TY - GEN
T1 - A human-in-the-loop attribute design framework for classification
AU - Salam, Md Abdus
AU - Koone, Mary E.
AU - Saravanan,
AU - Das, Gautam
AU - Roy, Senjuti Basu
N1 - Funding Information:
The work of Senjuti Basu Roy is supported by the National Science Foundation under Grant No.: 1814595 and Office of Naval Research under Grant No.: N000141812838. The work of Gautam Das is supported in part by grant W911NF-15-1-0020 from the Army Research Office, grant 1745925 from the National Science Foundation, and a grant from AT&T.
Publisher Copyright:
© 2019 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC-BY 4.0 License.
PY - 2019/5/13
Y1 - 2019/5/13
N2 - In this paper, we present a semi-automated, “human-in-the-loop” framework for attribute design that assists human analysts to transform raw attributes into effective derived attributes for classification problems. Our proposed framework is optimization guided and fully agnostic to the underlying classification model. We present an algebra with various operators (arithmetic, relational, and logical) to transform raw attributes into derived attributes and solve two technical problems: (a) the top-k buckets design problem aims at presenting human analysts with k buckets, each bucket containing promising choices of raw attributes that she can focus on only without having to look at all raw attributes; and (b) the top-l snippets generation problem, which iteratively aids human analysts with top-l derived attributes involving an attribute. For the former problem, we present an effective exact bottom-up algorithm that is empowered by pruning capability, as well as random walk based heuristic algorithms that are intuitive and work well in practice. For the latter, we present a greedy heuristic algorithm that is scalable and effective. Rigorous evaluations are conducted involving 6 different real world datasets to showcase that our framework generates effective derived attributes compared to fully manual or fully automated methods.
AB - In this paper, we present a semi-automated, “human-in-the-loop” framework for attribute design that assists human analysts to transform raw attributes into effective derived attributes for classification problems. Our proposed framework is optimization guided and fully agnostic to the underlying classification model. We present an algebra with various operators (arithmetic, relational, and logical) to transform raw attributes into derived attributes and solve two technical problems: (a) the top-k buckets design problem aims at presenting human analysts with k buckets, each bucket containing promising choices of raw attributes that she can focus on only without having to look at all raw attributes; and (b) the top-l snippets generation problem, which iteratively aids human analysts with top-l derived attributes involving an attribute. For the former problem, we present an effective exact bottom-up algorithm that is empowered by pruning capability, as well as random walk based heuristic algorithms that are intuitive and work well in practice. For the latter, we present a greedy heuristic algorithm that is scalable and effective. Rigorous evaluations are conducted involving 6 different real world datasets to showcase that our framework generates effective derived attributes compared to fully manual or fully automated methods.
KW - Attribute design
KW - Crowdsourcing
KW - Feature engineering
KW - Human computation
UR - http://www.scopus.com/inward/record.url?scp=85066894041&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066894041&partnerID=8YFLogxK
U2 - 10.1145/3308558.3313547
DO - 10.1145/3308558.3313547
M3 - Conference contribution
AN - SCOPUS:85066894041
T3 - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
SP - 1612
EP - 1622
BT - The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PB - Association for Computing Machinery, Inc
T2 - 2019 World Wide Web Conference, WWW 2019
Y2 - 13 May 2019 through 17 May 2019
ER -