TY - GEN
T1 - Generating better concept hierarchies using automatic document classification
AU - Bot, Razvan Stefan
AU - Wu, Yi Fang Brook
AU - Chen, Xin
AU - Li, Quanzhi
PY - 2005
Y1 - 2005
N2 - This paper presents a hybrid concept hierarchy development technique for web returned documents retrieved by a meta-search engine. The aim of the technique is to separate the initial retrieved documents into topical oriented categories, prior to the actual concept hierarchy generation. The topical categories correspond to different semantic aspects of the query. This is done using a 1-of-n automatic document classification, on the initial set of returned documents. Then, an individual topical concept hierarchy is automatically generated inside each of the resulted categories. Both steps are executed on the fly at retrieval time. Due to the efficiency constraints imposed by the web retrieval context, the algorithm only uses document snippets (rather than full web pages) for both document classification and concept hierarchy generation. Experimental results show that the algorithm is able to improve the quality of the concept hierarchy presented to the searcher; at the same time, the efficiency parameters are kept within reasonable intervals.
AB - This paper presents a hybrid concept hierarchy development technique for web returned documents retrieved by a meta-search engine. The aim of the technique is to separate the initial retrieved documents into topical oriented categories, prior to the actual concept hierarchy generation. The topical categories correspond to different semantic aspects of the query. This is done using a 1-of-n automatic document classification, on the initial set of returned documents. Then, an individual topical concept hierarchy is automatically generated inside each of the resulted categories. Both steps are executed on the fly at retrieval time. Due to the efficiency constraints imposed by the web retrieval context, the algorithm only uses document snippets (rather than full web pages) for both document classification and concept hierarchy generation. Experimental results show that the algorithm is able to improve the quality of the concept hierarchy presented to the searcher; at the same time, the efficiency parameters are kept within reasonable intervals.
KW - Automatic classification
KW - Concept hierarchy
KW - Document classification
KW - Information retrieval
KW - Manual classification
UR - http://www.scopus.com/inward/record.url?scp=33745799488&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33745799488&partnerID=8YFLogxK
U2 - 10.1145/1099554.1099627
DO - 10.1145/1099554.1099627
M3 - Conference contribution
AN - SCOPUS:33745799488
SN - 1595931406
SN - 9781595931405
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 281
EP - 282
BT - CIKM'05 - Proceedings of the 14th ACM International Conference on Information and Knowledge Management
T2 - CIKM'05 - Proceedings of the 14th ACM International Conference on Information and Knowledge Management
Y2 - 31 October 2005 through 5 November 2005
ER -