TY - JOUR
T1 - Document keyphrases as subject metadata
T2 - Incorporating document key concepts in search results
AU - Wu, Yi Fang Brook
AU - Li, Quanzhi
N1 - Funding Information:
Acknowledgements The authors would like to thank Allison Zhang, Manager of Digital Collections Production Center, Washington Research Library Consortium, for providing us the document collection and building the glossary for our experiment. Partial support for this research was provided by the United Parcel Service Foundation; the National Science Foundation under grants DUE-0226075, DUE-0434581 and DUE-0434998, and the Institute for Museum and Library Services under grant LG-02-04-0002-04.
PY - 2008/6
Y1 - 2008/6
N2 - Most search engines display some document metadata, such as title, snippet and URL, in conjunction with the returned hits to aid users in determining documents. However, metadata is usually fragmented pieces of information that, even when combined, does not provide an overview of a returned document. In this paper, we propose a mechanism of enriching metadata of the returned results by incorporating automatically extracted document keyphrases with each returned hit. We hypothesize that keyphrases of a document can better represent the major theme in that document. Therefore, by examining the keyphrases in each returned hit, users can better predict the content of documents and the time spent on downloading and examining the irrelevant documents will be reduced substantially.
AB - Most search engines display some document metadata, such as title, snippet and URL, in conjunction with the returned hits to aid users in determining documents. However, metadata is usually fragmented pieces of information that, even when combined, does not provide an overview of a returned document. In this paper, we propose a mechanism of enriching metadata of the returned results by incorporating automatically extracted document keyphrases with each returned hit. We hypothesize that keyphrases of a document can better represent the major theme in that document. Therefore, by examining the keyphrases in each returned hit, users can better predict the content of documents and the time spent on downloading and examining the irrelevant documents will be reduced substantially.
KW - Document keyphrase
KW - Document metadata
KW - Document surrogate
KW - Keyphrase extraction
KW - Search interface
UR - http://www.scopus.com/inward/record.url?scp=42149106891&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=42149106891&partnerID=8YFLogxK
U2 - 10.1007/s10791-008-9044-1
DO - 10.1007/s10791-008-9044-1
M3 - Article
AN - SCOPUS:42149106891
SN - 1386-4564
VL - 11
SP - 229
EP - 249
JO - Information Retrieval
JF - Information Retrieval
IS - 3
ER -