Domain-specific keyphrase extraction

Yi Fang Brook Wu, Quanzhi Li, Razvan Stefan Bot, Xin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

47 Scopus citations

Abstract

Document keyphrases provide semantic metadata characterizing documents and producing an overview of the content of a document. They can be used in many text-mining and knowledge management related applications. This paper describes a Keyphrase Identification Program (KIP), which extracts document keyphrases by using prior positive samples of human identified domain keyphrases to assign weights to the candidate keyphrases. The logic of our algorithm is: the more keywords a candidate keyphrase contains and the more significant these keywords are, the more likely this candidate phrase is a keyphrase. To obtain prior positive inputs, KIP first populates its glossary database using manually identified keyphrases and keywords. It then checks the composition of all noun phrases of a document, looks up the database and calculates scores for all these noun phrases. The ones having higher scores will be extracted as keyphrases.

Original languageEnglish (US)
Title of host publicationCIKM'05 - Proceedings of the 14th ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages283-284
Number of pages2
ISBN (Print)1595931406, 9781595931405
DOIs
StatePublished - 2005
EventCIKM'05 - Proceedings of the 14th ACM International Conference on Information and Knowledge Management - Bremen, Germany
Duration: Oct 31 2005Nov 5 2005

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

OtherCIKM'05 - Proceedings of the 14th ACM International Conference on Information and Knowledge Management
Country/TerritoryGermany
CityBremen
Period10/31/0511/5/05

All Science Journal Classification (ASJC) codes

  • General Decision Sciences
  • General Business, Management and Accounting

Keywords

  • Document Keyphrase
  • Document Metadata
  • Keyphrase Extraction
  • Text Mining

Fingerprint

Dive into the research topics of 'Domain-specific keyphrase extraction'. Together they form a unique fingerprint.

Cite this