Automatically finding significant topical terms from documents

Quanzhi Li, Yi Fang Brook Wu, Razvan Stefan Bot, Xin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With the pervasion of digital textual data, text mining is becoming more and more important to deriving competitive advantages. One factor for successful text mining applications is the ability of finding significant topical terms for discovering interesting patterns or relationships. Document keyphrases are phrases carrying the most important topical concepts for a given document. In many applications, keyphrases as textual elements are better suited for text mining and could provide more discriminating power than single words. This paper describes an automatic keyphrase identification program (KIP). KIP's algorithm examines the composition of noun phrases and calculates their scores by looking up a domain-specific glossary database; the ones with higher scores are extracted as keyphrases. KIP's learning function can enrich its glossary database by automatically adding new identified keyphrases. KIP's personalization feature allows the user build a glossary database specifically suitable for the area of his/her interest.

Original languageEnglish (US)
Title of host publicationAssociation for Information Systems - 11th Americas Conference on Information Systems, AMCIS 2005
Subtitle of host publicationA Conference on a Human Scale
Pages452-459
Number of pages8
StatePublished - Dec 1 2005
Event11th Americas Conference on Information Systems, AMCIS 2005 - Omaha, NE, United States
Duration: Aug 11 2005Aug 15 2005

Publication series

NameAssociation for Information Systems - 11th Americas Conference on Information Systems, AMCIS 2005: A Conference on a Human Scale
Volume1

Other

Other11th Americas Conference on Information Systems, AMCIS 2005
CountryUnited States
CityOmaha, NE
Period8/11/058/15/05

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Computer Networks and Communications
  • Information Systems
  • Library and Information Sciences

Keywords

  • Document keyphrase
  • Document metadata
  • Glossary database
  • Keyphrase extraction
  • Text mining

Fingerprint Dive into the research topics of 'Automatically finding significant topical terms from documents'. Together they form a unique fingerprint.

Cite this