KIP: A keyphrase identification program with learning functions

Yi Fang Brook Wu, Quanzhi Li, Razvan Stefan Bot, Xin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

In this paper, we report a keyphrase identification program (KIP) which uses sample human keyphrases and then learns to identify additional new keyphrases. KIP first populates its database using manually identified keyphrases; each keyphrase is pre-processed and assigned an initial weight. It then extracts noun phrases from documents. All noun phrases will be assigned a score, depending on the weights for words it contains; the ones that have a score higher than the threshold will be selected as keyphrases. Learned new keyphrases will be inserted to the database and weights will be updated. As a result, new keyphrase identification iteration will be triggered. The process stops when no new keyphrases are identified during previous iteration. According to the results of evaluation, the base KIP system's average recall was 0.7 and precision was 0.44. The augmented KIP with learning functions did produce new keyphrases which were not identified by the base system.

Original languageEnglish (US)
Title of host publicationInternational Conference on Information Technology
Subtitle of host publicationCoding Computing, ITCC 2004
EditorsP.K. Srimani, A. Abraham, M. Cannataro, J. Domingo-Ferrer, R. Hashemi
Pages450-454
Number of pages5
StatePublished - Jul 6 2004
EventInternational Conference on Information Technology: Coding Computing, ITCC 2004 - Las Vegas, NV, United States
Duration: Apr 5 2004Apr 7 2004

Publication series

NameInternational Conference on Information Technology: Coding Computing, ITCC
Volume2

Other

OtherInternational Conference on Information Technology: Coding Computing, ITCC 2004
Country/TerritoryUnited States
CityLas Vegas, NV
Period4/5/044/7/04

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems
  • Engineering(all)

Cite this