Improved shortest path edit distance for synonyms identification

Alex Rudniy, Min Song, James Geller

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Integration of proliferous sequencing and related data into the UniProt Knowledgebase is an important ongoing research project. This paper proposes Improved Shortest Path Edit Distance (ISPED) as an algorithm for enhancing existing integration techniques. ISPED is an improved version of the algorithm previously developed by the authors. Three major adjustments have been made: better node weight calculation, score normalization, and implementation of a re-scorer. We apply ISPED as an approximate string similarity metric to five datasets extracted from UNIPROT-GOA during synonym identification experiments. ISPED outperforms nine wellknown string similarity metrics and achieves the highest values of average precision and F1 on all selected datasets.

Original languageEnglish (US)
Title of host publicationProceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
PublisherInternational Society for Computers and Their Applications
Pages97-102
Number of pages6
ISBN (Print)9781632665140
StatePublished - 2014
Event6th International Conference on Bioinformatics and Computational Biology, BICOB 2014 - Las Vegas, NV, United States
Duration: Mar 24 2014Mar 26 2014

Publication series

NameProceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014

Other

Other6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
Country/TerritoryUnited States
CityLas Vegas, NV
Period3/24/143/26/14

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Health Informatics

Fingerprint

Dive into the research topics of 'Improved shortest path edit distance for synonyms identification'. Together they form a unique fingerprint.

Cite this