TY - GEN
T1 - Improved shortest path edit distance for synonyms identification
AU - Rudniy, Alex
AU - Song, Min
AU - Geller, James
PY - 2014
Y1 - 2014
N2 - Integration of proliferous sequencing and related data into the UniProt Knowledgebase is an important ongoing research project. This paper proposes Improved Shortest Path Edit Distance (ISPED) as an algorithm for enhancing existing integration techniques. ISPED is an improved version of the algorithm previously developed by the authors. Three major adjustments have been made: better node weight calculation, score normalization, and implementation of a re-scorer. We apply ISPED as an approximate string similarity metric to five datasets extracted from UNIPROT-GOA during synonym identification experiments. ISPED outperforms nine wellknown string similarity metrics and achieves the highest values of average precision and F1 on all selected datasets.
AB - Integration of proliferous sequencing and related data into the UniProt Knowledgebase is an important ongoing research project. This paper proposes Improved Shortest Path Edit Distance (ISPED) as an algorithm for enhancing existing integration techniques. ISPED is an improved version of the algorithm previously developed by the authors. Three major adjustments have been made: better node weight calculation, score normalization, and implementation of a re-scorer. We apply ISPED as an approximate string similarity metric to five datasets extracted from UNIPROT-GOA during synonym identification experiments. ISPED outperforms nine wellknown string similarity metrics and achieves the highest values of average precision and F1 on all selected datasets.
UR - http://www.scopus.com/inward/record.url?scp=84905826168&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84905826168&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84905826168
SN - 9781632665140
T3 - Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
SP - 97
EP - 102
BT - Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
PB - International Society for Computers and Their Applications
T2 - 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
Y2 - 24 March 2014 through 26 March 2014
ER -