TY - GEN
T1 - IFME
T2 - 13th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2013
AU - Zhu, Mingzhu
AU - Xu, Chao
AU - Wu, Yi Fang Brook
PY - 2013
Y1 - 2013
N2 - With the amount of digitalized documents increasing exponentially, it is more difficult for users to keep up to date with the knowledge in their domain. In this paper, we present a framework named IFME (Information Filtering by Multiple Examples) in a digital library environment to help users identify the literature related to their interests by leveraging the Positive Unlabeled learning (PU learning). Using a few relevant documents provided by a user and considering the documents in an online database as unlabeled data (called U), it ranks the documents in U using a PU learning algorithm. From the experimental results, we found that while the approach performed well when a large set of relevant feedback documents were available, it performed relatively poor when the relevant feedback documents were few. We improved IFME by combining PU learning with under-sampling to tune the performance. Using Mean Average Precision (MAP), our experimental results indicated that with under-sampling, the performance improved significantly even when the size of P was small. We believe the PU learning based IFME framework brings insights to develop more effective digital library systems.
AB - With the amount of digitalized documents increasing exponentially, it is more difficult for users to keep up to date with the knowledge in their domain. In this paper, we present a framework named IFME (Information Filtering by Multiple Examples) in a digital library environment to help users identify the literature related to their interests by leveraging the Positive Unlabeled learning (PU learning). Using a few relevant documents provided by a user and considering the documents in an online database as unlabeled data (called U), it ranks the documents in U using a PU learning algorithm. From the experimental results, we found that while the approach performed well when a large set of relevant feedback documents were available, it performed relatively poor when the relevant feedback documents were few. We improved IFME by combining PU learning with under-sampling to tune the performance. Using Mean Average Precision (MAP), our experimental results indicated that with under-sampling, the performance improved significantly even when the size of P was small. We believe the PU learning based IFME framework brings insights to develop more effective digital library systems.
KW - Information retrieval
KW - Positive unlabeled learning
KW - Relevance feedback
KW - Search by multiple examples
KW - Text classification
UR - http://www.scopus.com/inward/record.url?scp=84882257258&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84882257258&partnerID=8YFLogxK
U2 - 10.1145/2467696.2467736
DO - 10.1145/2467696.2467736
M3 - Conference contribution
AN - SCOPUS:84882257258
SN - 9781450320764
T3 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries
SP - 107
EP - 110
BT - JCDL 2013 - Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries
Y2 - 22 July 2013 through 26 July 2013
ER -