TY - GEN
T1 - Improving access to digital library resources by automatically generating complete reading level metadata
AU - Will, Todd
AU - Brook Wu, Yi Fang
PY - 2012
Y1 - 2012
N2 - Digital library collections usually hold resources describing a limited set of topics spanning a wide range of reading levels, requiring complete reading level metadata to filter relevant resources from the collection. In order to suggest the reading level for all resources in the test collection, we propose an SVM-based classification tool which predicts the specific reading level with an F-Measure of 0.70 for all resources, outperforming other classification methods and readability formulas under evaluation. To measure the impact of reading level metadata completeness on retrieval performance, a knowledge based system retrieves documents from three collections containing different reading level completeness: one with complete reading level information generated by the proposed SVM method, one missing all reading level information, and the final collection containing limited, human-expert provided metadata. The dataset with automatically identified complete reading level exceeds the performance of collection-provided reading level metadata for all five sample tasks.
AB - Digital library collections usually hold resources describing a limited set of topics spanning a wide range of reading levels, requiring complete reading level metadata to filter relevant resources from the collection. In order to suggest the reading level for all resources in the test collection, we propose an SVM-based classification tool which predicts the specific reading level with an F-Measure of 0.70 for all resources, outperforming other classification methods and readability formulas under evaluation. To measure the impact of reading level metadata completeness on retrieval performance, a knowledge based system retrieves documents from three collections containing different reading level completeness: one with complete reading level information generated by the proposed SVM method, one missing all reading level information, and the final collection containing limited, human-expert provided metadata. The dataset with automatically identified complete reading level exceeds the performance of collection-provided reading level metadata for all five sample tasks.
KW - Automatic metadata generation
KW - Digital libraries
KW - Knowledge based filtering
KW - Reading level
UR - http://www.scopus.com/inward/record.url?scp=84877918216&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84877918216&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84877918216
SN - 9781622768271
T3 - 18th Americas Conference on Information Systems 2012, AMCIS 2012
SP - 2122
EP - 2131
BT - 18th Americas Conference on Information Systems 2012, AMCIS 2012
T2 - 18th Americas Conference on Information Systems 2012, AMCIS 2012
Y2 - 9 August 2012 through 12 August 2012
ER -