Abstract
Readability is a fundamental problem in textbooks assessment. For low resources languages (LRL), however, little investigation has been done on the readability of textbook. In this paper, we proposed a readability assessment method for Tibetan textbook (a low resource language). We extract features based on the information that are gotten by Tibetan segmentation and named entity recognition. Then, we calculate the correlation of different features using Pearson Correlation Coefficient and select some feature sets to design the readability formula. Fit detection, F test and T test are applied on these selected features to generate a new readability assessment formula. Experiment shows that this new formula is capable of assessing the readability of Tibetan textbooks.
Original language | English (US) |
---|---|
Pages (from-to) | 213-225 |
Number of pages | 13 |
Journal | Computers, Materials and Continua |
Volume | 61 |
Issue number | 1 |
DOIs | |
State | Published - 2019 |
All Science Journal Classification (ASJC) codes
- Biomaterials
- Modeling and Simulation
- Mechanics of Materials
- Computer Science Applications
- Electrical and Electronic Engineering
Keywords
- Linear regression
- Low resource language
- Named entity
- Readability assessment
- Textbook in Tibetan