Diverse Linguistic Features for Assessing Reading Difficulty of Educational Filipino Texts

Research output: Working paper / PreprintPreprint

37 Downloads (Pure)


In order to ensure quality and effective learning, fluency, and comprehension, the proper identification of the difficulty levels of reading materials should be observed. In this paper, we describe the development of automatic machine learning-based readability assessment models for educational Filipino texts using the most diverse set of linguistic features for the language. Results show that using a Random Forest model obtained a high performance of 62.7% in terms of accuracy, and 66.1% when using the optimal combination of feature sets consisting of traditional and syllable pattern-based predictors.
Original languageUndefined/Unknown
PublisherAsia-Pacific Society for Computers in Education
Publication statusPublished - 31 Jul 2021

Publication series

Name International Conference on Computers in Education

Bibliographical note

Accepted at ICCE 2021


  • cs.CL
  • cs.LG

Cite this