Semi-automatic Construction of Sight Words Dictionary for Filipino Text Readability

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

1 Citation (SciVal)

Abstract

Readability formulas consider word familiarity as one of the factors for predicting the readability of children’s books. Word familiarity is dependent on the frequency in which the words are encountered in daily reading. Often referred to as “sight words”, developing effective recognition of these high-frequency words can assist young readers to develop their reading fluency and comprehension. In this paper, we describe our work in building a dictionary of sight words for Filipino with the use of a corpus of Filipino literary materials written for children. We expanded the dictionary to a total of 664 words with the use of pre-trained word embedding model. The availability of such dictionary can facilitate the development of a readability formula for Filipino text, especially in the context of its lexical complexity.

Original languageEnglish
Title of host publicationKnowledge Management and Acquisition for Intelligent Systems - 17th Pacific Rim Knowledge Acquisition Workshop, PKAW 2020, Proceedings
EditorsHiroshi Uehara, Takayasu Yamaguchi, Quan Bai
PublisherSpringer Science and Business Media Deutschland GmbH
Pages168-177
Number of pages10
ISBN (Print)9783030698850
DOIs
Publication statusPublished - 20 Feb 2021
Event17th Pacific Rim Knowledge Acquisition Workshop, PKAW 2020 held in conjunction with the International Joint Conference on Artificial Intelligence - Pacific Rim International Conference on Artificial Intelligence, IJCAI-PRICAI 2020 - Yokohama, Japan
Duration: 7 Jan 20218 Jan 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12280 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th Pacific Rim Knowledge Acquisition Workshop, PKAW 2020 held in conjunction with the International Joint Conference on Artificial Intelligence - Pacific Rim International Conference on Artificial Intelligence, IJCAI-PRICAI 2020
Country/TerritoryJapan
CityYokohama
Period7/01/218/01/21

Keywords

  • Filipino text
  • High-frequency words
  • Text readability

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Semi-automatic Construction of Sight Words Dictionary for Filipino Text Readability'. Together they form a unique fingerprint.

Cite this