TY - GEN
T1 - WikAnalytics
T2 - 5th International Conference on Machine Learning and Machine Intelligence, MLMI 2022
AU - Ramos, Jomari Valmadrid
AU - Ballesta, John Michael
AU - Lee Yam, Andrew Kobe
AU - Mogol, Moises Kairon
AU - Rodriguez, Ramon
AU - Imperial, Joseph Marvin
PY - 2022/9/23
Y1 - 2022/9/23
N2 - Reading is one of the first things humans learn to improve their comprehension, vocabulary, and imagination. Determining what to read for your level will be difficult, but forcing you to understand a high text level is more complex. It can result in not learning at all. The Philippines has two official languages, which are English and Filipino. Filipinos tend to engage with text written in both languages. Although the Philippines has two official languages, not all individuals are bilingual and only focus on one language, especially those who do not have the privilege to study in school. Thus, having bilingual text to read might be challenging to understand its context. A system that can analyze and recognize either English, Filipino, or Taglish languages while giving the readability index of the given text could help readers determine the readability level they want to read. Wikanalytics, a web application text analysis tool, was developed in this paper. The application aims to provide analysis of text files while calculating its readability index. The Agile Software Development Method was utilized for developing the web application. The system handles text documents written in English, Filipino, or Taglish Languages. The extracted features of an analyzed text are Readability Index, Traditional Features, Lexical Features, Syllable Patterns, Sentences and Paragraphs, and Token Ratios. The accuracy of the web application tool in identifying the language is 97.5% in English, 96% for Filipino, and 96.89% for Taglish. The runtime is tested for English, Filipino, and Taglish files, and the longest runtime out of the three languages mentioned is the Taglish file. Having two different languages in the file affected the runtime of the application's Lexical and Token ratio feature.
AB - Reading is one of the first things humans learn to improve their comprehension, vocabulary, and imagination. Determining what to read for your level will be difficult, but forcing you to understand a high text level is more complex. It can result in not learning at all. The Philippines has two official languages, which are English and Filipino. Filipinos tend to engage with text written in both languages. Although the Philippines has two official languages, not all individuals are bilingual and only focus on one language, especially those who do not have the privilege to study in school. Thus, having bilingual text to read might be challenging to understand its context. A system that can analyze and recognize either English, Filipino, or Taglish languages while giving the readability index of the given text could help readers determine the readability level they want to read. Wikanalytics, a web application text analysis tool, was developed in this paper. The application aims to provide analysis of text files while calculating its readability index. The Agile Software Development Method was utilized for developing the web application. The system handles text documents written in English, Filipino, or Taglish Languages. The extracted features of an analyzed text are Readability Index, Traditional Features, Lexical Features, Syllable Patterns, Sentences and Paragraphs, and Token Ratios. The accuracy of the web application tool in identifying the language is 97.5% in English, 96% for Filipino, and 96.89% for Taglish. The runtime is tested for English, Filipino, and Taglish files, and the longest runtime out of the three languages mentioned is the Taglish file. Having two different languages in the file affected the runtime of the application's Lexical and Token ratio feature.
KW - Agile Software Development Cycle
KW - English Language
KW - Filipino Language
KW - Linguistic Properties
KW - Taglish Language
KW - Text Analysis Software
UR - http://www.scopus.com/inward/record.url?scp=85149974509&partnerID=8YFLogxK
U2 - 10.1145/3568199.3568229
DO - 10.1145/3568199.3568229
M3 - Chapter in a published conference proceeding
AN - SCOPUS:85149974509
T3 - ACM International Conference Proceeding Series
SP - 190
EP - 198
BT - Proceedings of MLMI 2022 - 2022 5th International Conference on Machine Learning and Machine Intelligence
PB - Association for Computing Machinery
CY - U. S. A.
Y2 - 23 September 2022 through 25 September 2022
ER -