TY - GEN
T1 - An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework
AU - Shardlow, Matthew
AU - Alva-Manchego, Fernando
AU - Batista-Navarro, Riza
AU - Bott, Stefan
AU - Ramirez, Saul Calderon
AU - Cardon, Rémi
AU - François, Thomas
AU - Hayakawa, Akio
AU - Horbach, Andrea
AU - Hülsing, Anna
AU - Ide, Yusuke
AU - Imperial, Joseph Marvin
AU - Nohejl, Adam
AU - North, Kai
AU - Occhipinti, Laura
AU - Rojas, Nelson Peréz
AU - Raihan, Nishat
AU - Ranasinghe, Tharindu
AU - Salazar, Martin Solis
AU - Zampieri, Marcos
AU - Saggion, Horacio
PY - 2024/5/20
Y1 - 2024/5/20
N2 - We present preliminary findings on the MultiLS dataset, developed in support of the 2024 Multilingual Lexical Simplification Pipeline (MLSP) Shared Task. This dataset currently comprises of 300 instances of lexical complexity prediction and lexical simplification across 10 languages. In this paper, we (1) describe the annotation protocol in support of the contribution of future datasets and (2) present summary statistics on the existing data that we have gathered. Multilingual lexical simplification can be used to support low-ability readers to engage with otherwise difficult texts in their native, often low-resourced, languages.
AB - We present preliminary findings on the MultiLS dataset, developed in support of the 2024 Multilingual Lexical Simplification Pipeline (MLSP) Shared Task. This dataset currently comprises of 300 instances of lexical complexity prediction and lexical simplification across 10 languages. In this paper, we (1) describe the annotation protocol in support of the contribution of future datasets and (2) present summary statistics on the existing data that we have gathered. Multilingual lexical simplification can be used to support low-ability readers to engage with otherwise difficult texts in their native, often low-resourced, languages.
KW - lexical complexity prediction
KW - lexical simplification
KW - MultiLS
UR - http://www.scopus.com/inward/record.url?scp=85195174979&partnerID=8YFLogxK
M3 - Chapter in a published conference proceeding
AN - SCOPUS:85195174979
T3 - 3rd Workshop on Tools and Resources for People with REAding DIfficulties, READI 2024 at LREC-COLING 2024 - Workshop Proceedings
SP - 38
EP - 46
BT - 3rd Workshop on Tools and Resources for People with REAding DIfficulties, READI 2024 at LREC-COLING 2024 - Workshop Proceedings
A2 - Wilkens, Rodrigo
A2 - Cardon, Remi
A2 - Todirascu, Amalia
A2 - Gala, Nuria
PB - European Language Resources Association (ELRA)
T2 - 3rd Workshop on Tools and Resources for People with REAding DIfficulties, READI 2024
Y2 - 20 May 2024
ER -