The Sense Complexity Dataset (SeCoDa) provides a corpus that is annotated jointly for complexity and word senses. It thus provides a valuable resource for both word sense disambiguation and the task of complex word identification. The intention is that this dataset will be used to identify complexity at the level of word senses rather than word tokens. For word sense annotation, SeCoDa uses a hierarchical scheme that is based on information available in the Cambridge Advanced Learner’s Dictionary. This way, we can offer more coarse-grained senses than directly available in WordNet.
|Publication status||Published - 11 May 2021|
|Event||LREC 2020: Proceedings of the 12th Conference on Language Resources and Evaluation - Virtual, Marseille, France|
Duration: 11 May 2020 → 16 May 2020
|Abbreviated title||LREC 2020|
|Period||11/05/20 → 16/05/20|