Abstract
Dependency parsing is the task of analyzing the syntax of a sentence into a direct binary relational graph. Many languages have had the state-of-the-art model in this task and used it as a knowledge base to solve more complex problems. However, to achieve high accuracy in a dependency parsing model, it takes significant time and labor to build a large amount of annotated treebanks. For languages with little or no annotated treebanks, some approaches have been studied to induce a dependency parser from treebanks of high-resource languages to solve this problem. In this paper, we propose an approach to building a cross-lingual model to parse Vietnamese as a low-resource target language. The model uses English as a supportive high-resource source language to induce a Vietnamese parser. To remove the differences in syntaxes and lexicons of English and Vietnamese when training the model, the approach uses a filtering algorithm to choose English sentences having syntaxes as same as Vietnamese sentences based on Euclidean distance. The result shows that the proposed model significantly improves accuracy compared with models using only supervised mono-lingual treebanks.
Original language | English |
---|---|
Title of host publication | Artificial Intelligence in Data and Big Data Processing |
Subtitle of host publication | Proceedings of ICABDE 2021 |
Editors | Ngoc Hoang Thanh Dang, Yu-Dong Zhang, Joaa Manuel R. S. Travers, Bo-Hao Chen |
Place of Publication | Cham, Switzerland |
Publisher | Springer |
Pages | 97-108 |
Number of pages | 12 |
ISBN (Electronic) | 9783030976101 |
ISBN (Print) | 9783030976095 |
DOIs | |
Publication status | Published - 19 May 2022 |
Publication series
Name | Lecture Notes on Data Engineering and Communications Technologies |
---|---|
Volume | 124 |
ISSN (Print) | 2367-4512 |
ISSN (Electronic) | 2367-4520 |
Bibliographical note
Publisher Copyright:© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Keywords
- Cross-lingual method
- Deep biaffine attention
- Dependency parsing
- Low-resource language
- Transfer learning
ASJC Scopus subject areas
- Information Systems
- Media Technology
- Computer Science Applications
- Computer Networks and Communications
- Electrical and Electronic Engineering