Adapting Cross-lingual Model To Improve Vietnamese Dependency Parsing

Anh Duc Do Tran, Dien Dinh, An-Vinh Luong, Thao Do

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

Abstract

Dependency parsing is the task of analyzing the syntax of a sentence into a direct binary relational graph. Many languages have had the state-of-the-art model in this task and used it as a knowledge base to solve more complex problems. However, to achieve high accuracy in a dependency parsing model, it takes significant time and labor to build a large amount of annotated treebanks. For languages with little or no annotated treebanks, some approaches have been studied to induce a dependency parser from treebanks of high-resource languages to solve this problem. In this paper, we propose an approach to building a cross-lingual model to parse Vietnamese as a low-resource target language. The model uses English as a supportive high-resource source language to induce a Vietnamese parser. To remove the differences in syntaxes and lexicons of English and Vietnamese when training the model, the approach uses a filtering algorithm to choose English sentences having syntaxes as same as Vietnamese sentences based on Euclidean distance. The result shows that the proposed model significantly improves accuracy compared with models using only supervised mono-lingual treebanks.

Original languageEnglish
Title of host publicationArtificial Intelligence in Data and Big Data Processing
Subtitle of host publicationProceedings of ICABDE 2021
EditorsNgoc Hoang Thanh Dang, Yu-Dong Zhang, Joaa Manuel R. S. Travers, Bo-Hao Chen
Place of PublicationCham, Switzerland
PublisherSpringer
Pages97-108
Number of pages12
ISBN (Electronic)9783030976101
ISBN (Print)9783030976095
DOIs
Publication statusPublished - 19 May 2022

Publication series

NameLecture Notes on Data Engineering and Communications Technologies
Volume124
ISSN (Print)2367-4512
ISSN (Electronic)2367-4520

Keywords

  • Cross-lingual method
  • Deep biaffine attention
  • Dependency parsing
  • Low-resource language
  • Transfer learning

ASJC Scopus subject areas

  • Information Systems
  • Media Technology
  • Computer Science Applications
  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Adapting Cross-lingual Model To Improve Vietnamese Dependency Parsing'. Together they form a unique fingerprint.

Cite this