Translating sign language videos to talking faces

Seshadri Mazumder, Rudrabha Mukhopadhyay, Vinay P. Namboodiri, C. V. Jawahar

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

3 Citations (SciVal)

Abstract

Communication with the deaf community relies profoundly on the interpretation of sign languages performed by the signers. In light of the recent breakthroughs in sign language translations, we propose a pipeline that we term "Translating Sign Language Videos to Talking Faces". In this context, we improve the existing sign language translation systems by using POS tags to improve language modeling. We further extend the challenge to develop a system that can interpret a video from a signer to an avatar speaking in spoken languages. We focus on the translation systems that attempt to translate sign languages to text without glosses, an expensive annotation form. We critically analyze two state-of-the-art architectures, and based on their limitations, we improvise the systems. We propose a two-stage approach to translate sign language into intermediate text followed by a language model to get the final predictions. Quantitative evaluations on the challenging benchmarks on RWTH-PHOENIX-Weather 2014 T show that the translation accuracy of the texts generated by our translation model improves the state-of-the-art models by approximately 3 points. We then build a working text to talking face generation pipeline by bringing together multiple existing modules. The overall pipeline is capable of generating talking face videos with speech from sign language poses. Additional materials about this project including the codes and a demo video can be found in https://seshadri-c.github.io/SLV2TF/

Original languageEnglish
Title of host publicationProceedings of ICVGIP 2021 - 12th Indian Conference on Computer Vision, Graphics and Image Processing
Place of PublicationU. S. A.
PublisherAssociation for Computing Machinery
Pages1-10
ISBN (Electronic)9781450391276
DOIs
Publication statusPublished - 19 Dec 2021
Event12th Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP 2021 - Virtual, Online, India
Duration: 20 Dec 202122 Dec 2021

Publication series

NameACM International Conference Proceeding Series

Conference

Conference12th Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP 2021
Country/TerritoryIndia
CityVirtual, Online
Period20/12/2122/12/21

Keywords

  • POS tagging
  • Sign language
  • Sign language recognition
  • Sign language to text
  • Sign language translation

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint

Dive into the research topics of 'Translating sign language videos to talking faces'. Together they form a unique fingerprint.

Cite this