Cross-language Speech Dependent Lip-synchronization

Abhishek Jha, Vikram Voleti, Vinay Namboodiri, C. V. Jawahar

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

5 Citations (SciVal)
67 Downloads (Pure)

Abstract

Understanding videos of people speaking across international borders is hard as audiences from different demographies do not understand the language. Such speech videos are often supplemented with language subtitles. However, these hamper the viewing experience as the attention is shared. Simple audio dubbing in a different language makes the video appear unnatural due to unsynchronized lip motion. In this paper, we propose a system for automated cross-language lip synchronization for re-dubbed videos. Our model generates superior photorealistic lip-synchronization over original video in comparison to the current re-dubbing method. With the help of a user-based study, we verify that our method is preferred over unsynchronized videos.

Original languageEnglish
Title of host publication2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PublisherIEEE
Pages7140-7144
Number of pages5
ISBN (Electronic)9781479981311
DOIs
Publication statusPublished - May 2019
Event44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, UK United Kingdom
Duration: 12 May 201917 May 2019

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2019-May
ISSN (Print)1520-6149

Conference

Conference44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
Country/TerritoryUK United Kingdom
CityBrighton
Period12/05/1917/05/19

Keywords

  • Lip-synchronization
  • visual-dubbing

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Cross-language Speech Dependent Lip-synchronization'. Together they form a unique fingerprint.

Cite this