Personalized One-Shot Lipreading for an ALS Patient

Bipasha Sen, Aditya Agarwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

Research output: Contribution to conferencePaperpeer-review

2 Citations (SciVal)

Abstract

Lipreading or visually recognizing speech from the mouth movements of a speaker is a challenging and mentally taxing task. Unfortunately, multiple medical conditions force people to depend on this skill in their day-to-day lives for essential communication. Patients suffering from 'Amyotrophic Lateral Sclerosis' (ALS) often lose muscle control, consequently their ability to generate speech and communicate via lip movements. Existing large datasets do not focus on medical patients or curate personalized vocabulary relevant to an individual. Collecting large-scale dataset of a patient, needed to train modern data-hungry deep learning models is however, extremely challenging. In this work, we propose a personalized network to lipread an ALS patient using only one-shot examples. We depend on synthetically generated lip movements to augment the one-shot scenario. A Variational Encoder based domain adaptation technique is used to bridge the real-synthetic domain gap. Our approach significantly improves and achieves high top-5 accuracy with 83.2% accuracy compared to 62.6% achieved by comparable methods for the patient. Apart from evaluating our approach on the ALS patient, we also extend it to people with hearing impairment relying extensively on lip movements to communicate.

Original languageEnglish
Publication statusPublished - 25 Nov 2021
Event32nd British Machine Vision Conference, BMVC 2021 - Virtual, Online
Duration: 22 Nov 202125 Nov 2021

Conference

Conference32nd British Machine Vision Conference, BMVC 2021
CityVirtual, Online
Period22/11/2125/11/21

Bibliographical note

Publisher Copyright:
© 2021. The copyright of this document resides with its authors.

Funding

We would like to thank Anuraag Mullick and Vibha Mullick for their continual support in curating the dataset needed for this work. We also thank Harini Bhatt, who is the founder of ASRM Systems, for connecting us with several participants.

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Personalized One-Shot Lipreading for an ALS Patient'. Together they form a unique fingerprint.

Cite this