Robust explanations for visual question answering

Badri N. Patro, Shivansh Patel, Vinay P. Namboodiri

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

16 Citations (SciVal)
73 Downloads (Pure)

Abstract

In this paper, we propose a method to obtain robust explanations for visual question answering(VQA) that correlate well with the answers. Our model explains the answers obtained through a VQA model by providing visual and textual explanations. The main challenges that we address are i) Answers and textual explanations obtained by current methods are not well correlated and ii) Current methods for visual explanation do not focus on the right location for explaining the answer. We address both these challenges by using a collaborative correlated module which ensures that even if we do not train for noise based attacks, the enhanced correlation ensures that the right explanation and answer can be generated. We further show that this also aids in improving the generated visual and textual explanations. The use of the correlated module can be thought of as a robust method to verify if the answer and explanations are coherent. We evaluate this model using VQA-X dataset. We observe that the proposed method yields better textual and visual justification that supports the decision. We showcase the robustness of the model against a noise-based perturbation attack using corresponding visual and textual explanations. A detailed empirical analysis is shown.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020
PublisherIEEE
Pages1566-1575
Number of pages10
ISBN (Electronic)9781728165530
DOIs
Publication statusPublished - 14 May 2020
Event2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020 - Snowmass Village, USA United States
Duration: 1 Mar 20205 Mar 2020

Publication series

NameProceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020

Conference

Conference2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020
Country/TerritoryUSA United States
CitySnowmass Village
Period1/03/205/03/20

Funding

Assistance through SERB grant CRG/2018/003566 is duly acknowledged.

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Robust explanations for visual question answering'. Together they form a unique fingerprint.

Cite this