Bridged variational autoencoders for joint modeling of images and attributes

Ravindra Yadav, Ashish Sardana, Vinay P. Namboodiri, Rajesh M. Hegde

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

6 Citations (SciVal)
78 Downloads (Pure)

Abstract

Generative models have recently shown the ability to realistically generate data and model the distribution accurately. However, joint modeling of an image with the attribute that it is labeled with requires learning a cross modal correspondence between image and attribute data. Though the information present in a set of images and its attributes possesses completely different statistical properties altogether, there exists an inherent correspondence that is challenging to capture. Various models have aimed at capturing this correspondence either through joint modeling of a variational autoencoder or through separate encoder networks that are then concatenated. We present an alternative by proposing a bridged variational autoencoder that allows for learning cross-modal correspondence by incorporating cross-modal hallucination losses in the latent space. In comparison to the existing methods, we have found that by using a bridge connection in latent space we not only obtain better generation results, but also obtain highly parameter-efficient model which provide 40% reduction in training parameters for bimodal dataset and nearly 70% reduction for trimodal dataset. We validate the proposed method through comparison with state of the art methods and benchmarking on standard datasets.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020
PublisherIEEE
Pages1468-1476
Number of pages9
ISBN (Electronic)9781728165530
DOIs
Publication statusPublished - 14 May 2020
Event2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020 - Snowmass Village, USA United States
Duration: 1 Mar 20205 Mar 2020

Publication series

NameProceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020

Conference

Conference2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020
Country/TerritoryUSA United States
CitySnowmass Village
Period1/03/205/03/20

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Bridged variational autoencoders for joint modeling of images and attributes'. Together they form a unique fingerprint.

Cite this