Abstract

We propose a novel generative adversarial network (GAN) for the task of unsupervised learning of 3D representations from natural images. Most generative models rely on 2D kernels to generate images and make few assumptions about the 3D world. These models therefore tend to create blurry images or artefacts in tasks that require a strong 3D understanding, such as novel-view synthesis. HoloGAN instead learns a 3D representation of the world, and to render this representation in a realistic manner. Unlike other GANs, HoloGAN provides explicit control over the pose of generated objects through rigid-body transformations of the learnt 3D features. Our experiments show that using explicit 3D features enables HoloGAN to disentangle 3D pose and identity, which is further decomposed into shape and appearance, while still being able to generate images with similar or higher visual quality than other generative models. HoloGAN can be trained end-to-end from unlabelled 2D images only. Particularly, we do not require pose labels, 3D shapes, or multiple views of the same objects. This shows that HoloGAN is the first generative model that learns 3D representations from natural images in an entirely unsupervised manner.
Original languageEnglish
Pages7588-7597
Number of pages10
Publication statusPublished - 27 Oct 2019
EventInternational Conference on Computer Vision 2019 -
Duration: 27 Oct 20192 Nov 2019

Conference

ConferenceInternational Conference on Computer Vision 2019
Period27/10/192/11/19

Cite this

Nguyen Phuoc, T., Li, C., Theis, L., Richardt, C., & Yang, Y. (2019). HoloGAN: Unsupervised Learning of 3D Representations From Natural Images. 7588-7597. Paper presented at International Conference on Computer Vision 2019, .

HoloGAN: Unsupervised Learning of 3D Representations From Natural Images. / Nguyen Phuoc, Thu; Li, Chuan; Theis, Lucas; Richardt, Christian; Yang, Yongliang.

2019. 7588-7597 Paper presented at International Conference on Computer Vision 2019, .

Research output: Contribution to conferencePaper

Nguyen Phuoc, T, Li, C, Theis, L, Richardt, C & Yang, Y 2019, 'HoloGAN: Unsupervised Learning of 3D Representations From Natural Images' Paper presented at International Conference on Computer Vision 2019, 27/10/19 - 2/11/19, pp. 7588-7597.
Nguyen Phuoc T, Li C, Theis L, Richardt C, Yang Y. HoloGAN: Unsupervised Learning of 3D Representations From Natural Images. 2019. Paper presented at International Conference on Computer Vision 2019, .
Nguyen Phuoc, Thu ; Li, Chuan ; Theis, Lucas ; Richardt, Christian ; Yang, Yongliang. / HoloGAN: Unsupervised Learning of 3D Representations From Natural Images. Paper presented at International Conference on Computer Vision 2019, .10 p.
@conference{ddf6fd4500964d4f81ee83ead348dbfb,
title = "HoloGAN: Unsupervised Learning of 3D Representations From Natural Images",
abstract = "We propose a novel generative adversarial network (GAN) for the task of unsupervised learning of 3D representations from natural images. Most generative models rely on 2D kernels to generate images and make few assumptions about the 3D world. These models therefore tend to create blurry images or artefacts in tasks that require a strong 3D understanding, such as novel-view synthesis. HoloGAN instead learns a 3D representation of the world, and to render this representation in a realistic manner. Unlike other GANs, HoloGAN provides explicit control over the pose of generated objects through rigid-body transformations of the learnt 3D features. Our experiments show that using explicit 3D features enables HoloGAN to disentangle 3D pose and identity, which is further decomposed into shape and appearance, while still being able to generate images with similar or higher visual quality than other generative models. HoloGAN can be trained end-to-end from unlabelled 2D images only. Particularly, we do not require pose labels, 3D shapes, or multiple views of the same objects. This shows that HoloGAN is the first generative model that learns 3D representations from natural images in an entirely unsupervised manner.",
author = "{Nguyen Phuoc}, Thu and Chuan Li and Lucas Theis and Christian Richardt and Yongliang Yang",
year = "2019",
month = "10",
day = "27",
language = "English",
pages = "7588--7597",
note = "International Conference on Computer Vision 2019 ; Conference date: 27-10-2019 Through 02-11-2019",

}

TY - CONF

T1 - HoloGAN: Unsupervised Learning of 3D Representations From Natural Images

AU - Nguyen Phuoc, Thu

AU - Li, Chuan

AU - Theis, Lucas

AU - Richardt, Christian

AU - Yang, Yongliang

PY - 2019/10/27

Y1 - 2019/10/27

N2 - We propose a novel generative adversarial network (GAN) for the task of unsupervised learning of 3D representations from natural images. Most generative models rely on 2D kernels to generate images and make few assumptions about the 3D world. These models therefore tend to create blurry images or artefacts in tasks that require a strong 3D understanding, such as novel-view synthesis. HoloGAN instead learns a 3D representation of the world, and to render this representation in a realistic manner. Unlike other GANs, HoloGAN provides explicit control over the pose of generated objects through rigid-body transformations of the learnt 3D features. Our experiments show that using explicit 3D features enables HoloGAN to disentangle 3D pose and identity, which is further decomposed into shape and appearance, while still being able to generate images with similar or higher visual quality than other generative models. HoloGAN can be trained end-to-end from unlabelled 2D images only. Particularly, we do not require pose labels, 3D shapes, or multiple views of the same objects. This shows that HoloGAN is the first generative model that learns 3D representations from natural images in an entirely unsupervised manner.

AB - We propose a novel generative adversarial network (GAN) for the task of unsupervised learning of 3D representations from natural images. Most generative models rely on 2D kernels to generate images and make few assumptions about the 3D world. These models therefore tend to create blurry images or artefacts in tasks that require a strong 3D understanding, such as novel-view synthesis. HoloGAN instead learns a 3D representation of the world, and to render this representation in a realistic manner. Unlike other GANs, HoloGAN provides explicit control over the pose of generated objects through rigid-body transformations of the learnt 3D features. Our experiments show that using explicit 3D features enables HoloGAN to disentangle 3D pose and identity, which is further decomposed into shape and appearance, while still being able to generate images with similar or higher visual quality than other generative models. HoloGAN can be trained end-to-end from unlabelled 2D images only. Particularly, we do not require pose labels, 3D shapes, or multiple views of the same objects. This shows that HoloGAN is the first generative model that learns 3D representations from natural images in an entirely unsupervised manner.

M3 - Paper

SP - 7588

EP - 7597

ER -