pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

pi-GAN: Periodic Implicit Generative Adversarial Networks

for 3D-Aware Image Synthesis

CVPR 2021 (Oral)

Eric Chan*, Marco Monteiro*,
Petr Kellnhofer, Jiajun Wu, Gordon Wetzstein,
Stanford University

We have witnessed rapid progress on 3D-aware image synthesis, leveraging recent advances in generative visual models and neural rendering. Existing approaches however fall short in two ways: first, they may lack an underlying 3D representation or rely on view-inconsistent rendering, hence synthesizing images that are not multi-view consistent; second, they often depend upon representation network architectures that are not expressive enough, and their results thus lack in image quality. We propose a novel generative model, named Periodic Implicit Generative Adversarial Networks (π-GAN or pi-GAN), for high-quality 3D-aware image synthesis. π-GAN leverages neural representations with periodic activation functions and volumetric rendering to represent scenes as view-consistent 3D representations with fine detail. The proposed approach obtains state-of-the-art results for 3D-aware image synthesis with multiple real and synthetic datasets.

Results

π-GAN leverages recent advances in generative visual models and neural rendering to produce high-quality, multi-view-consistent images.

π-GAN achieves state-of-the-art results on 3D-aware image synthesis on CelebA, Cats, and CARLA at 128 x 128.

Interpreting the 3D Representation

pi-GAN relies on an underlying multi-view-consistent 3D representation. We can interpret the 3D represenation as a mesh through marching cubes on the density output of the conditioned radiance field.

Inverse Rendering and Novel View Synthesis

Using a trained π-GAN generator, we can perform single-view reconstruction and novel-view synthesis. After freezing the parameters of our implicit representation, we optimize for the conditioning parameters that produce a radiance field which, when rendered, best matches the target image.

π-GAN Extrapolates to Unseen Camera Poses

The underlying 3D structural representation makes π-GAN more capable of rendering views absent from the training distribution of camera poses than previous methods that lacked 3D representations or relied on black-box neural rendering. π-GAN offers explicit control over position, rotation, focal length, and other camera parameters. Despite training only on closely cropped images of Cats, π-GAN can render images at much higher or much lower magnification.

Arxiv

Bibtex

@inproceedings{chanmonteiro2020pi-GAN, author = {Chan, Eric and Monteiro, Marco and Kellnhofer, Petr and Wu, Jiajun and Wetzstein, Gordon}, title = {pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis}, booktitle = {arXiv}, year={2020} }

We would like to offer special thanks to Matthew Chan for fruitful discussions and assistance in completing this work. We'd like to thank Stanford HAI for the AWS Cloud Credits. Gordon Wetzstein was supported by an NSF CAREER Award (IIS 1553333), a Sloan Fellowship, and a PECASE from the ARO.