Show simple item record

dc.contributor.author
Shahbazi, Mohamad
dc.contributor.supervisor
Van Gool, Luc
dc.contributor.supervisor
Zhu, Jun-Yan
dc.contributor.supervisor
Aila, Timo
dc.contributor.supervisor
Khoreva, Anna
dc.contributor.supervisor
Paudel, Danda Pani
dc.date.accessioned
2024-09-02T06:58:38Z
dc.date.available
2024-09-01T08:51:23Z
dc.date.available
2024-09-02T06:19:36Z
dc.date.available
2024-09-02T06:58:38Z
dc.date.issued
2024
dc.identifier.uri
http://hdl.handle.net/20.500.11850/691713
dc.identifier.doi
10.3929/ethz-b-000691713
dc.description.abstract
Recent advancements in generative modeling have transformed visual content creation, showing tremendous promise in several applications in Computer Vision and Graphics. However, the adoption of generative models in everyday tasks is hindered by challenges in controllability of the generation process, data requirements, and computational demands. This thesis focuses on addressing such real-world constraints in 2D and 3D generative models. Firstly, we focus on improving the data efficiency of class-conditional Generative Adversarial Networks (GANs) using transfer learning. We introduce a new class-specific transfer learning method, called cGANTransfer, to explicitly propagate the knowledge from old classes to the new ones based on their relevance. Through extensive evaluation, we demonstrate the superiority of the proposed approach over the previous methods for conditional GAN transfer. Secondly, we investigate the training of class-conditional GANs with small datasets. In particular, we identify conditioning collapse in GANs--mode collapse caused by conditional GAN training on small data. We propose a training strategy based on transitional conditioning that effectively prevents the observed mode collapse by additionally leveraging unconditional learning. The proposed method results not only in stable training but also in generating high-quality images, thanks to the exploitation of shared information across classes in the early stages of training. Thirdly, we tackle the computational efficiency of NeRF-GANs, a class of 3D-aware generative models based on the integration of Neural Radiance Fields (NeRFs) and GANs, trained on single-view image datasets. Specifically, we revisit pose-conditioned 2D GANs for efficient 3D-aware generation at inference time by distilling 3D knowledge from pretrained NeRF-GANs. We propose a simple and effective method for efficient inference of 3D-aware GANs, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network, to directly generate 3D-consistent images corresponding to the underlying 3D representations. Lastly, we address the novel task of object generation in 3D scenes without the need for any 3D supervision or 3D placement guidance from the users. We introduce InseRF, a novel method for generative object insertion in the NeRF reconstructions of 3D scenes. Based on a user-provided textual description and only a 2D bounding box in a reference viewpoint, InseRF is capable of controllable and 3D-consistent object insertion in 3D scenes without requiring explicit 3D information as input.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
ETH Zurich
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.subject
Computer Vision
en_US
dc.subject
Generative AI
en_US
dc.title
2D and 3D Generative Models under Real-World Constraints
en_US
dc.type
Doctoral Thesis
dc.rights.license
In Copyright - Non-Commercial Use Permitted
dc.date.published
2024-09-02
ethz.size
153 p.
en_US
ethz.code.ddc
DDC - DDC::0 - Computer science, information & general works::004 - Data processing, computer science
en_US
ethz.identifier.diss
30227
en_US
ethz.publication.place
Zurich
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02652 - Institut für Bildverarbeitung / Computer Vision Laboratory::03514 - Van Gool, Luc (emeritus) / Van Gool, Luc (emeritus)
en_US
ethz.date.deposited
2024-09-01T08:51:23Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2024-09-02T06:58:40Z
ethz.rosetta.lastUpdated
2024-09-02T06:58:40Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=2D%20and%203D%20Generative%20Models%20under%20Real-World%20Constraints&rft.date=2024&rft.au=Shahbazi,%20Mohamad&rft.genre=unknown&rft.btitle=2D%20and%203D%20Generative%20Models%20under%20Real-World%20Constraints
 Search print copy at ETH Library

Files in this item

Thumbnail

Publication type

Show simple item record