Diffusion Guided Domain Adaptation of Image Generators

AI-generated keywords: Domain Adaptation Text-to-Image Diffusion Models Classifier-Free Guidance 3D-Aware Style-Based Generators DreamBooth Guidance

AI-generated Key Points

  • The paper proposes a method for adapting a GAN generator to a new domain using text-to-image diffusion models as training objectives.
  • Classifier-free guidance is used as a critic to enable generators to distill knowledge from large-scale text-to-image diffusion models, allowing them to efficiently shift into new domains indicated by text prompts without access to ground truth samples.
  • The authors demonstrate the effectiveness and controllability of their method through extensive experiments, achieving high CLIP scores and significantly lower FID than prior work on short prompts, and outperforming the baseline qualitatively and quantitatively on long and complicated prompts.
  • The proposed method incorporates large-scale pre-trained diffusion models and distillation sampling for text-driven image generator domain adaptation, giving quality previously beyond possible.
  • The authors extend their work to 3D-aware style-based generators and DreamBooth guidance.
  • Performance gains increase quickly as the text prompts grow longer, with the method generating images with much higher visual quality and fidelity in these experiments.
  • Quantitative comparisons show that the models achieve significantly better FIDs than the baseline, competitive CLIP scores with better LPIPS scores, and capture all key constraints mentioned in long text prompts more effectively than the baseline.
  • Overall, this work presents an innovative approach for adapting image generators to new domains using large scale pre trained diffusion models and distillation sampling guided by textual input.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Kunpeng Song, Ligong Han, Bingchen Liu, Dimitris Metaxas, Ahmed Elgammal

Project website: https://styleganfusion.github.io/
License: CC BY 4.0

Abstract: Can a text-to-image diffusion model be used as a training objective for adapting a GAN generator to another domain? In this paper, we show that the classifier-free guidance can be leveraged as a critic and enable generators to distill knowledge from large-scale text-to-image diffusion models. Generators can be efficiently shifted into new domains indicated by text prompts without access to groundtruth samples from target domains. We demonstrate the effectiveness and controllability of our method through extensive experiments. Although not trained to minimize CLIP loss, our model achieves equally high CLIP scores and significantly lower FID than prior work on short prompts, and outperforms the baseline qualitatively and quantitatively on long and complicated prompts. To our best knowledge, the proposed method is the first attempt at incorporating large-scale pre-trained diffusion models and distillation sampling for text-driven image generator domain adaptation and gives a quality previously beyond possible. Moreover, we extend our work to 3D-aware style-based generators and DreamBooth guidance.

Submitted to arXiv on 08 Dec. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2212.04473v1

This paper proposes a method for adapting a GAN generator to a new domain using text-to-image diffusion models as training objectives. The approach leverages classifier-free guidance as a critic to enable generators to distill knowledge from large-scale text-to-image diffusion models, allowing them to efficiently shift into new domains indicated by text prompts without access to ground truth samples. The authors demonstrate the effectiveness and controllability of their method through extensive experiments, achieving high CLIP scores and significantly lower FID than prior work on short prompts, and outperforming the baseline qualitatively and quantitatively on long and complicated prompts. The proposed method incorporates large-scale pre-trained diffusion models and distillation sampling for text-driven image generator domain adaptation, giving quality previously beyond possible. Additionally, the authors extend their work to 3D-aware style-based generators and DreamBooth guidance. Performance gains increase quickly as the text prompts grow longer, with our method generating images with much higher visual quality and fidelity in these experiments. Quantitative comparisons show that our models achieve significantly better FIDs than the baseline, competitive CLIP scores with better LPIPS scores, and capture all key constraints mentioned in long text prompts more effectively than the baseline. Overall, this work presents an innovative approach for adapting image generators to new domains using large scale pre trained diffusion models and distillation sampling guided by textual input. This approach enables efficient shifting into new domains indicated by text prompts without access to ground truth samples while providing high quality results with improved visual fidelity compared to prior works.
Created on 03 May. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.