Sparse-to-Sparse Training of Diffusion Models

AI-generated keywords: Diffusion models Sparse-to-sparse training Efficiency Performance Computational overhead

AI-generated Key Points

  • Authors introduce sparse-to-sparse training for diffusion models (DMs) to improve efficiency in training and inference.
  • Sparse DMs can match or outperform dense counterparts while reducing trainable parameters and FLOPs.
  • Three different methods (Static-DM, RigL-DM, MagRan-DM) are used to train sparse DMs from scratch on six datasets.
  • Experimental results show the effectiveness of sparse-to-sparse training in DMs across various sparsity rates and dataset sizes.
  • Safe and effective values for implementing sparse-to-sparse training in DMs are identified.
  • Experiments with larger datasets like ImageNet-1k demonstrate the scalability of the proposed approach.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Inês Cardoso Oliveira, Decebal Constantin Mocanu, Luis A. Leiva

License: CC BY 4.0

Abstract: Diffusion models (DMs) are a powerful type of generative models that have achieved state-of-the-art results in various image synthesis tasks and have shown potential in other domains, such as natural language processing and temporal data modeling. Despite their stable training dynamics and ability to produce diverse high-quality samples, DMs are notorious for requiring significant computational resources, both in the training and inference stages. Previous work has focused mostly on increasing the efficiency of model inference. This paper introduces, for the first time, the paradigm of sparse-to-sparse training to DMs, with the aim of improving both training and inference efficiency. We focus on unconditional generation and train sparse DMs from scratch (Latent Diffusion and ChiroDiff) on six datasets using three different methods (Static-DM, RigL-DM, and MagRan-DM) to study the effect of sparsity in model performance. Our experiments show that sparse DMs are able to match and often outperform their Dense counterparts, while substantially reducing the number of trainable parameters and FLOPs. We also identify safe and effective values to perform sparse-to-sparse training of DMs.

Submitted to arXiv on 30 Apr. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2504.21380v1

In their paper "Sparse-to-Sparse Training of Diffusion Models," authors Inês Cardoso Oliveira, Decebal Constantin Mocanu, and Luis A. Leiva introduce a novel approach to training diffusion models (DMs) with the aim of improving both training and inference efficiency. DMs are known for their stable training dynamics and ability to generate high-quality samples in various image synthesis tasks, natural language processing, and temporal data modeling. However, they typically require significant computational resources for both training and inference stages. The authors propose the paradigm of sparse-to-sparse training for DMs, focusing on unconditional generation. They train sparse DMs from scratch using three different methods (Static-DM, RigL-DM, and MagRan-DM) on six datasets to investigate the impact of sparsity on model performance. The experimental results demonstrate that sparse DMs can match or even outperform their dense counterparts while significantly reducing the number of trainable parameters and floating-point operations (FLOPs). The study also identifies safe and effective values for implementing sparse-to-sparse training in DMs. Furthermore, the authors provide experimental details such as setting sparsity rates at {0.1, 0.25, 0.5, 0.75, 0.9}, exploration frequencies (∆Te), weight prune and regrowth ratios (p), and dataset sizes used for training models like CelebA-HQ and LSUN-Bedrooms. Additionally, experiments with larger datasets like ImageNet-1k are conducted to assess the scalability of the proposed approach. Overall,this research contributes to advancing the field of generative models by introducing a more efficient training method for diffusion models that can achieve comparable or superior performance while reducing computational overhead.
Created on 23 May. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.