New Noise Schedule Improves Diffusion Models for Imbalanced Data
▶ The 2-minute explainer
Summary
Researchers introduce the Class-frequency Guided (CFRG) noise schedule for diffusion models, which assigns larger-scale noises to low-frequency classes. This method significantly improves generation quality and diversity for imbalanced datasets by addressing issues of inaccurate score estimation and high-frequency class dominance.
Why it matters
Professionals developing generative AI models, especially for real-world datasets that are often imbalanced, can use this technique to produce higher-quality and more diverse outputs, improving model fairness and utility.
How to implement this in your domain
- 1Analyze existing diffusion model training pipelines for performance on imbalanced datasets.
- 2Implement the Class-frequency Guided (CFRG) noise schedule in custom diffusion models.
- 3Experiment with different noise scaling strategies based on class frequency for specific datasets.
- 4Evaluate the impact of CFRG on generation quality, diversity, and fairness metrics for low-frequency classes.
Who benefits
Key takeaways
- Diffusion models struggle with imbalanced datasets, leading to poor generation for rare classes.
- A new CFRG noise schedule assigns larger noise to low-frequency classes.
- This improves score estimation and prevents high-frequency class dominance.
- Significant improvements in image and text-to-image generation on imbalanced data are observed.
Original post by Jiequan Cui, Beier Zhu, Qingshan Xu, Xiaojuan Qi, Bei Yu, Hanwang Zhang
"arXiv:2606.27696v1 Announce Type: new Abstract: In this paper, we are the first to examine the correlations between class frequency and the multi-scale noise schedule within diffusion models. For score-based generative models, low-density regions often lead to inaccurately estima…"
View on XOriginally posted by Jiequan Cui, Beier Zhu, Qingshan Xu, Xiaojuan Qi, Bei Yu, Hanwang Zhang on X · view source
Want to go deeper?
Turn these trends into skills with Learnijoy's hands-on AI & tech courses.
Explore coursesMore in AI Research
OpenAI Report Maps AI's Impact on European Workforce
A new OpenAI report analyzes how artificial intelligence could transform jobs across the European Union, identifying occupations susceptible to automation, growth, or significant workflow alterations.
Autoencoders Score Athlete Performance from Wearable Data
This paper evaluates five dimensionality reduction models, including autoencoders and PCA, for compressing nine wearable sensor metrics into a single athlete performance score. The Deep Autoencoder achieved the best composite score, with running pace, aerobic decoupling, and average heart rate identified as dominant performance drivers.
MixTTA Enhances Model Adaptation to Data Shifts
Researchers introduce MixTTA, a lightweight module that improves Test-Time Adaptation (TTA) by enabling low-rank cross-channel mixing within normalization layers. This allows models to better correct structural changes caused by distribution shifts, outperforming existing methods and mitigating adaptation failures.