D4M: Dataset Distillation via Disentangled Diffusion Model

CVPR 2024

Duo Su1,5,6,† Junjie Hou2,5,6,† Weizhi Gao3 Yingjie Tian4,5,6,7,*,   Bowen Tang8,  

1School of Computer Science and Technology, UCAS,   2Sino-Danish College, UCAS,  
3Department of Computer Science, NCSU,   4School of Economics and Management, UCAS,  
5Research Center on Fictitious Economy and Data Science, CAS,   6Key Laboratory of Big Data Mining and Knowledge Management, CAS,  
7MOE Social Science Laboratory of Digital Economic Forecasts and Policy Simulation, UCAS,   8Institute of Computing Technology, CAS,  

Equal contribution *Corresponding authors

Abstract

Dataset distillation offers a lightweight synthetic dataset for fast network training with promising test accuracy. To imitate the performance of the original dataset, most approaches employ bi-level optimization and the distillation space relies on the matching architecture. Nevertheless, these approaches either suffer significant computational costs on large-scale datasets or experience performance decline on cross-architectures. We advocate for designing an economical dataset distillation framework that is independent of the matching architectures. With empirical observations, we argue that constraining the consistency of the real and synthetic image spaces will enhance the cross-architecture generalization. Motivated by this, we introduce Dataset Distillation via Disentangled Diffusion Model (D4M), an efficient framework for dataset distillation. Compared to architecture-dependent methods, D4M employs latent diffusion model to guarantee consistency and incorporates label information into category prototypes. The distilled datasets are versatile, eliminating the need for repeated generation of distinct datasets for various architectures. Through comprehensive experiments, D4M demonstrates superior performance and robust generalization, surpassing the SOTA methods across most aspects.


Visualization results

The top row of each dataset comes from D4M and the bottom comes from SRe2L (ImageNet-1K and Tiny-ImageNet) and MTT (CIFAR-10/100). The images generated by D4M have better resolution and are more lifelike.


Paper and Supplementary Material


Duo Su, Junjie Hou, Weizhi Gao, Yingjie Tian, Bowen Tang;
Proceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition (CVPR), 2024, pp. 5809-5818

(hosted on CVPR 2024 open access)

BibTeX

@InProceedings{Su_2024_CVPR,
            author    = {Su, Duo and Hou, Junjie and Gao, Weizhi and Tian, Yingjie and Tang, Bowen},
            title     = {D{\textasciicircum}4: Dataset Distillation via Disentangled Diffusion Model},
            booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
            month     = {June},
            year      = {2024},
            pages     = {5809-5818}
        }