Diffusion Models have recently gained popularity in the field of image generation, with widely used products such as Stable Diffusion employing this approach and yielding impressive results. While GANs are also recognized for their efficiency, in what scenarios do I need to choose GANs over Diffusion Models and do GANs have any advantages compared to Diffusion Models in image generation?
Here are a few reasons I can think of:
- Diffusion Models take more time and larger datasets to train.
- To train a Diffusion Model project, one must have substantial computational resources (a lot of GPUs), compared to GANs.
- The codebases of some popular Diffusion Models projects are not open source.
I don’t know if these are correct. As for the mathematical aspect, I’m not an expert in that area.
GAN - Great if it works, but you better get used to praying cause it’s difficult to train like reinforcement learning. After all the pain you either got a complete piece of garbage or amazing miracle work that’s extremely efficient with O(1) time complexity. Look at GigaGan. Images are sharper with detail and sometimes almost impossible to tell.
Diffusion - Slow but gets high quality results and super easy to train. It will probably improve in the future when we get better noise schedulers and other breakthroughs. O(n) which n is time steps. Images are smoother. But good quality enough to fool most people.
Paper for 2 time step diffusion models?
For sure. OAI also mentioned using a similar process. They probably failed in implementation and will probably copy them as soon as they can.
https://arxiv.org/abs/2210.03142