Generative Adversarial Networks

Introduction

Generative Adversarial Networks (GANs) are a class of deep learning models introduced by Ian Goodfellow and his colleagues in 2014. The core idea behind GANs is to train a generator network to produce data that is indistinguishable from real data, while simultaneously training a discriminator network to differentiate between real and generated data.

Architecture overview: GANs consist of two main components: the generator and the discriminator.
Generator: The generator takes random noise $z$ as input and generates synthetic data samples. Its goal is to create data that is realistic enough to deceive the discriminator.
Discriminator: The discriminator, akin to a detective, evaluates whether a given sample is real (from the actual dataset) or fake (generated by the generator). Its objective is to become increasingly accurate in distinguishing between real and generated samples.

A common analogy that can be found online is that of an art forger/painter (the generator) which tries to forge paintings and an art investigator/critic (the discriminator) which tries to detect limitations.

Lilian Weng GAN Figure

GANs vs VAEs

GANs and VAEs are both popular generative models in machine learning, but they have different strengths and weaknesses. Whether one is “better” depends on the specific task and requirements. Here’s a breakdown of their strengths and weaknesses.

Image Generation:
- GANs:
  - Strengths: Generate higher quality images, especially for complex data with sharp details and realistic textures.
  - Weaknesses: Can be more difficult to train and prone to instability.
  - Example: A GAN-generated image of a bedroom is likely to be indistinguishable from a real one, while a VAE-generated bedroom might appear blurry or have unrealistic lighting.
- VAEs:
  - Strengths: Easier to train and more stable than GANs.
  - Weaknesses: May generate blurry, less detailed images with unrealistic features.
Other Tasks:
- GANs:
  - Strengths: Can be used for tasks like super-resolution and image-to-image translation.
  - Weaknesses: May not be the best choice for tasks that require a smooth transition between data points.
- VAEs:
  - Strengths: Widely used for tasks like image denoising and anomaly detection.
  - Weaknesses: May not be as effective as GANs for tasks that require high-quality image generation.

Here’s a table summarizing the key differences:

Feature	GANs	VAEs
Image Quality	Higher	Lower
Ease of Training	More difficult	Easier
Stability	Less Stable	More Stable
Applications	Image Generation, Super-resolution, image-to-image translation	Image Denoising, Anamoly Detection, Signal Analysis

Ultimately, the best choice depends on one’s specific needs and priorities. If one needs high-quality images for tasks like generating realistic faces or landscapes, then a GAN might be the better choice. However, if one needs a model that is easier to train and more stable, then a VAE might be a better option.

Training GANs

Training GANs involves a unique adversarial process where the generator and discriminator play a cat-and-mouse game.

Adversarial Training Process: The generator and discriminator are trained simultaneously. The generator aims to produce data that is indistinguishable from real data, while the discriminator strives to improve its ability to differentiate between real and fake samples.
Objective Function: The training process is guided by a min-max game type objective function which is used to optimize both the generator and the discriminator. The generator aims to minimize the probability of the discriminator correctly classifying generated samples as fake, while the discriminator seeks to maximize this probability. This objective function is represented as: $\min_G \max_D L(D, G)=\mathbb{E}_{x \sim p_{r}(x)} [\log D(x)] + \mathbb{E}_{x \sim p_g(x)} [\log(1 - D(x))]$ Here, the discriminator tries to maximize this loss function whereas the generator tries to minimize it, hence the adversarial nature.
Iterative Improvement: As training progresses, the generator becomes adept at producing realistic samples, and the discriminator becomes more discerning. This adversarial loop continues until the generator generates data that is virtually indistinguishable from real data.

References:

< > Update on GitHub

Community Computer Vision Course

Generative Adversarial Networks

Introduction

GANs vs VAEs

Training GANs

References: