Load up your GANs, bring your friends

it's fun to lose and to pretend

Posted by Marianne Linhares on September 13, 2017

Load up your GANs (Generative Adversarial Networks)

GANs (Generative Adversarial Networks) are currently a very hot Deep Learning topic. This week, I’ve took some time to learn more about them, and here you can find some of my discovers.

My plan was to find out:

  1. What are GANs?
  2. Why are GANs such a big thing?
  3. What are the next steps, what is not solved yet?
  4. What are the differences between the types of GANs?
  5. Quick Personal project

After learning more about GANs, I’ve planned to start a small personal project that should be easy to teach (so I can talk about it to my friends) and should run (train) fast, since I don’t have any GPUs available (if it takes a while to train it will not be a fun demo).

PS: Feel free to contribute to this blogpost. Create an issue on this repo if there’s something wrong or that you think it can be improved. Thank you!

P2: To be clear the goal of this blogpost is not to be teach GANs, but only to summarize what I’ve found about them, and highlight what I think is more relevant.

In the end of this post you’ll find all the references used.

1. What is a GAN?

GANs are a generative model, where two neural networks play a minimax game where one network, called discriminator, tries to tell if a sample is real of fake (generated by a model) and the other network, called generator, tries to deceive the discriminator.

The idea is that we can train these networks together in a way that if the discriminator is very good, the generator also needs to be very good in order to deceive it. In the end of the game the generator will be so good that the discriminator will have to guess (50% of probability) which samples are real or fake.

This is a simple GAN architecture using fashion MNIST as example.

Check GANs introduction by Ian Goodfellow

2. Why are GANs such a big thing?

  • There are a lot of applications for GANs:

OpenAI blog:

“This may by itself find use in multiple applications, such as on-demand generated art, or Photoshop++ commands such as “make my smile wider”. Additional presently known applications include image denoising, inpainting, super-resolution, structured prediction, exploration in reinforcement learning, and neural network pretraining in cases where labeled data is expensive.”

  • Yann LeCun (Director of AI Research at Facebook), says that:

“Why is that so interesting? It allows us to train a discriminator as an unsupervised “density estimator”, i.e. a contrast function that gives us a low value for data and higher output for everything else. This discriminator has to develop a good internal representation of the data to solve this problem properly. It can then be used as a feature extractor for a classifier, for example.

But perhaps more interestingly, the generator can be seen as parameterizing the complicated surface of real data: give it a vector Z, and it maps it to a point on the data manifold. There are papers where people do amazing things with this, like generating pictures of bedrooms, doing arithmetic on faces in the Z vector space: [man with glasses] - [man without glasses] + [woman without glasses] = [woman with glasses].”

3. What are the next steps? What is not solved yet?

In summary: lots of things. Some are:

  • In Ian’s talk he shows some generated samples that indicates issues while generating images like: counting, perspective and global structure problems.

  • Convergence problems

  • Images generated don’t have a great resolution yet.

  • This is a new thing, so researchers are still trying to find out what works and what doesn’t work, more about this in the next section.

4. What are the differences between the types of GANs?

I really liked this blog post about the topic.

5. Quick Personal Project

You can check the code here.

I refactored and made some improvements on a very simple implementation available here by @wiseodd that has a nice repo with a lot of GAN implementations and a great blog post about this vanilla implementation.

Here’s a diagram of the network architecure:

And an example of the samples generated while training with fashion MNIST dataset!

References

References can be find here.

Hope this helps somehow, have a nice day!