Projects of Saleh Sargolzaei - DCGAN for Face Generation

In this project, I defined and trained a DCGAN on a dataset of faces. The goal is to get a generator network to generate new images of faces that look as realistic as possible!

Get the Data

I have used the CelebFaces Attributes Dataset (CelebA) to train the adversarial networks.

Pre-processed Data

Each of the CelebA images has been cropped to remove parts of the image that don't include a face, then resized down to 64x64x3 NumPy images. Some sample data is show below.

You can download this data by clicking here This is a zip file that you'll need to extract in the home directory of this notebook for further loading and processing. After extracting the data, you should be left with a directory of data processed_celeba_small/

Network Architecture

The architecture used for the generator and the discriminator was inspired by the original DCGAN paper:

I have also used the same hyperparameters mentioned in this paper.

The loss functions were inspired by the LSGAN paper.

Binary cross-entropy loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, I've used a least-squares loss function for the discriminator. This structure is also referred to as a least-squares GAN or LSGAN, and you can read the original paper on LSGANs, here. The authors show that LSGANs are able to generate higher quality images than regular GANs and that this loss type is a bit more stable during training! Finally, the last layer of the discriminator was inspired by patchGAN. PatchGAN has fewer parameters, runs faster, and classifies images as fake or real. You can check about patchGAN in this paper: Image-to-Image Translation with Conditional Adversarial Networks.

Results

You can see the value for losses below:

Some of the generated faces after 25 epochs have been shown below. As you can see, they're not the most realistic faces in the world, but I argue that they are fantastic, considering that :

I didn't put so much time experimenting with hyperparameters,
The model is not very deep,
The number of epochs is relatively low,
And the initial input is small.