Principle and Implementation of Deep Convolutional Generative Adversarial Network (DCGAN) (Tensorflow2)

“This is the 13th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021”

GAN intuitive understanding

Ian Goodfellow proposed GAN for the first time and introduced the GAN model with an image metaphor: the function of generating network G is to generate realistic counterfeit money in an attempt to deceive discriminator D, who can master the identification method of money by learning real money and counterfeit money generated by generator G. The two networks are trained in a game with each other until the counterfeit money produced by generator G makes discriminator D difficult to distinguish. DCGAN uses convolution operation and deconvolution operation to replace the full connection operation in original GAN.

DCGAN network structure

GAN contains a Generator network (G) and a Discriminator network (D), where G is used to learn the real distribution of data and Dis used to distinguish the data generated by G from the real sample.

Generate network G (z) G (z) G (z) G from prior distribution p z (⋅) p_z (\ cdot) pz (⋅) sampling latent variable z ~ pz (⋅) z \ sim p_z (\ cdot) z ~ pz (⋅), through the study of G distribution pg (x ∣ z) p_g pg (x | z) (x ∣ z), Get to generate samples x ~ pg ∣ z (x) x \ sim ~ p_g | z (x) x ~ pg (x ∣ z). Merc (⋅)p_z (\cdot)pz(⋅) can be assumed to be a familiar distribution.

Discriminant network D(x)D(x)D(x) D is a binary network, ⋅ (⋅) X_r \sim P_R (\cdot) XR ~ PR (⋅) and xF ~ pg(x∣z)x_f\sim p_g Xf – pg (x | z) (x ∣ z), discriminant network composed of xrx_rxr and xfx_fxf training data set. The label of the real sample xrx_RXR is 1, and the sample xfX_FXF generated by the generated network is 0. The discriminant network is optimized by minimizing the error between the predicted value of discriminant network D and the label.

GAN training objectives

The goal of discriminant network is to distinguish true sample XrX_RXR from false sample xFX_FXF. Its objective is to minimize the cross entropy loss function between the predicted value and the true value:

\ underset min \ mathcal {theta} L = CE (D_ theta (x_r), y_r, D_ theta (x_f), y_f)

CE stands for CrossEntropy loss function CrossEntropy:

= – \ \ mathcal L sum_ {x_r \ sim p_r (\ cdot)} logD_ theta (x_r) – \ sum_ {x_f \ sim p_g (\ cdot)} the log (1 – D_ theta (x_f))

The optimization objective of discriminant network D is:

Theta ^ ∗ = \ underset {theta} argmin – \ sum_ {x_r \ sim p_r (\ cdot)} logD_ theta (x_r) – \ sum_ {x_f \ sim p_g (\ cdot)} the log (1 – D_ theta (x_f))

Convert minLmin \mathcal LminL to Max −Lmax − mathcal Lmax:

Theta ^ ∗ = \ underset {theta} argmax \ mathbb E_ {x_r \ sim p_r (\ cdot)} \ \ mathbb logD_ theta (x_r) + E_ (x_f \ sim p_g (\ cdot)} the log (1 – D_ theta (x_f ))

For the generated network G(z)G(z)G(z), it is hoped that the generated data can deceive the discriminant network D, and the output of the fake sample XfX_FXF in the discriminant network is closer to the real label, the better. In other words, during the training of network generation, it is expected to discriminate the network output D(G(z))D(G(z))D(G(z)) is closer to 1, and minimize the cross entropy loss function between D(G(z))D(G(z)) and 1:

\ underset min \ mathcal {phi} L = CE (D (G_ phi (z)), 1) = – logD G_ (phi) (z)

Convert minLmin \mathcal LminL to Max −Lmax − mathcal Lmax:

Phi ^ ∗ = \ underset argmin \ mathcal L = {phi} \ mathbb E_ \ {z sim p_z (\ cdot)} the log (1 – D G_ (phi) (z)]

Where φφφ is the parameter of generating network G.

Iterate through training discriminators and generators during training.

DCGAN implementation

Cifar10’s training set is used as GAN training set to achieve DCGAN.

The data load

The training set of CIFAR10 was loaded and the data was preprocessed

# batch size
batch_size = 64
(train_x,_),_ = keras.datasets.cifar10.load_data()
# Data normalization
train_x = train_x / (255. / 2) - 1
print(train_x.shape)
dataset = tf.data.Dataset.from_tensor_slices(train_x)
dataset = dataset.shuffle(1000)
dataset = dataset.batch(batch_size=batch_size, drop_remainder=True)
Copy the code

network

The network consists of authentication network and generation network

Identify the network

class Discriminator(keras.Model) :
    def __init__(self) :
        super(Discriminator,self).__init__()
        filters = 64
        self.conv1 = keras.layers.Conv2D(filters,4.2.'valid',use_bias=False)
        self.bn1 = keras.layers.BatchNormalization()
        self.conv2 = keras.layers.Conv2D(filters*2.4.2.'valid',use_bias=False)
        self.bn2 = keras.layers.BatchNormalization()
        self.conv3 = keras.layers.Conv2D(filters*4.3.1.'valid',use_bias=False)
        self.bn3 = keras.layers.BatchNormalization()
        self.conv4 = keras.layers.Conv2D(filters*8.3.1.'valid',use_bias=False)
        self.bn4 = keras.layers.BatchNormalization()
        # global pooling
        self.pool = keras.layers.GlobalAveragePooling2D()
        self.flatten = keras.layers.Flatten()
        self.fc = keras.layers.Dense(1)

    def call(self,inputs,training=True) :
        x = inputs
        x = tf.nn.leaky_relu(self.bn1(self.conv1(x),training=training))
        x = tf.nn.leaky_relu(self.bn2(self.conv2(x),training=training))
        x = tf.nn.leaky_relu(self.bn3(self.conv3(x),training=training))
        x = tf.nn.leaky_relu(self.bn4(self.conv4(x),training=training))
        x = self.pool(x)
        x = self.flatten(x)
        logits = self.fc(x)
        return logits
Copy the code

Generate network

class Generator(keras.Model) :
    def __init__(self) :
        super(Generator,self).__init__()
        filters = 64
        self.conv1 = keras.layers.Conv2DTranspose(filters*4.4.1.'valid',use_bias=False)
        self.bn1 = keras.layers.BatchNormalization()
        self.conv2 = keras.layers.Conv2DTranspose(filters*3.4.2.'same',use_bias=False)
        self.bn2 = keras.layers.BatchNormalization()
        self.conv3 = keras.layers.Conv2DTranspose(filters*1.4.2.'same',use_bias=False)
        self.bn3 = keras.layers.BatchNormalization()
        self.conv4 = keras.layers.Conv2DTranspose(3.4.2.'same',use_bias=False)

    def call(self,inputs,training=False) :
        x = inputs
        x = tf.reshape(x,(x.shape[0].1.1,x.shape[1]))
        x = tf.nn.relu(x)
        x = tf.nn.relu(self.bn1(self.conv1(x),training=training))
        x = tf.nn.relu(self.bn2(self.conv2(x),training=training))
        x = tf.nn.relu(self.bn3(self.conv3(x),training=training))
        x = self.conv4(x)
        x = tf.tanh(x)
        return x
Copy the code

Network training

You can train the discriminator multiple times and then train the generator once during training

Defining loss function

def celoss_ones(logits) :
    Calculate the cross entropy belonging to and labeled 1
    y = tf.ones_like(logits)
    loss = keras.losses.binary_crossentropy(y, logits, from_logits=True)
    return tf.reduce_mean(loss)


def celoss_zeros(logits) :
    # Calculate the cross entropy belonging to and tag 0
    y = tf.zeros_like(logits)
    loss = keras.losses.binary_crossentropy(y, logits, from_logits=True)
    return tf.reduce_mean(loss)

def d_loss_fn(generator, discriminator, batch_z, batch_x, is_training) :
    Calculate the loss function of discriminator
    # Sample to generate images
    fake_image = generator(batch_z, is_training)
    # Determine the generated image
    d_fake_logits = discriminator(fake_image, is_training)
    # Judge the real picture
    d_real_logits = discriminator(batch_x, is_training)
    # Error between real picture and 1
    d_loss_real = celoss_ones(d_real_logits)
    # Generate the error between the image and 0
    d_loss_fake = celoss_zeros(d_fake_logits)
    # Merge error
    loss = d_loss_fake + d_loss_real

    return loss


def g_loss_fn(generator, discriminator, batch_z, is_training) :
	Calculate the generator's loss function
    # Sample to generate images
    fake_image = generator(batch_z, is_training)
    When training the generative network, we need to force the generated image to be true
    d_fake_logits = discriminator(fake_image, is_training)
    Calculate the error between the generated image and 1
    loss = celoss_ones(d_fake_logits)

    return loss
Copy the code

Instantiate the network and optimizer

# Define hyperparameters
# Latent dimension
z_dim = 100
# epoch size
epochs = 300
# batch size
batch_size = 64
Vector #
lr = 0.0002
is_training = True
# instantiate the network
discriminator = Discriminator()
discriminator.build(input_shape=(4.32.32.3))
discriminator.summary()
generator = Generator()
generator.build(input_shape=(4,z_dim))
generator.summary()
Instantiate the optimizer
g_optimizer = keras.optimizers.Adam(learning_rate=lr,beta_1=0.5)
d_optimizer = keras.optimizers.Adam(learning_rate=lr,beta_1=0.5)
Copy the code

training

# Count the loss value
d_losses = []
g_losses = []
for epoch in range(epochs):
    for _,batch_x in enumerate(dataset):
        batch_z = tf.random.normal([batch_size,z_dim])
        with tf.GradientTape() as tape:
            d_loss = d_loss_fn(generator,discriminator,batch_z,batch_x,is_training)
        grads = tape.gradient(d_loss,discriminator.trainable_variables)
        d_optimizer.apply_gradients(zip(grads,discriminator.trainable_variables))
        with tf.GradientTape() as tape:
            g_loss = g_loss_fn(generator,discriminator,batch_z,is_training)
        grads = tape.gradient(g_loss,generator.trainable_variables)
        g_optimizer.apply_gradients(zip(grads,generator.trainable_variables))
Copy the code

Results show

Training tests can be improved by adjusting the hyperparameters.

Effects of training 26 Epochs: