• The article was transferred from wechat official account: Alchemy of machine learning
  • Author: Chen Yixin (Welcome exchange and common progress)
  • Contact: Wechat CYX645016617
  • Analyzing and Improving the Image Quality of StyleGAN

[TOC]

4.1 the Path length of regularization

In the production of human face at the same time, we hope to control the attributes of the face, different latent code can get different face, when determine the specific direction of the change of latent code, the direction of the different size corresponding to a specific change in the image of a different magnitude. To achieve this goal, Path Legnth Regularization is designed, and the regularization formula is as follows:


L p l = E w E y ( J w T y 2 a ) 2 L_{pl}=E_wE_y(||J^T_wy||_2-a)^2

4.2 Code to achieve PLloss

def calc_pl_lengths(styles, images) :
    device = images.device
    num_pixels = images.shape[2] * images.shape[3]
    pl_noise = torch.randn(images.shape, device=device) / math.sqrt(num_pixels)
    outputs = (images * pl_noise).sum()

    pl_grads = torch_grad(outputs=outputs, inputs=styles,
                          grad_outputs=torch.ones(outputs.shape, device=device),
                          create_graph=True, retain_graph=True, only_inputs=True) [0]

    return (pl_grads ** 2).sum(dim=2).mean(dim=1).sqrt()
Copy the code

The self.pl_mean in this step corresponds to the moving average a in the formula.

Self.pl_mean is updated as follows:

Moving average with momentum of 0.99:

4.3 No Progressive growth

StyleGAN using Progressive Growth has some disadvantages. As shown below, teeth are not deflected when the face is turned left to right:

In other words, some details of the human bird, such as teeth and eyes, are relatively fixed and do not change according to the deflection of the face, which is caused by the Progressive growth training. Progressive growth comes from PGGAN, which means that the low resolution is trained first, and then the resolution of high discussion layer is added for training after the training is stable, and then the resolution is increased after the training is stable, that is, each resolution will output results, which will lead to the output of details with high frequency.

The reason why Progressive growth is used is that the network required for high resolution image generation is relatively large and deep, and it is not easy to train when the network is too deep. However, Skip Connection can solve the training problem of network depth. Therefore, the following three results using Skip Connection appear, and StyleGAN2 conducts experimental evaluation on the effects of the three structures:

On the left is skip Connection with corresponding resolution similar to Unet, also called MSG-GAN research. Figure B converts the picture into a 3-channel RGB image, and then makes the connection of up-down sampling. Figure C is similar to residual link.

As you can see, the discriminator uses the residual mode effect number, so output skip or residual seems to be ok.