SRGAN With WGAN makes super-resolution algorithm training more stable

Write in front:

The project was finished 7 months ago, but I didn’t have the heart to get the code out. Later, I found out that the team with the same idea had written it into a paper, so I had no motivation to write it. Last night, I found out that my Github project had stars. I felt great support and suddenly got motivated to write an article, so this article came into being.

SRGAN profile

SRGAN, a high-profile super resolution paper in 2017 CVPR, has brought the effect of super resolution to a new level, and EDSR, the winner of 2017 SUPER score Competition NTIRE, is also based on SRGAN variation.

SRGAN was trained based on GAN method, there was a generator and a discriminator, the discriminator body used VGG19, the generator was a series of Residual block connection, and subpixel module was added at the back of the model. Referring to the idea of Shi et al’s Subpixel Network[6], the resolution of the image is increased only in the Network layer at the last, so as to improve the resolution and reduce the consumption of computing resources. For detailed introduction, it is recommended that you read the paper [1] directly. There are also some interpretation articles on the Internet. Here, some innovation details are introduced directly.

Problems with GAN

One problem with traditional gans is that you can’t know when to stop training the Discriminator and when to stop training the Discriminator. If you overtrain the Discriminator, the Generator won’t learn, and vice versa, the model will behave badly. If there is a loss indicator to reflect the training situation, the difficulty of training will be greatly reduced. WGAN[3] proposed in 2017 is an important method to solve this problem.

WGAN uses Wasserstein distance to describe the degree of difference between the distribution of two data sets. As long as the model is modified into the form of WGAN, the degree of model training can be monitored according to a unique Loss. An explanation of WGAN is highly recommended: The amazing Wasserstein GAN, who introduces WGAN in very straightforward language.

SRGAN combining WGAN

One of SRGAN’s great reproductions comes from

@ Dong Hao
github

· Remove the sigmoID from the last layer of the discriminator

· No log is taken for loss of generator and discriminator

· Truncate the absolute value of the discriminator parameters to no more than a fixed constant C after each update

· Do not use momentum based optimization algorithms (including Momentum and Adam), RMSProp, SGD is also recommended

– from”Amazing Wasserstein GAN”

According to the introduction of this article, by modifying the code in the above four items and converting the training mode of GAN into WGAN, the decline of Loss can be monitored in tensorBoard. Therefore, the author made some modifications to the original project:

The model code is modified by the above WGAN;
Tensorboard is added to monitor the decline of Loss.
In the author’s model.py, the convolution kernel of the last layer of Generator is changed from 1×1 to 9×9, which is the structure proposed in the original paper.

SRGAN_Wasserstein: SRGAN_Wasserstein: SRGAN_Wasserstein: SRGAN_Wasserstein: SRGAN_Wasserstein: SRGAN_Wasserstein: SRGAN_Wasserstein , so please go directly to Github to check, if you feel good, give a Star!

Modified training Loss reduction situation

Here are some of the superpartite restoration effects after reappearance:

A question from industry

In actual production use, low-resolution images encountered may not always be PNG (lossless compression images yield the best results) and may have varying degrees of distortion (artifacts resulting from lossy compression). I have tried many algorithms. For example, SRGAN, EDSR, RAISR, Fast Neural Style and so on. At present, any super segmentation algorithm of this kind of pictures can not improve the resolution and eliminate distortion at the same time. This problem is in my opinion

@ Dong Hao
issue
Is the SRGAN super-resolution method less effective than the SRGAN super-resolution method for low-resolution JPG images than the SRGAN method for low-resolution PNG images?

reference

[1] [1609.04802] Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

[2] [1609.07009] Is the deconvolution layer the same as a convolutional layer?

[3] [1701.07875] out GAN

[4] The amazing Wasserstein GAN

[5] Project Warehouse: JustinhoCHN/SRGAN_Wasserstein

[6] [1609.05158] Real-time Single Image and Video super-resolution Using an Efficient sub-pixel Convolutional Neural algorithm Network

SRGAN With WGAN makes super-resolution algorithm training more stable

SRGAN profile

Problems with GAN

SRGAN combining WGAN

A question from industry

reference

Related Posts

Label system application and design ideas

Linear regression algorithm (II)

Introduce you to traditional speech recognition technology