The original link: machinelearningmastery.com/impressive-…
Introduction to the
If deep learning is the most popular and widely used field, it’s the GAN–Generative Adversarial Network, which translates to Generative Adversarial Network. From its name alone, it looks like it’s basically just a Generative model that generates images.
When in fact, it is the most began to appear, it is used to generate images, but it is not just a generation model, it is actually a game between two networks, one is the generator, which is generated fake pictures, discriminant, the other is used to judge the authenticity of the input image, then the goal is to make the natural discriminant is unable to determine generator picture is true or false.
Distance, of course, it is first mentioned in 2014, has been in the past five years, the application of it is not only confined to generate images, more and more researchers apply it to various aspects, including image transformation, image restoration, image super-resolution and migration, text style production, video production, etc., this article introduces today, Just to summarize some interesting applications that GANs can implement so far!
The article divides these applications into the following areas, and then introduces the papers that implemented these applications, mainly between 2016 and 2018
- Generate images
- The face of a generation
- The photo generated
- Generate cartoon characters
- Image conversion
- Text to image conversion
- Semantic picture to photo conversion
- Full face image generation
- Generate new body positions
- Photo to expression conversion
- Photo editing
- Pictures mixed
- super-resolution
- Image restoration
- Clothes conversion
- Video to predict
- 3D object generation
1. Generate pictures
This is how GANs’ original 2014 paper “Generative Adversarial Networks” works, as shown below. It involves generating MNIST handwritten digital data sets, CIFAR10 images of small objects, and images of facial data sets.
Then the 2015 paper Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, Also known as DCGAN, the stable use of CNN training GAN is realized, and the results are shown in the figure below:
2. Face generation
The application of human face is one of the hottest, deepest and most mature directions in computer vision, and GANs is naturally involved in the application of this aspect.
In 2017, the paper “Progressive Growing of GANs for Improved Quality, Stability, and Variation “(ProGAN for short) can generate very realistic human faces, as shown in the figure below
The paper also shows its other applications, generating experimental results for other objects:
Additionally, a 2018 report “The Malicious Use of Artificial Intelligence: “Forecasting, Prevention, and Mitigation” describes the rapid development of GANs between 2014 and 2017, and uses face generation as an example to show how it has become increasingly realistic over the years.
3. Photo generation
The 2018 paper “Large Scale GAN Training for High Fidelity Natural Image Synthesis”, also known as BigGAN, produced excellent results in the generation of real photos, as shown in the figure below. Had at the time of publication, and causes great attention – the strongest in the history of academic circles | GAN image generators, Inception scores two times.
4. Create cartoon characters
In the paper “Towards the Automatic Anime Characters Creation with Generative Adversarial Networks” in 2017, GANs was found to be a novel novel with Generative Adversarial Networks Applied to generate the face of Japanese anime characters, as shown in the figure below
In addition, someone also applied GANs to generate pokemon pictures, as shown in the picture below. The project address is:
-
Github.com/moxiegushi/…
-
Github.com/kvpratama/g…
But more recently, GANs have been used to generate Pokemon with different properties:
Use CycleGAN to generate pokemon with different attributes
5. Image conversion
Image conversion is to apply GANs to many conversion tasks. The most famous paper here is “Image-to-Image Translation with Conditional Adversarial Networks” in 2016. Pix2pixGAN, which can convert these images:
- Convert semantic images into photos of street scenes and buildings
- Satellite photos are transferred to Google Maps
- The picture changes from day to night
- Color in black and white photos
- Transfer from sketch to color picture
Here are the results of the paper. The first line is semantic image to street view, semantic image to architecture, black and white image coloring, the second line is satellite image to Google Maps, day to night and sketch image to color.
However, pix2pixGAN requires the data set to be paired, that is, the input images and the expected output images are a pair, but this has a high requirement on the data set. In many cases, there are no such pairs of images. So in 2017, there was an improved paper “Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks”, also known as CycleGAN, It requires only the data sets of the original domain and the target domain, and does not require one-to-one pairs of data. It can implement the following transformations:
- The photographs were converted into art paintings
- Common horse and zebra conversion
- The photos change from summer to winter style
- Satellite image to Google Maps
The results are shown below, with the first line showing the conversion between art and photo, zebra and regular horse, summer and winter seasons, and the second and third lines detailing an example of each.
6. Text to picture conversion
A 2016 paper, “StackGAN: Text to Photorealistic Image Synthesis with Stacked Generative Adversarial Networks “, StackGAN is used to generate realistic photos from simple text descriptions such as birds and flowers. The first sentence describes a bird with a red head and feathers that gradually fade from red to gray from head to tail, while the second sentence describes a dark green bird with a short bill.
Another 2016 paper, “Generative Adversarial Text to Image Synthesis,” allows more textual-to-image descriptions, including Generative birds and flowers, as shown below:
Other similar papers include:
- Tac-gan — Text Conditioned Classifier Generative Adversarial Network, 2017
- Learning What and Where to Draw, 2016
7. Semantic picture to photo conversion
The 2017 paper “High-resolution Image Synthesis and Semantic Manipulation with Conditional GANs” uses the Conditional GANs method to generate very realistic photos, It can generate different types of photos according to a given semantic photo:
- Street view photos
- The bedroom photos
- Face photos
- Generates a photo of a face given a sketch picture
An example of generating a street view photo is shown below:
8. Full face image generation
The 2017 paper “Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving photofrontal View Synthesis Generates full face photo results. This can be applied to face verification or face recognition systems.
The effect is shown below:
9. Create new body positions
In 2017, the paper “Pose Guided Person Image Generation” realized that input images can be given and then generated postures, as shown in the figure below. Input is forward, side or back postures, and new postures can be generated, including forward side images.
10. Photos to emoticons
A 2016 paper, “Unsupervised cross-domain Image Generation,” uses GAN to generate images in different fields, such as street view numbers to handwritten font datasets, and then regenerate them into what level of expression or cartoon character faces. As follows:
11. Photo editing
CVPR 2018 paper “StarGAN: Unified Generative Adversarial Networks for multi-domain image-to-image Translation” enables photo editing, mainly of face attributes, as shown in the figure below. It can modify some attributes of the face, including hair color, expression, gender, age change and so on, depending on whether the training set contains the corresponding tag.
StarGAN is open source and the project address is:
Github.com/yunjey/star…
Other similar papers are:
- Invertible Conditional GANs For Image Editing, 2016
- Coupled Generative Adversarial Networks, 2016
- Neural Photo Editing with Introversarial Networks, 2016
- Image de-raining Using a Conditional Generative Adversarial Network, 2017
The following is mainly for the face age change:
-
Face Aging With Conditional Generative Adversarial Networks, 2017
-
Age Progression/Regression by Conditional Adversarial Autoencoder, 2017
12. Mix pictures
2017 paper GP-GAN: Towards Realistic high-resolution Image Blending theory adopts GANs to realize the Blending operation of pictures, that is, to fuse different elements of multiple pictures, as shown in the following figure. It fuses the middle part of picture A to the same position of picture B.
13. Super resolution
Image super-resolution technology refers to the process of generating high resolution images from low resolution images. It hopes to reconstruct the missing image details according to the existing image information.
ECCV 2018 Paper –ESRGAN: Enhanced super-resolution generative Adversarial Networks proposed ESRGAN, or Enhanced super-resolution generative adversarial network, which can add realistic details to low-resolution images, resulting in a more detailed picture. The result of its implementation is shown below:
ESRGAN project Address:
Github.com/xinntao/ESR…
Temporally Coherent GANs for Video Super-resolution (TecoGAN), This paper presents for the first time an adversarial and circular training method to supervise spatial high-frequency details and temporal relationships. For detailed introduction, please refer to the introduction of the following article:
Low-hd video can be quickly converted to HD: Super-resolution algorithm TecoGAN
Other papers implementing super-resolution are:
- Photo-realistic Single Image Super-resolution Using a Generative Adversarial Network, 2016
- High-quality Face Image SR Using Conditional Generative Adversarial Networks, 2017
- Relationship of Perception-Distortion Tradeoff using Enhanced Perceptual super-resolution Network, 2018
14. Image restoration
EdgeConnect 2019 paper EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning The restoration of images is divided into two steps: Edge generation and then Image completion.
The goddess is being coded? A imagine a dash back effect beyond Adobe | is open source
The results are shown as follows. Six examples are shown respectively. Figure A is the image that needs to be repaired, Figure B is the edge image generated in the middle, and Figure C is the final result of the repair.
Project Address:
Github.com/knazeri/edg…
Other papers are:
- Image Inpainting via Generative multi-column Convolutional Neural Networks, 2018
- Generative Image Inpainting with Contextual Attention, 2018
- High resolution Image Inpainting using Multi-scale Neural Patch Synthesis, CVPR 2017
- Generative Face Completion, 2017
- Context Encoders: Feature Learning by Inpainting, 2016
15. 2 d the fitting
The Conditional Analogy GAN (2017) Considerations Considerations Fashion Articles on People Images, trying to use GANs to achieve the EFFECT of 2D fitting, the results of the paper are as follows, it is given a model and the corresponding clothes to be replaced, and then the clothes on the model are replaced.
Some foreign people made some modifications to this article, wrote a blog introduction, and opened source its code, and the results are as follows:
Blog: shaoanlu.wordpress.com/2017/10/26/…
Github address: github.com/shaoanlu/Co…
At present, the technology is not very mature.
Other similar papers:
- INSTAGAN, 2018, Github: github.com/sangwoomo/i…
16. Video prediction
Generating Videos with Scene Dynamics (2016) introduces how to use GANs to achieve video prediction, mainly applied to elements in static scenes, as shown in the figure below:
17. 3D object generation
Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling (2016 Generate new 3D objects, such as chairs, cars, sofas, tables, etc., as shown below:
Another paper in 2016 –3D Shape Induction from 2D Views of Multiple Objects can also generate 3D Objects given a 2D object image from Multiple perspectives, as shown in the figure below:
summary
For more information on GANs applications, read the following article and the Github project
- gans-awesome-applications: Curated list of awesome GAN applications and demo.
- Some cool applications of GANs, 2018.
- GANs beyond generation: 7 alternative use cases, 2018.
reference
- Use CycleGAN to generate pokemon with different attributes
- Amazing results: Classic games like The Elder Scrolls III can also be usedsuper-resolutionGAN remakes
- Low-hd video can be quickly converted to HD: Super-resolution algorithm TecoGAN
Welcome to follow my wechat official account — the growth of algorithmic ape, or scan the QR code below, we can communicate, learn and progress together!