Video introduction: Use GAN to create magical creatures

Creating art for digital video games requires a high degree of artistic creativity and technical knowledge, as well as game artists iterating ideas quickly and producing large amounts of assets, often under tight deadlines. What if the artist’s brush were less like a tool and more like a helper? Machine learning models that act as such paintbrushes can reduce the time it takes to create high-quality art without sacrificing artistic choices, and may even enhance creativity.

Today, we show Chimera Painter, a machine learning (ML) model trained to automatically create fully fleshed-out renderings based on user-supplied biological profiles. As a demo application, When the user clicks the “Transform” button, Chimera Painter adds features and textures to the biological contours segmented with body part labels such as “wings” or “claws.” Here is an example of using a demo with a preset bio profile.

In this article, we describe some of the challenges of creating ML models behind Chimera Painter and demonstrate how to use the tool to create video game ready assets.

Prototype design of new model

While developing ML models to generate images of video-game-ready creatures, we created a digital card game prototype around the concept of combining creatures into new hybrids that can fight each other. In this game, players will start with cards of real-world animals (such as salamanders or whales) and then make them more powerful by combining them together (making scary salamanders and whales chimeras). This provides a creative environment for demonstrating image generation models, as the number of possible chimera requires a way to quickly design a large number of artistic assets that can be assembled naturally, while still retaining the recognizable visual characteristics of the original creature.

Since our goal was to create high-quality creature card images guided by artist input, we experimented with generative adversarial networks (gans) based on artist feedback to create creature images suitable for our fantasy card game prototype. GAN pairs two convolutional neural networks with each other: a generator network to create new images and a discriminator network to determine whether these images are samples from a training data set (in this case, images created for the artist). We used a variant called conditional GAN, where the generator takes a separate input to guide the image generation process. Interestingly, our approach is completely different from other GAN efforts, which typically focus on photo realism.

To train GAN, we created a full-color image data set containing profiles of single-species organisms adapted from 3D biological models. Biocontours represent the shape and size of each creature and provide a segmented map that identifies individual body parts. After model training, the model was tasked with generating multi-species chimeras based on contours provided by the artist. The best-performing models are then merged into Chimera Painter. Below we show some sample assets generated using this model, including single-species organisms, and more complex multi-species chimeras.

Learning generation has structure

One problem with using gans to generate organisms is that anatomy and spatial coherence can be lost when rendering subtle or low-contrast portions of an image, despite the high perceptual importance of these to humans. Examples of this could include eyes, fingers, and even distinguishing overlapping body parts with similar textures (see affectionately named BoggleDog below).

Generating chimeras required a new non-photographic fantasy style data set with unique features such as dramatic perspectives, composition, and lighting. Existing illustration repositories are not suitable for use as data sets for training ML models because they may be subject to licensing restrictions, style conflicts, or simply lack the diversity required for this task.

To address this problem, we developed a new artist-led semi-automated approach for creating ML training datasets from 3D biological models, which allowed us to work at scale and iterate quickly as needed. In this process, the artist will create or acquire a set of 3D model creatures, one for each type of creature (such as hyena or lion). The artist then used Unreal Engine to create two sets of textures superimposed on the 3D model — one with a full-color texture (left, below) and one with a solid-color texture for each body part (for example, head, ears, neck, etc.), called a “split map” (right, below). A second set of body part fragments is provided to the model during training to ensure that GAN understands the specific structure, shape, texture, and proportions of the body parts of various organisms.

The 3D creature models are all placed in a simple 3D scene, again using Unreal Engine. A set of automated scripts will then take this 3D scene and interpolate between the different poses, viewpoints, and zoom levels of each 3D biometrics to create full-color images and segmentation graphs that make up the GAN training dataset. Using this approach, we generated more than 10,000 image + split graph pairs for each 3D biometrics, saving the artist millions of hours compared to manually creating such data (approximately 20 minutes per image).

fine-tuning

GAN has many different hyperparameters that can be adjusted, resulting in different quality of the output image. To get a better idea of which versions of the model are better than others, the artists were given samples of the different biological types generated by those models and asked to filter them down to a few best examples. We gathered feedback on the desired features that appeared in these examples, such as a sense of depth, style in terms of biological textures, and realism of the face and eyes. This information is used to train the new version of the model, and to select the best image from each biological category (e.g., gazelle, lynx, gorilla, etc.) after the model generates hundreds of thousands of images of creatures.

We tune GAN for this task by focusing on perceived losses. This loss function component (also used in Stadia’s Style Transfer ML) calculates the difference between two images using features extracted from a separate convolutional neural network (CNN) previously trained on millions of photos from the ImageNet dataset. These features are extracted from different layers of CNN and weights are applied to each layer, which affects their contribution to the final loss value. We found that these weights were critical in determining the appearance of the resulting image. Here are some examples from gans trained with different perceived loss weights.

Some of the variations in the image above are due to the fact that the dataset contains multiple textures for each creature (for example, the red or gray version of the bat). However, ignoring coloring, many of the differences are directly related to changes in perceived loss values. In particular, we found that certain values led to clearer facial features (for example, lower right vs. upper right) or “smoothing” versus “patterning” (upper right vs. lower left), making the generated biology feel more realistic.

Below are some gAN-generated creatures trained with different perceptual loss weights to show a small sample of outputs and postures that the model can handle.

Chimera Painter

Trained gans can now be used in Chimera Painter demos, allowing artists to use models iteratively rather than drawing dozens of similar creatures from scratch. Artists can choose a starting point and then adjust the shape, type or position of the biological parts, enabling rapid exploration and creating large numbers of images. The demo also allows uploading of biological Outlines created in external programs such as Photoshop. Simply download a preset bio outline to get the desired color for each bio part and use it as a template to draw one outside of Chimera Painter, then use the “Load” button in the demo to flesh out your creation with this outline.

We hope that these GAN models and the Chimera Painter demo tool will inspire others to think differently about their artistic pipeline. What can be created using machine learning as a paintbrush?

Update note: first update wechat public number “rain night blog”, later update blog, after will be distributed to each platform, if the first to know more in advance, please pay attention to the wechat public number “rain night blog”.

Blog Source: Blog of rainy Night