Using these three open source tools, THE AI was able to reconstruct 100-year-old videos of Beijing

A 100-year-old video of Beijing has gone viral on Weibo after being forwarded by CCTV. It is understood that the video is from a B station UP master Otani’s game creation hut. Author Otani uses AI technology to render a black and white video of Beijing taken in 1920 into a high-definition and fluent color video, matched with local characteristics of BGM, to restore the folk customs of Beijing 100 years ago, quite internal flavor.

According to Otani, the black-and-white video shot 100 years ago has a lot of noise, frame number and resolution are very low due to its age. So he used three open source AI tools to manipulate the video:

DAIN, video frame filling processing to improve video smoothness
ESRGAN, super resolution processing for video, improve video definition
DeOldifyAnd color the video

Let’s take a look at these three common AI tools for video repair, so you can learn how to use them and do your own old videos.

Video frame complement tool DAIN

The number of frames per second (FPS) has a significant impact on the smoothness of a video. At normal playback speeds, anything below 30 FPS will appear sluggish, while anything above 60 FPS will be indistinguishable to the naked eye. The higher the number of frames, the smoother the video must be, especially in slow motion, the difference is obvious:

The picture above is an example from SUPER SLOMO, another AI frame fixing tool. After slowing down the original 30 FPS car tail swing shot by 8 times, the actual FPS is less than 4 FPS, and the naked eye can see the lag. Through the AI frame filling technology, the video can be maintained at 240 FPS after slow playing, showing the smoothness.

Generally speaking, the core of frame complement is to insert intermediate frames between two consecutive frames to improve the FPS of a video, while the problem that frame complement tool needs to solve is how to automatically generate intermediate frames through THE AI algorithm. DAIN full name is depth-aware Video Frame Interpolation, a depth-aware Video Frame Interpolation tool. The DAIN team proposed a depth-aware frame interpolation model and developed a depth-aware stream projection layer to generate intermediate frames.

The test environment

Ubuntu (Ubuntu = 16.04.5lts)
Python: Anaconda3 = 4.1.1 & Python= 3.6.8
Cuda and Cudnn: Cuda = 9.0 & Cudnn = 7.0
Pythorch: Custom depth-aware flow projections and other layers require ATen API =1.0.0 in Pythorch
GCC: Compiling the PyTorch 1.0.0 extension file (.c/.cu) requires a GCC=4.9.1 and NVCC =9.0 compiler
GPU: NVIDIA GPU (the author used Titan X (Pascal) computing =6.1, supporting Compute_50/52/60/61 devices)

Install and use

Download database:

$ git clone https://github.com/baowenbo/DAIN.gitCopy the code

Before building the Pytorch extension, make sure you have Pytorch >= 1.0.0:

$ python -c "import torch; print(torch.__version__)"Copy the code

Generate PyTorch extension:

$ cd DAIN
$ cd my_package 
$ ./build.shCopy the code

Generate the Correlation package required by PWCNet:

$ cd .. /PWCNet/correlation_package_pytorch1_0 $ ./build.shCopy the code

Test the pre-training model:

Make the model weight directory and Middlebury data set directory:

$ cd DAIN
$ mkdir model_weights
$ mkdir MiddleBurySetCopy the code

Download the pre-trained model,

$ cd model_weights
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/best.pthCopy the code

And the Middlebury dataset:

$ cd .. /MiddleBurySet $ wget http://vision.middlebury.edu/flow/data/comp/zip/other-color-allframes.zip $ unzip other-color-allframes.zip $ wget http://vision.middlebury.edu/flow/data/comp/zip/other-gt-interp.zip $ unzip other-gt-interp.zip $ cd ..Copy the code

Pre-installed:

$ cd PWCNet/correlation_package_pytorch1_0 $ sh build.sh $ cd .. /my_package $ sh build.sh $ cd ..Copy the code

Download the results

Use the following method to download the interpolation results:

$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/UCF101_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Vimeo90K_interp_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Middlebury_eval_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Middlebury_other_DAIN.zipCopy the code

Generated in slow motion:

This model is fully capable of producing slow-motion effects with minimal modifications to the network architecture. Run the following code to generate x4 slow motion by specifying time_step = 0.25:

$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.25Copy the code

Or set time_step to 0.125 or 0.1 as follows

$CUDA_VISIBLE_DEVICES=0 python demo_MIDDLEbury_slowmotion. py --netName DAIN_slowmotion --time_step 0.125 $ CUDA_VISIBLE_DEVICES=0 python demo_MIDDLEbury_slowmotion. py --netName DAIN_slowmotion --time_step 0.1Copy the code

Generate x8 and x10 slow motion, respectively. Or if you want to shoot something fun in slow motion with the X100.

$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.01Copy the code

GIF animations can be created by:

$ cd MiddleBurySet/other-result-author/[random number]/Beanbags
$ convert -delay 1 *.png -loop 0 Beanbags.gif //1*10ms delay Copy the code

Have fun!

More on DAIN: Check it out

DAIN is available for download.
Click on the download

Resolution enhancement toolESRGAN

We know that resolution has a direct effect on the sharpness of the image. If we zoom in on a small low-resolution image, many of the details will become “Mosaic”, which I believe you are familiar with. Therefore, if the ordinary interpolation algorithm is used to enlarge the image, the edges of the objects in the image will become blurred, and the super-resolution algorithm is to solve this problem.

ESRGAN is an enhanced super-resolution generative adversarial network, which can generate real texture through AI during the period of image super-resolution to realize the improvement of image resolution, as shown in the following figure:

The test environment

Python 3
PyTorch> = 1.0 (CUDA version > = 7.5 if installed with CUDA. For more details)
Python suite:pip install numpy opencv-python

Install and use

1. Clone the Github repository.

git clone https://github.com/xinntao/ESRGAN
cd ESRGANCopy the code

2. Place the low resolution image in the./LR folder. (There are two sample images – baboon and cartoon).

3. Download the pre-trained model from Google or Baidu CLOUD Drive. Put the model in./ Models.

4. The authors provide two models with high perceived quality and high PSNR performance. Run tests: ESRGAN and RRDB_PSNR models are provided, and test.py can be configured.

python test.pyCopy the code

5. The results are in the./ Results folder.

For details on ESRGAN, click here

ESRGAN iS available for download.
Click on the download

Black and white image coloring toolDeOldify

DeOldify is a deep learning project for coloring and restoring old images and videos. It adopts NoGAN, a new and efficient image-to-image GAN training method with better detail processing and more realistic rendering:

NoGAN is a new GAN training model developed by the author to solve some key problems in the previous DeOldify model. NoGAN training combines the benefits of GAN training (nice colors) while eliminating annoying side effects (such as flashing objects in videos). Video is generated from isolated images without adding any time to modeling. The process performs the GAN portion of 30-60 minutes of “NoGAN” training, using 1% to 3% of image Network (Imagenet) data at a time. Then, as with still image coloring, the individual frames are “DeOldify” before rebuilding the video, rendering is very consistent even in moving scenes:

Currently, DeOldify is available in three models, each with key advantages and disadvantages and therefore different use cases. Artistic, Stable and Video.

The test environment

Linux
Fast.AI = 1.0.51 (and its dependencies). If you use any higher version, you will see mesh artifacts in the rendering and the TensorBoard will fail.
PyTorch = 1.0.1
Jupyter Lab Conda installation

-c conda forge jupyterlabCopy the code

Tensorboard (that is, Tensorflow installed) and TensorboardX. Not necessary, but FastAI now provides native support for TF, which is handy.

prereqs:conda install-c anaconda tensorflow gpu
pip install tensorboardXCopy the code

ImageNet is a great data set for training.
GPU: The requirements are not high. (separate coloring with ordinary set display can be, large-scale training or suggest a better video card)

Install and use

Open the command line and navigate to the root folder you want to install, typing the following command:

git clone https://github.com/jantic/DeOldify.git DeOldify
cd DeOldify
conda env create -f environment.ymlCopy the code

Then start running with these commands:

source activate deoldify
jupyter labCopy the code

Start running in Jupyter Lab via the URL provided in the console.

DeOldify: Click here to learn more

DeOldify download
Click on the download

Those who want to turn old black and white videos into color hd videos can do so now.

Using these three open source tools, THE AI was able to reconstruct 100-year-old videos of Beijing

Video frame complement tool DAIN

Resolution enhancement toolESRGAN

Black and white image coloring toolDeOldify

Related Posts

The difference between machine learning and deep learning

Docker mount host directory Docker access Permission denied

To win the Oscar, Alibaba’s team devised a winner