A 100-year-old video of Beijing has gone viral on Weibo after being forwarded by CCTV. It is understood that the video is from a B station UP master Otani’s game creation hut. Author Otani uses AI technology to render a black and white video of Beijing taken in 1920 into a high-definition and fluent color video, matched with local characteristics of BGM, to restore the folk customs of Beijing 100 years ago, quite internal flavor.
According to Otani, the black-and-white video shot 100 years ago has a lot of noise, frame number and resolution are very low due to its age. So he used three open source AI tools to manipulate the video:
- DAIN, video frame filling processing to improve video smoothness
- ESRGAN, super resolution processing for video, improve video definition
- DeOldifyAnd color the video
Let’s take a look at these three common AI tools for video repair, so you can learn how to use them and do your own old videos.
Video frame complement tool DAIN
The number of frames per second (FPS) has a significant impact on the smoothness of a video. At normal playback speeds, anything below 30 FPS will appear sluggish, while anything above 60 FPS will be indistinguishable to the naked eye. The higher the number of frames, the smoother the video must be, especially in slow motion, the difference is obvious:
The picture above is an example from SUPER SLOMO, another AI frame fixing tool. After slowing down the original 30 FPS car tail swing shot by 8 times, the actual FPS is less than 4 FPS, and the naked eye can see the lag. Through the AI frame filling technology, the video can be maintained at 240 FPS after slow playing, showing the smoothness.
Generally speaking, the core of frame complement is to insert intermediate frames between two consecutive frames to improve the FPS of a video, while the problem that frame complement tool needs to solve is how to automatically generate intermediate frames through THE AI algorithm. DAIN full name is depth-aware Video Frame Interpolation, a depth-aware Video Frame Interpolation tool. The DAIN team proposed a depth-aware frame interpolation model and developed a depth-aware stream projection layer to generate intermediate frames.
The test environment
- Ubuntu (Ubuntu = 16.04.5lts)
- Python: Anaconda3 = 4.1.1 & Python= 3.6.8
- Cuda and Cudnn: Cuda = 9.0 & Cudnn = 7.0
- Pythorch: Custom depth-aware flow projections and other layers require ATen API =1.0.0 in Pythorch
- GCC: Compiling the PyTorch 1.0.0 extension file (.c/.cu) requires a GCC=4.9.1 and NVCC =9.0 compiler
- GPU: NVIDIA GPU (the author used Titan X (Pascal) computing =6.1, supporting Compute_50/52/60/61 devices)
Install and use
Download database:
$ git clone https://github.com/baowenbo/DAIN.gitCopy the code
Before building the Pytorch extension, make sure you have Pytorch >= 1.0.0:
$ python -c "import torch; print(torch.__version__)"Copy the code
Generate PyTorch extension:
$ cd DAIN
$ cd my_package
$ ./build.shCopy the code
Generate the Correlation package required by PWCNet:
$ cd .. /PWCNet/correlation_package_pytorch1_0 $ ./build.shCopy the code
Test the pre-training model:
Make the model weight directory and Middlebury data set directory:
$ cd DAIN
$ mkdir model_weights
$ mkdir MiddleBurySetCopy the code
Download the pre-trained model,
$ cd model_weights
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/best.pthCopy the code
And the Middlebury dataset:
$ cd .. /MiddleBurySet $ wget http://vision.middlebury.edu/flow/data/comp/zip/other-color-allframes.zip $ unzip other-color-allframes.zip $ wget http://vision.middlebury.edu/flow/data/comp/zip/other-gt-interp.zip $ unzip other-gt-interp.zip $ cd ..Copy the code
Pre-installed:
$ cd PWCNet/correlation_package_pytorch1_0 $ sh build.sh $ cd .. /my_package $ sh build.sh $ cd ..Copy the code
Download the results
Use the following method to download the interpolation results:
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/UCF101_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Vimeo90K_interp_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Middlebury_eval_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Middlebury_other_DAIN.zipCopy the code
Generated in slow motion:
This model is fully capable of producing slow-motion effects with minimal modifications to the network architecture. Run the following code to generate x4 slow motion by specifying time_step = 0.25:
$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.25Copy the code
Or set time_step to 0.125 or 0.1 as follows
$CUDA_VISIBLE_DEVICES=0 python demo_MIDDLEbury_slowmotion. py --netName DAIN_slowmotion --time_step 0.125 $ CUDA_VISIBLE_DEVICES=0 python demo_MIDDLEbury_slowmotion. py --netName DAIN_slowmotion --time_step 0.1Copy the code
Generate x8 and x10 slow motion, respectively. Or if you want to shoot something fun in slow motion with the X100.
$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.01Copy the code
GIF animations can be created by:
$ cd MiddleBurySet/other-result-author/[random number]/Beanbags
$ convert -delay 1 *.png -loop 0 Beanbags.gif //1*10ms delay Copy the code
Have fun!
More on DAIN: Check it out
DAIN is available for download.
Click on the download
Resolution enhancement toolESRGAN
We know that resolution has a direct effect on the sharpness of the image. If we zoom in on a small low-resolution image, many of the details will become “Mosaic”, which I believe you are familiar with. Therefore, if the ordinary interpolation algorithm is used to enlarge the image, the edges of the objects in the image will become blurred, and the super-resolution algorithm is to solve this problem.
ESRGAN is an enhanced super-resolution generative adversarial network, which can generate real texture through AI during the period of image super-resolution to realize the improvement of image resolution, as shown in the following figure:
The test environment
- Python 3
- PyTorch> = 1.0 (CUDA version > = 7.5 if installed with CUDA. For more details)
- Python suite:
pip install numpy opencv-python
Install and use
1. Clone the Github repository.
git clone https://github.com/xinntao/ESRGAN
cd ESRGANCopy the code
2. Place the low resolution image in the./LR folder. (There are two sample images – baboon and cartoon).
3. Download the pre-trained model from Google or Baidu CLOUD Drive. Put the model in./ Models.
4. The authors provide two models with high perceived quality and high PSNR performance. Run tests: ESRGAN and RRDB_PSNR models are provided, and test.py can be configured.
python test.pyCopy the code
5. The results are in the./ Results folder.
For details on ESRGAN, click here
ESRGAN iS available for download.
Click on the download
Black and white image coloring toolDeOldify
DeOldify is a deep learning project for coloring and restoring old images and videos. It adopts NoGAN, a new and efficient image-to-image GAN training method with better detail processing and more realistic rendering:
NoGAN is a new GAN training model developed by the author to solve some key problems in the previous DeOldify model. NoGAN training combines the benefits of GAN training (nice colors) while eliminating annoying side effects (such as flashing objects in videos). Video is generated from isolated images without adding any time to modeling. The process performs the GAN portion of 30-60 minutes of “NoGAN” training, using 1% to 3% of image Network (Imagenet) data at a time. Then, as with still image coloring, the individual frames are “DeOldify” before rebuilding the video, rendering is very consistent even in moving scenes:
Currently, DeOldify is available in three models, each with key advantages and disadvantages and therefore different use cases. Artistic, Stable and Video.
The test environment
- Linux
- Fast.AI = 1.0.51 (and its dependencies). If you use any higher version, you will see mesh artifacts in the rendering and the TensorBoard will fail.
- PyTorch = 1.0.1
- Jupyter Lab Conda installation
-c conda forge jupyterlabCopy the code
- Tensorboard (that is, Tensorflow installed) and TensorboardX. Not necessary, but FastAI now provides native support for TF, which is handy.
prereqs:conda install-c anaconda tensorflow gpu
pip install tensorboardXCopy the code
- ImageNet is a great data set for training.
- GPU: The requirements are not high. (separate coloring with ordinary set display can be, large-scale training or suggest a better video card)
Install and use
Open the command line and navigate to the root folder you want to install, typing the following command:
git clone https://github.com/jantic/DeOldify.git DeOldify
cd DeOldify
conda env create -f environment.ymlCopy the code
Then start running with these commands:
source activate deoldify
jupyter labCopy the code
Start running in Jupyter Lab via the URL provided in the console.
DeOldify: Click here to learn more
DeOldify download
Click on the download
Those who want to turn old black and white videos into color hd videos can do so now.