Original Zhangming Baidu App technology
The background,
With the proliferation of mobile devices, it has become easier to produce and consume content based on mobile devices. As a content distribution platform, Baidu App carries a large number of images, texts and videos contributed by PGC and UGC. At a time when 2K mobile phone screen resolution has become the mainstream, people’s appeal to watch high-definition resources has become natural. The collection, transmission and storage of pictures and videos are limited by various factors, and there will inevitably be some resources with relatively poor definition and resolution, which is bound to affect the user’s viewing experience. Baidu App, together with the team of Baidu Visual Technology Department, improves the display effect of terminal pictures and videos through real-time super-resolution reconstruction technology based on deep learning.
How to improve the resolution
Generally speaking, the image and image resolution represent the number of pixels per unit area of the physical scene on the imaging plane, which is an indicator of the resolution ability of image details. It can be used to describe the sharpness of the image, the higher the resolution, the more details can be presented, the more accurate the pixel value carried, under the same display hardware, often can get a better viewing experience, which means better picture quality, at the same time the resource file will be larger.Note: Display at different resolutions (image from Wikipedia)
Super resolution can be understood as the process of creating more pixels based on the pixel content of an existing image.
Traditional methods to improve image resolution, such as interpolation, are based on fixed rules to calculate the value of added pixels. There are often problems with mosaics, serrations and blurred edges.
In recent years, thanks to the continuous development of deep learning technology, such as convolutional neural network, which uses the methods of human visual system to perceive graphic images for reference, can achieve better and more stable reconstruction effects by extracting and learning image features to complete reconstruction.
3. Baidu App superfraction reconstruction model
The superseparable reconstruction model is a residual learning network framework based on VDSR. The model tailoring and the use of Depthwise Convolution will accelerate the computation of the model. The model input is the Y-channel which has been sampled to the target resolution by the algorithm and supports variable input.Note :(image from VDSR Paper)
4. Difficulty and challenge of real-time super resolution on mobile terminal
5. Strategy and optimization of mobile terminal real-time overscoring
Application layer optimization:
- 1. Image over-memory: for super-large images, by cutting original images into blocks, sub-columns and multi-instance parallel over-partitioning, the memory occupancy peak can be predicted by dynamic constraint.
- 2. Real-time performance of video over-scoring: the strategy module provides the stability guarantee of extreme over-scoring and safe frame rate over-scoring.
- 3. Computing resource scheduling: Part of cpu-based pre – and post-processing is migrated to GPU operator, and pre – and post-processing and prediction are processed by GPU.
Inference engine optimization:Optimization results:
-
- Picture & video over-prediction time, optimized to less than 50% of the original time. Batch capacity: iOS can be optimized to 1/4 of CoreML time. 480p predicted speed: 25ms for iPhone XR; The Android Snapdragon 845 is 23ms.
-
- Image & Video super GPU memory usage reduced to less than 50%.
Six, business application and effect comparison
Image hyperscore and video hyperscore have been applied in several mobile terminal products of Baidu. Every day, tens of millions of pictures and videos are displayed and played to users after being reconstructed. There is no Server intervention in the whole process, and the calculation, storage, and bandwidth occupation of low-frequency resources during Server reconstruction are reduced.Note: Low resolution hyperresolution reconstruction to target resolution vs. target resolution original quality
7. End-to-end access scheme
Baidu App will open the video super sub ability recently, please look forward to it.
// iOS /** Super split @param image Image @param scaleType SR multiple @param block result Callback */ (void)executeSuperResolutionWithImage:(UIImage *)aImage scale:(MMLImageSuperResolutionScaleType)scaleType Completion :(void (^)(UIImage *srImage, NSError *error))block API_AVAILABLE(ios(9.0)); // Android /** * Execute image superpartition * @param inputBitmap Image to be superpartitioned * @param scale SR multiple * @param onSrResultListener Callback the superpartition result */ void sr(Bitmap inputBitmap, float scale, OnSrResultListener onSrResultListener)Copy the code
8. References
En.wikipedia.org/wiki/Image_…
Arxiv.org/abs/1511.04…