Those who follow the live broadcasting industry know that 2016 is the first year of mobile live broadcasting. In that year, Internet celebrities were on our lips and the number of live broadcasting platforms showed a massive increase. The model changed from live broadcast of show to UGC live broadcast for the whole people. From beauty and pornography at the beginning, it gradually explored and combined with various vertical fields. Recently, the popularity of live streaming claw machines and live answering shows us the possibility of extending and innovating the mobile live streaming scene.
Application scenarios of mobile live broadcasting
Livestreaming is an advanced means of displaying content. The combination of mobile livestreaming and vertical field will produce better scenes and add value to the original business.
Here are some common scenarios:
Live streaming + e-commerce: it is easier to promote sales. The data shows that more than 60% of the audience will browse the products, and the purchase rate reaches 20%. Live streaming + education: distance education can be closer to on-site teaching, and through real-time interaction, can effectively improve the quality of learning; Live streaming + finance: The whole process from information to transaction can be realized through the financial planner’s analysis of the market and real-time information direction, promoting the purchase of financial products and so on.
Of course, the above scenes are based on a stable, complete, high-performance, scalable live broadcast system. Next, Xianwang, product manager of video cloud, will introduce aliyun live broadcast system solutions and the core technical capabilities behind them.
Live broadcast system solutions
Aliyun’s live broadcast technology architecture provides a complete end-to-end solution to help users quickly build live broadcast systems and realize live broadcast functions.
In the figure below, the left side is the push stream terminal, which has the push stream SDK of Android and iOS, and can also be combined with the push stream tool like OBS and professional devices to push stream. After the live stream is pushed up, it will be processed in the live center in real time, such as real-time transcoding, screenshot, recording, watermarking, live time shifting, etc., and then distributed through CDN. The player receives the playback of the entire video stream, which can support more than 100,000 stream outputs and more than 10 million viewers online at the same time.
(Technical Architecture of Live Broadcasting)
We can see that there is SDK or OpenAPI on the whole link for users to use. Users only need to do some interface docking based on SDK to complete the development of the whole live broadcast system.
Technical ability introduction
Let’s take a look at the specific links involved in the whole live broadcast system and how each link is implemented.
First, push the flow end
The live streaming end is the host end, which mainly collects video data through mobile phone camera and audio data through microphone. After a series of pre-processing, coding and streaming, it is distributed until CDN.
1. Advanced beauty
Pre-processing is the rendering of the video, such as beauty, watermarking, mapping, mixing, noise reduction and other processing. The essential beauty function of the live broadcast platform is realized by recognizing the skin part of the face through the algorithm and adjusting the color value of the skin area. Now, Ali Cloud will be based on facial recognition based on the value of technology to benefit the public, such as thin face, small face, big eyes, blush and other advanced beauty functions completely free of charge. At the same time, through standardized packaging, push stream SDK provides a standard interface, which can support the access of third-party beauty ability, so that users can make choices based on their own business.
2. Real-time mixing
In the push stream stage, Ali Cloud also opened the popular mixing technology, it is the collection of human voice and music sound mixed output, and support noise reduction and ear return functions. The whole technical process is as follows: the background music is decoded into PCM audio data, while the PCM audio data collected by the microphone is de-noised, and the two are combined. The PCM audio is played directly at the anchor end, and the audio is encoded first at the audience end, and then the audio stream is pushed through RTMP to complete the playback.
Ii. Processing of live broadcast server
After the streaming is completed, the live streaming server needs to carry out technical implementation, including real-time transcoding, yellow authentication, live streaming screenshots, live streaming recording, watermarking, time-shifting viewing, live answering, etc., while ensuring stability, fluency and visibility of real-time data.
1. Real-time transcoding
Real-time transcoding is to convert a stream into multiple streams with different definitions. It is rarely used in mobile live broadcast scenes, but the stream pushed up on the PC side is generally high-definition, so it is more effective. The narrowband HD technology provided by Ali Cloud uses high allocation transcoding cluster and highly complex coding algorithm to provide better compression rate at the same quality and save 20%-30% traffic bandwidth.
2. Live streaming
In order to ensure the compliance of content on the platform, live-streaming porn verification is a necessary part. Large live broadcast platforms use manual yellow authentication, which is costly and not accurate enough. Therefore, the selection of artificial intelligence technology to identify pornographic content can reduce the audit manpower and effectively reduce the risk of pornography on the platform.
After the API is added to the live broadcast system, the ARTIFICIAL intelligence technology can screen the second-level screenshots, determine the degree of pornography of the live broadcast, give a reasonable score, and give suggestions on handling methods to help the platform achieve content control.
3, live time shift
Live-streaming time-shifting is a kind of ability to combine on-demand and live-streaming. Simply speaking, it supports real-time replay of the content that has been live-streamed. This function is combined with the player SDK, through a few simple interface calls can be achieved. In network TV, network live, is very common.
4. Live quiz
Aliyun has also launched a solution to the recent hot livestreaming quiz. Technically, it is realized as follows: the host sends out answer signals, the field broadcast director calls the OpenAPI of Ali Cloud through the AppServer of the access party, inserts SEI information into the video frame, and the player parses SEI frame and calls back to APP for display on the playback end. The key point is that SEI information is not lost during CDN distribution and transcoding. This also ensures that all clients can successfully complete the task. In addition, the whole scheme through the synchronization server, the same transmission channel at the same time, can achieve high-precision painting synchronization, to ensure user experience.
In general, SEI information is inserted into the scene of live answering by on-site personnel at the OBS terminal after transformation. At the same time, Aliyun live answering solution will also provide the ability of setting questions on mobile terminal to meet the business scene requirements of anchors setting questions directly.
Three, pull flow end
The core processing of the pull stream side is decoding and rendering on the player side. The interactive live broadcast also needs to integrate functions such as chat room, like and gift system.
The pull stream side now supports RTMP, HLS, and HTTP-FLV. RTMP is a patent protocol of Adobe, which is well supported by open source software and open source libraries. The delay is generally 1-3 seconds. HLS is a streaming media transmission protocol based on HTTP proposed by Apple. It is preferred to be cross-platform. HTML5 can be opened to play directly and has good compatibility with mobile terminals, but its disadvantage is high latency. FLV (HTTP-FLV) is a protocol that uses HTTP to transmit streaming media content. Without worrying about patent issues, the delay of live broadcast can also be 1-3 seconds.
All three broadcast protocols can be supported, and we can choose according to our own scene. For example, FLV can be used for playback on the server, and HLS is recommended for sharing the video stream.
The core technology
So in the whole Aliyun live broadcasting system, what are the core technologies of mobile terminals? Xianwang think: dynamic bit rate, advanced beauty, the first frame second player open three points is very important.
Push stream SDK — Dynamic bit rate
The implementation principle of dynamic bit rate technology is to set a maximum and minimum bit rate and delineate the floating range of bit rate in the configuration of application layer. Then, the current bit rate is reported in real time, and the dynamic bit rate is adjusted according to the bandwidth condition, so as to improve the video definition when the network condition is good, and ensure the smoothness when the network condition is bad. This technology allows for a balance between video sharpness and lag.
According to Xianwang, the video cloud is currently exploring the strategy of dynamic frame loss, and in the future, the frame rate and resolution of encoding may also be dynamically adjusted.
Two, push stream SDK — real-time beauty
Ali Cloud real-time beauty, in fact, beauty related algorithms, such as linear filtering, nonlinear filtering, PS method of image details enhancement and so on, do the fusion and optimization. The latest beauty algorithm has achieved very high performance through a combination of filters, taking a total of six milliseconds.
Generally, video streaming on the mobile end is 15-20 frames per second, and each frame takes 50 milliseconds to process. In fact, 6 milliseconds of processing time is basically no performance bottleneck.
Three, player — the first frame seconds open
On the player side, the first frame second determines the key technology of the user experience. Its basic principle is that when a user in the initiating point not video key frames, can’t directly apply colours to a drawing and playing, reading has been in service side cache a GOP recently in play end reads to key frames can directly show, again through the PTS correction, lost frames, chase frame, such as strategy, to deal with the dynamic video streaming, reduce the delay. After testing, the second time between 200 milliseconds -1 second.
Access to the process
The whole process of PC broadcast is as follows:
First of all, we need to add domain name, bind CNAME, configure authentication and other preparatory work before live broadcast, then use third-party streaming software or OBS to push stream, finally obtain the broadcast address, and preview the broadcast on the Web page or VLC.
Although the technology behind it is relatively complex, the user side of the operation is very simple and convenient.
If it involves streaming and playing on mobile terminals, you can refer to the following documents:
IOS Push stream SDK instructions
Android push stream SDK instructions
IOS Player SDK instructions
Android Player SDK instructions
conclusion
In order to quickly realize the ability of live broadcasting without any impact on the original business, relying on live broadcasting platforms like Ali Cloud to build mobile live broadcasting system, handing over technical problems to Ali Cloud, and focusing more on the core business itself is the safest and efficient choice.
In order to facilitate users to experience before access, Ali Cloud also provides product DEMO, scan code download push stream, player, short video and other client SDK, have any ideas and suggestions, welcome you to leave a message and interaction in the cloud community below the original text.