An overview,

In 2019, the project of Vivo live broadcasting platform was approved. In the early stage, we carried out the development of combined live broadcasting with the excellent top live broadcasting platform, and preliminarily explored the market, products and technologies. Later, in order to enrich the content and form of live broadcast, I began to explore independently. Later, combined with vivo’s current live broadcasting business, we successively completed the implementation of various live broadcasting forms such as pan-entertainment, interactive and company event live broadcasting. We believe that we will bring better live broadcasting experience to users according to the subsequent business planning.

Today, I would like to share with you the technical development process of Vivo live broadcast platform in the past two years. I hope you can have a basic understanding of live broadcast. If you have just started the development of live broadcast related businesses, you can get some inspiration.

Ii. Business background introduction

So far, Vivo live broadcast platform supports three categories of live broadcast. The first is the classic pan-entertainment live show. The platform provides some standard functions of live show, such as interactive chat, PK, gift giving and other basic functions. Pan live entertainment market started early, related functions play is rich and varied, for example the host wheat PK, anchor wheat interaction with users, gift combo, event list, you draw I guess, and so on related entertainment function, the platform is continuous iterative related classic features, continue to give the user a better user experience.

The second type of live broadcasting is the low-delay interactive live broadcasting based on real-time audio and video technology. Low-delay live broadcasting is characterized by strong interactivity and strong sense of participation of users, which brings strong technical challenges and high maintenance costs.

Third, in the context of the epidemic, live events for company information distribution have emerged, such as the annual meeting of the company, the recruitment and publicity meeting, brand culture and epidemic prevention knowledge. The biggest feature of this kind of live broadcast is its flexibility, changeability and various forms of content. For live broadcasting platforms, we need to provide one-stop live broadcasting solutions so that companies can use the form of live broadcasting to promote their brands and cultures in a better and more stable way.

First of all, we’ll look at vivo live platform of a business architecture model, the underlying link is our company around monitoring platform, live to live traffic, network, related business monitor data index, and provide relevant alarm service, facilitate us to immediately response, positioning and solve live in each module of some practical problems in the process of distribution.

On top of this monitoring, we iterate logically on the upper-level business. In the distribution of live broadcast content, the main method still relies on the CDN of cloud manufacturers and our internal live broadcast server cluster. One is to distribute traffic directly at the C terminal, and the other is to process related services through internal pull. In terms of business capabilities, we have preliminarily acquired the following basic capabilities, such as massive information storage, video processing, content identification, real-time security compliance audit of live video content, and synchronous and asynchronous processing of live broadcast events. At the same time, SDK is encapsulated based on these capabilities, such as scene SDKS such as push stream, play and IM. Standard SDKS are provided for distribution and reuse of live broadcast capabilities, and it is also convenient for business parties to integrate functions.

In terms of content output and external services, we have enabled our mobile apps, such as I Video, Vivo short video, I Music, browser and other apps, to enrich consumers’ mobile phone experience. In addition, in addition to supporting vivo’s ability to broadcast related mobile apps, we will also cooperate with third-party livestreaming platforms to distribute and broadcast some content.

Third, technical practice

3.1 Live broadcast of pan-entertainment shows

First of all, we will introduce the most classic pan-entertainment show live broadcast, as shown in the picture below, which is a standard live broadcast process based on RTMP protocol.

The general process is as follows: firstly, the input source is collected, usually by the screen, camera, microphone, etc. Next, image processing, such as beauty, filter, watermark, etc.; After that, standard H264 coding is carried out, and RTMP protocol is used for transmission. After the transmission, relevant events are processed in the central computer room, and finally distributed to the edge node. The user terminal reversely pulls the live stream from the edge node, and finally decodes and plays it in the terminal device. On top of this standard viewing process, pan-entertainment show live broadcast will also provide some common functions, such as interactive chat, mini-games in the live broadcast room, PK, gifts and so on.

This set of standard live broadcasting process also involves a lot of live broadcasting technology points. Next, I would like to talk to you about some practical problems our team encountered in the process of landing relevant live broadcasting business.

We encountered the following four issues, air tools, pushing flow is the first technical difficulties we encountered, anchor is numerous, the definition of “beauty” is differ, subjectivity is strong, more demand for personalized, in stickers, color temperature, pictures of saturated etc. The demand is higher, in the process of air at the same time, the requirements of facial jitter distortion, as well as the definition of the picture, There are also related requirements for granularity.

The second problem is that news distribution, which is our traditional said IM instant communication technology, the traditional instant messaging technology is not the same, in addition to the group chat, private chat and broadcast the classic IM message distributed scenario, “the group” members of the instability is, we need to consider the user is frequent in and out of the studio and studio and switch the evening peak message storm and traffic spikes, It’s something we need to deal with and solve.

The third problem is the delay of live broadcast, which is also one of the classic problems of live broadcast. There are many factors causing delay, such as terminal equipment, transmission protocol, network bandwidth, codec and so on.

Of the cost of the last question, is to live, must be done on the bad, will clearly know, RTMP and CDN and machinery, as the foundation of cloud services, CDN for this kind of streaming media distribution, the billing is also relatively expensive, so we also want to ensure that the product features and user experience, on the basis of cost control of our live.

For the first technical difficulty, we did the following treatment, we carried out some technical optimization on the end. In the aspect of beauty, we make full use of the accumulation of relevant technology of the company’s image team to standardize, generalize and customize beauty, filters, stickers, beauty and style makeup.

At the same time, we also conducted a series of experiments in the stream pushing module, including cloud transcoding, over-segmentation and sharpening, to ensure that the drawing style of each anchor has certain audience attraction. In the gift animation playback part, after many experiments using MP4 for special effects gift animation playback, the test data shows that COMPARED with SVGA, MP4 occupies significantly lower memory and file size. Finally, the player kernel team customized the player to share the player, slide, click pre-create and other ways, so that the indicator of the second opening of the live broadcast room can reach the first-class standard in the industry.

In the second part, we will introduce another problem of live broadcasting, namely instant message processing. Potential problems can be analyzed from the following two dimensions. First user dimension, and the user’s behavior including gifts, gift combo, thumb up, interactive chat, and when introducing moment influx of a large number of users, these scenes can cause IM messages flood peak, the server load pressure suddenly will improve a lot, even if we put the IM module componentization, isolation between modules, However, sudden IM traffic also affects the distribution of other high-priority or system messages.

The second aspect is the system dimension, under the scenarios of mass message, also can have a read diffusion problems, and to set the scene, causing us to IM distribution are also structured data, and some special complex business scenarios, structured data packet size is big also, in this case, if the instantaneous distribution of news is more, The egress bandwidth of the equipment room is under certain pressure. The second factor is that the message processing capacity of mobile terminals is also limited. If too many messages are distributed and the special effects of some messages are complicated, some low-end mobile phone models may not be able to process messages immediately.

Under the limitation of these scenarios, we made the following three schemes. The first is the combination of push and pull messages, the common operation of long IM link and short HTTP polling at the peak of business traffic. The second is the use of protobuf message compression and grading frequency limit, on the broadcast room of the message business and anchor two dimensions grading, limiting the frequency of different types of different broadcast room distribution; The last solution is monitoring degradation, we can monitor the designated live broadcast room, in the case of monitoring problems, can automatically switch the way of message distribution, to ensure the rate of message arrival.

The third problem is the delay of live broadcast. In fact, in dealing with the delay of live broadcast, we hope to avoid unnecessary delay in specific live broadcast scenarios, so as to achieve the standard delay, rather than blindly pursuing the extreme low delay. To solve the problem of live broadcast delay, we should first analyze several links that produce delay and sort out the optimization we can do in these links. Acquisition side was mainly affected by caching strategies and data coding, at the same time, the network environment, transmission protocol, and physical distance will produce certain effect to delay, to solve these problems, we collected the phone models and the actual network environment of the host, according to the actual situation of the corresponding definition of coding, weakening model and network effects on live.

Second we all know that the other 80% of the time delay was produced in downstream CDN play link, in the end, in order to resist the card, you need to do to buffer the adaptive, rate of adaptive forward error correction and retransmission packet loss, etc, because of the limitation of the agreement itself, we can only according to different user terminal distribution of different protocols and different definition to watch link, At the same time, we introduce new live protocols for special scenes, such as WebRTC,QUIC,SRT, etc.

Last aspect, is suffused with live entertainment cost issues, as a business unit, also we have been focused on cost, the cost of the live mainly has two main components of storage and bandwidth, we made the following three aspects: optimization of storage, we will be the host of video transcoding in the first place, to reduce the storage file size, secondly to heat grade of the host, Storage for different time processing, and finally will clip the relevant wonderful moments, in accordance with the relevant national laws and regulations, delete the original file, so as to reduce the cost of storage.

Based CDN service fee, the second is the cloud vendor’s way to charge according to the peak bandwidth and flow of these two, so we will be from a business point of view, to a certain degree of strategy optimization, to push the scene, we will push for several points in time, to prevent the instantaneous into the air, a large number of live viewing users at the same time playing live high peak of the bandwidth. Watch end support multiple resolution at the same time, according to the business flow of different periods, distribution of different definition to watch link, finally to restrict from the strategy, at the time of anchor operation, then the scientific planning of the airtime, prevent high flow anchor cluster, cause unnecessary bandwidth and the additional traffic fees, the last is to monitor, To judge and calculate the cost of each cycle, through the actual use of the next cycle stage adjustment, to ensure that can get a better solution phase charge.

3.2 Interactive Live broadcast

Analysis is done extensive entertainment show live after the solution of several classic problem, and we also to introduce the current low latency of live interaction, interactive live core characteristic is the ultra low delay, interactive, in a certain business scenarios, for all kinds of message delivery order and the requirement of real-time also particularly high, the current common interactive live, At present, the best scene in the industry is e-commerce, education, interactive entertainment, government and enterprise live broadcasting.

The main function and technology related with interactive live as shown above, such as synchronization of information, media processing, compared with pan live entertainment, security audit, the streaming media multi-user terminal on consistency, also has more stringent requirements, related technology stack, a large part of all is based on real-time audio and video technology, and also combined the technology of the SEI, order of information in sync, Generally speaking, it can meet the needs of the business.

We mainly encountered two business pain points when landing interactive live broadcast. The first pain point was that the second-level delay of a long live broadcast link based on RTMP protocol was difficult to meet, and multi-terminal media streaming picture processing based on RTMP protocol, such as mianmic and mixed stream. In mute scenarios, RTMP processing is complex and is not conducive to multi-terminal synchronization. The second aspect is information control. Information control is also a problem we encounter in the actual development process, such as terminal consistency control, streaming media security control, abnormal control, etc. In the actual production environment, because there are a large number of terminal interaction, interactive process produces a large number of messages, part of the business scenario will exist when message delivery delays, order confusion, even lost, eventually leading to various business terminal state inconsistent problems, this also is at present, interactive live besides delay, another obvious problem.

RTC and RTMP+CDN are completely different from each other in terms of technical features. The technical feature of RTMP is binding and coupling with CDN. RTMP needs to make use of the edge network capability of CDN. It enables users in multiple regions to obtain live broadcast content nearby, and improves the success rate and second turn rate of users to obtain live video stream. However, due to the limitation of network protocol and other factors, the corresponding delay is inevitable in the PROTOCOL stack of RTMP technology. RTC also has two obvious technical features. The theoretical delay of RTC based on UDP protocol is within one hundred milliseconds. The second feature is that the communication mode of SFU and MCU is suitable for multi-terminal interaction. Use different related techniques to solve practical problems.

Live in the interactive development process, we met a series of problems of information control, the first is the consistency control, interactive live strong business content and streaming media information related, need of indiscriminate order transfer information in real time, we made the following some compensation control, for a single client and batch information report. By the server to correct the abnormal state of individual terminals; Secondly, SEI should be used reasonably. It carries a small amount of business information in video stream information, such as karaoke room, cargoes and live answers, so as to ensure the consistency and synchronization of information.

The second is the safety control, in accordance with the relevant provisions of the state, audio and video broadcast content is illegal is central to the relevant departments to audit object, live on show here, we currently do the video streaming sectional frame review regularly, use machine double review mechanism, judge and people judge interactive live, will be an independent audit of each terminal, accurately identify the violation of interactive terminal, Finally, create a green and healthy live broadcast environment. The last scenario is also dictated by the complexity of the business because of the number of interactive terminals. The probability of exceptions is greatly increased due to factors such as human, network, and equipment. Therefore, we need to be prepared to quickly identify the flow interruption, effectively use the callback or report within the client’s capability, and handle the related exception recovery process after verifying services. The second is to use the heartbeat detection mechanism. The client periodically reports the heartbeat. If the server does not receive the heartbeat for more than a specified number of times, it can identify the abnormality of the device and also perform the related exception logic processing.

3.3 Event Live broadcast

In the last module, we will focus on the case of enabling the livestreaming platform for the company’s internal businesses. Vivo also has a lot of official live events on a daily basis, such as technology sharing, school recruitment talks, epidemic knowledge talks, brand image live broadcasts, and more official mobile phone conference live broadcasts.

Why is the focus on company live broadcast here?

First of all, official organizations have a huge influence and a larger audience than other types of live broadcasts. We also put more energy into the development, operation and maintenance of live broadcasts than other live broadcasts. In the company live broadcast also carried on some technical induction and summary. First of all, we need to ensure the stability and smoothness of the whole live broadcast. We conduct related monitoring and optimization of the network, and build multi-region internal live broadcast server to achieve multi-region traffic isolation and load balancing. Another aspect is the technical support for flexibility, in order to save our development time, after power live a few large companies, we also arranges systematically, summing up the features of some common SDK, through the use of these proved many times there is no problem of live SDK can reduce the probability of error, also can enhance the success rate of the entire live.

Next, we will talk about two interesting cases, which are also interesting and practical cases that we are exploring and practicing internal live broadcasting. The first case is that last year, due to the epidemic, our company could only hold online live broadcasting annual meeting. Vivo employees have many office places, which are located in various parts of the country. Therefore, how to efficiently guarantee the high-definition viewing of 10,000 employees in multiple office locations at home and abroad is the problem we need to solve at that time.

The root cause of this problem is that the egress bandwidth of each office area is limited. If all employees use the company’s wireless network to watch the video, the egress bandwidth of each office area will be full, which will affect some employees’ viewing experience and even their daily work.

Generally speaking, there are two common solutions to the bandwidth limitation problem.

  • Temporarily increase the egress bandwidth of network operators in each office location;

  • Reduce bandwidth pressure and cost by reducing the bit rate of live broadcast.

In fact, neither of these two schemes can completely guarantee that there is no problem with live broadcasting, and on the premise of sacrificing some users’ viewing experience, such schemes are also difficult to accept. Finally implemented and successfully implemented the solution is to push through the Intranet, the Intranet server performs load balancing, and the viewing requests from different offices are resolved to the live broadcast server of the local office through DNS. In this way, the problem of bandwidth can be successfully solved, and high-definition 4K resolution can be supported. At present, this ability has been verified and feasible for many times. Relevant practical details have also been published on a number of authoritative technical websites and received consistent praise from the community. If you are interested, you can refer to it.

The second interesting case is that the company’s daily press conference and publicity meeting need to be streamed to multiple third-party live broadcasting platforms at the same time, such as B station and Tencent Live broadcasting. Because push flow equipment push flow number limitation, unable to support multiple flow address, before the plan is to coordinate the various live broadcast partners, pull group synchronization a current address, confirm with every time there will be a lot of coordination work, to a large extent affect the efficiency of people, and it’s easy to appear some partners because configuration errors, lead to cannot live normally.

In this big background, we made some adjustments, by integrating internal live broadcast server and cloud server, do a good job in the corresponding network capacity, in the mainline network problems, can automatically identify and change to another cluster, push flow ultimately guarantee live smoothness and stability of the distance, we also set up artificial operation platform, In addition, through the open platform, partners can modify the relevant configurations temporarily, which can greatly reduce the possibility of manual configuration errors and support the company’s related live broadcasts with higher efficiency.

Four,

At present, Vivo live broadcast platform is still in the process of preliminary construction and continuous exploration, but our direction and planning are clear, that is, by continuously enriching c-terminal live broadcast forms, introducing more forms of live broadcast, bringing better user experience to Vivo mobile phone users, and at the same time, precipitation and accumulation of relevant technologies. Finally, these technologies are developed into some standard solutions to produce content and technology for horizontal departments of the company, such as technical SDK service, internal live broadcast service, live broadcast short video service, etc., forming a virtuous circle of mutual feedback between internal and external platforms.

Author: Li Guolin, Vivo Internet Server Team