Clubhouse has quadrupled in value from three months ago. On April 19, Clubhouse announced that it had raised a Series C round at a valuation of $4 billion.

And that’s just the tip of the iceberg in the explosion of real-time audio and video communications.

Thanks to Musk’s “live streaming”, there are more and more chat products, and Facebook is rumored to be launching a rival product called Clubhouse. Not only that, the demand for real-time audio and video in online office, online education, and pan-entertainment scenarios is also surging.

Thanks to the development of 5G, RTC and other technologies, a language chat room, event live broadcast room and online classroom can be quickly built and released, further stimulating the real-time audio and video market. Rongyun’s real-time audio and video service, for example, allows developers to quickly integrate audio and video capabilities in 30 minutes using just three steps: at the first step, they apply for developer registration, and the official website sends out App keys and other information to download the SDK. This step can usually be completed within ten minutes. Buy a way to integrate the downloaded SDK into your development tools, initialize the SDK, and join the room. Initializing the SDK helps initialize device and audio and video parameters. Buy a way to publish your own audio stream and subscribe to someone else’s.

The 5G era requires more convenient RTC technology services

Why does the market need the ability to quickly integrate real-time audio and video?

On the one hand, real-time audio and video functions are being embedded in many traditional Internet scenarios thanks to 5G. On the other hand, manufacturers focusing on the application layer need to launch functions at the lowest cost and the fastest speed to support product launch and operation.

Rongyun CTO Ren Jie believes that 5G brings two major changes to the RTC market.

First, the broadband and latency of 5G will be greatly improved, so high-definition and low-latency audio and video calls will become the mainstream in the future. Under 4G network, real-time audio and video call mainstream is 720P, 1080p stability is a little less. With the advent of 5G, 1080p and even higher resolution 4K and 8K call scenarios will generally increase.

Second, access to various iot devices will be increased. Before RTC real-time audio and video field, mainly mobile terminal, PC terminal applications. Other iot devices, such as vehicles, cameras and large-screen devices, are less connected. Ren believes that with the arrival of 5G, the access scenarios of various iot devices will also increase. From the technical point of view, after 5G solves the delay problem, a large number of devices can be connected, and many real-time operating systems ATOS, as well as The application scenarios of Linux in the FIELD of RTC will become more mainstream.

The arrival of 5G has spawned many new real-time audio and video applications, which also means that RTC technology providers need to provide better support.

“We need to be able to provide high-definition, stable, fluid audio and video experiences whenever and wherever we go.” Ren Jie concluded that there are many related technologies in the RTC field, but rongyun’s business goal is only one. Of course, rongyun will solve the network bandwidth limitation, audio and video processing and other technical problems behind the high-quality audio and video experience presented to users.

What are RTC technology service providers doing?

RTC technology providers have a lot of work to do behind hd, stable and smooth audio and video services, as well as “fast integration in 30 minutes”.

In general, the RTC technology stack can be divided into two aspects: server-side processing and server-side processing.

On – end processing, mainly for sound and video codec. After the completion of the coding and decoding will be a series of sound and video pre-processing, pre-processing includes echo, noise reduction, scream suppression, voice gain, 3A algorithm, etc. The server side is similar to the processing technology on the end, such as audio and video recording, but focuses on the transmission level, such as network optimization.

Network optimization can be divided into two parts: anti-weak network, distributed network and scheduling. No matter 5G, 4G or WiFi, as long as the wireless signal, will encounter the problem of signal occlusion and attenuation, which is manifested as the network is not very stable, often instantaneous packet loss will be relatively large, and the delay may suddenly become larger.

In addition, after the endpoint device is connected to the network route, the real-time audio and video stream transmission between two or multiple ends needs to pass through multiple network nodes. So which set of paths can make endpoint access better and make transmission flow more stable is the strategy of network routing and also the problem to be solved by distributed network and scheduling.

For these problems, Rongyun has made optimization from the following three aspects:

Firstly, the optimization of weak network adversity-related algorithms, including redefining some algorithms, strictly distinguishing random packet loss and bandwidth limitation, and being able to respond quickly.

Second, in terms of dynamic routing policy, Rongyun will do some advance detection to dynamically check links. The link failure can be detected within 3 to 4 seconds and rescheduled immediately, including some scheduling of servers and loads. This is distributed deployment.

Third, global network deployment. Rongyun audio and video uses a large number of IaaS and cooperates with some computer rooms around the world to deploy as many nodes as possible. In addition, you can monitor the status of these nodes in real time, expand capacity, and add nodes. Background also has audio and video quality QoE system monitoring the entire network, real-time tuning.

Be your developer’s advocate, not your competitor

“We are the provider of PaaS, so we will make aPaaS products upwards, but not to C products directly.” Ren Jie said that as long as the reuse degree is high, Rongyun will consider to encapsulate them upwards to help developers to use them closer to the scene. However, Rongyun always serves developers, so it will not make TO C products. Develop a competitive relationship with the developer client.

Take chat room as an example, rongyun’s support for chat room application focuses on the technical level, so that it can present a better sound effect.

After developers spend 30 minutes to quickly integrate a language chat room application, Rongyun will also make SDK calls and development according to business scenarios, so that developers can more quickly realize the use of mixing, adding background music, mic management, network signal display and other functions.

One is the related control of mai in the language chat room. Ren Jie said that the mic bit support relies on signaling management, because there are years of ACCUMULATION in IM field, signaling is the traditional advantage of Rongyun, and rongyun also has a complete set of technical support in language chat room, so it is easier and easier to complete the mic bit management. Second, real-time audio and video related processing of sound, including bel Canto, sound changes, various sound effects, sound effects after mixing and so on.

Ren pointed out that the development environment is complicated because the RTC field involves many specific audio and video concepts, as well as quality control related to audio and video. The more scenarioized the SDK itself, the easier it will be to integrate, because it will hide the technical details for developers, so it is suggested to move towards SaaS.

“Up to a point I couldn’t agree more. As for our PaaS capability providers, they also want to reduce developer costs most, so we are also working on aPaaS capability between SaaS and PaaS. For example, in addition to providing audio and video capabilities, we also provide MeetingLib as a whole control signaling system, which is directly related to some operations of audio and video streaming.”

Developers using this SDK don’t have to worry too much about the processing of audio and video streams. When you need to disable mics, for example, you do it in the MeetingLib via a standard interface. Conversely, without MeetingLib, developers could do it themselves with RTCLib, but they would have to issue bans on the app side themselves, or invoke IM capabilities to issue bans to everyone and disable everyone’s microphone.

“So we go up one level, and developers don’t have to pay attention to every single process, every single user.” Ren says that is what they are doing now.

In addition, in terms of customer support, Rongyun also summed up the two types of problems that need the most support.

One is SDK access, that is, development integration related issues. Ren Jie revealed that customers would first read the documents, or the support system, work order and support group provided by Rongyun to help customers complete the integration work. However, since most developers are new to the audio and video technology stack, they have difficulty encountering specific development concepts that are not generic.

For this purpose, Rongyun will provide a summary explanation in relevant documents, such as the explanation of basic concepts such as encoding, frame rate and bit rate. There are also procedural instructions detailing the SDK integration process; Quick demo is provided to assist developers in quick use and integration.

The other is quality-related support. Because audio and video are relatively affected by the network during a real-time call, the endpoint network may have problems. “There are a number of issues that need to be addressed in this process, and we also have a self-service platform — Polaris.” According to Ren, Polaris is essentially a QoE system of audio and video. The system records every call; The transmission of audio and video stream in the whole process, including the transmission rate, the lag rate, whether there is a black screen and other data indicators, data curve; Developers can also query call quality and statistics on the platform.

Next generation RTC market

As RTC market technology service providers, rongyun is doing a more fresh attempt recently.

5G has spawned more VR apps, and soon, VR live apps will follow. Rongyun has reached cooperation with many enterprises in the VR industry and currently provides them with remote maintenance, technical guidance and other services. “With the rollout of 5G, a truly large-scale to-C scenario in the entertainment industry will gradually emerge.”

Pan-entertainment applications are a big scene of the next generation of RTC applications. In addition to the emerging voice room, there are also live streaming, werewolf killing, script killing, KTV and so on, as well as new scenes integrating VR in the near future.

In addition, office applications include conference scenarios and monitoring scenarios. For example, in the monitoring scenarios of public security, security and emergency command, some devices will be connected, including GB28181 and SIP support. Many subdivision applications are also involved in the online education scene, such as small class, large class, large live broadcast, double teachers and so on.

Recently, in order to better enable developers to explore more fresh applications, Rongyun also launched 200,000 minutes of free audio and video sharing activities. Live audio and video users will enjoy 200,000 free minutes per month with up to 1080P ultra hd resolution.

In the current outbreak of RTC applications, Rongyun, as a leader in the field of communication cloud, combined with years of IM capabilities, has been able to provide technical services covering all communication scenarios.

Ren Jie said that Rongyun’s advantages come from many aspects: Rongyun is the PaaS provider of public cloud, and HAS been engaged in IM for many years. More than 99% of RTC scenarios will use IM related capabilities. In addition, Rongyun has a very professional large-scale team, and invests heavily in audio and video technology, and constantly iterates technology. “One of our service providers can use the integrated communication capability of ‘RTC+IM+PUSH’ to cover all communication scenarios with a set of SDK, as long as we rongyun can complete this matter.”