directory

  • [How to quickly develop a complete iOS Live app] (Principles)
  • [How to quickly develop a complete iOS Live App]

preface

Rectified didn’t write the blog, but I have been focused on the Internet, a recent study a lot of things, and share, the rise of mobile broadcast industry this year, the birth of a large number of web celebrity, even star began to live, so have to keep up with the pace of The Times, due to the reason of the first contact, so spent a lot of time to learn about live, the principle of the live, At present, it is only the principle part, and the actual combat part will be released continuously in the future to teach you to build a complete iOS live broadcast app from scratch, hoping to help more people understand live broadcast more quickly. If you like my article, you can follow my microblog: acridzheng, and you can also come to Xiao Xiao Ge to learn about our iOS training course. More content will be updated in the future, if you have any questions, please welcome Jane to leave a message zheng ACrid…

I. Personal opinions (Difficulty and ease of live broadcasting)

Live difficult: individual thinks to live made from scratch, is definitely a cow force of the cow force, and the Daniel, because it is applied to the technical difficulties in broadcast, video/audio processing, image processing, video/audio compression, CDN distribution, such as instant messaging technology, each technology is enough you learn a few years.

Livestream easy: There are already masters in various fields who have encapsulated many awesome frames. We only need to use the frames written by others to quickly build a livestream app, which is also known as standing on the shoulders of the bull for programming.

Second, learn about live broadcasting

Hot Live Streaming products

Yingke, Betta fish, panda, tiger tooth, Pepper and so on

Live-streaming renderings




Live effects. Jpeg


1. A complete live streaming app function (fromFalls the shadow loyinglinShared)

  • 1, chat

  • 2, gifts

    • Ordinary gifts, luxury gifts, red envelopes, leaderboards, third-party recharge, internal purchase, gift dynamic update, cash withdrawal, etc.;
  • 3. Live list

  • 4. Live stream yourself

    • Recording, stream pushing, decoding, playing, beauty, heartbeat, background switching, anchor to administrator operation, administrator to user, etc.;
  • 5. Room logic

    • Create a room, enter a room, exit a room, close a room, switch a room, room administrator Settings, room user list, etc.
  • 6. User logic

    • General login, third-party login, registration, search, modify personal information, follow list, fan list, forget password, view personal information, income list, follow and remove, search, etc.;
  • 7. Watch live

  • 8, statistics

  • 9, super tube

2. The principle of a complete live streaming app

Principle of live broadcasting: The video recorded by the anchor is pushed to the server and distributed to the audience by the server.

Live broadcast: Push stream side (collection, beauty processing, coding, push stream), server side processing (transcoding, recording, screenshot, yellow authentication), player (pull stream, decoding, rendering), interactive system (chat room, gift system, like)

3. A complete implementation process of live streaming app

1. Collection, 2. Filter processing, 3. Coding, 4. Pull stream, 7. Decode, 8. Play, 9. Chat interaction




PNG livestreaming process


4. A complete live streaming APP architecture




Live streaming architecture. PNG


5. A complete technical point of livestreaming app




WeChat_1472043345.jpeg


Iii. Understand streaming media (livestreaming requires streaming media)

  • Streaming media development: Network layer (Socket or ST) is responsible for transmission, protocol layer (RTMP or HLS) is responsible for network packaging, packaging layer (FLV, TS) is responsible for the encapsulation of codec data, coding layer (H.264 and AAC) is responsible for image, audio compression.
  • frame: Each frame represents a still image
  • GOP(Group of Pictures) Picture Group: A GOP is a Group of continuous Pictures, each picture is a frame, and a GOP is a collection of many frames
    • The data of live broadcast is actually a group of pictures, including I frame, P frame and B frame. When the user watches it for the first time, he will look for I frame, and the player will find the latest I frame from the server and feed back to the user. Therefore, GOP Cache adds end-to-end latency because it must fetch the latest I frame
    • The longer the GOP Cache, the better the picture quality
  • Bit rate: The amount of data displayed per second after the image is compressed.
  • Frame rate: Number of images displayed per second. Affect the smoothness of the picture, is proportional to the smoothness of the picture: the larger the frame rate, the smoother the picture; The smaller the frame rate, the more jumpy the picture.
    • Due to the physical structure of the human eye, images viewed at a frame rate higher than 16 are considered coherent, a phenomenon known as visual persistence. And when the frame speed reaches a certain value, and then increases, the human eye is not easy to detect a significant improvement in fluency.
  • The resolution of the(rectangle) The length and width of the picture, i.e. the dimensions of the picture
  • Data volume per second before compression: Frame rate X resolution (should be bytes)
  • Compression ratio: Data volume per second before compression/bit rate (For the same video source and using the same video coding algorithm, the higher the compression ratio, the worse the picture quality.)
  • Video file formats: suffix, such as WMV,…. Mov, mp4, mp3, avi,

    • The main useAccording to the file format, the system will automatically determine which software to open.

      Note: modify the file format at will, it will not affect the file itself too much, such as avi to MP4, the file is avi.

  • Video encapsulation format: a container for storing video information. Streaming encapsulation includes TS, FLV, etc., indexed encapsulation includes MP4,MOV,AVI, etc.

    • The main role: A video file usually contains images and audio, and some configuration information (such as the relationship between images and audio, how to decode them, etc.) : this content needs to be organized and packaged according to certain rules.
    • Pay attention to: It can be found that the encapsulation format is the same as the file format, because the suffix name of the general video file format is the name of the corresponding video encapsulation format, so the video file format is the video encapsulation format.
  • Video packaging format and video compression coding standardsJust like project engineering and programming languages, packaging format is a project’s project, video encoding is a programming language, and a project can be developed in different languages.

Iv. Introduction to basic Knowledge of Live Broadcasting:

1. Collect video and audio

* 1.1 Video and audio coding framework *

  • AVFoundationAVFoundation is a framework for playing and creating real-time audio and visual media data, while providing an Objective-C interface to manipulate this audio and visual data, such as editing, rotating, and reencoding

* 1.2 Video and audio hardware equipment *

  • CCDImage sensor A process used in image acquisition and processing to convert an image into an electrical signal.
  • pick-upSound sensor A process used for sound acquisition and processing that converts sound into electrical signals.
  • Audio sampling data: Generally, the format is PCM
  • Video sampling dataB: UsuallyYUVOr,RGBFormat, the volume of the original audio and video collected is very large, which needs to be processed by compression technology to improve transmission efficiency

2. Video processing (beautifying, watermarking)

  • Principles of Video processing: Because the video is finally rendered to the screen frame by frame by GPU, we can use OpenGL ES to process various video frames, so that the video can have different effects.It's like water coming out of a tap, passing through several pipes, and then going to different targets
    • Now all kinds of beauty and video add special effects app are usedGPUImageThis framework implements.

* Video processing framework *

  • GPUImage: GPUImage is a powerful image/video processing framework based on OpenGL ES. It encapsulates various filters and can also write custom filters. It has more than 120 common filter effects built-in.
  • OpenGLOpenGL (Full write Open Graphics Library) is a specification that defines a cross-programming language, cross-platform programming interface for 3D (or 2d) Graphics. OpenGL is a professional graphics program interface, is a powerful, easy to call the bottom graphics library.
  • OpenGL ES:OpenGL ES (OpenGL for Embedded Systems) is a subset of OpenGL 3D graphics API, designed for Embedded devices such as mobile phones, PDAs and game consoles.

3. Video encoding and decoding

* 3.1 Video coding framework *

  • FFmpeg: is a cross-platform open source video framework, can achieve such as video encoding, decoding, transcoding, streaming, playing and other rich functions. It supports rich video formats and playback protocols, including almost all audio and video codec, encapsulation formats and playback protocols.
    • -Libswresample: Supports audio resampling,rematrixing, and conversion of sampling formats.
    • -Libavcodec: provides a general purpose codec framework, including many video, audio, subtitle streams and other codecs.
    • -Libavformat: encapsulates or decapsulates videos.
    • -libavutil: contains some common functions, such as random number generation, data structure, math operations, etc.
    • -libpostProc: used for post-processing of videos.
    • -libswScale: used for video image scaling, color space conversion, etc.
    • -Libavfilter: provides the filter function.
  • X264: Compress the YUV encoding of the video into H.264 format
  • VideoToolbox: apple’s built-in video hard decoding and hard coding API, but not available until iOS8.
  • AudioToolbox: Apple’s own audio hard decoding and hard coding API

* 3.2 Video coding technology *

  • Video compression coding standard:The coding technique of compressing (video encoding) or decompressing (video decoding) video, such asMPEG.H.264These video coding techniques compress and encode video
    • The main role: Is to compress the video pixel data into video code stream, thus reducing the amount of video data. If the video is not compressed and encoded, the volume is usually very large, a movie may be hundreds of gigabytes of space.
    • Pay attention to: The most important factor affecting the quality of video is its video encoded data and audio encoded data, which has little to do with the package format
  • MPEG: A video compression method that uses interframe compression and stores only the differences between consecutive frames to achieve a large compression ratio
  • H.264/AVC: a video compression method that uses pre-prediction and the same frame prediction method as p-B frames in MPEG. It can produce video streams suitable for network transmission as required, with higher compression ratio and better image quality
    • Note 1: If from a single picture definition comparison, MPEG4 has the advantage; In terms of clarity in motion coherence, H.264 has an advantage
    • Note 2: Because the algorithm of 264 is more complex, the program implementation is cumbersome, running it requires more processor and memory resources. Therefore, running 264 system requirements are relatively high.
    • Note 3: Because the implementation of 264 is more flexible, it left some implementation to the manufacturers themselves to achieve, although this has brought A lot of benefits to the implementation, but the intercommunication between different products has become A big problem, resulting in the data compiled by the encoder of A company, must be solved by the decoder of A company such an embarrassing thing
  • H.265/HEVC: A video compression method based on H.264 that retains some of the original technologies and improves some of the related technologies to improve the relationship between bitstream, coding quality, delay, and algorithm complexity to achieve optimal Settings.
    • H.265 is a more efficient coding standard that can compress content into a smaller volume with the same image quality, and transmit it faster and with less bandwidth
    • The I frame(key frame) Keep a complete picture, only need this frame data can be completed when decoding (because it contains the complete picture)
  • P frame(Differential frame) Retain the difference between this frame and the previous frame. When decoding, the difference defined by this frame needs to be superimposed on the cached picture to generate the final picture. (P frame does not have complete picture data, only the data that is different from the picture of the previous frame)
  • B frame(Bidirectional differential frame) the difference between this frame and the frame before and after is retained. When decoding B frame, we not only need to obtain the cached picture before, but also need to decode the picture after, and obtain the final picture by superposition of the data of the frame with the picture before and after. B frame compression rate is high, but the CPU will be more tired when decoding
  • Intra-frame compression: When compressing a frame image, only the data of the frame is considered without considering the redundant information between adjacent frames, and lossy compression algorithm is generally used within the frame
  • Interframe compression: Temporal compression, which compresses data by comparing different frames on the timeline. Interframe compression is generally lossless
  • Muxing (synthetic): Encapsulate video streams, audio streams, and even subtitle streams into a file (Container format (FLV, TS)) as a signal.

* 3.3 Audio coding technology *

* 3.4 bit rate control *

  • More bit rate: The network situation of the audience is very complicated, it may be WiFi, 4G, 3G, or even 2G, so how to meet the requirements of multiple parties? Get more lines and customize the bit rate according to the current network environment.
    • Such as: often see video player software 1024,720, HD, STANDARD definition, fluency, etc., refers to a variety of bit rates.

* 3.5 Video packaging format *

  • TS: it is a streaming format. One of the advantages of streaming is that there is no need to load the index for playback, which greatly reduces the delay of the first loading. If the film is long, the index of MP4 files is quite large, which affects the user experience

    • Why use TS: This is because the two TS clips can be seamlessly spliced together and played continuously by the player
  • FLV: A streaming media package format. Because the files formed by FLV are very small and the loading speed is very fast, it makes it possible to watch video files on the network. Therefore, FLV format has become the mainstream video format today

4. Push the flow

* 4.1 Data transmission framework *

Librtmp: Used to transfer data in RTMP format

* 4.2 Streaming media data Transfer protocol *

  • RTMP: Real-time message transfer protocol (RTMP), an open protocol developed by Adobe Systems for audio, video, and data transfer between Flash players and servers.

    • The RTMP protocol is used for object, video, and audio transmission.
    • This protocol is built on top of TCP or polling HTTP.
    • The RTMP protocol is like a container for data packets, which can be video and audio data in FLV. A single connection can transmit multiple network streams over different channels, where packets are transmitted in fixed size packets

    The chunk: news package

5. Streaming media server

* 5.1 Common Servers *

  • SRS: an excellent open source streaming media server system developed by Chinese people
  • BMS: is also a streaming media server system, but not open source, is the commercial version of SRS, more features than SRS
  • nginx: A free open source Web server used to configure streaming media servers.

* 5.2 Data Distribution *

  • CDN(Content Delivery Network) is a Content Delivery Network that publishes the Content of a website to the “edge” of the Network closest to users, so that users can get the Content they need nearby, which solves the situation of Internet congestion and improves the response speed of users visiting websites.
    • CDN: proxy server, which acts as an intermediary.
    • How CDN works: For example, request streaming media data
      • 1. Upload streaming media data to server (source station)
      • 2. The source site stores streaming media data
      • 3. The client plays streaming media and requests encoded streaming media data from the CDN
      • 4. The CDN server responds to the request. If no such streaming media data exists on the node, the CDN continues to request streaming media data from the source station. If the video file is already cached on the node, skip to step 6.
      • 5. The source station responds to the CDN request and distributes the streaming media to the corresponding CDN node
      • 6. The CDN sends streaming media data to the client
  • Back to the source: When a user accesses a URL, if the CDN node that is parsed does not cache the response content or the cache has expired, the CDN returns the responseThe source stationTo get the search. If no one visits, then the CDN node will not actively goThe source stationGet it.
  • bandwidth: The amount of data that can be transmitted at a fixed time,
    • For example, 64-bit, 800MHz front-end bus, its data transfer rate is equal to 64bit x 800MHz÷8(Byte)=6.4GB/s
  • Load balancing: a server set is composed of multiple servers in a symmetric manner. Each server has equivalent status and can provide services independently without the assistance of other servers.
    • Through some load balancing technology, requests from the outside are evenly distributed to a server in a symmetrical structure, and the server receiving the request responds to the client’s request independently.
    • Load balancing can evenly distribute customer requests to the server array to provide rapid access to important data and solve the problem of large number of concurrent access services.
    • This clustering technique can achieve mainframe-like performance with minimal investment.
  • QoS (Bandwidth Management): Limits the bandwidth of each group to maximize the effectiveness of the limited bandwidth

6. Pull flow

  • Selection of Live Broadcast Protocol:
    • Immediacy requirements or interactive requirements can be usedRTMP.RTSP
    • Recommended for playback or cross-platform requirementsHLS
  • Comparison of Live Broadcast Protocols :



Comparison of live broadcast protocols. PNG


  • HLS: a protocol defined by Apple for real-time stream transmission. HLS is implemented based on HTTP. The transmission content consists of M3U8 description files and TS media files. Can realize streaming media live and on-demand, mainly applied in iOS system
    • HLS isOn demand technologyTo live stream
    • HLS isAdaptive bit rate streaming, the client will automatically select different bit rate video streams according to network conditions, use high bit rate when conditions permit, use low bit rate when the network is busy, and automatically cut between the two at will

      Change. This is very helpful for ensuring smooth playback when the network condition of mobile device is unstable.

    • The realization method is that the server side provides multi-bit rate video stream, and indicates in the list file, and the player adjusts automatically according to the playing progress and download speed.
  • Compare HLS with RTMP:HLS has a large delay, while RTMP has a low delay
    • The small slice mode of HLS generates a large number of files, and storing or processing these files wastes a lot of resources
    • Compared with the RTSP protocol, once the segmentation is completed, the subsequent distribution process does not need to use any additional special software, ordinary network server can greatly reduce the configuration requirements of THE CDN edge server, can use any off-the-shelf CDN, while ordinary servers rarely support RTSP.
  • HTTP-FLV: Transmits media content based on THE HTTP protocol.
    • HTTP is simpler and better known than RTMP. Content latency can also be 1-3 seconds, and opening is faster because HTTP itself does not have complex state interactions. So http-FLV is superior to RTMP in terms of latency.
  • RTSPReal-time streaming protocol, which defines how one-to-many applications can efficiently transmit multimedia data over an IP network.
  • RTPRTP, a real-time transport protocol, is based on UDP and is often used in conjunction with RTCP. RTP itself does not provide a just-in-time delivery mechanism or other quality of service (QoS) guarantees. It relies on low-level services to implement this process.
  • RTCP: A supporting protocol of RTP. It provides feedback for the quality of service (QoS) provided by RTP and collects statistics about media connections, such as the number of bytes transferred, the number of packets transferred, the number of packets lost, and the one-way and bidirectional network delays.

7. The decoding

* 7.1 Decapsulation *

  • Demuxing (separation): composited files from video streams, audio streams, subtitle streams (Container format (FLV, TS)), decomposes video, audio or subtitles, and decodes them respectively.

* 7.2 Audio coding framework *

  • fdk_aac: Audio coding and decoding framework, PCM audio data and AAC audio data mutual transfer

* 7.3 Introduction to Decoding *

  • Hard to decode: Use GPU to decode, reduce CPU operation
    • Advantages: smooth playback, low power consumption, fast decoding, * Disadvantages: not compatible
  • Soft decoding: Use CPU to decode
    • Advantages: Compatible * Disadvantages: increase CPU burden, increase power consumption, no hard decoding smooth, decoding speed is relatively slow

Play 8.

  • ijkplayer: an open-source Android/iOS video player based on FFmpeg
    • Apis are easy to integrate;
    • Compilation and configuration can be tailored to facilitate the control of installation package size;
    • Support hardware accelerated decoding, more power saving
    • Easy to use, specify pull stream URL, automatic decoding playback.

9. Chat and interact

  • IMInstantMessaging Is a real-time communication system that allows two or more people to communicate text messages, documents, voice and video in real time using the Internet.
    • IMThe main function of the live broadcast system is to realize the text interaction between audience and anchor, audience and audience.

      *The third party SDK*

  • Tencent Cloud: Instant messaging SDK provided by Tencent, which can be used as a live chat room
  • Rongyun: a popular INSTANT messaging SDK that can be used as a live chat room

5. How to quickly develop a complete iOS live app

1. Rapid development of third-party live STREAMING SDK

Qiniu Live Cloud: Qiniu Live Cloud is a global live streaming service specially created for live streaming platforms and a one-stop enterprise-level live streaming cloud service platform to achieve SDK end-to-end live streaming scenarios.

* Panda TV, Longzhu TV and other live streaming platforms are all using QiniuyunCopy the code

Netease Video Cloud: Based on professional cross-platform video codec technology and large-scale video content distribution network, netease Video Cloud provides stable and smooth real-time audio and video services with low latency and high concurrency, and can seamlessly connect live video to its own App.

2. Why do third-party SDK companies provide SDK to us?

  • We hope to tie our products to it and become more dependent on it.
  • Technology makes money, helps raise a lot of cattle B programmers

3. Live broadcast function: self-development or third-party live broadcast SDK development?

Third-party SDK development: For a start-up team, self-research live broadcast has a very large threshold in terms of technical threshold, CDN and bandwidth, and it takes a lot of time to make finished products, which is not conducive to attracting investment.

Self-research: the company’s live broadcasting platform is large, and in the long run, self-research can save costs, and the technical aspects are much more controllable than using SDK directly.

4. Benefits of third-party SDKS

  • Reduce the cost
    • Use good third-party services and you won’t have to pay a headhunter to poach an expensive talent or soothe their individual temperament
  • Promote efficiency
    • The focus of third-party services and the convenience of code integration can take as little as 1-2 hours, saving nearly 99% of time, enough to buy more time to compete with competitors and increase the likelihood of success
  • To reduce risk
    • With the help of professional third-party services, due to its fast, professional, stable and other characteristics, can greatly strengthen the competitiveness of products (quality service, research and development speed, etc.), shorten the trial and error time, will be one of the means to protect life in entrepreneurship
  • Get a professional to do a professional job
    • Third-party services are teams of at least 10-20 people working on the same problem and doing the same thing. The support effects of third-party services can never be compared with one or two individual treatments, can they

conclusion

There will also be video collection, beauty, chat room, gift system and other functions, please pay attention!!