Introduction to the

After we learned the basic knowledge of audio and audio rendering in the last article, we will learn the knowledge of video in this article, which is the same as the way of learning in the last article: basic + demo, focusing on rendering, acquisition and coding. We will study player and screen recording later.

The basics of video

The physics of the image

If you have done Camera capture or frame animation, you should know that video is made up of frames of images or frames of YUV data, so you have to start with images to learn video.

Let’s recall that I did a prism experiment when I was in junior high school. The content was how to use a prism to decompose sunlight into colorful bands. The first experimenter to do this was Isaac Newton. Each color of light is separated by the Angle of refraction it creates, just like a rainbow, so white light can be broken up into many colors. Later, people proved through experiments that red, green and blue light could not be decomposed, so it was called trichroma light. The sum of the trichroma light would turn into white light, that is, white light contains the same amount of red light (R), green light (G) and blue light (B).

In everyday life, we can only see the Outlines and colors of various objects because of the reflection of light. But if you apply the same theory to mobile phones, does it hold true? The answer is no, because we can see what’s on the phone screen even in the dark, and here’s how the human eye actually sees what’s on the phone screen.

Assuming that the screen resolution of a phone is 1920 x 1080 means that there are 1080 pixels horizontally and 1920 pixels vertically, so the entire screen is 1920 x 1080 pixels (which is what resolution means). Each pixel is made up of three sub-pixels, as shown below, which are densely packed and can be clearly seen when the image is enlarged or under a microscope. When a text or an image is to be displayed, sub-pixels corresponding to the RGB channel of each pixel of the image will be drawn on the screen to display the whole image.

The reason you can see what’s on your phone’s screen even in the dark is because the phone’s screen lights up spontaneously, rather than reflecting light.

photo

Numerical representation of the image

RGB representation

We clearly know from the previous section that any image is composed of RGB, so how to represent the RGB of a pixel point? Each sample in audio is represented by 16 bits. What about the sub-pixels in pixels? There are several common representations.

  • Floating-point representation: values ranging from 0.0 to 1.0, such as in OpenGL ES for each sub-pixel representation.
  • Integer: the value ranges from 0 to 255 or from 00 to FF. Eight bits represent one sub-pixel and 32 bits represent one pixel. This is the RGBA_8888 data format similar to the image format on some platforms. For example, the representation method of RGB_565 on Android platform is 16 bits to represent a pixel, R is 5 bits, G is 6 bits, and B is 5 bits.

For an image, integer representation is generally used to describe it, for example, to calculate the size of a 1920 * 1080 RGB_8888 image, which can be calculated as follows:

1920 * 1080 * 4/1024/1024 ≈ 7.910 MBCopy the code

This is also the size of the memory footprint of bitmaps, so the raw data for each image is large. For the raw data of images, it is not possible to transmit directly in the network, so there are image compression formats, for example, I have a JPEG compression before open-source :JPEG is the static image compression standard, formulated by ISO. JPEG image compression algorithm not only provides good compression performance, but also has good reconstruction quality. This algorithm is widely used in image processing, of course, it is also a lossy compression. In many websites such as Taobao are used after this compression of the image, but this compression can not be directly applied to video compression, because for video, there is a time domain factor to consider, that is to say, not only to consider the intra-frame coding, but also to consider the inter-frame coding. The video adopts a more mature algorithm, and we will introduce the relevant content of the video compression algorithm later.

YUV representation

The naked data representation of video frames is actually more of a representation of the YUV data format, which is mainly used to optimize the transmission of color video signals to make them backward compatible with older black-and-white televisions. Compared with RGB video signal transmission, its biggest advantage is that it only needs to occupy very little bandwidth (RGB requires three independent video signals to be transmitted at the same time). Among them, Y represents brightness, while “U” and “V” represent chroma value, which describes the color and saturation of the image and is used to specify the color of the pixel. The “brightness” is built from the RGB input signal by stacking together specific parts of the RGB signal. “Chroma” defines two aspects of color – hue and saturation, represented by Cr and Cb respectively. Among them, Cr reflects the difference between the red part of RGB input signal and the brightness value of RGB signal, while Cb reflects the difference between the blue part of RGB input signal and the brightness value of RGB signal.

YUV color space is used because its brightness signal Y is separated from chromaticity signal U and V. If there are only Y signal components but no U and V components, then the image represented in this way is black, white and gray image. Color TV using YUV space is to use brightness signal Y to solve the compatibility problem of color TV and black and white TV, so that black and white TV can also receive color TV signals, the most commonly used form of expression is Y, U, V use 8 bytes to express, so the value range is 0 ~ 255. In the broadcast and television system, very low and very high values are not transmitted, in fact, in order to prevent overload caused by signal changes. The value range of Y is 16 ~ 235, and the value range of UV is 16 ~ 240.

The most commonly used YUV sampling format is 4:2:0, 4:2:0 does not mean only Y, Cb and no Cr components. It means that only one chromaticity component is stored at a sampling rate of 2:1 for each scan line. Adjacent scan rows store different chromaticity components, meaning that if one row is 4:2:0, the next row is 4:0:2, the next row is 4:2:0, and so on. For each chromaticity component, the abstraction rate for both horizontal and vertical directions is 2:1, so it can be said that the sampling rate for chromaticity is 4:1. For uncompressed 8-bit quantized video, an 8-by-4 image takes up 48 bytes of memory.

Compared with RGB, we can calculate a video frame of 1920 * 1080, represented by YUV420P format, and its data volume is as follows:

(1920 * 1080 * 1 + 1920 * 1080 * 0.5) / 1024/1024 ≈ 2.966MBCopy the code

If FPS (the number of video frames in 1 second) is 25, calculated according to a short video of 5 minutes, then if the short video is represented in YUV420P data format, its data volume is as follows:

2.966MB * 25fps * 5min * 60s / 1024 ≈ 21GB
Copy the code

It can be seen that only 5 minutes of video data can reach 21 G, such as Douyin, Kuaishou and other representatives in the field of short video is not stuck, so how to store and stream short video? The answer is definitely video coding, which I’ll cover later.

If you do not know how to sample or store YUV, you can read this article: Audio and Video basics – Pixel format YUV

YUV and RGB conversion

As mentioned above, all text, pictures, or other renderings on the screen need to be converted to RGB representation. So how to convert YUV representation and RGB representation? You can refer to the article YUV < — > RGB conversion algorithm, conversion of C++ code can refer to the address

How video is encoded

Video coding

Remember how we learned about audio encoding in the last article? Audio coding is mainly to remove redundant information, so as to achieve data compression. So for video compression, and from which aspects of data compression? In fact, like the audio encoding mentioned earlier, video compression is also compressed by removing redundant information. Compared with audio data, video data has a strong correlation, that is to say, there is a lot of redundant information, including redundant information in space and redundant information in time, including the following parts.

  • Motion compensation: motion compensation is to predict and compensate the current local image through the previous local image, which is an effective method to reduce the redundant information of frame sequence.
  • Motion representation: images in different areas need to use different motion vectors to describe motion information.
  • Motion estimation: Motion estimation is a set of techniques for extracting motion information from video sequences.

The redundant information in space can be removed by using intra – frame coding.

Remember the image encoding JPEG? For video, ISO has also developed standards: Motion JPEG is also known as MPEG, MPEG algorithm is suitable for dynamic video compression algorithm, it in addition to encoding a single image, but also use the relevant principles of image sequence to remove redundancy, which can greatly improve the video compression ratio, up to now, the VERSION of MPEG has been constantly updated, Mainly include such versions: Mpeg1(for VCD), Mpeg2(for DVD), Mpeg4 AVC(now streaming media is the most used it).

To compare isO-specified MPEG video compression standard, ITU-T specified H.261, H.262, H.263, H.264 series of video coding standards is a separate system. Among them, H.264 combines all the advantages of previous standards and draws on the experience of previous standards, sampling a simple design, which makes it easier to promote than Mpeg4. Now the most used standard is H.264, H.264 created a multi-reference frame, multi-block type, integer transformation, intra-frame prediction and other new compression technology, the use of more accurate pixel motion vector (1/4, 1/8) and a new generation of loop filter, which makes the compression performance has been greatly improved, the system has become more perfect.

Coding concept

In video coding, each frame represents a still image. In actual compression, various algorithms will be adopted to reduce the data capacity, among which IPB frame is the most common one.

IPB frame
  • I frame: represents the key frame. You can understand it as the complete retention of this frame. Only the data of this frame can be completed when decoding (including the complete frame).
  • P frame: refers to the difference between the current P frame and the previous frame (I frame or P frame). When decoding, the difference defined in this frame shall be superimposed on the cached picture to generate the final picture. (Also known as differential frames, a P frame does not have the full frame data, only the data that is different from the previous frame.)
  • B frame: B frame records the difference between the current frame and the previous frame (the previous I frame or P frame and the subsequent P frame). In other words, to decode B frame, not only the cached frame before, but also the decoded frame after. The final picture is obtained through the superposition of the data of the front and back pictures and the data of this frame. B frame compression rate is high, but decoding will be difficult for the CPU.
Understanding of IDR frame and I frame

In the concept of H264, there is a frame called IDR frame, so what is the difference between IDR frame and I frame? First of all, I need to look at the full English name of IDR, instantaneous Decoding refresh picture, because H264 adopts multi-frame prediction, so the P frame after I frame may refer to the frame before I frame. This makes it impossible to find I frame as a reference condition in random access, because even if I frame is found, the frame after I frame may not be resolved, and IDR frame is a special I frame, that is, all reference frames after this frame will only refer to this IDR frame, and will not refer to the previous frame. In the decoder, as soon as the first IDR frame is received, the reference frame buffer is cleaned and the IDR frame is treated as the frame being referenced.

PTS and DTS

DTS mainly uses video Decoding, known as “Decoding Time Stamp”, while PTS is mainly used for video synchronization and output in the Decoding stage, known as “Presentation Time Stamp”. In the absence of B-frames, the output order of DTS and PTS is the same. Because b-frames disrupt the sequence of decoding and display, once b-frames are present, PTS and DTS are bound to be different. In most codec standards (H.264 or HEVC), the encoding order is not the same as the input order, thus requiring the different timestamps of PTS and DTS.

The concept of the GOP

A Group Of pictures formed between two I frames is the concept Of GOP (Group Of Pictures). When setting parameters for the encoder, it is usually necessary to set the value of gop_size, which represents the number of frames between two I frames. The frame with the largest capacity in a GOP is the I frame, so relatively speaking, the larger the GOP_size is, the better the quality of the whole picture will be. However, the decoding end must start from the first I frame received before the original image can be decoded correctly, otherwise the original image cannot be decoded correctly. In the techniques for improving video quality, Another trick is to use more B-frames. Generally speaking, the compression rate of I is 7 (similar to that of JPG), P is 20, and B can reach 50. It can be seen that using B-frames saves a lot of space, which can be used to save more I-frames, which can provide better picture quality at the same bit rate. Therefore, we need to set the size of gOP_size appropriately according to different business scenarios to get higher quality videos.

Combined with the following figure, I hope to help the group better understand the concepts of DTS and PTS.

Video rendering

OpenGL ES

Implementation effect

introduce

OpenGL (Open Graphics Lib) defines a cross-programming language, cross-platform programming professional Graphics programming interface. It can be used for 2d or 3D image processing and rendering. It is a powerful and convenient base graphics library. For Embedded devices, OpenGL ES(OpenGL for Embedded System) is a subset of OpenGL, which is designed for Embedded devices such as mobile phones and pads. So far, OpenGL ES has gone through many versions of iteration and update, so far the most widely used is OpenGL ES 2.0 version. Our next Demo is based on OpenGL ES 2.0 interface programming and image rendering.

Because OpenGL ES is based on cross-platform design, so in each platform must have its specific implementation, both to provide OpenGL ES context and window management. In OpenGL’s design, OpenGL is not responsible for managing Windows. So on the Android platform is actually using EGL to provide a local platform implementation of OpenGL ES.

use

The first way to use OpenGL ES on Android is to use GLSurfaceView directly. Using OpenGL ES in this way is easier because you don’t need to set up an OpenGL ES context. And create a display device for OpenGL ES. Using GLSurfaceView is not flexible enough. Many of the true core uses of OpenGL ES (such as sharing context to achieve multithreading) are built using the EGL API and are based on C++ environment. Because if written only in the Java layer, this may be possible for common applications, but for scenarios that require decoding or use third-party libraries (such as face recognition), it needs to be implemented in the C++ layer. In consideration of efficiency and performance, the architecture here will directly use EGL of Native layer to build an OpenGL ES development environment. To use EGL in the Native layer, you must add the EGL library to cmakelists. TXT (please refer to the CmakeLists file configuration provided below) and introduce the corresponding header file into the C++ files using the library. The address of the header file to be quoted is as follows:

//1. If you want to use EGL in your development, add the EGL library to cmakelists. TXT and specify the header file

// Use the header that EGL needs to add
#include <EGL/egl.h>
#include <EGL/eglext.h>

Use OpenGL ES 2.0 also need to add GLESv2 library in cmakelists. TXT and specify the header file

// Add a header file to use OpenGL ES 2.0
#include <GLES2/gl2.h>
#include <GLES2/gl2ext.h>
Copy the code

CMakeLists file configuration:

cmake_minimum_required(VERSION 3.4.1)

# Audio Rendering
set(OpenSL ${CMAKE_SOURCE_DIR}/opensl)
# Video Rendering
set(OpenGL ${CMAKE_SOURCE_DIR}/gles)


# Batch add your own CPP files, do not add *. H
file(GLOB ALL_CPP ${OpenSL}/*.cpp ${OpenGL}/*.cpp)

# Add your own CPP source files to generate dynamic library
add_library(audiovideo SHARED ${ALL_CPP})

# find NDK log library in system
find_library(log_lib
        log)

# Finally start linking libraries
target_link_libraries(
				The name of the last generated so library
        audiovideo
        # Audio Rendering
        OpenSLES

        # OpenGL connects to the intermediate of the local window with NativeWindow
        EGL
        # Video Rendering
        GLESv2
        Add local library
        android

        ${log_lib}
)
Copy the code

At this point, the development of OpenGL needs to use header files and library files are introduced, the next to see how to use EGL to build OpenGL context and render video data.

    1. To use EGL, you must first create a connection between the local window system and OpenGL ES

      //1. Get the original window
      nativeWindow = ANativeWindow_fromSurface(env, surface);
      / / get the Display
      display = eglGetDisplay(EGL_DEFAULT_DISPLAY);
      if (display == EGL_NO_DISPLAY) {
              LOGD("egl display failed");
              showMessage(env, "egl display failed".false);
              return;
      }
      Copy the code
    1. Initialize the EGL

      // Initialize egL with the last two parameters as the major version number
          if(EGL_TRUE ! = eglInitialize(display,0.0)) {
              LOGD("eglInitialize failed");
              showMessage(env, "eglInitialize failed".false);
              return;
          }
      Copy the code
    1. Determine the configuration of available render surfaces.

          // Surface configuration, can be read as window
          EGLConfig eglConfig;
          EGLint configNum;
          EGLint configSpec[] = {
                  EGL_RED_SIZE, 8,
                  EGL_GREEN_SIZE, 8,
                  EGL_BLUE_SIZE, 8,
                  EGL_SURFACE_TYPE, EGL_WINDOW_BIT,
                  EGL_NONE
          };
      
          if(EGL_TRUE ! = eglChooseConfig(display, configSpec, &eglConfig,1, &configNum)) {
              LOGD("eglChooseConfig failed");
              showMessage(env, "eglChooseConfig failed".false);
              return;
          }
      Copy the code
    1. Create render surface surface (4/5 steps interchangeable)

      // Create surface(egL and NativeWindow association. The last parameter is attribute information, 0 is the default version.)
          winSurface = eglCreateWindowSurface(display, eglConfig, nativeWindow, 0);
          if (winSurface == EGL_NO_SURFACE) {
              LOGD("eglCreateWindowSurface failed");
              showMessage(env, "eglCreateWindowSurface failed".false);
              return;
          }
      Copy the code
    1. Create render Context Context

          //4 Create an association context
          const EGLint ctxAttr[] = {
                  EGL_CONTEXT_CLIENT_VERSION, 2, EGL_NONE
          };
          EGL_NO_CONTEXT: multiple devices do not need to share the context
          context = eglCreateContext(display, eglConfig, EGL_NO_CONTEXT, ctxAttr);
          if (context == EGL_NO_CONTEXT) {
              LOGD("eglCreateContext failed");
              showMessage(env, "eglCreateContext failed".false);
              return;
          }
      Copy the code
    1. Specify an EGLContext as the current context and associate it

          // associate egL with OpengL
          // Read and write two surfaces. The second is usually used for offline rendering
          if(EGL_TRUE ! = eglMakeCurrent(display, winSurface, winSurface, context)) { LOGD("eglMakeCurrent failed");
              showMessage(env, "eglMakeCurrent failed".false);
              return;
          }
      Copy the code
    1. Use OpenGL related apis for drawing operations

          GLint vsh = initShader(vertexShader, GL_VERTEX_SHADER);
          GLint fsh = initShader(fragYUV420P, GL_FRAGMENT_SHADER);
      
          // Create the renderer
          GLint program = glCreateProgram();
          if (program == 0) {
              LOGD("glCreateProgram failed");
              showMessage(env, "glCreateProgram failed".false);
              return;
          }
      
          // Add a shader to the renderer
          glAttachShader(program, vsh);
          glAttachShader(program, fsh);
      
          // Link program
          glLinkProgram(program);
          GLint status = 0;
          glGetProgramiv(program, GL_LINK_STATUS, &status);
          if (status == 0) {
              LOGD("glLinkProgram failed");
              showMessage(env, "glLinkProgram failed".false);
              return;
          }
          LOGD("glLinkProgram success");
          // Activate the renderer
          glUseProgram(program);
      
          // Add 3d vertex data
          static float ver[] = {
                  1.0 f.1.0 f.0.0 f.1.0 f.1.0 f.0.0 f.1.0 f.1.0 f.0.0 f.1.0 f.1.0 f.0.0 f
          };
      
          GLuint apos = static_cast<GLuint>(glGetAttribLocation(program, "aPosition"));
          glEnableVertexAttribArray(apos);
          glVertexAttribPointer(apos, 3, GL_FLOAT, GL_FALSE, 0, ver);
      
          // Add texture coordinate data
          static float fragment[] = {
                  1.0 f.0.0 f.0.0 f.0.0 f.1.0 f.1.0 f.0.0 f.1.0 f
          };
          GLuint aTex = static_cast<GLuint>(glGetAttribLocation(program, "aTextCoord"));
          glEnableVertexAttribArray(aTex);
          glVertexAttribPointer(aTex, 2, GL_FLOAT, GL_FALSE, 0, fragment);
      
      
      
          // Initialize the texture
          // Set the corresponding sampler for the texture layer.
      
          GLint textureUniformY = glGetUniformLocation(program, "SamplerY"); GLint textureUniformU = glGetUniformLocation(program, "SamplerU"); GLint textureUniformV = glGetUniformLocation(program, "SamplerV"); GlUniform1i (textureUniformY, 0); glUniform1i(textureUniformU, 1); glUniform1i(textureUniformV, 2); * /
          // Set the sampler variables using the functions glUniform1i and glUniform1iv
          glUniform1i(glGetUniformLocation(program, "yTexture"), 0);
          glUniform1i(glGetUniformLocation(program, "uTexture"), 1);
          glUniform1i(glGetUniformLocation(program, "vTexture"), 2);
          / / texture ID
          GLuint texts[3] = {0};
          // Create several texture objects and get the texture ID
          glGenTextures(3, texts);
      
          // Bind the texture. All subsequent Settings and loads apply to the currently bound texture object
          GL_TEXTURE0, GL_TEXTURE1, GL_TEXTURE2 are texture units, GL_TEXTURE_1D, GL_TEXTURE_2D, CUBE_MAP are texture targets
          // After binding the texture target to the texture with the glBindTexture function, all operations performed on the texture target are reflected on the texture
          glBindTexture(GL_TEXTURE_2D, texts[0]);
          // Shrink the filter
          glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
          // Zoom filter
          glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
          // Set the format and size of the texture
          // Load the texture into OpenGL, read the bitmap data defined by buffer, and copy it to the currently bound texture object
          // The currently bound texture object will be attached to the texture image.
          //width,height, yuv element for each pixel? For example, width over 2 means one element for every two pixels horizontally, right?
          glTexImage2D(GL_TEXTURE_2D,
                       0.// Details basically default to 0
                       GL_LUMINANCE,// Gpu internal format brightness, grayscale map
                       width,// The width of the loaded texture. Preferably a power of 2 (the y-component data is treated as the specified size, but the display size is stretched to full screen?)
                       height,// The height of the loaded texture. It's better to be a power of two
                       0.// Texture border
                       GL_LUMINANCE,// Data in pixel format brightness, grayscale map
                       GL_UNSIGNED_BYTE,// The type of data stored in the pixel
                       NULL // Texture data (not passed)
          );
      
          // Bind the texture
          glBindTexture(GL_TEXTURE_2D, texts[1]);
          // Shrink the filter
          glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
          glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
          // Set the format and size of the texture
          glTexImage2D(GL_TEXTURE_2D,
                       0.// Details basically default to 0
                       GL_LUMINANCE,// Gpu internal format brightness, grayscale
                       width / 2.//u the amount of data is 1/4 of the screen
                       height / 2.0./ / frame
                       GL_LUMINANCE,// Data in pixel format brightness, grayscale map
                       GL_UNSIGNED_BYTE,// The type of data stored in the pixel
                       NULL // Texture data (not passed)
          );
      
          // Bind the texture
          glBindTexture(GL_TEXTURE_2D, texts[2]);
          // Shrink the filter
          glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
          glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
          // Set the format and size of the texture
          glTexImage2D(GL_TEXTURE_2D,
                       0.// Details basically default to 0
                       GL_LUMINANCE,// Gpu internal format brightness, grayscale
                       width / 2,
                       height / 2.// The amount of v data is 1/4 of the screen
                       0./ / frame
                       GL_LUMINANCE,// Data in pixel format brightness, grayscale map
                       GL_UNSIGNED_BYTE,// The type of data stored in the pixel
                       NULL // Texture data (not passed)
          );
      
          unsigned char *buf[3] = {0};
          buf[0] = new unsigned char[width * height];//y
          buf[1] = new unsigned char[width * height / 4];//u
          buf[2] = new unsigned char[width * height / 4];//v
      
          showMessage(env, "onSucceed".true);
      
      
          FILE *fp = fopen(data_source, "rb");
          if(! fp) { LOGD("oepn file %s fail", data_source);
              return;
          }
      
          while(! feof(fp)) {// Resolve the abnormal exit and stop reading data
              if(! isPlay)return;
              fread(buf[0].1, width * height, fp);
              fread(buf[1].1, width * height / 4, fp);
              fread(buf[2].1, width * height / 4, fp);
      
              // Activate the first texture and bind it to the created texture
              // Width,height is the main display size?
              glActiveTexture(GL_TEXTURE0);
              // bind the texture corresponding to y
              glBindTexture(GL_TEXTURE_2D, texts[0]);
              // Replace textures with much better performance than using glTexImage2D
              glTexSubImage2D(GL_TEXTURE_2D, 0.0.0.// Offset from the original texture
                              width, height,// The width and height of the loaded texture. It's better to be a power of two
                              GL_LUMINANCE, GL_UNSIGNED_BYTE,
                              buf[0]);
      
              // Activate the second texture and bind it to the created texture
              glActiveTexture(GL_TEXTURE1);
              // bind the texture corresponding to u
              glBindTexture(GL_TEXTURE_2D, texts[1]);
              // Replace textures with better performance than using glTexImage2D
              glTexSubImage2D(GL_TEXTURE_2D, 0.0.0, width / 2, height / 2, GL_LUMINANCE,
                              GL_UNSIGNED_BYTE,
                              buf[1]);
      
              // Activate the third layer texture and bind it to the created texture
              glActiveTexture(GL_TEXTURE2);
              // bind the texture corresponding to v
              glBindTexture(GL_TEXTURE_2D, texts[2]);
              // Replace textures with better performance than using glTexImage2D
              glTexSubImage2D(GL_TEXTURE_2D, 0.0.0, width / 2, height / 2, GL_LUMINANCE,
                              GL_UNSIGNED_BYTE,
                              buf[2]);
      
              glDrawArrays(GL_TRIANGLE_STRIP, 0.4);
              //8. The window displays, swap double buffers
              eglSwapBuffers(display, winSurface);
          }
      Copy the code
    1. Swap the internal buffer of EGL’s Surface with the platform-independent window diaplay created by EGL

      // The window shows that double buffers are swapped
      eglSwapBuffers(display, winSurface);
      Copy the code
    1. Release resources

      /** * Destroy data */
      void Gles_play::release() {
          if (display || winSurface || context) {
              // Destroy the display device
              eglDestroySurface(display, winSurface);
              // Destroy context
              eglDestroyContext(display, context);
              // Release window
              ANativeWindow_release(nativeWindow);
              // Release the thread
              eglReleaseThread();
              / / stop
              eglTerminate(display);
              eglMakeCurrent(display, winSurface, EGL_NO_SURFACE, context);
              context = EGL_NO_CONTEXT;
              display = EGL_NO_SURFACE;
              winSurface = nullptr;
              winSurface = 0;
              nativeWindow = 0;
              isPlay = false; }}Copy the code

OpenGL ES real-time rendering YUV (OpenGL ES real-time rendering YUV) To test, place raw/*.yuv in the sdcard/ root directory.

conclusion

There are many concepts in this chapter, which will inevitably be a bit boring, but understanding these concepts is a must. In the next part of this article, we will take a look at the FFmpeg + LibRtmp player development exercise, which supports RTMP pull stream and local video playback. Is there a little bit of expectation 😜, is expected to release the article in late February, wait for a moment.