2 - Hard Solution (MediaCodec+RMTP) - Moment For Technology

Code address: github.com/deepsadness…

Source of ideas

The last part of the article said that the use of recording Api for screen recording live. This article is expected to be completed in May. Turns out after all this time, I finally have time. To fill the hole.

The main idea

Direct use of hardware encoder to record live.
Use RTMP protocol for live push stream

Use MediaProjection schematic. PNG

The whole process is to create a VirtualDisplay and get the data directly from MediaCodec’s Surface. The encoded data was obtained through MediaCodec, encapsulated in FLV format, and finally sent through RTMP protocol.

Get a screenshot of the screen

1. Use MediaCodec Surface

This section is basically the same as the previous article, except that you use MediaCodec to get the Surface

@Override public @Nullable Surface createSurface(int width, int height) { mBufferInfo = new MediaCodec.BufferInfo(); / / create video mediaFormat mediaFormat format. = mediaFormat createVideoFormat (MIME_TYPE, width, height); // The interpolator is also required. Set yourself some variable format. SetInteger (MediaFormat. KEY_COLOR_FORMAT, MediaCodecInfo. CodecCapabilities. COLOR_FormatSurface); format.setInteger(MediaFormat.KEY_BIT_RATE, BIT_RATE); format.setInteger(MediaFormat.KEY_FRAME_RATE, FRAME_RATE); format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, IFRAME_INTERVAL); format.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface); format.setInteger(MediaFormat.KEY_BIT_RATE, BIT_RATE); format.setInteger(MediaFormat.KEY_FRAME_RATE, FRAME_RATE); format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, IFRAME_INTERVAL); if (VERBOSE) Log.d(TAG, "format: " + format); // Create a MediaCodec encoder and use format to configure. Then Get a Surface give VirtualDisplay try {mEncoder = MediaCodec. CreateEncoderByType (MIME_TYPE); mEncoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE); mInputSurface = mEncoder.createInputSurface(); // open the encoder mEncoder. Start (); / /... Return mInputSurface; } catch (IOException e) { e.printStackTrace(); } return null; }Copy the code

2. Obtain the encoded data

The creation of Encoder HanderThread continuously obtains encoded data in a new thread. So let’s create a HanderThread for asynchronous operations and asynchronous traffic.

    private void createEncoderThread() {
        HandlerThread encoder = new HandlerThread("Encoder");
        encoder.start();
        Looper looper = encoder.getLooper();
        workHanlder = new Handler(looper);
    }
Copy the code

Start the task of obtaining data after the above encoder is started, and directly push into a task to run

// The 1s delay is because after encoder is enabled, Workhander.postdelayed (new Runnable() {@override public void run() {doExtract(mEncoder,null); }, 1000);Copy the code

Notice that yes, there is a slight delay in pushing the task, because it takes a little time to initialize and start the hardware encoder.

Get the encoded data

/** * loop over and over until we finish manually. * @param encoder * @param frameCallback */ private void doExtract(MediaCodec encoder, FrameCallback frameCallback) { final int TIMEOUT_USEC = 10000; long firstInputTimeNsec = -1; boolean outputDone = false; // No manual stop, just keep doing while (! OutputDone) {// If manually stopped. If (mIsStopRequested) {log. d(TAG, "Stop requested"); return; } // Since the state and feed data to the encoder are obtained directly from the Surface, Here just direct access to the decoding state int decoderStatus = encoder. DequeueOutputBuffer (mBufferInfo TIMEOUT_USEC); if (decoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) { // no output available yet // if (VERBOSE) Log.d(TAG, "no output from decoder available"); } else if (decoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) { // not important for us, since we're using Surface // if (VERBOSE) Log.d(TAG, "decoder output buffers changed"); } else if (decoderStatus == mediacodec.info_output_format_changed) {if (decoderStatus == mediacodec.info_output_format_changed) { Here, after MediaCodec starts encoding, a data with CS-0 and CS-1 will be obtained, corresponding to SPS and PPS. Once we get it, we'll need to deal with it later, so we'll just set it to a callback. MediaFormat newFormat = encoder.getOutputFormat(); if (VERBOSE) Log.d(TAG, "decoder output format changed: " + newFormat); if (frameCallback ! = null) { frameCallback.formatChange(newFormat); }} else if (decoderStatus < 0) { For the time being to direct the first throw new RuntimeException (" unexpected result from decoder. DequeueOutputBuffer: "+ decoderStatus); } else {// decoderStatus >= 0; = 0) { long nowNsec = System.nanoTime(); Log.d(TAG, "startup lag "+ ((nowNsec-FirstinputTimenSec) / 1000000.0) +" ms"); firstInputTimeNsec = 0; } if (VERBOSE) Log.d(TAG, "surface decoder given buffer " + decoderStatus + " (size=" + mBufferInfo.size + ")"); // Get the final data. I'm going to break out of the loop here. If ((mbufferinfo.flags & mediacodec.buffer_FLAG_end_of_stream)! = 0) { if (VERBOSE) Log.d(TAG, "output EOS"); outputDone = true; } // If size is greater than 0, Boolean doRender = (mbufferinfo.size! = 0); If (doRender && frameCallback! = null) { ByteBuffer outputBuffer = encoder.getOutputBuffer(decoderStatus); frameCallback.render(mBufferInfo, outputBuffer); } encoder.releaseOutputBuffer(decoderStatus, doRender); }}}Copy the code

With this loop fetching, the encoded data can be retrieved through callbacks. Later, we can stream the encoded data to RTMP.

Use RTMP to push the stream

Understand the RTMP protocol
RMTP Connection
code

1. Understand the RTMP protocol

RTMP, short for Real Time Message Protocol, is an application layer Protocol proposed by Adobe to solve the Multiplexing and packetizing problems of multimedia data transfer streams.

Based on TCP After the transport layer Connection is established, the RTMP protocol also requires the client and server to “shake hands” to establish the RTMP Connection over the transport layer Connection. On the Connection link transmission control information, such as SetChunkSize, SetACKWindowSize. The CreateStream command creates a Stream link for transmitting specific audio and video data and command information that controls the transmission of this information. The RTMP protocol will format the data itself during transmission, and the Message in this format is called RTMP Message. In actual transmission, in order to better realize multiplexing, subcontracting and fairness of information, the sender will divide the Message into Chunk with Message ID. Each Chunk may be an independent Message or a part of Message. The receiving end will restore the Chunk to a complete Message according to the length of data contained in the Chunk, Message ID and Message length, so as to realize the sending and receiving of information.

2. RTMP Connection

HandShake

An RTMP connection starts with a handshake, with each party sending three data blocks of a fixed size

The handshake starts when the client sends blocks C0 and C1. The server sends S0 and S1 after receiving C0 or C1.
When S0 and S1 are collected, the client sends C2. When the server receives all C0 and C1, it starts to send S2.
When the client and server receive S2 and C2 respectively, the handshake is complete.

image

Theoretically, as long as the above conditions are met, the order of the six messages can be arranged in any way. However, in practice, in order to minimize the number of communication while ensuring the authentication function of handshake, the general sending order is as follows:

The Client sends C0+C1 to the Sever
The Server sends S0+S1+S2 to the Client
The Client sends C2 to the Server. The handshake is complete

Setting up a Network Connection

The client sends the “connect” in the command message to the server, requesting a connection to a service application instance.
After the server receives the connect command message, it sends the Window Acknowledgement Size protocol message to the client and connects to the application mentioned in the connect command.
The server sends a Set Peer Bandwitdh message to the client.
The client sends the Window Acknowledgement Size protocol message to the server after processing the set bandwidth protocol message.
The server sends the “Stream Begin” message in the user control message to the client.
The server sends the _result in the command message to inform the client of the connection status.
After receiving the message from the server, the client returns the size of the confirmation window, and the network connection is created.

After receiving the connection request from the client, the server sends the following information:

image

Basically, the client is told to confirm the window size, set the node bandwidth, and then the server connects the “connection” to the specified application and returns the result, “Network connection is successful”. And returns the message at the beginning of the Stream (Stream Begin 0).

Establish a NetStream

The client sends the createStream command in the command message to the server.
After receiving the Create Flow command, the server sends the _result in the command message to inform the client of the flow status.

Push the flow process

The client sends the publish push flow instruction.
The server sends the “Stream Begin” message in the user control message to the client.
The client sends metadata (resolution, frame rate, audio sampling rate, audio bit rate, and so on).
The client sends audio data.
Client sending The server sends the ChunkSize protocol message.
The server sends the _result in the command message to inform the client of the status of the push.
After receiving the video, the client sends the video data until the end.

Push the flow process

Information flow process

The client sends the play command in the command message to the server.
Upon receiving the playback command, the server sends the ChunkSize protocol message.
The server sends “StreamBEGIN” in the user control message to inform the client of the stream ID.
If the command is successfully played, the server sends response status netstream.play. Start and netstream.play. reset in the command message to inform the client that the command is successfully executed.
After that the server sends the audio and video data that the client wants to play.

Information flow process

3. Code integration

1. The integrated RTMP

Use the code from RTMP in Librestreaming directly and put it into CMake for compilation.

Put the librtmp in the project under liBS

image.png

According to the original Android.mk file, the configurationCMakeList

Cmake_minimum_required (VERSION 3.4.1 track) add_definitions (" - DNO_CRYPTO ") include_directories(${CMAKE_SOURCE_DIR}/libs/rtmp/librtmp) #native-lib file(GLOB PROJECT_SOURCES "${CMAKE_SOURCE_DIR}/libs/rtmp/librtmp/*.c") add_library(rtmp-lib SHARED src/main/cpp/rtmp-hanlde.cpp ${PROJECT_SOURCES}  ) find_library( # Sets the name of the path variable. log-lib log) target_link_libraries( # Specifies the target library. rtmp-lib ${log-lib})Copy the code

Create a Java file and write jNI

public class RtmpClient {
    static {
        System.loadLibrary("rtmp-lib");
    }

    /**
     * @param url
     * @param isPublishMode
     * @return rtmpPointer ,pointer to native rtmp struct
     */
    public static native long open(String url, boolean isPublishMode);
    public static native int write(long rtmpPointer, byte[] data, int size, int type, int ts);
    public static native int close(long rtmpPointer);
    public static native String getIpAddr(long rtmpPointer);
}

Copy the code

2. Push RMTP flow

Previous articles have analyzed the data format of FLV. This also needs to encode the data. I won’t go into it here.

RTMP connection part of the overall flow

The process of connecting to the entire RTMP URL. The above understanding has also been mentioned.

const char *url = env->GetStringUTFChars(url_, 0); LOGD("RTMP_OPENING:%s", url); RTMP * RTMP = RTMP_Alloc(); if (rtmp == NULL) { LOGD("RTMP_Alloc=NULL"); return NULL; } // initialize RTMP RTMP_Init(RTMP); int ret = RTMP_SetupURL(rtmp, const_cast<char *>(url)); if (! ret) { RTMP_Free(rtmp); rtmp = NULL; LOGD("RTMP_SetupURL=ret"); return NULL; } if (isPublishMode) { RTMP_EnableWrite(rtmp); } //2. Start Connect. The process of establishing a network connection. Ret = RTMP_Connect(RTMP, NULL); if (! ret) { RTMP_Free(rtmp); rtmp = NULL; LOGD("RTMP_Connect=ret"); return NULL; Ret = RTMP_ConnectStream(RTMP, 0); if (! ret) { ret = RTMP_ConnectStream(rtmp, 0); RTMP_Close(rtmp); RTMP_Free(rtmp); rtmp = NULL; LOGD("RTMP_ConnectStream=ret"); return NULL; } env->ReleaseStringUTFChars(url_, url); LOGD("RTMP_OPENED");Copy the code

When the MediaFormat callback is received, it is streamed and published
The main difference between continuously getting encoded data and continuously pushing stream is the difference of type in coding. We know that the first message must be a complete message and must be meta_data.

jbyte *buffer = env->GetByteArrayElements(data_, NULL); LOGD("start write"); RTMPPacket *packet = (RTMPPacket *) malloc(sizeof(RTMPPacket)); RTMPPacket_Alloc(packet, size); RTMPPacket_Reset(packet); if (type == RTMP_PACKET_TYPE_INFO) { // metadata packet->m_nChannel = 0x03; } else if (type == RTMP_PACKET_TYPE_VIDEO) { // video packet->m_nChannel = 0x04; } else if (type == RTMP_PACKET_TYPE_AUDIO) { //audio packet->m_nChannel = 0x05; } else { packet->m_nChannel = -1; } RTMP *r = (RTMP *) rtmpPointer; packet->m_nInfoField2 = r->m_stream_id; LOGD("write data type: %d, ts %d", type, ts); memcpy(packet->m_body, buffer, size); packet->m_headerType = RTMP_PACKET_SIZE_LARGE; packet->m_hasAbsTimestamp = FALSE; packet->m_nTimeStamp = ts; packet->m_packetType = type; packet->m_nBodySize = size; int ret = RTMP_SendPacket((RTMP *) rtmpPointer, packet, 0); RTMPPacket_Free(packet); free(packet); env->ReleaseByteArrayElements(data_, buffer, 0); if (! ret) { LOGD("end write error %d", ret); return ret; } else { LOGD("end write success"); return 0; }Copy the code

Finally close

    RTMP_Close((RTMP *) rtmpPointer);
    RTMP_Free((RTMP *) rtmpPointer);
Copy the code

Accept the encoded data callback

workHanlder.postDelayed(new Runnable() { @Override public void run() { doExtract(mEncoder, new FrameCallback() { @Override public void render(MediaCodec.BufferInfo info, ByteBuffer outputBuffer) { Sender.getInstance().rtmpSend(info, outputBuffer); } @Override public void formatChange(MediaFormat mediaFormat) { Sender.getInstance().rtmpSendFormat(mediaFormat); }}); }}, 1000);Copy the code

By calling back MediaFormat

Before the FLV format detailed solution, we know to achieve FLV push flow. It is necessary to push the head position of CS0 and CS1 to display normally. And must be the first message. So this is how we read cs0 and cs1

public static byte[] generateAVCDecoderConfigurationRecord(MediaFormat mediaFormat) {
            ByteBuffer SPSByteBuff = mediaFormat.getByteBuffer("csd-0");
            SPSByteBuff.position(4);
            ByteBuffer PPSByteBuff = mediaFormat.getByteBuffer("csd-1");
            PPSByteBuff.position(4);
            int spslength = SPSByteBuff.remaining();
            int ppslength = PPSByteBuff.remaining();
            int length = 11 + spslength + ppslength;
            byte[] result = new byte[length];
            SPSByteBuff.get(result, 8, spslength);
            PPSByteBuff.get(result, 8 + spslength + 3, ppslength);
            /**
             * UB[8]configurationVersion
             * UB[8]AVCProfileIndication
             * UB[8]profile_compatibility
             * UB[8]AVCLevelIndication
             * UB[8]lengthSizeMinusOne
             */
            result[0] = 0x01;
            result[1] = result[9];
            result[2] = result[10];
            result[3] = result[11];
            result[4] = (byte) 0xFF;
            /**
             * UB[8]numOfSequenceParameterSets
             * UB[16]sequenceParameterSetLength
             */
            result[5] = (byte) 0xE1;
            ByteArrayTools.intToByteArrayTwoByte(result, 6, spslength);
            /**
             * UB[8]numOfPictureParameterSets
             * UB[16]pictureParameterSetLength
             */
            int pos = 8 + spslength;
            result[pos] = (byte) 0x01;
            ByteArrayTools.intToByteArrayTwoByte(result, pos + 1, ppslength);

            return result;
        }
Copy the code

According to FLV format analysis. Fill in FLV

 public static void fillFlvVideoTag(byte[] dst, int pos, boolean isAVCSequenceHeader, boolean isIDR, int readDataLength) {
            //FrameType&CodecID
            dst[pos] = isIDR ? (byte) 0x17 : (byte) 0x27;
            //AVCPacketType
            dst[pos + 1] = isAVCSequenceHeader ? (byte) 0x00 : (byte) 0x01;
            //LAKETODO CompositionTime
            dst[pos + 2] = 0x00;
            dst[pos + 3] = 0x00;
            dst[pos + 4] = 0x00;
            if (!isAVCSequenceHeader) {
                //NALU HEADER
                ByteArrayTools.intToByteArrayFull(dst, pos + 5, readDataLength);
            }
        }
Copy the code

And then send it.

Sending actual data

 public static RESFlvData sendRealData(long tms, ByteBuffer realData) {
        int realDataLength = realData.remaining();
        int packetLen = Packager.FLVPackager.FLV_VIDEO_TAG_LENGTH +
                Packager.FLVPackager.NALU_HEADER_LENGTH +
                realDataLength;
        byte[] finalBuff = new byte[packetLen];
        realData.get(finalBuff, Packager.FLVPackager.FLV_VIDEO_TAG_LENGTH +
                        Packager.FLVPackager.NALU_HEADER_LENGTH,
                realDataLength);
        int frameType = finalBuff[Packager.FLVPackager.FLV_VIDEO_TAG_LENGTH +
                Packager.FLVPackager.NALU_HEADER_LENGTH] & 0x1F;
        Packager.FLVPackager.fillFlvVideoTag(finalBuff,
                0,
                false,
                frameType == 5,
                realDataLength);
        RESFlvData resFlvData = new RESFlvData();
        resFlvData.droppable = true;
        resFlvData.byteBuffer = finalBuff;
        resFlvData.size = finalBuff.length;
        resFlvData.dts = (int) tms;
        resFlvData.flvTagType = RESFlvData.FLV_RTMP_PACKET_TYPE_VIDEO;
        resFlvData.videoFrameType = frameType;
        return resFlvData;
//        dataCollecter.collect(resFlvData, RESRtmpSender.FROM_VIDEO);
    }
Copy the code

RMTP server

The RMTP server can be easily used to establish the RMTP server

conclusion

Compare the previous article

Android PC screen simple try

To get the data way is through MediaProjection. CreateVirtualDisplay way to obtain the data screen. The difference is that the previous article uses ImageReader to get screenshots one by one. In this paper, MediaCodec is directly used for hard coding and h264 data is obtained directly after coding.
The webSocket used in the paper on the transmission protocol will get Bitmap byte stream, transmitted through the socket, the receiver, as long as the socket, and parse it into Bitmap to display. Advantages are convenience, and you can customize the content of the protocol. However, the disadvantage is that it can not be used in general, and the corresponding client must be written to complete. This article uses the streaming media protocol of RTMP. The advantage is that any player that supports this protocol can directly play our screen casting stream.

Refer to the article

Some notes on implementing the RTMP protocol for live streaming

Screen-casting attempts series of articles

Android PC Screen – Custom protocol (Socket+Bitmap)
2 — Hard Solution (MediaCodec+RMTP)

2 — Hard Solution (MediaCodec+RMTP)