RTSP is an Internet protocol specification, which is an application layer protocol level network communication system in TCP/IP protocol system. Designed for use in entertainment (such as audio and video) and communication systems to control streaming media servers. This protocol is used to establish and control media sessions between endpoints. The media server client issues VHS-style commands such as PLAY, PAUSE, SETUP, DESCRIBE, RECORD, and so on. To facilitate real-time control of media streams from server to client or from client to server.
RTSP transmission process
- When a user or application tries to stream video from a remote source, the client device sends an RTSP request to the server to determine the available options, such as PLAY, PAUSE, SETUP…
- The server then returns a list of the types of requests it can accept through the RTSP.
- Once the client knows how to make the request, it sends the media description request to the streaming server.
- The server responds with a media description.
- The client sends a setup request from there, and the server responds with information about the transport mechanism.
- Once the setup process is complete, the client initiates the stream transfer process by telling the server to send a bitstream (binary sequence) using the transport mechanism specified in the setup request.
Client -> Server :DESCRIBE
Server -> Client: 200 OK (SDP)
Client -> Server :SETUP
Server -> Client: 200 OK
Client -> Server :PAUSE
.
Protocol analysis and learning is indispensable for packet capture, screenshot of RTSP protocol packet capture diagram:
Why is the RTS protocol important
- RTSP was originally a way to allow users to play audio and video directly from the Internet without having to download media files to their devices. The protocol has been used for a variety of purposes, including Internet camera sites, online education and Internet broadcasting.
- RTSP uses the same concepts as basic HTTP, largely to be compatible with existing Web infrastructures. Because of this, most of HTTP’s extension mechanisms can be imported directly into the RTSP.
- The RTSP protocol also has great flexibility. Clients can request the features they want to use to find out if the media server supports them. Similarly, anyone who owns media can deliver media streams from multiple servers. The agreement also aims to adapt to the future development of media so that media creators can modify the agreement if necessary.
RTSP protocol directive
Although RTSP is similar to HTTP in some respects, it defines a sequence of controls that can be used to control multimedia playback. Although HTTP is stateless, RTSP is stateful.
Use identifiers when you need to track concurrent sessions. Like HTTP, RTSP uses TCP to maintain an end-to-end connection on port 554.
Although most RTSP control messages are sent from the client to the server, some commands are delivered in the other direction, from the server to the client.
Let’s take a look at the basic RTSP request:
SETUP
The SETUP request specifies how a single media stream must be transmitted. This must be done before sending the PLAY request.
The request contains the media stream URL and transport specifier.
This specifier typically includes a local port for receiving RTP data (audio or video) and another for RTCP data (meta information).
The server reply typically confirms the selected parameters and fills in the missing parts, such as the selected port of the server. Each media stream must be configured using SETUP before an aggregate playback request can be sent.
PLAY
The PLAY request will cause one or all of the media streams to PLAY. PLAY requests can be stacked by sending multiple PLAY requests. The URL can be an aggregation URL (to play all media streams) or a single media stream URL (to play only the stream).
Ranges can be specified. If no range is specified, the playback starts at the beginning and ends, or resumes at the pause point if the stream has been paused.
PAUSE
A PAUSE request temporarily suspends one or all of the media streams, so they can be resumed later with a PLAY request. The request contains an aggregate or media stream URL.
The range parameter on the PAUSE request specifies when to PAUSE. If the range argument is omitted, the pause occurs immediately and indefinitely.
RECORD
This method starts recording a series of media data following the demonstration instructions. The timestamp reflects the start time and end time (UTC). If no time range is given, use the start or end time provided in the demo.
If a session has started, start recording immediately. The server decides whether to store the recorded data under the request URl or other URI.
If the server does not use the request URI, the response should be 201 and contain entity and location headers that describe the status of the request and reference the new resource.
ANNOUNCE
When sent from the client to the server, ANNOUNCE publishes to the server a presentation or a description of the media object requesting the URL identity. ANNOUNCE updates the session description in real time.
If a new media stream is added to a presentation (for example, in a live presentation), the entire presentation note should be sent again, not just the other components, so that they can be removed.
TEARDOWN
The TEARDOWN request is used to terminate the session. It stops all media streams and frees all session-related data on the server.
GET_PARAMETER
GET_PARAMETER requests to retrieve the parameter value of the representation or stream specified in the URI. The content of the reply and response is left to the implementation.
SET_PARAMETER
This method requires setting parameter values for the representation or stream specified by the URI.
The Wireshark RTSP protocol is parsed
With a general understanding of the use of the RTSP protocol, let’s parse and implement the RTSP protocol.
#include <sys/stat.h>
#include <sys/types.h>
#include <netinet/tcp.h>
#include <netinet/udp.h>
#include <netinet/ip.h>
#include <netinet/ip6.h>
#include <net/ethernet.h>
#include <pcap.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
/*RTSP 端口*/
#define RTSP_TCP_PORT_RANGE 554
typedef enum {
RTSP_REQUEST,
RTSP_REPLY,
RTSP_NOT_FIRST_LINE
} rtsp_type_t;
static const char *rtsp_methods[] = {
"DESCRIBE",
"ANNOUNCE",
"GET_PARAMETER",
"OPTIONS",
"PAUSE",
"PLAY",
"RECORD",
"REDIRECT",
"SETUP",
"SET_PARAMETER",
"TEARDOWN"
};
/* 用于RTSP统计 */
struct rtsp_info_value_t {
char *request_method;
unsigned long int response_code;
};
/*
假定一个字节数组(假定包含一个以空值结尾的字符串)作为参数,
并返回字符串的长度-即该数组的大小,对于空终止符的值减去1。
*/
#define STRLEN_CONST(str) (sizeof (str) - 1)
static const char rtsp_content_type[] = "Content-Type:";
static const char rtsp_transport[] = "Transport:";
static const char rtsp_sps_server_port[] = "server_port=";
static const char rtsp_cps_server_port[] = "client_port=";
static const char rtsp_sps_dest_addr[] = "dest_addr=";
static const char rtsp_cps_src_addr[] = "src_addr=";
static const char rtsp_rtp_udp_default[] = "rtp/avp";
static const char rtsp_rtp_udp[] = "rtp/avp/udp";
static const char rtsp_rtp_tcp[] = "rtp/avp/tcp";
static const char rtsp_rdt_feature_level[] = "RDTFeatureLevel";
static const char rtsp_real_rdt[] = "x-real-rdt/";
static const char rtsp_real_tng[] = "x-pn-tng/"; /* synonym for x-real-rdt */
static const char rtsp_inter[] = "interleaved=";
static const char rtsp_content_length[] = "Content-Length:";
static void rtsp_create_conversation(u_char *line_begin, size_t line_len,rtsp_type_t rtsp_type_packet)
{
char buf[256];
char *tmp;
bool rtp_udp_transport = false;
bool rtp_tcp_transport = false;
bool rdt_transport = false;
//bool is_video = false; /* 是否需要显示视频 */
unsigned int c_data_port, c_mon_port;
unsigned int s_data_port, s_mon_port;
unsigned int ipv4_1, ipv4_2, ipv4_3, ipv4_4;
if (rtsp_type_packet != RTSP_REPLY) {
return;
}
/* 将行复制到buf */
if (line_len > sizeof(buf) - 1)
{
/* 避免溢出缓冲区。 */
line_len = sizeof(buf) - 1;
}
memcpy(buf, line_begin, line_len);
buf[line_len] = '\0';
printf("%s\n",buf);
/* Get past "Transport:" and spaces */
tmp = buf + STRLEN_CONST(rtsp_transport);
//printf("tmp %s\n",tmp);
while (*tmp && isspace(*tmp))
tmp++;
if ((tmp = strstr(buf, rtsp_cps_src_addr)))
{
tmp += strlen(rtsp_cps_src_addr);
//printf("tmp ====== %s\n",tmp);
if (sscanf(tmp, "\"%u.%u.%u.%u:%u\"", &ipv4_1, &ipv4_2, &ipv4_3, &ipv4_4, &c_data_port) == 5)
{
char *tmp2;
char *tmp3;
//printf("ipv4_1 %d\n",ipv4_1);
//printf("ipv4_2 %d\n",ipv4_2);
//printf("ipv4_3 %d\n",ipv4_3);
//printf("ipv4_4 %d\n",ipv4_4);
printf("c_data_port %d\n",c_data_port);
//Skip leading
tmp++;
tmp2=strstr(tmp,":");
tmp3=strndup(tmp,tmp2-tmp);
printf("src_addr %s\n",tmp3);
free(tmp3);
}
}
if ((tmp = strstr(buf, rtsp_sps_dest_addr)))
{
tmp += strlen(rtsp_sps_dest_addr);
if (sscanf(tmp, "\":%u\"", &s_data_port) == 1)
{
/* :9 mean ignore */
if (s_data_port == 9) {
s_data_port = 0;
}
printf("s_data_port %d\n",s_data_port);
}
}
if ((tmp = strstr(buf, rtsp_sps_server_port))) {
tmp += strlen(rtsp_sps_server_port);
if (sscanf(tmp, "%u", &s_mon_port) == 1) {
printf("s_mon_port %d\n",s_mon_port);
}
}
}
static bool is_rtsp_request_or_reply( unsigned char *line, int offset, rtsp_type_t *type)
{
unsigned int ii = 0;
char *data = reinterpret_cast<char *>(line);
int tokenlen;
char response_chars[4];
struct rtsp_info_value_t rtsp_info;
char *token, *next_token;
/*这是RTSP的回复 ? */
if ( strncasecmp("RTSP/", data, 5) == 0) {
/*
* Yes.
*/
*type = RTSP_REPLY;
/* 第一个标记是版本。 */
offset += 9;
memcpy(response_chars, data + offset, 3);
response_chars[3] = '\0';
rtsp_info.response_code = strtoul(response_chars, NULL, 10);
//printf("rtsp_info.response_code %d\n",rtsp_info.response_code);
return true;
}
/*
这是RTSP请求吗?
检查该行是否以RTSP请求方法之一开头。
*/
for (ii = 0; ii < sizeof rtsp_methods / sizeof rtsp_methods[0]; ii++) {
size_t len = strlen(rtsp_methods[ii]);
if (strncasecmp(rtsp_methods[ii], data, len) == 0 &&(isspace(data[len])))
{
*type = RTSP_REQUEST;
rtsp_info.request_method = strndupa(rtsp_methods[ii], len+1);
//printf("request_method: %s\n",rtsp_info.request_method);
return true;
}
}
/* 既不是请求也不是回应 */
*type = RTSP_NOT_FIRST_LINE;
return false;
}
/* 阅读回复消息的第一行 */
static void process_rtsp_reply(u_char *rtsp_data, int offset,rtsp_type_t rtsp_type_packet)
{
char *lineend = reinterpret_cast<char *>(rtsp_data + offset);
char *status = reinterpret_cast<char *>(rtsp_data );
char *status_start;
unsigned int status_i;
/* status code */
/* Skip protocol/version */
while (status < lineend && !isspace(*status))
status++;
/* Skip spaces */
while (status < lineend && isspace(*status))
status++;
/* Actual code number now */
status_start = status;
//printf("status_start %s\n",status_start);
status_i = 0;
while (status < lineend && isdigit(*status))
status_i = status_i * 10 + *status++ - '0';
//printf("status_i %d\n",status_i);
offset += strlen(lineend);
rtsp_create_conversation(rtsp_data,offset,rtsp_type_packet);
}
static void process_rtsp_request(u_char *rtsp_data, int offset,rtsp_type_t rtsp_type_packet)
{
char *lineend = reinterpret_cast<char *>(rtsp_data + offset);
// u_char *lineend = rtsp_data + offset;
unsigned int ii = 0;
char *url;
char *url_start;
char buf[256];
char *tmp;
int content_length = 0;
char content_type[256];
/* Request Methods */
for (ii = 0; ii < sizeof rtsp_methods / sizeof rtsp_methods[0]; ii++) {
size_t len = strlen(rtsp_methods[ii]);
if (strncasecmp(rtsp_methods[ii], lineend, len) == 0 &&(isspace(lineend[len])))
break;
}
//printf("process_rtsp_request 0x%.2X,0x%.2X,0x%.2X,0x%.2X\n",lineend[0],lineend[1],lineend[2],lineend[3]);
/* URL */
url = lineend;
/* Skip method name again */
while (url < lineend && !isspace(*url))
url++;
/* Skip spaces */
while (url < lineend && isspace(*url))
url++;
/* URL starts here */
url_start = url;
/* Scan to end of URL */
while (url < lineend && !isspace(*url))
url++;
printf("%s\n",url_start);
printf("111url %s\n",url);
if ((tmp = strstr(url_start, rtsp_content_type)))
{
tmp += strlen(rtsp_content_type);
if (sscanf(tmp, "%s", content_type) == 1)
{
//printf("content_type %s\n",content_type);
}
}
//Content-Length
if ((tmp = strstr(url_start, rtsp_content_length)))
{
tmp += strlen(rtsp_content_length);
if (sscanf(tmp, "%u", &content_length) == 1)
{
//printf("content_length %d\n",content_length);
}
}
}
void dissect_rtsp(u_char *rtsp_data)
{
int offset = 0;
rtsp_type_t rtsp_type_packet;
bool is_request_or_reply;
u_char *linep, *lineend;
u_char c;
//bool is_header = false;
is_request_or_reply = is_rtsp_request_or_reply(rtsp_data, offset, &rtsp_type_packet);
if (is_request_or_reply)
goto is_rtsp;
is_rtsp:
switch(rtsp_type_packet)
{
case RTSP_REQUEST:
process_rtsp_request(rtsp_data, offset,rtsp_type_packet);
break;
case RTSP_REPLY:
process_rtsp_reply(rtsp_data, offset,rtsp_type_packet);
break;
case RTSP_NOT_FIRST_LINE:
/* Drop through, it may well be a header line */
break;
default:
break;
}
}
static void dissect_rtsp_tcp(struct ip *pIp)
{
int iHeadLen = pIp->ip_hl*4;
int iPacketLen = ntohs(pIp->ip_len) - iHeadLen;
int offset = 0;
int nFragSeq = 0;
struct tcphdr *pTcpHdr = (struct tcphdr *)(((char *)pIp) + iHeadLen);
if (pIp->ip_p == IPPROTO_TCP && (ntohs(pTcpHdr->dest) == RTSP_TCP_PORT_RANGE)
|| (ntohs(pTcpHdr->source) == RTSP_TCP_PORT_RANGE) )/*仅处理TCP协议*/
{
int iPayloadLen = iPacketLen - pTcpHdr->doff*4;
//printf("TCP Payload Len %d\n",iPayloadLen);
u_char *RtspHdr = (u_char*)(pTcpHdr+1);
if (RtspHdr == NULL)
return;
u_char *RtspData = RtspHdr + 12; /*skip OPtions */
//printf("NtpHdr 0x%.2X,0x%.2X,0x%.2X,0x%.2X\n",RtspData[0],RtspData[1],RtspData[2],RtspData[3]);
dissect_rtsp(RtspData);
}
}
Copy the code
Compile operation
RTSP is a text-based protocol that uses carriage return newline (\r\n) as the end character of each line. The advantage of RTSP is that custom parameters can be easily added during use and packet capture analysis can be facilitated.
RTSP packets are classified into request packets and response packets in terms of message transmission direction. The request packet is sent from the client to the server, and the response packet is sent from the server to the client.
conclusion
RTSP provides controls for streaming media such as PLAY, PAUSE, and SETUP, but does not transmit data itself. RTSP functions as remote control for streaming media servers.
The server can choose to use TCP or UDP to stream content, which has a similar syntax and operation to HTTP. For more information, please refer to the official RFC documentation, which is the most authoritative document.