Network gleanings
Part.1 – HTTP protocol
1. The HTTP features
- HTTP is an application-layer protocol based on TCP/IP. The default port number is 80.
- HTTP is connectionless and stateless.
2. HTTP message
HTTP is an application layer specification based on TCP/IP protocol and is transmitted by ASCII code. The specification divides HTTP into three parts: the status line, the request header, and the request body. HTTP defines different methods to interact with the server. The four commonly used methods are GET, POST, DELETE, and PUT. The full name of a URL is a resource descriptor. A URL is used to describe a resource on the network. The POST, DELETE, PUT, and GET operations in HTTP correspond to adding, deleting, modifying, and querying a resource. Other request methods include HEAD, OPTIONS, TRACE, and PATCH.
GET
Used for information retrieval. (Security and idempotent) Note: Security means that the operation is used to retrieve information, not modify it. Idempotent means that multiple requests to the same URL should return the same result.POST
Represents a request that may modify a resource on the server. (not safe, not idempotent)HEAD
与GET
The method is similar, but does not return the message body content, just gets part of the information (content-type, content-Length) that fetched the resource (security and idempotent)PUT
Used to create and update resources. (insecure, idempotent)DELETE
Delete a resource. (insecure, idempotent)OPTIONS
URL authentication to verify whether the interface service is normal. (Safe, idempotent)TEACE
Echo the request received by the server so that the client can see (if any) which changes or additions have been implemented by the intermediate server (security, idempotent).PATCH
PATCH is used to create and update resources, similar to PUT, except that PATCH represents partial update. Later proposed interface methods may be used to verify whether the client and server support; (insecure, idempotent)
The amount of data submitted by GET is limited by the URL length. HTTP does not limit the URL length, which is restricted by the browser and server. In theory, POST has no size limit, and HTTP has no size limit. For security reasons, the server will make certain limits.
3. Method of submitting data through POST
HTTP protocol stipulates that the data submitted by POST must be in the Body part, but the protocol does not specify the data format or encoding method. The server usually knows the encoding method of the message Body in the request through the Content-Type field in the request header, and then parses the Body. The scheme for submitting data through POST contains the following two parts: Content-Type and message body encoding:
application/x-www-form-urlencoded
: The most common POST data submission method, the browser’s native form, if not setenctype
Property, and you end up withapplication/x-www-form-urlencoded
Method to submit data.multipart/form-data
: When uploading files using forms, you must make the form’senctype
Is equal to themultipart/form-data
. This method is usually used to upload files.application/json
text/xml
application/x-protobuf
As long as the server can correctly parse the request based on content-Type and Content-Encoding.
4. Respond to the packet
An HTTP response, like an HTTP request, consists of three parts: a status line, a response header, and a response body. The status line consists of the protocol version, the status code in digital form, and the status description of the response, separated by Spaces. Common status codes:
200 OK
The client request succeeded. Procedure301 Moved Permanently
Request a permanent redirect302 Moved Temporarily
Request a temporary redirect304 Not Modified
The file is not modified. You can use the cache file directly400 Bad Request
The client request cannot be understood by the server because of a syntax error401 Unauthorized
Unauthorized request (status code must be used with wwW-Authenticate header field)403 Forbidden
The server received the request but refused service. The server usually gives the reason for not providing the service in the response body.404 Not Found
The requested resource does not exist.500 Internal Server Error
An unexpected error occurred on the server, causing the client’s request to be unable to be completed.503 Service Unavailable
The server cannot process client requests. The server may recover after a period of time.
5. Conditional GET
HTTP condition GET is HTTP protocol in order to reduce unnecessary bandwidth waste, put forward a scheme. When to use: The client has visited the site before and wants to visit it again. Usage: The client to the server to send a package to ask if changed since the last time after the visit the web site page, if the server is not updated, obviously do not need to pass an entire web page to the client, the client only need to use the local cache, if the server controls the client has given time, update the client request to send the update the web page to the user.
6. Persistent connection
Generally, the HTTP protocol adopts the request-reply mode. In the common mode, that is, the keep-alive mode is not used, each request/reply client and server create a connection and disconnect the connection immediately after the connection is completed (HTTP is a connectionless protocol). When the keep-alive mode is used, the keep-alive function keeps the connection between the client and the server Alive. When subsequent requests to the server occur, the keep-alive function avoids establishing or re-establishing the connection. In HTTP 1.0: If the client browser supports keep-alive, add a field, connection-keep-alive, to the HTTP request header. In HTTP 1.1: All connections are maintained by default.
- HTTP keep-alive simply means to Keep the current TCP connection and avoid re-establishing the connection.
- HTTP long connections cannot last forever, for example
Keep-Alive: timeout=5, max=100
, indicating that the TCP channel can hold for 5 seconds. Max =100, indicating that the long connection will be disconnected after receiving 100 requests at most. - HTTP is a stateless protocol, which means that each request is independent,
Keep-Alive
Without changing the result,Keep-Alive
There is no guarantee that the connection between the client and server will be active, the only guarantee is that a notification will be received when the connection is disconnected. - After using a long connection, how do clients and servers know that the transfer is over? 1. Check whether the data is transmitted
Content-Length
2. The dynamically generated file does not have content-Length. It is chunked, and the chunked encoded data has an empty chunked block at the end, indicating that the data transfer is over.
7. Transform-Encoding
Transform-encoding is a header value that identifies the HTTP transmission format. The current HTTP specification defines only one transmission format – chunked. If the transform-encoding header of an HTTP message request or reply message has the value chunked, the body of the message consists of an indeterminate number of blocks and ends with the last block of size 0.
chunked
和multipart
The two terms are similar in meaning, but they are not in the same category in the HTTP protocol. Multipart is aContent-Type
Indicates the type of the HTTP packet content, andchunked
Is a transport format that indicates how the header will be transmitted.chunked
Transmission cannot know in advance the size of the transmission content, only the last emptychunked
Block to determine the end, so for download requests, there is no way to know the download progress.chunked
The advantage is that the server can generate content and send it at the same time without having to know everything in advance. HTTP/2 is not supportedTransfer-Encoding: chunked
HTTP/2 has its own way of streaming:Source: mDN-transfer-encoding
8. HTTP Pipelining
By default, each transport layer connection in THE HTTP protocol can host only one HTTP request and response, and the browser sends the next request after receiving the response from the previous one. In the case of persistent connections, the passing of messages on a connection is similar to request 1 -> Response 1 -> Request 2 -> Response 2 -> Request 3 -> Response 3. Using HTTP Pipelining is a technology that packages multiple HTTP requests without waiting for a response from the server. A message on a connection is transmitted like Request 1 -> Request 2 -> Request 3 -> response 1 -> Response 2 -> Response 3.
- Pipelining is done through persistent Connection, only HTTP/1.1 supports this technology (HTTP/1.0 does not)
- Only GET and HEAD requests can be piped; POST is limited.
- The pipeline mechanism should not be enabled when the connection is first created because the other party (server) may not support HTTP/1.1.
- Pipelining does not affect the order in which responses arrive.
- HTTP 1.1 requires the server to be piped. It does not require the server to also pipe the response, only that the piped request does not fail.
- Due to the server side issues mentioned above, it is likely that enabling pipelining will not result in significant performance gains, and many server side and agents do not support pipelining well, so modern browsers such as Chrome and Firefox do not enable pipelining by default.
9. Session tracing
The session
The whole process in which the client opens the connection with the server and sends a request to the server and responds to the client’s request is called a session.
Session tracking
Session tracing is the monitoring of continuous requests and received responses from the same user to the server.
Why do YOU need session tracing
If the client communicates with the server using HTTP protocol, which is stateless, it cannot save the status (information) of the user. That is, the client is disconnected after one response, and the next request needs to be reconnected. In this case, it needs to determine whether the user is the same.
Common techniques for session tracing
- The URL rewrite
- URL (Uniform Resource Locator) is the address of a specific interface on the Web. URL rewriting technology is to add an additional data at the end of the URL to identify the session, and transfer the SESSION ID to the server through the URL to distinguish different users on the server.
- Hide form fields
- Submit the session ID to the server by adding it to an HTTP form element that is not visible to the client
- Cookie
- Cookie is a short piece of information sent by the server to the client. When the client requests, the information can be read and sent to the server for user identification. For each request of the client, the server will send the Cookie to the client, and the client can save it for next use.
- Cookie can be stored in the client memory, called temporary Cookie, the client is closed that is cleared; In addition, it can be stored in the disk as a permanent Cookie.
- Session
- Each user has a specific Session, which cannot be shared but is exclusive to each user. Information can be stored in the Session.
- A Session object will be created on the server, a SessionID will be generated to identify the Session object, and then the SessionID will be sent to the client in the Cookie. On the next access, the SessionID will be sent to the server again. Identify different users on the server
- Session implementation depends on cookies. If cookies are disabled, the Session is invalid.
10. Cross-site attacks
Cross-site Request Forgery (CSRF) – Forges a request to impersonate a user’s normal operation on the Site.
How to prevent CSRF cross-site attacks:
- Critical actions only accept POST requests
- Verification code
- With captcha, each operation requires user interaction, which effectively defends against CSRF attacks. However, having to enter a captcha for every action you make on a site can seriously affect the user experience, so captcha is usually only used for special actions or for registration.
- detection
Referer
- For example, if you leave a message in the forum, then no matter where you redirected to after leaving a message, the previous URL must contain the message input box, the previous URL will remain in the new page header file
Referer
In, pass the checkReferer
Value, we can determine whether the request is valid or illegal, but the problem is that the server does not always receive itReferer
The value of, soReferer Check
Generally used for monitoringCSRF
Attacks occur instead of being defended against.
- For example, if you leave a message in the forum, then no matter where you redirected to after leaving a message, the previous URL must contain the message input box, the previous URL will remain in the new page header file
- Token
- Encryption prevention for parameters
CSRF
Attack. - Add a new parameter Token, not knowing that Token cannot construct a legitimate request to attack.
- Token Usage time:
- The Token should be random enough
- The Token is one-time, that is, the Token is updated after each successful request
- Keep tokens confidential
- Encryption prevention for parameters
XSS (Cross Site Scripting) – is a type of injection attack
If defense against XSS: the user’s input is parsed using the HTML parsing library to get the data. Then rebuild the HTML element tree according to the user’s original tag attributes. During the build process, all tags and attributes are taken only from the whitelist.
Part.2 – HTTP over SSL/TLS
1. HTTPS basic process
HTTPS (HTTP over SSL/TLS) is a protocol used to transmit HTTP content over encrypted channels.
TLS basic process:
- The client sends one
ClientHello
The message to the server contains its version of TLS, available encryption algorithms, and compression algorithms. - The server sends one to the client
ServerHello
The message contains the TLS version of the server, the encryption algorithm and compression algorithm selected by the server, and the server public Certificate issued by the Certificate Authority (CA), which contains the public key. The client uses this public key to encrypt the subsequent handshake until a new symmetric key is negotiated. The certificate also contains the Common Name (CN) used by the certificate for client authentication. - The client authenticates the certificate of the server based on its trusted CA list. If the certificate is trusted, the client generates a string of pseudo-random numbers and encrypts it with the public key of the server. This string of random numbers is used to generate a new symmetric key.
- The server decrypts the above random number using its own private key, and then uses the random number to generate its own symmetric master key.
- The client sends one
finished
The message is sent to the server using the symmetric key to encrypt a hash value of the communication. - The server generates its own Hash value, decrypts the message sent by the client, checks whether the two values match, and sends one to the client if so
finished
Messages, also encrypted using negotiated symmetric keys. - From now on, the entire TLS session is encrypted using symmetric keys to transmit application layer (HTTP) content.
The complete process of TLS requires three algorithms (protocols) : key exchange algorithm, symmetric encryption algorithm and message authentication algorithm
2. TLS certificate mechanism
An important step in the HTTPS process is that the server must have a certificate issued by the CA certificate authority. The client authenticates the server based on the trusted CA list. In modern browsers, the certificate verification process relies on the certificate trust chain. That is, a certificate needs to rely on the previous certificate to prove its credibility. The top-level certificate is the root certificate, and the authority with the root certificate is called the root CA (common operating system).
3. Man-in-the-middle attack
The so-called man-in-the-middle attack means that the attacker establishes independent contact with both ends of the communication and exchanges the received data, so that both sides of the communication believe that they are directly talking with each other through a private connection. In fact, the whole conversation will be completely controlled by the attacker. In a man-in-the-middle attack, an attacker can intercept communications between two parties and insert new content.
SSL Stripping Problem
SSL stripping prevents users from accessing websites using HTTPS. Since not all web sites support HTTPS only, most web sites support both HTTP and HTTPS. When a user visits a website, he or she may enter the address of http:// in the address bar. The first visit is completely in clear text, which gives an attacker an opportunity. By attacking DNS responses, attackers can turn themselves into middlemen.
HSTS
A mechanism used to force browsers to access web sites using HTTPS. The basic mechanism is to add a special header to the response returned by the server, instructing the browser to force HTTPS access to the site. HSTS has the obvious disadvantage of waiting for the header in the influence of the first server to take effect, but what if the first time you visit the site is attacked? To solve this problem, the browser carries the domain names of some websites, called the HSTS Preload List. For sites on this list, HTTPS is forced directly.
Forged Certificate Attack
HSTS only addresses THE issue of SSL stripping, but it is still possible to be listened to even when HTTPS is used throughout. The first step is to attack the DNS server. The second step is that the attacker’s own certificate needs to be trusted by users. This step is difficult for users to control, and certificate authorities need to control themselves from spamming certificates.
HPKP
HPKP technology is born to solve the forged certificate attack. HPKP (Public Key Pinning Extension for HTTP) takes HSTS one step further by storing the Public Key fingerprint information of the server directly in the return header, and when it detects a difference between the fingerprint and the Public Key actually received, the browser can assume that an attack is under way. Like HSTS, HPKP relies on header returns from the server, does not solve the problem of first access, and the browser itself has some built-in HPKP lists.
Part.3 – TCP protocol
1. TCP features
- TCP provides a connection-oriented, reliable byte stream service.
- In a TCP connection, only two parties communicate with each other. Broadcast and multicast cannot be used for TCP.
- TCP uses verification, confirmation, and retransmission mechanisms to ensure reliable transmission.
- TCP sorts data sections and uses accumulations to confirm and ensure that the order of data is constant and non-repetitive.
- TCP uses the sliding window mechanism to control traffic, and dynamically changes the window size to control congestion
Note: TCP does not guarantee that the data will be received by the other party, because it is impossible. What TCP does is deliver data to the other party if possible, and notify the user otherwise (by aborting retransmission and breaking the connection). So TCP is not exactly a 100% reliable protocol. What it does provide is reliable delivery of data or reliable notification of failures.
2. Three handshakes and four waves
Three-way handshake
The three-way handshake means that the client and server need to send three packets to establish a TCP connection. The purpose of the three-way handshake is to connect the specified port of the server, establish a TCP connection, and synchronize the serial number and confirmation number of the connection parties, exchange TCP window size information. In socket programming, the client executes connect() to trigger the three-way handshake.
-
First handshake :(SYN = 1, seq = x)
- The client sends one
TCP
的SYN
A package marked at position 1, indicating the port to which the client needs to connect and the initial Sequence Number X, is stored in the Sequence Number field of the packet header. - After the sending is complete, the client enters
SYN_SEND
State.
- The client sends one
-
Second handshake :(SYN = 1, ACK = 1, seq = y, ACKnum = x + 1)
- The server sends back an ACK reply, that is
SYN
和ACK
Both values are 1, and the server selects its ownISN
Serial number, putseq
In the domain, set the Acknowledgement Number to that of the customerISN
Add 1, namelyX+1
- After sending the packets, the server enters
SYN_RCVD
State.
- The server sends back an ACK reply, that is
-
Third handshake :(ACK = 1, ACKnum = y + 1)
- The client sends an acknowledgement packet (ACK) again,
SYN
Flag bit is 0,ACK
Flag bit is 1 and sends the serverACK
The ordinal number field + 1. - After the sending is complete, the client enters
ESTABLISHED
State, also entered when the server receives the packetESTABLISHED
State,TCP
The handshake ends.
- The client sends an acknowledgement packet (ACK) again,
Schematic diagram of three-way handshake:
Four times to wave
The removal of TCP requires four packets to be sent, so it is called the quadruple wave, also known as the improved three-way handshake. Both the client and the server can initiate the wave action actively. In socket programming, the wave action can be generated by executing close() at either end.
-
First wave :(FIN = 1, seq = x)
- Suppose the client wants to close the connection, the client sends one
FIN
A packet that flags bit 1, indicating that it has no data to send but can still receive data. - After the sending is complete, the client enters
FIN_WAIT_1
State.
- Suppose the client wants to close the connection, the client sends one
-
Second wave :(ACK = 1, ACKnum = x + 1)
- The server identifies the client
FIN
Package, and sends an acknowledgement that it has received the client’s request to close the connection, but is not ready to do so. - After sending the packets, the server enters
CLOSE_WAIT
The client enters after receiving this acknowledgement packetFIN_WAIT_2
Waiting for the server to close the connection.
- The server identifies the client
-
Third wave :(FIN = 1, seq = y)
- When the server is ready to close the connection, it sends the end of the connection request to the client,
FIN
Set to 1. - After sending the packets, the server enters
LAST_ACK
Status, waiting for the last one from the clientACK
.
- When the server is ready to close the connection, it sends the end of the connection request to the client,
-
Fourth wave :(ACK = 1, ACKnum = y + 1)
- The client receives a close request from the server, sends an acknowledgement packet, and enters
TIME_WAIT
State. Waiting for possible requests for retransmissionACK
The package. - After receiving the acknowledgement packet, the server closes the connection and enters
CLOSED
State. - The client does not receive from the server after waiting a fixed amount of time (two maximum segment declaration cycles)
ACK
, thinking that the server has closed the connection normally, so you also close the connection, enterCLOSED
State.
Schematic diagram of four waves:
- The client receives a close request from the server, sends an acknowledgement packet, and enters
3. The SYN attack
What is theSYN
Attack?
In the second handshake, after the server sends SYN_ACK, the TCP connection before receiving ACK from the client is called a half-connection. The server is in the SYN_RCVD state. After receiving an ACK, the server changes to ESTABLISHED. In a SYN attack, the “attacking client” forges a large number of non-existent IP addresses in a short period of time and sends SYN packets repeatedly to the server. The server replies with an acknowledgement packet and waits for the client’s confirmation. Because the source address does not exist, the server needs to continuously resend until timeout. These forged SYN packets occupy the unconnected queue for a long time, and normal SYN requests are discarded, resulting in slow running of the target system, or even network congestion and system breakdown in serious cases. SYN attack is a typical Dos/DDos attack
How to detectSYN
Attack?
When the server has a large number of semi-connected states, especially when the source IP address is random, it can be concluded that this is a SYN attack. On Linux/Unix, you can use the netstats command to detect SYN attacks.
How to defenseSYN
Attack?
SYN attacks cannot be completely blocked unless TCP is redesigned. You can minimize the damage caused by SYN attacks:
- Shorten SYN Timeout
- Increase the maximum number of connections
- Filtering gateway Protection
SYN cookie
technology
4. TCP KeepAlive
TCP connection, in fact, is a pure software level concept, there is no “connection” in the physical layer of this concept. If an exception occurs on one end and the other end cannot perceive it, the other end maintains the connection for a long time. As a result, a large number of semi-connected TCP connections are maintained for a long time, consuming and wasting resources on the end system. To solve this problem, the KeepAlive mechanism of TCP can be used at the transport layer to avoid it.
The basic principle of TCP KeepAlive is as follows: The TCP KeepAlive sends a probe packet to the peer end at intervals. If the peer end receives an ACK reply, the connection is considered alive. After the number of retries exceeds a certain limit, the peer end dismisses the TCP connection.
Limitations of TCP KeepAlive: First of all, TCP KeepAlive detects by sending a probe packet, which brings extra traffic to the network. In addition, TCP KeepAlive can only detect whether the connection is alive at the kernel level, which does not necessarily mean that the service is available. For example, when the CPU usage of a server is 100% and the server cannot respond to requests, TCP KeepAlive still considers the connection alive. Therefore, TCP KeepAlive is of relatively little value to application-layer programs.
Part.4 – UDP protocol
UDP is a simple transport layer protocol. Compared with TCP, UDP has the following features:
UDP
Lack of reliability.UDP
It does not provide mechanisms such as sequence number confirmation and timeout retransmission.UDP
Datagrams may be copied and reordered across the network. namelyUDP
There is no guarantee that the datagram will reach its final destination, the order in which each datagram will arrive, or that each datagram will arrive only once.UDP
Datagrams have length. eachUDP
Datagrams have length, and if a datagram arrives at its destination correctly, the length of the datagram is passed along with the data to the receiver. whileTCP
Is a byte stream protocol with no record boundaries on any protocol.UDP
It’s connectionless.UDP
There is no long-term relationship between the client and the server,UDP
There is no need to create a connection through a handshake before sending a datagramUDP
Supports multicast and broadcast.
Part.5 – IP protocol
IP protocol is located in the third layer of TCP/IP – the network layer. In contrast to transport layer protocols, the responsibility of the network layer is to provide point-to-point services, while the transport layer (TCP/UDP) provides end-to-end services.
1. OSI Layer 7 protocol of the network
– | – |
---|---|
7 | The application layer |
6 | The presentation layer |
5 | The session layer |
4 | The transport layer |
3 | The network layer |
2 | Data link layer |
1 | The physical layer |
2. IP address classification
- Class A address
- A class B address
- Class C address
- The class D address
3. Broadcast and multicast
Broadcast and multicast only for UDP (TCP is connection-oriented)
radio
There are four broadcast addresses:
- Restricted broadcast: The restricted broadcast address is
255.255.255.255
. - Broadcast to the network: address with all 1 host numbers
- A broadcast to a subnet
- Broadcast to all subnets
multicast
Also called multicast, class D addresses are used. The 28 bits assigned to class D addresses are used as multicast group numbers
BGP
Border Gateway Protocol (BGP) is an autonomous system routing protocol running on TCP
Part.6 – Socket programming
1. Basic concepts of Sockets
Socket is an encapsulation of the TCP/IP protocol family and an intermediate software abstraction layer for the communication between the application layer and the TCP/IP protocol family. From the point of view of design mode, Socket is actually a facade mode, it hides the complex TCP/IP protocol family behind the Socket interface, for users, a simple set of interfaces is all, let the Socket to organize data to conform to the specified protocol. Socket can also be considered as a method of communication between different computer processes on the network, using triples (IP address, protocol, port) can uniquely identify the process in the network, process communication in the network can use this symbol to interact with other processes. Socket originated in Unix. One of the basic philosophies of Unix/Linux is that everything can be operated in the “open > Write /read > Close” mode, so sockets are treated as special files.
2. Write a simple WebServer
A simple Server process includes:
- Establish a connection and accept a client connection.
- Receives the request and reads an HTTP request packet from the network.
- Handle requests and access resources.
- Build the response, creating an HTTP response message with the header.
- Sends the response to the client.
General program and called function logic:
socket()
Create a socketbind()
Assign a socket addresslisten()
Waiting for connection requestaccept()
Allow connection requestread()/write()
Data interchangeclose()
Close the connection
More dry articles
Blog:www.qiuxuewei.com
Wechat Official Account:@ The way developers grow
A public account with no chicken soup and only dry goods