Network gleanings

Part.1 – HTTP protocol

1. The HTTP features

  • HTTP is an application-layer protocol based on TCP/IP. The default port number is 80.
  • HTTP is connectionless and stateless.

2. HTTP message

HTTP is an application layer specification based on TCP/IP protocol and is transmitted by ASCII code. The specification divides HTTP into three parts: the status line, the request header, and the request body. HTTP defines different methods to interact with the server. The four commonly used methods are GET, POST, DELETE, and PUT. The full name of a URL is a resource descriptor. A URL is used to describe a resource on the network. The POST, DELETE, PUT, and GET operations in HTTP correspond to adding, deleting, modifying, and querying a resource. Other request methods include HEAD, OPTIONS, TRACE, and PATCH.

  • GETUsed for information retrieval. (Security and idempotent) Note: Security means that the operation is used to retrieve information, not modify it. Idempotent means that multiple requests to the same URL should return the same result.
  • POSTRepresents a request that may modify a resource on the server. (not safe, not idempotent)
  • HEADGETThe method is similar, but does not return the message body content, just gets part of the information (content-type, content-Length) that fetched the resource (security and idempotent)
  • PUTUsed to create and update resources. (insecure, idempotent)
  • DELETEDelete a resource. (insecure, idempotent)
  • OPTIONSURL authentication to verify whether the interface service is normal. (Safe, idempotent)
  • TEACEEcho the request received by the server so that the client can see (if any) which changes or additions have been implemented by the intermediate server (security, idempotent).
  • PATCHPATCH is used to create and update resources, similar to PUT, except that PATCH represents partial update. Later proposed interface methods may be used to verify whether the client and server support; (insecure, idempotent)

The amount of data submitted by GET is limited by the URL length. HTTP does not limit the URL length, which is restricted by the browser and server. In theory, POST has no size limit, and HTTP has no size limit. For security reasons, the server will make certain limits.

3. Method of submitting data through POST

HTTP protocol stipulates that the data submitted by POST must be in the Body part, but the protocol does not specify the data format or encoding method. The server usually knows the encoding method of the message Body in the request through the Content-Type field in the request header, and then parses the Body. The scheme for submitting data through POST contains the following two parts: Content-Type and message body encoding:

  • application/x-www-form-urlencoded: The most common POST data submission method, the browser’s native form, if not setenctypeProperty, and you end up withapplication/x-www-form-urlencodedMethod to submit data.
  • multipart/form-data: When uploading files using forms, you must make the form’senctypeIs equal to themultipart/form-data. This method is usually used to upload files.
  • application/json
  • text/xml
  • application/x-protobuf

As long as the server can correctly parse the request based on content-Type and Content-Encoding.

4. Respond to the packet

An HTTP response, like an HTTP request, consists of three parts: a status line, a response header, and a response body. The status line consists of the protocol version, the status code in digital form, and the status description of the response, separated by Spaces. Common status codes:

  • 200 OKThe client request succeeded. Procedure
  • 301 Moved PermanentlyRequest a permanent redirect
  • 302 Moved TemporarilyRequest a temporary redirect
  • 304 Not ModifiedThe file is not modified. You can use the cache file directly
  • 400 Bad RequestThe client request cannot be understood by the server because of a syntax error
  • 401 UnauthorizedUnauthorized request (status code must be used with wwW-Authenticate header field)
  • 403 ForbiddenThe server received the request but refused service. The server usually gives the reason for not providing the service in the response body.
  • 404 Not FoundThe requested resource does not exist.
  • 500 Internal Server ErrorAn unexpected error occurred on the server, causing the client’s request to be unable to be completed.
  • 503 Service UnavailableThe server cannot process client requests. The server may recover after a period of time.

5. Conditional GET

HTTP condition GET is HTTP protocol in order to reduce unnecessary bandwidth waste, put forward a scheme. When to use: The client has visited the site before and wants to visit it again. Usage: The client to the server to send a package to ask if changed since the last time after the visit the web site page, if the server is not updated, obviously do not need to pass an entire web page to the client, the client only need to use the local cache, if the server controls the client has given time, update the client request to send the update the web page to the user.

6. Persistent connection

Generally, the HTTP protocol adopts the request-reply mode. In the common mode, that is, the keep-alive mode is not used, each request/reply client and server create a connection and disconnect the connection immediately after the connection is completed (HTTP is a connectionless protocol). When the keep-alive mode is used, the keep-alive function keeps the connection between the client and the server Alive. When subsequent requests to the server occur, the keep-alive function avoids establishing or re-establishing the connection. In HTTP 1.0: If the client browser supports keep-alive, add a field, connection-keep-alive, to the HTTP request header. In HTTP 1.1: All connections are maintained by default.

  • HTTP keep-alive simply means to Keep the current TCP connection and avoid re-establishing the connection.
  • HTTP long connections cannot last forever, for exampleKeep-Alive: timeout=5, max=100, indicating that the TCP channel can hold for 5 seconds. Max =100, indicating that the long connection will be disconnected after receiving 100 requests at most.
  • HTTP is a stateless protocol, which means that each request is independent,Keep-AliveWithout changing the result,Keep-AliveThere is no guarantee that the connection between the client and server will be active, the only guarantee is that a notification will be received when the connection is disconnected.
  • After using a long connection, how do clients and servers know that the transfer is over? 1. Check whether the data is transmittedContent-Length2. The dynamically generated file does not have content-Length. It is chunked, and the chunked encoded data has an empty chunked block at the end, indicating that the data transfer is over.

7. Transform-Encoding

Transform-encoding is a header value that identifies the HTTP transmission format. The current HTTP specification defines only one transmission format – chunked. If the transform-encoding header of an HTTP message request or reply message has the value chunked, the body of the message consists of an indeterminate number of blocks and ends with the last block of size 0.

  • chunkedmultipartThe two terms are similar in meaning, but they are not in the same category in the HTTP protocol. Multipart is aContent-TypeIndicates the type of the HTTP packet content, andchunkedIs a transport format that indicates how the header will be transmitted.
  • chunkedTransmission cannot know in advance the size of the transmission content, only the last emptychunkedBlock to determine the end, so for download requests, there is no way to know the download progress.
  • chunkedThe advantage is that the server can generate content and send it at the same time without having to know everything in advance. HTTP/2 is not supportedTransfer-Encoding: chunkedHTTP/2 has its own way of streaming:Source: mDN-transfer-encoding

8. HTTP Pipelining

By default, each transport layer connection in THE HTTP protocol can host only one HTTP request and response, and the browser sends the next request after receiving the response from the previous one. In the case of persistent connections, the passing of messages on a connection is similar to request 1 -> Response 1 -> Request 2 -> Response 2 -> Request 3 -> Response 3. Using HTTP Pipelining is a technology that packages multiple HTTP requests without waiting for a response from the server. A message on a connection is transmitted like Request 1 -> Request 2 -> Request 3 -> response 1 -> Response 2 -> Response 3.

  • Pipelining is done through persistent Connection, only HTTP/1.1 supports this technology (HTTP/1.0 does not)
  • Only GET and HEAD requests can be piped; POST is limited.
  • The pipeline mechanism should not be enabled when the connection is first created because the other party (server) may not support HTTP/1.1.
  • Pipelining does not affect the order in which responses arrive.
  • HTTP 1.1 requires the server to be piped. It does not require the server to also pipe the response, only that the piped request does not fail.
  • Due to the server side issues mentioned above, it is likely that enabling pipelining will not result in significant performance gains, and many server side and agents do not support pipelining well, so modern browsers such as Chrome and Firefox do not enable pipelining by default.

9. Session tracing

The session

The whole process in which the client opens the connection with the server and sends a request to the server and responds to the client’s request is called a session.

Session tracking

Session tracing is the monitoring of continuous requests and received responses from the same user to the server.

Why do YOU need session tracing

If the client communicates with the server using HTTP protocol, which is stateless, it cannot save the status (information) of the user. That is, the client is disconnected after one response, and the next request needs to be reconnected. In this case, it needs to determine whether the user is the same.

Common techniques for session tracing

  • The URL rewrite
    • URL (Uniform Resource Locator) is the address of a specific interface on the Web. URL rewriting technology is to add an additional data at the end of the URL to identify the session, and transfer the SESSION ID to the server through the URL to distinguish different users on the server.
  • Hide form fields
    • Submit the session ID to the server by adding it to an HTTP form element that is not visible to the client
  • Cookie
    • Cookie is a short piece of information sent by the server to the client. When the client requests, the information can be read and sent to the server for user identification. For each request of the client, the server will send the Cookie to the client, and the client can save it for next use.
    • Cookie can be stored in the client memory, called temporary Cookie, the client is closed that is cleared; In addition, it can be stored in the disk as a permanent Cookie.
  • Session
    • Each user has a specific Session, which cannot be shared but is exclusive to each user. Information can be stored in the Session.
    • A Session object will be created on the server, a SessionID will be generated to identify the Session object, and then the SessionID will be sent to the client in the Cookie. On the next access, the SessionID will be sent to the server again. Identify different users on the server
    • Session implementation depends on cookies. If cookies are disabled, the Session is invalid.

10. Cross-site attacks

Cross-site Request Forgery (CSRF) – Forges a request to impersonate a user’s normal operation on the Site.

How to prevent CSRF cross-site attacks:

  • Critical actions only accept POST requests
  • Verification code
    • With captcha, each operation requires user interaction, which effectively defends against CSRF attacks. However, having to enter a captcha for every action you make on a site can seriously affect the user experience, so captcha is usually only used for special actions or for registration.
  • detectionReferer
    • For example, if you leave a message in the forum, then no matter where you redirected to after leaving a message, the previous URL must contain the message input box, the previous URL will remain in the new page header fileRefererIn, pass the checkRefererValue, we can determine whether the request is valid or illegal, but the problem is that the server does not always receive itRefererThe value of, soReferer CheckGenerally used for monitoringCSRFAttacks occur instead of being defended against.
  • Token
    • Encryption prevention for parametersCSRFAttack.
    • Add a new parameter Token, not knowing that Token cannot construct a legitimate request to attack.
    • Token Usage time:
      • The Token should be random enough
      • The Token is one-time, that is, the Token is updated after each successful request
      • Keep tokens confidential

XSS (Cross Site Scripting) – is a type of injection attack

If defense against XSS: the user’s input is parsed using the HTML parsing library to get the data. Then rebuild the HTML element tree according to the user’s original tag attributes. During the build process, all tags and attributes are taken only from the whitelist.

Part.2 – HTTP over SSL/TLS

1. HTTPS basic process

HTTPS (HTTP over SSL/TLS) is a protocol used to transmit HTTP content over encrypted channels.

TLS basic process:

  • The client sends oneClientHelloThe message to the server contains its version of TLS, available encryption algorithms, and compression algorithms.
  • The server sends one to the clientServerHelloThe message contains the TLS version of the server, the encryption algorithm and compression algorithm selected by the server, and the server public Certificate issued by the Certificate Authority (CA), which contains the public key. The client uses this public key to encrypt the subsequent handshake until a new symmetric key is negotiated. The certificate also contains the Common Name (CN) used by the certificate for client authentication.
  • The client authenticates the certificate of the server based on its trusted CA list. If the certificate is trusted, the client generates a string of pseudo-random numbers and encrypts it with the public key of the server. This string of random numbers is used to generate a new symmetric key.
  • The server decrypts the above random number using its own private key, and then uses the random number to generate its own symmetric master key.
  • The client sends onefinishedThe message is sent to the server using the symmetric key to encrypt a hash value of the communication.
  • The server generates its own Hash value, decrypts the message sent by the client, checks whether the two values match, and sends one to the client if sofinishedMessages, also encrypted using negotiated symmetric keys.
  • From now on, the entire TLS session is encrypted using symmetric keys to transmit application layer (HTTP) content.

The complete process of TLS requires three algorithms (protocols) : key exchange algorithm, symmetric encryption algorithm and message authentication algorithm

2. TLS certificate mechanism

An important step in the HTTPS process is that the server must have a certificate issued by the CA certificate authority. The client authenticates the server based on the trusted CA list. In modern browsers, the certificate verification process relies on the certificate trust chain. That is, a certificate needs to rely on the previous certificate to prove its credibility. The top-level certificate is the root certificate, and the authority with the root certificate is called the root CA (common operating system).

3. Man-in-the-middle attack

The so-called man-in-the-middle attack means that the attacker establishes independent contact with both ends of the communication and exchanges the received data, so that both sides of the communication believe that they are directly talking with each other through a private connection. In fact, the whole conversation will be completely controlled by the attacker. In a man-in-the-middle attack, an attacker can intercept communications between two parties and insert new content.

SSL Stripping Problem

SSL stripping prevents users from accessing websites using HTTPS. Since not all web sites support HTTPS only, most web sites support both HTTP and HTTPS. When a user visits a website, he or she may enter the address of http:// in the address bar. The first visit is completely in clear text, which gives an attacker an opportunity. By attacking DNS responses, attackers can turn themselves into middlemen.

HSTS

A mechanism used to force browsers to access web sites using HTTPS. The basic mechanism is to add a special header to the response returned by the server, instructing the browser to force HTTPS access to the site. HSTS has the obvious disadvantage of waiting for the header in the influence of the first server to take effect, but what if the first time you visit the site is attacked? To solve this problem, the browser carries the domain names of some websites, called the HSTS Preload List. For sites on this list, HTTPS is forced directly.

Forged Certificate Attack

HSTS only addresses THE issue of SSL stripping, but it is still possible to be listened to even when HTTPS is used throughout. The first step is to attack the DNS server. The second step is that the attacker’s own certificate needs to be trusted by users. This step is difficult for users to control, and certificate authorities need to control themselves from spamming certificates.

HPKP

HPKP technology is born to solve the forged certificate attack. HPKP (Public Key Pinning Extension for HTTP) takes HSTS one step further by storing the Public Key fingerprint information of the server directly in the return header, and when it detects a difference between the fingerprint and the Public Key actually received, the browser can assume that an attack is under way. Like HSTS, HPKP relies on header returns from the server, does not solve the problem of first access, and the browser itself has some built-in HPKP lists.

Part.3 – TCP protocol

1. TCP features

  • TCP provides a connection-oriented, reliable byte stream service.
  • In a TCP connection, only two parties communicate with each other. Broadcast and multicast cannot be used for TCP.
  • TCP uses verification, confirmation, and retransmission mechanisms to ensure reliable transmission.
  • TCP sorts data sections and uses accumulations to confirm and ensure that the order of data is constant and non-repetitive.
  • TCP uses the sliding window mechanism to control traffic, and dynamically changes the window size to control congestion

Note: TCP does not guarantee that the data will be received by the other party, because it is impossible. What TCP does is deliver data to the other party if possible, and notify the user otherwise (by aborting retransmission and breaking the connection). So TCP is not exactly a 100% reliable protocol. What it does provide is reliable delivery of data or reliable notification of failures.

2. Three handshakes and four waves

Three-way handshake

The three-way handshake means that the client and server need to send three packets to establish a TCP connection. The purpose of the three-way handshake is to connect the specified port of the server, establish a TCP connection, and synchronize the serial number and confirmation number of the connection parties, exchange TCP window size information. In socket programming, the client executes connect() to trigger the three-way handshake.

  • First handshake :(SYN = 1, seq = x)

    • The client sends oneTCPSYNA package marked at position 1, indicating the port to which the client needs to connect and the initial Sequence Number X, is stored in the Sequence Number field of the packet header.
    • After the sending is complete, the client entersSYN_SENDState.
  • Second handshake :(SYN = 1, ACK = 1, seq = y, ACKnum = x + 1)

    • The server sends back an ACK reply, that isSYNACKBoth values are 1, and the server selects its ownISNSerial number, putseqIn the domain, set the Acknowledgement Number to that of the customerISNAdd 1, namelyX+1
    • After sending the packets, the server entersSYN_RCVDState.
  • Third handshake :(ACK = 1, ACKnum = y + 1)

    • The client sends an acknowledgement packet (ACK) again,SYNFlag bit is 0,ACKFlag bit is 1 and sends the serverACKThe ordinal number field + 1.
    • After the sending is complete, the client entersESTABLISHEDState, also entered when the server receives the packetESTABLISHEDState,TCPThe handshake ends.

Schematic diagram of three-way handshake:

Four times to wave

The removal of TCP requires four packets to be sent, so it is called the quadruple wave, also known as the improved three-way handshake. Both the client and the server can initiate the wave action actively. In socket programming, the wave action can be generated by executing close() at either end.

  • First wave :(FIN = 1, seq = x)

    • Suppose the client wants to close the connection, the client sends oneFINA packet that flags bit 1, indicating that it has no data to send but can still receive data.
    • After the sending is complete, the client entersFIN_WAIT_1State.
  • Second wave :(ACK = 1, ACKnum = x + 1)

    • The server identifies the clientFINPackage, and sends an acknowledgement that it has received the client’s request to close the connection, but is not ready to do so.
    • After sending the packets, the server entersCLOSE_WAITThe client enters after receiving this acknowledgement packetFIN_WAIT_2Waiting for the server to close the connection.
  • Third wave :(FIN = 1, seq = y)

    • When the server is ready to close the connection, it sends the end of the connection request to the client,FINSet to 1.
    • After sending the packets, the server entersLAST_ACKStatus, waiting for the last one from the clientACK.
  • Fourth wave :(ACK = 1, ACKnum = y + 1)

    • The client receives a close request from the server, sends an acknowledgement packet, and entersTIME_WAITState. Waiting for possible requests for retransmissionACKThe package.
    • After receiving the acknowledgement packet, the server closes the connection and entersCLOSEDState.
    • The client does not receive from the server after waiting a fixed amount of time (two maximum segment declaration cycles)ACK, thinking that the server has closed the connection normally, so you also close the connection, enterCLOSEDState.

    Schematic diagram of four waves:

3. The SYN attack

What is theSYNAttack?

In the second handshake, after the server sends SYN_ACK, the TCP connection before receiving ACK from the client is called a half-connection. The server is in the SYN_RCVD state. After receiving an ACK, the server changes to ESTABLISHED. In a SYN attack, the “attacking client” forges a large number of non-existent IP addresses in a short period of time and sends SYN packets repeatedly to the server. The server replies with an acknowledgement packet and waits for the client’s confirmation. Because the source address does not exist, the server needs to continuously resend until timeout. These forged SYN packets occupy the unconnected queue for a long time, and normal SYN requests are discarded, resulting in slow running of the target system, or even network congestion and system breakdown in serious cases. SYN attack is a typical Dos/DDos attack

How to detectSYNAttack?

When the server has a large number of semi-connected states, especially when the source IP address is random, it can be concluded that this is a SYN attack. On Linux/Unix, you can use the netstats command to detect SYN attacks.

How to defenseSYNAttack?

SYN attacks cannot be completely blocked unless TCP is redesigned. You can minimize the damage caused by SYN attacks:

  • Shorten SYN Timeout
  • Increase the maximum number of connections
  • Filtering gateway Protection
  • SYN cookietechnology

4. TCP KeepAlive

TCP connection, in fact, is a pure software level concept, there is no “connection” in the physical layer of this concept. If an exception occurs on one end and the other end cannot perceive it, the other end maintains the connection for a long time. As a result, a large number of semi-connected TCP connections are maintained for a long time, consuming and wasting resources on the end system. To solve this problem, the KeepAlive mechanism of TCP can be used at the transport layer to avoid it.

The basic principle of TCP KeepAlive is as follows: The TCP KeepAlive sends a probe packet to the peer end at intervals. If the peer end receives an ACK reply, the connection is considered alive. After the number of retries exceeds a certain limit, the peer end dismisses the TCP connection.

Limitations of TCP KeepAlive: First of all, TCP KeepAlive detects by sending a probe packet, which brings extra traffic to the network. In addition, TCP KeepAlive can only detect whether the connection is alive at the kernel level, which does not necessarily mean that the service is available. For example, when the CPU usage of a server is 100% and the server cannot respond to requests, TCP KeepAlive still considers the connection alive. Therefore, TCP KeepAlive is of relatively little value to application-layer programs.

Part.4 – UDP protocol

UDP is a simple transport layer protocol. Compared with TCP, UDP has the following features:

  • UDPLack of reliability.UDPIt does not provide mechanisms such as sequence number confirmation and timeout retransmission.UDPDatagrams may be copied and reordered across the network. namelyUDPThere is no guarantee that the datagram will reach its final destination, the order in which each datagram will arrive, or that each datagram will arrive only once.
  • UDPDatagrams have length. eachUDPDatagrams have length, and if a datagram arrives at its destination correctly, the length of the datagram is passed along with the data to the receiver. whileTCPIs a byte stream protocol with no record boundaries on any protocol.
  • UDPIt’s connectionless.UDPThere is no long-term relationship between the client and the server,UDPThere is no need to create a connection through a handshake before sending a datagram
  • UDPSupports multicast and broadcast.

Part.5 – IP protocol

IP protocol is located in the third layer of TCP/IP – the network layer. In contrast to transport layer protocols, the responsibility of the network layer is to provide point-to-point services, while the transport layer (TCP/UDP) provides end-to-end services.

1. OSI Layer 7 protocol of the network

7 The application layer
6 The presentation layer
5 The session layer
4 The transport layer
3 The network layer
2 Data link layer
1 The physical layer

2. IP address classification

  • Class A address
  • A class B address
  • Class C address
  • The class D address

3. Broadcast and multicast

Broadcast and multicast only for UDP (TCP is connection-oriented)

radio

There are four broadcast addresses:

  1. Restricted broadcast: The restricted broadcast address is255.255.255.255.
  2. Broadcast to the network: address with all 1 host numbers
  3. A broadcast to a subnet
  4. Broadcast to all subnets

multicast

Also called multicast, class D addresses are used. The 28 bits assigned to class D addresses are used as multicast group numbers

BGP

Border Gateway Protocol (BGP) is an autonomous system routing protocol running on TCP

Part.6 – Socket programming

1. Basic concepts of Sockets

Socket is an encapsulation of the TCP/IP protocol family and an intermediate software abstraction layer for the communication between the application layer and the TCP/IP protocol family. From the point of view of design mode, Socket is actually a facade mode, it hides the complex TCP/IP protocol family behind the Socket interface, for users, a simple set of interfaces is all, let the Socket to organize data to conform to the specified protocol. Socket can also be considered as a method of communication between different computer processes on the network, using triples (IP address, protocol, port) can uniquely identify the process in the network, process communication in the network can use this symbol to interact with other processes. Socket originated in Unix. One of the basic philosophies of Unix/Linux is that everything can be operated in the “open > Write /read > Close” mode, so sockets are treated as special files.

2. Write a simple WebServer

A simple Server process includes:

  1. Establish a connection and accept a client connection.
  2. Receives the request and reads an HTTP request packet from the network.
  3. Handle requests and access resources.
  4. Build the response, creating an HTTP response message with the header.
  5. Sends the response to the client.

General program and called function logic:

  • socket()Create a socket
  • bind()Assign a socket address
  • listen()Waiting for connection request
  • accept()Allow connection request
  • read()/write()Data interchange
  • close()Close the connection

More dry articles

Blog:www.qiuxuewei.com
Wechat Official Account:@ The way developers grow

A public account with no chicken soup and only dry goods