Diagram HTTP (ii) mediations and caching

Introduction to the

The current chapters are “RFC7230-2.3 Mediation” and “RFC7230-2.4 Cache”, which introduce the knowledge of agents, tunnels, gateways, and caches.

In this series, you will learn about HTTP through your own understanding of it. If readers have different understanding, please share with me. If there is a mistake, please correct, I will be the first time to modify, so as not to mislead the reader.

The mediation

HTTP enables mediations to fulfill requests through a series of connections. There are three common forms of HTTP mediation: proxy, gateway, and tunnel. A separate mediation may act as a source server, broker, depending on the nature of each request. Gateway or tunnel.

Each participant may participate in multiple communications simultaneously. As shown below, B may accept requests from many clients, not only A, and/or forward requests to other servers, not C, while processing A’s requests.

Similarly, subsequent requests may be sent over a different connection path, usually based on statically configured load balancing.

Upstream and downstream

All message flows flow from upstream to downstream

Inbound and outbound

Inbound refers to the source server and outbound refers to the user agent.

The agent

Is a message forwarding proxy selected by the client, usually through local configuration rules, for receiving requests for certain types of absolute URIs and attempting to satisfy those requests through HTTP interface translation. Proxies are typically used to aggregate HTTP requests from a community with a common mediation for security, annotation services, or shared caching. Some brokers are designed to transform selected messages or payloads when forwarding.

Gateway (Reverse proxy)

It is a source server that acts as an outbound connection but transforms incoming requests and forwards them inbound to other servers or server groups. Gateways are often used to encapsulate traditional or untrusted information services, improve server performance through “accelerator” caches, and enable HTTP service partitioning or load balancing across multiple machines.

All HTTP requirements that apply to the source server also apply to gateways for outbound communication. The gateway communicates with the inbound server using any protocol it wants, including private extensions beyond the HTTP specification. However, an HTTP-to-HTTP gateway that wants to interoperate with a third-party HTTP server should meet the requirements of a user agent on a gateway inbound connection.

The tunnel

Act as a blind repeater for both connections without changing the message. Once activated, a tunnel is not considered a party to HTTP communication, although the tunnel may have been initiated by an HTTP request. When both are gone, the tunnel ceases to exist and both ends of the trunk connection are closed.

Tunnels are used to extend virtual connections through mediations, such as when Transport Layer Security (TLS) is used to establish confidential communications over shared firewall agents.

Intercepting proxy (transparent proxy or mandatory portal)

Mediations consider only those who are participants in HTTP communication. There are mediations that operate in the lower layers of the network protocol stack to filter or forward HTTP traffic in ways that the sender is unaware of or unauthorized to do. Man-in-the middle attacks are difficult to distinguish at the protocol level and often introduce security flaws or interoperability problems because they mistakenly violate HTTP semantics.

HTTP is defined as a stateless protocol, which means that each request message can be understood in isolation. Many implementations rely on the stateless design of HTTP to reuse proxy connections or dynamically invoke load-balancing requests across multiple servers. Therefore, the server cannot assume that two requests on the same connection are from the same user agent unless the connection is secure and specific to that agent. Some non-standard HTTP extensions are believed to violate this requirement, leading to security and interoperability issues.

The cache

Is a subsystem that locally stores previous response messages and controls the storage, recovery, and deletion of its messages. A cache stores cacheable responses to reduce response time and future network bandwidth usage for the same request. Any client or server can use the cache, although a server cannot use the cache when acting as a tunnel.

The effect of a cache is that if one of the participants in the chain has a cache applied to the request, this will shorten the length of the requesting and responding chain. The following illustrates that if B has A cached copy, it is the response from O (through C) for A request that was not previously cached by UA or A.

cacheable

The response is “cacheable” if the cache is allowed to store a copy of the response message to be used to answer subsequent requests. Even if the response is cacheable, there may be additional constraints on the client or source server when the cached response is available for a particular request.

A wide variety of architectures and caching configurations are deployed within the World Wide Web and large organizations. These include national-level caching proxies for storing cross-sea bandwidth, collaborative systems for broadcast or multicast cache entries, archiving of prefetch cache entries for offline or high-latency environments, and so on.

The document address

HTTP / 1.1 rfc7230