In every age, there is no ill-treatment of those who can learn.

Hello, I’m Yes.

HTTP protocol is everywhere on the Internet today, has been quietly behind the operation of the network world, HTTP is more familiar to us programmers.

We often say that architecture is evolutionary and that requirements drive iterations, updates, and advances in technology, and this is also true for HTTP.

Have you ever wondered how HTTP came to be, how it started, and how it has evolved to HTTP/3?

What secrets did they experience?

Today I’d like to take a look at the evolution of HTTP and see how it has grown from a tiny baby into a dominant presence on the Internet.

But before that, let’s take a brief look at a small history of ARPANET, the ancestor of the Internet, which is still very interesting.

The ancestor of the Internet – ARPANET

In the 1950s, communication researchers recognized the need for communication between different computer users and networks, which led to research on distributed networks, queuing theory, and packet interaction.

On February 7, 1958, U.S. Secretary of Defense Neil McElroy issued Department of Defense Directive 5105.15 establishing the Advanced Research Projects Agency (ARPA).

A study sponsored by IPTO, one of ARPA’s core agencies, led to the development of ARPANET.

Let’s take a look at the history.

In 1962, the director of ARPA hired Joseph Lickerd, one of the first to foresee modern interactive computing and its various applications, as the first director of IPTO.

IPTO funds research into advanced computing and networking technologies and commissions 13 research groups to study human-computer interaction and distributed systems. Each group received a budget 30 to 40 times larger than normal research grants.

Such is the deep pockets, the researchers must be very motivated!

In 1963 Lickerd funded a research project called MAC to explore the possibility of building communities on time-sharing computers.

The project had a lasting impact on IPTO and the wider research community as a prototype for widespread networking.

And Lickerd’s vision of a global network has greatly influenced his successors at IPTO.

In 1964 Lickerd moved to IBM, where his second director, Sutherland, came on board. He created the revolutionary Sketchpad program for storing memory in computer monitors, and in 1965 he signed an IPTO contract with Lawrence Roberts at MIT to further develop computer networking technology.

Roberts and Thomas Merrill then implemented the first packet exchange over a dial-up telephone connection between the TX-2 computer at MIT and the Q-32 computer in California.

The third director, Bob Taylor, arrived in 1966 and was heavily influenced by Lickerd, who, like Lickerd, was also a psychoacoustician.

With three different terminals connected to three different research sites in Taylor’s IPTO office, he realized that this architecture would severely limit his ability to scale access to multiple sites.

So he wanted to connect one terminal to a network of multiple sites, and from his position at the Pentagon, he had the ability to make that happen.

Darpa Director Charlie Hertzfeld promised Taylor $1 million to build a distributed communications network if IPTO could get organized.

Taylor was pleased, then impressed with Roberts’ work and asked him to join him and lead the effort, which Roberts didn’t like.

Taylor was displeased and asked Hertzfeld to get the Director of the Lincoln Laboratory to pressure Roberts to reconsider, which eventually led Roberts to relent and join IPTO as chief scientist in December 1966.

On June 3, 1968, Roberts described the plan for ARPANET to Taylor. Eighteen days later, on June 21, Taylor approved the plan and ARPANET was established 14 months later.

As ARPANET grew, Taylor handed over management of IPTO to Roberts in September 1969.

Roberts then left ARPA to become Telenet’s CEO, while Lickerd returned to IPTO as a director to complete the organization’s life cycle.

That’s the end of the story, and you can see that Roberts was pressured into accepting the assignment and eventually creating ARPANET, the ancestor of the Internet.

Thanks to Lickerd’s vision and money, ARPA became the birthplace not only of the web, but also of computer graphics, parallel processes, and computer-simulated flight.

History is so coincidental and interesting.

History of the Internet

In 1973 the ARPA network was extended to the Internet, with the first computers connected to the UK and Norway, gradually becoming the backbone of the network connection.

In 1974, Robert Kahn of ARPA and Vinton Cerf of Stanford proposed TCP/IP protocol.

In 1986, the National Science Foundation (NSF) established NSFNET, a backbone network connecting universities, which was an important step in the history of the Internet. NSFNET became the new backbone and ARPANET was retired in 1990.

In 1990, Tim Berners-Lee created all the tools needed to run the World Wide Web: HTTP, HTML, the first web browser, the first web server, and the first web site.

At this point, the Internet opened up the road of rapid development, HTTP also began its great journey.

There’s a lot of interesting history, like the First Browser wars and so on, and we’ll talk about it later, but today we’re going to focus on HTTP.

Let’s take a look at the evolution of the major versions of HTTP and see how it came to be what it is today.

HTTP / 0.9 era

In 1989, Mr. Li published a paper in which he proposed three now-commonplace concepts.

  • Uris, uniform Resource Identifiers, serve as unique identifiers on the Internet.
  • HTML, hypertext markup language, describes hypertext.
  • HTTP, hypertext transfer protocol, transfer hypertext.

Then Mr. Li went into action and did it all, calling it the World Wide Web.

It was the early days of the Internet, and the processing power of computers, including the speed of the Internet, was very weak, so HTTP was not immune to the constraints of that era, so it was designed to be very simple, and it was in plain text format.

At that time, Li Lao thought that the document is stored in the server, we only need to GET the document from the server, so there is only “GET”, there is no need for any request header, and it ends after taking, so the connection is broken after the request response.

This is why HTTP is designed as a text protocol, with only “GET” in the beginning and the connection broken after the response.

It may seem crude to us now, but at the time it was a big step in the development of the Internet. Nothing is more difficult than to build something from scratch.

At this time HTTP has no version number, the reason why it is called HTTP / 0.9 is added by later generations, in order to distinguish the later version.

The HTTP 1.0 era

The demand is endless, and as graphics and audio evolve, so do browsers.

In 1995, Apache was developed, which simplified the construction of HTTP server. More and more people used the Internet, which also promoted the modification of HTTP protocol.

Requirements prompted the addition of various features to meet user requirements, and HTTP/1.0 was released in 1996 through a series of drafts.

Dave Raggett, who led the HTTP working Group in 1995, wanted to extend protocols with extended operations, extended negotiation, richer meta information, and security protocols associated with security protocols that were more efficient by adding additional methods and header fields.

The following are the main additions to HTTP/1.0:

  • New methods such as HEAD and POST have been added.
  • Added the response status code.
  • Headers were introduced, namely request and response headers.
  • The HTTP version number was added to the request.
  • The content-Type was introduced to transfer data beyond text.

You can see the introduction of new methods to populate the semantics of operations, such as HEAD, which can also take only meta information without transferring the entire content, improving efficiency in some scenarios.

The introduction of a response status code allows the requester to know what is going on at the server and to distinguish the cause of a request error without being confused.

The introduction of headers makes requests and responses more flexible and decouples control data from business entities.

The addition of a version number indicates that this is a symbol of engineering, indicating that on the right track, after all, without a version number can not be managed.

The content-Type is introduced to support the transmission of different types of data, enriching the carrier of the protocol and enriching the eyeballs of users.

But at that time HTTP/1.0 was not a standard, there was no actual binding force, and the parties didn’t accept it.

The HTTP 1.1 era

The HTTP/1.1 version was first documented in RFC 2068 in 1997, and greatly promoted the development of the Web from 1995 to 1999 during the First Browser Wars.

HTTP/1.0 evolved into HTTP/1.1, and RFC 2616 was released in 1999, abandoning the previous RFC 2068.

As you can see from the version number, this is a minor update, mainly due to the performance issues of HTTP/1.0, which requires a new TCP connection for every resource requested, and only serial requests.

The following are the main additions to HTTP/1.1:

  • New connection management, namely keepalive, to allow persistent connections.
  • Support for pipelines to send a second request without waiting for a previous request response.
  • Allows the response data to be chunked, i.e. the response is not marked with content-Length, so the client cannot disconnect until it receives the EOF from the server, facilitating the transfer of large files.
  • Added cache control and management.
  • Add the Host header, which is used when you have multiple hosts deployed on one machine, and multiple domain names are resolved to the same IP address, then add the Host header to determine which Host you want to access.

As you can see, the browser wars have pushed the Web forward and exposed the shortcomings of HTTP/1.0. After all, network bandwidth and so on are improving, and you can’t let protocols limit hardware development.

Therefore, HTTP/1.1 was proposed to address performance issues, including support for persistent connections, pipelines, cache management, and many other features were added.

HTTP/1.1 was revised again in 2014 because it was so large and complex that it was split into six smaller documents, RFC7230-RFC7235

At this point HTTP/1.1 has become a standard, and standards tend to be established after strong competitors are relatively stable, because standards mean uniformity, and uniformity means less effort to accommodate everything.

Only a great power can set standards, and when you are strong enough you can set standards, to challenge the old standards.

HTTP 2 times

With the release of HTTP/1.1, the Internet also began to explode, and this growth exposed HTTP’s shortcomings, mainly performance issues, which HTTP/1.1 did nothing about.

It’s human inertia, and it’s consistent with how we evolve our products. When you’re strong and comfortable, you don’t want to change anything.

Don’t use anymore.

Google can’t stand it now. You don’t do it, do you? I do my own, I play with my own, I have a large user base, I have Chrome, I have a lot of services to go.

Google launched the SPDY protocol, and with more than 60% of the global market share, in July 2012 the team that developed SPDY publicly announced that it was working on standardization.

HTTP became restless, and the Internet Standardization Organization began developing a new version of HTTP based on SPDY, eventually releasing HTTP/2 in 2015.

HTTP/2 contains the following additions:

  • Binary protocol, not plain text.
  • Support a TCP connection to initiate multiple requests, removing pipelines.
  • Use HPACK compression header to reduce the amount of data transfer.
  • Allows the server to actively push data.

Moving from text to binary actually simplifies neat complexity, parses data with less overhead, makes data more compact, reduces network latency, and improves overall throughput.

Support a TCP connection to initiate multiple requests, that is, support multiplexing, such as HTTP/1.1 pipeline or blocked, need to wait for a response back to return.

Multiplexing is fully asynchronous, which reduces the overall round trip time (RTT), eliminates HTTP queue header blocking, and avoids the effects of SLOW TCP startup.

HPACK compressed header, using static table, dynamic table and Huffman encoding, maintains a list of request headers on both the client and the server, so only the incremental and compressed header information is needed, and the server can assemble the complete header information after taking it.

To visualize this, it is shown below:

A more specific point is the following:

The server actively pushes data, which actually reduces the number of requests. For example, when the client requests 1.html, I send the JS and CSS required by 1.html together, saving the client from asking me for JS later, and I want this CSS.

As you can see, the overall evolution of HTTP/2 is geared toward performance optimization, because performance is a pain point, and everything evolves in the same way.

There are a few exceptions, of course, such as accidents, or just the “lazy ass” kind of papering.

This promotion belongs to the user revolt, you don’t upgrade me myself, I have the capital, you weigh yourself.

The end result was good, Google later abandoned SPDY and embraced standards, and HTTP/1.1 was such a heavy historical burden that HTTP/2 is still used by only about half of all web sites.

HTTP 3 times

HTTP/3 is not yet HTTP/2.

It’s Google again, breaking out on its own, mainly because of pain points, this time from HTTP’s dependence on TCP.

TCP is a reliable and ordered transport protocol, so there will be retransmission failure and sequence mechanism, while HTTP/2 is all streams share a TCP connection, so there will be TCP queue head blocking, when the retransmission will affect multiple request response.

In addition, TCP determines connections based on quads (source IP, source port, destination IP, and destination port). However, in the case of mobile networks, IP addresses change frequently, which leads to repeated connections.

There is also the TCP and TLS overlay handshake, which increases the latency.

The problem was TCP, so Google went after UDP.

UDP we know is connectionless, no matter what order, and no matter what packet you throw, while TCP I said in the previous article very clear TCP troubleshooting students who do not know can go to see.

The short answer is that TCP has been too altruistic, or too conservative, and a more radical approach is needed.

So what? TCP can not change I will change! Then TCP reliable, ordered functions to the application layer to implement, so Google developed the QUIC protocol.

QUIC layer for its own packet loss retransmission and congestion control, and we all use HTTPS for security reasons, so multiple handshakes are required.

I have also mentioned the situation of quad above, so in the era of mobile Internet, the consumption of this handshake is magnified, so QUIC introduced a Connection ID to identify a link, so after switching the network, this Connection can be reused, and the transmission can start when it reaches 0 RTT.

Note that the figure above was created after the server was already in hand. The 0 RTT was generated because of network switching, etc.

If it’s your first time and you need to shake hands multiple times, let’s look at a simplified handshake comparison diagram.

So the so-called 0RTT is when the connection has been built before.

Of course, HTTP/2 mentioned HPACK, which relies on TCP for reliable, orderly transmission, so QUIC had to develop a QPACK, also using static tables, dynamic tables and Huffman encoding.

It enriches the HTTP/2 static table from 61 to 98 items.

The dynamic table mentioned above is used to store the header items that are not included in the static table. If the dynamic table has not received the header, it will definitely be blocked when the header is solved.

Therefore, QPACK will open another way to transmit the codec of the dynamic table in a one-way Stream. Once the one-way transmission is finished, the decoding can only start when the receiving end arrives.

How does QUIC solve TCP header blocking? After all, it has to be orderly and reliable.

Because TCP does not know which request each stream is from, it can only block all of them, while QUIC knows, so if request A loses A packet, I just block A, and request B can pass all of them without being affected at all.

As you can see, UDP-based QUIC is still very strong and has many users. In 2018, the Internet Standardization Organization IETF proposed and approved the change of HTTP over QUIC to HTTP/3.

Can see the demand and promote the progress of technology, due to the limitations of TCP’s own mechanism, we have to rely on UDP, that TCP will become history?

We’ll see.

The last

Today we have taken a look at the history of HTTP and how it has evolved, and we can see that technology comes from demand, and demand drives technology.

Essentially is the person’s inertia, only the pain will grow.

And the standards are actually pushed by the giants for their own benefit, but they do reduce the cost of docking and make it easier to unify.

Of course there’s a lot going on with HTTP, there’s a lot of detail, there’s a lot of algorithms, for example Connection ID, different quads how do you guarantee that the request will be forwarded to the previous server?

So today I’m just scratching the surface to talk about the general evolution. I’ll leave it up to you to figure out how to implement it, or I’ll write more later when I have the opportunity.

However, I am more interested in the historical evolution than the implementation details, which allows me to understand this thing more deeply from the background and constraints of the time, why it was designed the way it was in the first place.

And history is interesting, isn’t it?


I’m yes, from a little bit to a billion bits. See you next time.

Shoulders of giants

www.livinginternet.com/i/ii_ipto.h…

Jacobianengineering.com/blog/2016/1…

W3techs.com/technologie…

www.verizondigitalmedia.com/blog/how-qu…

www.oreilly.com/content/htt…

www.darpa.mil/about-us/ti…

En.wikipedia.org/wiki/ARPANE…

En.wikipedia.org/wiki/Intern…

In-depth analysis of HTTP/3 protocol, Tao Hui

Perspective of HTTP protocol, Luo Jianfeng