My personal website has been launched, it can better retrieve historical articles, and can leave comments on articles, welcome to visit

One question you probably know or have been asked is the classic “what happens from the URL entered in the browser to the page displayed”. Although this question is simple, it can really see the level gap between different people from the various details of the answer.

This article is mainly to talk about the first step after entering the URL – domain name resolution

The domain name is similar to www.google.com. You can ping the IP address of the domain name.

So why have a domain name and an IP?

Domain name and IP address coexist

First of all, I want to explain why domain names and IP addresses coexist. There are two main points:

  1. Improve user experience
  2. Improve operational efficiency

Just to clarify, an IP address is 32 bits long, which in decimal notation would look like this — 192.168.1.0. But imagine if we had to enter a long string of numbers to access a website. The experience would be pretty bad. What’s more, we often use more than one website.

On top of that, if you’re promoting your site to other people, you blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah.

This is why domain names are still used to facilitate the memory of the human brain.

So why do you need an IP address? An IPv4 IP address requires only four bytes, while a domain name represented by a string requires at least dozens or even hundreds of bytes, which greatly increases the burden on the underlying router.

That’s why IP addresses are still in use. People use domain names, and the router layer uses IP addresses, just as we write characters that we can understand, and ultimately computers understand binaries.

The DNS

With this background, we can look at how “domain name” becomes “IP address”.

First of all, we know that we’re going to send a request to the DNS server, so the question is, how does the browser know what the DNS server address is?

The answer is pre-configured. Of course, this is not the only way, DNS can also be dynamically allocated through DHCP (Dynamic Host Configuration Protocol).

For example, the DNS configuration in MacOS looks like this.

Of course, you can also use the command line to view and modify, the address is /etc/resolv.conf.

With a DNS server, you might think it would be easy:

I will send you a domain name, and you can send me the corresponding IP address. There are tens of thousands of DNS servers on the Internet. How do I know which server my data is on? Do you want to go through tens of thousands of servers one by one?

I’m sure you didn’t realize how long it would take to type a domain name into the browser to display the page, which means it’s not a server by server traversal.

Domain name Composition

To understand how DNS optimizes it, we need to know the components of a domain name. When you look at this, you’re probably thinking:

What form? Isn’t it just a bunch of strings?

In fact, a domain name is made up of different domains, each part separated by a. Is a domain.

For example, suppose the domain name we analyze is www.google.com, and the size of each part of the domain may look like this from our habitual thinking of writing the delivery address:

www > google > com

But that’s not the case. It’s:

. > com > google > www

You may even notice that the largest is a. In fact, the full domain name should be www.google.com.,. Stands for the root domain, because the root domain is the same for all domain names, so we usually omit the last dot.

Each field has its own specific noun:

. > com > google > www

Root domain | level 1 domain | Level 2 domain name | (subdomain name) | host name

Of course, we know that you can also assign sub-domains to sub-domains, such as mail.google.com.

So seeing this, you should be able to understand the idea that domain names are hierarchical, but let me give you a more general example.

WWW, the Google division of google.com inc. Mail.google.com/mail/u/0/#i…

DNS stratified

After understanding the layered domain name, DNS is how to optimize the domain name resolution problem is solved, that is – layered.

The DNS server stores the data of domain names on all DNS servers in a distributed manner. However, the data of a domain is stored on the same DNS server. The same DNS server can store the data of multiple domains.

This may sound a little abstract, but a picture is worth a thousand words, and here it is:

With the layering of data, the query data has a lot of rhythm.

Querying Domain Name Data

A picture is worth a thousand words, and with the hierarchical mechanism, the whole query process looks like this:

The query is performed on the configured DNS server, which is usually a DNS server on the local or Intranet. If not found, will go to the root domain to ask, say buddy, I need here www.google.com IP address.

Root domain a look, I do not have here ah, but I know the DNS server address of com domain, he may know.

Then com domain DNS server a look, www.google.com IP address I do not know, but I know google.com domain DNS server address, he may know, you go to ask him.

So ask all the way down, finally can find the IP address corresponding to www.google.com.

Root DNS server

Read the above process, you may still have a little doubt. When you go to the DNS server to query the IP address, the initial DNS server IP address is the configuration of the local computer. How do I know which root servers I have when I’m hierarchical? And how do I know what the IP addresses of these root servers are?

The answer is built-in.

Our devices, or any device with Internet access, have a built-in list of root servers. There are 13 root DNS servers, named [a-m].root-servers.net, whose addresses can be obtained without any queries.

Of course, if you think about it a little bit, 13 servers are hardly enough to handle requests from global Internet users, and there are actually a lot of mirror servers for those 13 servers.

Seeing is believing

So much for the virtual concept, let’s do it in practice with the DIG command.

As you can see, the full domain name under QUESTION SECTION is www.google.com. It has A root domain, but what does this IN and A mean?

This is because three parameters are required to query a request to the DNS server:

  1. Domain name (for example www.google.com)
  2. Network type (Class Is designed to allow multiple networks to coexist. However, there is actually only one network, the Internet, so the value of this parameter is always —IN
  3. Type (e.g.ARepresents the IP address, andMXRepresents the address of the mail server.

In the ANSWER SECTION, it is the response result of DNS service. The figure above shows that there are always 6 DNS records, and their corresponding IP address is returned later.

69 is TTL in seconds, which means you don’t need to send another request for 69 seconds.

At the bottom are the statistics, the time spent on the DNS query, and the address and port of the requested DNS server. This server address is the DNS server address configured on our machine.

The sharp-eyed may have noticed that the request to the root server is not designed at all. This is because the command omits this section, and it is possible to see the detailed hierarchical query process by adding the +trace command line argument.

This time, let’s take www.36kr.com as an example.

As you can see in the figure above, all the root DNS servers are listed, then go to com domain, then go to 36kr.com domain, and finally get the IP address of www.36kr.com.

Caching mechanisms

Of course, if you start from the root server every time, it is not reasonable, because the relationship between domain name and IP address changes infrequently, so the DNS server will cache the results.

And, in the figure below:

I only wrote that one DNS server has the same level of domain information, but in fact different levels of domain information may exist on the same DNS server, for example, the com domain and the Google.com domain may exist on the same machine.

However, this cache has an expiration date. If the DNS data changes during the validity period, the data in the cache is incorrect. In this case, you need to manually delete the DNS.

This post has been posted on my Github github.com/sh-blog. Welcome Star. Wechat search attention [SH full stack notes]

If you find this article helpful, please give it a thumbs up, a comment, a share and a comment.