Recently, in understanding edge computing, I found that CDN, which we often hear about, is also a part of edge computing. So speaking of CDN, it seems to be known only as content distribution Network in Chinese. So what is the principle of specific CDN? What benefits does it bring to users when they browse the site? Solving these two problems is the purpose of this paper.

Concept of CDN

The full name of CDN is “Content Delivery Network”, or Content Delivery Network in Chinese.

In fact, the concept of CDN was put forward in 1996 by a research group of Massachusetts Institute of Technology in order to improve the service quality of the Internet. So how does it improve the quality of Internet service?

The principle of analysis

We know that when we use a domain name to visit a site, we are actually sending a request packet (in the case of an Http request) over the network to a server, such as “www.baidu.com” :

  1. First resolve the IP address corresponding to the domain name (DNS domain name resolution)
  2. The Http request packet is then routed to the server corresponding to the IP address over the network

We usually say “server IP address”, which is not quite accurate, IP address and network card binding, a server can have multiple network cards, that is, may have multiple IP addresses.

Let’s look at the first step: domain name resolution

Domain name resolution

Domain name resolution is divided into two types:

  1. Resolve a domain name to an IP address
  2. Resolves a domain name to another domain name

We need to Map an IP address after purchasing a domain name from the domain name service provider. We can use Map to represent the relationship: {domain name: IP}.

At the same time, you can also set an alias for a domain name, for example, www.baidu.com to test.baidu.com. This relationship can also be represented by Map: {domain name: alias}. The alias is **CNAME**.

Domain name resolution is actually to resolve the IP address corresponding to the specified domain name, or a CNAME of the domain name.

Domain name resolution is handled by the DNS system, which accepts external requests and extracts domain names from them,

  • If the domain name corresponds to an IP address, the IP address is returned,
  • If the domain name corresponds to a CNAME, the IP address of the CNAME domain name is searched and returned to the request sender.

After the request sender gets the IP address, the actual request is invoked.

In fact, the DNS system is very large, and I won’t go into it too much, but think of it as a black box, which is what I described above, and here is a simple diagram to illustrate it.

Case without CNAME:

CNAME:

Special attention: in the case of CNAME, we can find that CNAME actually assumes the role of middleman (or agent) in the process of domain name resolution, which is the key of CDN implementation.

Principle of CDN

First of all, CDN is to improve the service quality of the Internet. Popular point of view is to improve access speed.

Suppose baidu website now only has one server, now there is a person in Shanghai to visit Baidu, if the server is also in Shanghai, then generally speaking access is relatively fast, if the server is in Lhasa, then relatively slow access. The root cause of this problem is that network transmission depends on network cables, and the longer the network cable, the longer it must take.

So how to solve this problem? In fact, the idea is very simple, Baidu deployed in all parts of the country exactly the same server on the line, a professional point called redundancy.

The idea is very simple, but the implementation is more troublesome, the resources on the server are divided into two kinds: static resources and dynamic resources.

  • Static resources: These are resources that rarely change, such as images, videos, CSS, javascript, etc
  • Dynamic resources: These resources are usually accessed differently by different users at different times, such as FTL, JSP, etc.

So if baidu to deployment servers all over the country, if each server has the same dynamic resource, you may also need to configure the corresponding database, because the information recorded by the dynamic resource usually stored in a database, then it involves data synchronization and so on, which can lead to high cost, the professional is the cluster, At present, the cluster architecture is at most three places and five centers, not to say that the national multi-place cluster is impossible, mainly because the cost is too high.

You want to know about the center of the five, we can see mp.weixin.qq.com/s/uGyGldbwm… This article was also written by myself.

Is there a less costly way to deploy static resources on each server? Yes, static resources usually do not involve databases, so they are less costly and can improve user access speed.

Here, the purpose of CDN is introduced, so how to achieve this purpose?

Now, if we want to compare CDN systems, we can consider two points:

  1. How about the performance and network speed of the server storing static resources in the CDN system?
  2. The number and deployment of server nodes in the CDN system nationwide or even globally.

The first point is easy to understand, and the second point should be understood. If there are many server nodes of static resources, each user does not have to “run a long distance” to access these static resources, then naturally this is the advantage of the CDN system.

Some companies saw this demand, so now there are actually many CDN providers, such as Ali, Tencent and so on have their own CDN services. As long as your own system is connected to the CDN service provided by these big factories, you transfer your static resources to the CDN service, so these static resources will be automatically distributed to all parts of the world.

Ok, so the problem is that when users access static resources, the domain name will be resolved to a certain IP address, the key question is, how does the DNS system perform domain name resolution, the closest IP address to the user?

A normal DNS system cannot do this, it needs a special DNS server, and this special DNS needs to know

  1. Current location of the user
  2. You also need to know which IP address corresponds to the domain name that the user is accessing, and where is that IP address?

For the first problem easy to solve, directly from the user request to extract the USER’S IP address, such as the IP address is resolved as Beijing Telecom, Shanghai Mobile and so on.

The second question is who will solve it. We are now thinking about CDN. The CDN provider must know where their company has deployed the machines and their IP addresses, so this problem can only be solved by the CDN provider.

In this way, as long as the user accesses static resources using a domain name, if the user directly sets the DNS address of his computer to the DNS server dedicated to the CDN. That naturally solves the problem, but when we think about it, we can’t ask every user in the world to change their DNS address. So you’re going to use the CNAME in DNS.

When users use a domain name to access static resources (this domain name is called “accelerated domain name” in Ali CDN service), for example, the domain name “image.baidu.com” corresponds to a CNAME called “cdn.ali.com”. The common DNS server (special DNS server for CDN) resolves image.baidu.com as cdn.ali.com. The common DNS server finds that the domain name corresponds to another DNS server and transfers domain name resolution to the DNS server. The DNS server is the CDN dedicated DNS server. The CDN dedicated DNS server parses cdn.ali.com, selects the nearest CDN server address based on all CDN server addresses recorded on the server, and returns the address to the user. Then the user can access the nearest CDN server.

Supplement:

There are many types of records in domain name resolution. The most common ones are:

  • A Record: One domain name corresponds to one IP address
  • CNAME: one domain corresponds to another domain name
  • NS: The subdomain name is resolved by another DNS server

conclusion

Through the above article we can find that the realization principle of CDN depends on DNS, because I am not specialized in network, so if there is an inaccurate place, please also point out.

 

Recommended reading:

Do code farming almost 15 years, chat programmers 15 years career planning, mutual encouragement

Golden three silver four seasons, Ali has worked for more than 10 years Java giant “experience”, dedicated to you in confusion

  • give a like
  • collection
  • share
    • The article reported


Frank little yard farmer
Direct messages
Focus on