Introduction to this Article

  • How to speed up logistics warehouse distribution
  • Static resource file deployment mode
  • CDN technology for static resource acceleration
  • The interpretation of nouns during parsing
  • Final conclusion

1. How to accelerate the distribution of logistics warehouse

Let’s start with an example of shopping in life.

Back a few years ago, if you ordered from jd.com’s APP in my hometown, you had to wait at least a few days for the goods to be delivered to you.

Because their logistics warehouse center was not to build in the county seat, so generally may from cities or provincial capital () as a regional warehouse logistics warehouse to find whether have inventory, if there is no inventory or regional logistics warehouse, may be logistics warehouse (as center warehouse) shipment from Beijing, once the center of the warehouse is not available, Then you have to buy from the manufacturer (as the source).

However, it is different now. No matter you are in first-tier cities or your hometown, you can get your goods in the next day by shopping on JD.com (mainly self-owned goods by default), and the delivery efficiency of the Courier is the same.

This is jingdong’s strong logistics advantage. By expanding its logistics warehouse to the nearest place to ordinary people, users’ shopping experience has been greatly improved.

Through this case, we understand the accelerated process of goods delivery.

Goods have ordinary goods, large goods, etc., these goods are stocked to the central logistics warehouse at the beginning, the central logistics warehouse can be considered to be almost the most complete commodity warehouse center.

When the regional logistics warehouse is built, these goods can be prepared in advance to the regional warehouse, further improve the delivery time of goods.

When the warehouse is built in the county, you can prepare the goods in advance to the county warehouse, as long as the warehouse is closer and closer to you, you do not need to spend so much trouble after placing an order, shipped from the regional or central warehouse, and even you can go to the local warehouse at a point to pick up.

As shown below:

The county warehouse is the closest warehouse to the user, which is to add multi-level intermediate warehouses between the user and the central warehouse to deliver goods nearby, speed up delivery and improve user experience.

2. Static resource file deployment

So, on the Internet, when you visit a shopping mall, click into the product details page, you can see a large number of pictures and advertising videos, these are static resources, then how do users access these static resources?

Initially, we considered deploying an Nginx cluster with these static resources stored on each machine. Files could be uploaded to one of the machines through a service and then distributed to other Nginx machines using Rsync. This is fine for small static resource files.

However, images and videos, which can range from a few megabytes to hundreds of megabytes, are not recommended to be placed on an Nginx cluster, nor are distributed caches recommended, nor are distributed caches recommended to store large keys. Assuming that you do, deploy the Nginx cluster or distributed cache in the Beijing machine room. When users access these resources, they will have to pass through multiple backbone networks, resulting in high network latency, giving you the visual impression that images cannot be loaded, and video playback will be slow.

At this point, I don’t think you are interested in waiting any longer, for e-commerce sites to lose users.

In general, we can use Nginx clusters as the source for small static resource files, and separate distributed storage as the source for streaming audio and video data. The source station is where your static data is originally stored. To achieve high availability and stability, a BGP multi-line equipment room is generally deployed in consideration of enterprise costs.

The DIAGRAM of BGP equipment room is as follows:

The so-called BGP enables websites to communicate with each other between operators’ lines, so that users of all interconnected operators can access the website quickly and select the best network link based on the user’s network. Therefore, the cost of BGP room bandwidth is higher.

The bandwidth cost of the BGP equipment room ranges from 80 yuan to 400 yuan /M. Therefore, if 100 yuan is used for each 1 mbit/s traffic, 1 GBIT/s traffic is 100,000 blocks. If tens or hundreds of GIGABytes of traffic are used, the cost is quite reasonable.

CDN technology for static resource acceleration

In the example above, we know that users accessing static resources directly access the BGP source, and bandwidth costs are very high. A website users will be distributed all over the country, or even distributed around the world, how to make users access these static resources faster?

We can also learn from the example of logistics warehouse, similar to logistics warehouse, of course, these static resources are closer to the user, the faster access. Thus, CDN technology came into being.

What is CDN technology?

The full name of CDN is Content Delivery Network/Content Distribution Network. CDN solves the problem of adding a layer of CACHE (CACHE) layer in the network to distribute the resources of the source site to the nearest network “edge” node, so that users can access the content nearby, improve the response speed of the website, avoid network congestion, and ensure the speed and experience of user access to resources.

After the CDN node is added, it is shown as the figure below:

Distribution architecture of CDN:

Make an analogy with the logistics warehouse: the logistics of the central warehouse is equivalent to the central node of CDN, the regional logistics warehouse is equivalent to the regional node of CDN, and the county logistics warehouse center is equivalent to the edge node of CDN.

Schematic diagram of CDN distribution architecture:

At present, the application of CDN technology is very common. Powerful companies will also build CDN by themselves and have their own CDN research and development team to provide more stable and reliable CDN services. However, most companies still choose professional CDN vendors. If your service is deployed on the cloud, you can choose the CDN services provided by Ali Cloud and Tencent Cloud. In addition, you can also choose the old CDN vendors, such as Wangshu and Lanxun.

CDN working principle:

So how does the user access the nearest CDN node?

Let’s use a picture to make it more intuitive:

The diagram above solves two problems:

How do access domain names map to CDN addresses

How to find the CDN node nearest to the user

Next, we will explain the process in detail according to the above two questions and the diagram.

1. How to map an access domain name to a CDN address

When you access the static.example.com domain name through your browser, assume it is a static domain name and CDN static resource acceleration is done.

1) The system checks the IP address corresponding to the domain name in the /etc/hosts file on the local host through the local DNS parser. If the IP address is found, the system directly initiates a request using the IP address. Otherwise, go to Step 2.

2) Because the local DNS server parses, if a domain name pair is found in the local DNS cache, the IP address is directly used for access. Otherwise, proceed to Step 3).

3) The local DNS server sends a request to the root DNS server. The root DNS server returns the top-level DNS server address and tells you to look it up.

4) The local DNS server sends a request to the top-level DNS server. The.com top-level DNS server returns the authoritative DNS server address and asks you to look it up.

5) The local DNS server sends requests to the authoritative DNS server example.com. When the authoritative DNS server sees that I can resolve the domain name, it finds that CDN accelerated domain name configuration has been performed. It will CNAME to static.xxx.example.cdn.com domain name.

At this point, we actually found the CDN address by accessing the static domain name static.example.com.

If you don’t need to find a user nearest node, through the IP address of the static.xxx.example.cdn.com domain name. Then you will find correct.

2. How to find the CDN node nearest to the user

Combined with the above figure, the CDN node closest to the user can be further analyzed.

1) local DNS server will static.xxx.example.cdn.com will GSLB global load balancing a request to the first layer, the first layer of the global load balancing can according to user’s operators network analysis, such as the mobile operators, Returns the CNAME to such as static.yd.example.cdn.com domain name address.

2) The local DNS server sends a request to the layer-2 GSLB global load balancer. The layer-2 GLOBAL load balancer returns the SLB CDN load balancer address based on the DNS geographical location.

3) The local DNS server can select a CDN IP address from multiple RETURNED CDN node IP addresses through local simple polling.

At this point, the CDN nodes found through GSLB global load balancing are the CDN nodes closest to users.

What is GSLB?

Global Server Load Balance (GSLB) implements Load balancing among servers deployed in different regions. On the one hand, traffic can be balanced to the servers below it. On the other hand, the server closest to the user can be found based on geographical location.

If the CDN node nearest to the user is found, the corresponding resources may not be directly obtained from the CDN node. If the resources do not exist, the system will continue to search from the upper-level region or the central CDN node. If neither of the resources exists, the system will eventually go back to the source to obtain the resources, and then set the CDN cache expiration time.

Generally, for some small static resource files, they are stored in the source station and accessed by the CDN node.

For large audio and video streaming media files, resources can be written to a CDN node in advance through the interface provided by THE CDN vendor, and then distributed to other CDN nodes through the CDN internal mechanism.

However, even if resources are actively synchronized, there is a delay, which may eventually lead to source back, and the cost of source back bandwidth is very high. Therefore, when we use CDN, it is necessary to pay attention to the CDN hit ratio and source bandwidth.

4, the interpretation of nouns in the process of parsing

**CNAME(Canonical Name) : ** It resolves one domain Name to another.

Here’s an example:

When you access resources using docs.example.com and want to access the same resources using docs-xyz.example.com, you can add a CNAME record to the DNS resolver, Point docs-xyz.example.com to docs.example.com, and when added, all requests to docs-xyz.example.com will be forwarded to the docs.example.com domain name.

CNAME domain name:

When accessing CDN, after adding the accelerated domain name on the CONSOLE of CDN manufacturer, you will get a CNAME domain name assigned to you by CDN. You need to add CNAME record in your DNS resolution service provider to point your accelerated domain name to this CNAME domain name. In this way, all requests of the domain name will be directed to the nodes of the CDN to achieve the effect of acceleration.

Domain Name System (DNS) :

Domain name resolution service.

Resolve a domain name to a network-recognized IP address. Servers recognize IP addresses, but users are used to remembering domain names. Therefore, the relationship between domain names and IP addresses is one-to-one. Domain name resolution is performed by a dedicated parser. For details, see the DNS resolution process in the preceding figure.

5. Final conclusion

You may think that as an engineer, I am far away from the CDN technology mentioned above. Ignoring the importance of CDN technology seems to be a matter of operation and maintenance, and I have nothing to do with it. This idea is wrong, our thinking can not be too limited, if you do some live broadcasting, video related technology, more or less certainly can access this technology.

Have you considered the whole process of short videos in Douyin and Kuaishou? After uploading videos by users in City A, users in city B can see them soon after transcoding and distributing the videos, and the video playback is very smooth, which also benefits from the application of CDN distribution technology.

This paper introduces the example of logistics warehouse to make an analogy with CDN technology, and has a sensory understanding of CDN distribution architecture.

At the same time, the working principle of CDN resolution is further analyzed. We can understand it well through the diagram of CDN working principle analysis, which contains the detailed process of DNS resolution and how THE DNS GSLB is to find the nearest node from the user.

CDN is the facade of major systems and is better at caching static data, images and streaming media data. CDN as a special cache, its hit ratio and high availability are also what we need to focus on.

There is a harvest sweep code concern about Java enthusiasts community support, public number: Javatech_CBO original dry goods timely push!

Welcome to pay attention to my public number, scan the TWO-DIMENSIONAL code to pay attention to unlock more wonderful articles, grow up with you ~