preface

Many people are excited to create their first website. We know that in a website project, there are often a lot of JavaScript and CSS references on the page. If you refer directly to a project file, they might look something like this:

The advantages of this method are that it saves effort to develop and publish, requires little server, saves money, and has no specific requirements for public network access.

However, if your website has a lot of pictures or videos and needs to be deployed to the public network, the speed of the website will definitely crash you. Like the picture below 👇

At this point, there is definitely a recommendation that you use CDN to speed up some JavaScript and CSS files on your site, as follows:

In fact, the picture above has already used CDN. So what exactly is a CDN?

Before explaining what CDN is, let’s take a look at a very common case around us — online shopping.

Jingdong’s self-operated shopping experience with Taobao

I believe that there should be no one who has not used Taobao and JINGdong, before saying CDN we first talk about my shopping experience in Taobao and jingdong. Here is my experience in using the two e-commerce platforms:

  • Buy goods from third-party stores on Taobao
  • Buy self-owned goods on JD.com

Before I bought a thunder 3 expansion dock in Taobao, the place of shipment is Shenzhen, it took three days to arrive in Nanjing, if the place of receipt is Henan? Xinjiang? I think it’s even longer. But my classmate in Henan bought a mobile phone in JINGdong (self-run) in the afternoon and received it in the morning the next day (not to advertise for jingdong). Why is that?

When we use JINGdong for shopping, if we observe carefully, we can find that jingdong will find the nearest warehouse with the fastest delivery across the country according to our place of receipt. For example, if I place an order in Nanjing, he may deliver it from Shanghai or even directly from Nanjing. If the order is placed in Luoyang, it may be shipped from Zhengzhou. The advantage of this is that no matter we are in Nanjing or Urumqi, our delivery time will be greatly reduced. CDN is similar to the kind of warehouse system jd.com has built.

From online shopping to CDN

I don’t know whether the above description is clear or not. In order to deepen understanding, I made the following process comparison diagram:

In order to deliver goods to buyers more quickly, JD.com set up the warehousing system; By analogy with the web, CDN was invented to allow users to load web pages faster.

The full name of CDN is Content Delivery Network. The basic idea is to avoid the bottlenecks and links that may affect the speed and stability of data transmission on the Internet as far as possible, so that the content transmission is faster and more stable. Through the server placed throughout the network node of a layer of intelligence on the basis of the existing Internet virtual network, CDN system can in real time according to the network traffic and each node connections, load condition and to the user’s distance and comprehensive information such as response time will the user’s request to guide users closest service nodes (as shown in the figure below). Its purpose is to enable users to obtain the content needed nearby, solve the situation of crowded Internet network, improve the response speed of users to visit the website.

Understand some terms related to CDN from “live broadcast”

From the above description, we know the function and general principle of CDN, but the details are not elaborated. In fact, some details of CDN are usually associated with some terms, such as load balancing, source station and so on. Again, let’s use a nearby example — “live streaming” — to explain these terms related to CDN.

As we know, video is actually composed of pictures frame by frame, so the process of video images we receive during live broadcast can be approximately understood as follows 👇

But is this the case? Of course not! How can an anchor have only one audience, so it should be the following 👇

The above method is that the host transmits the same data to multiple audiences at the same time, which is of course a very stupid way. The same data has been transmitted for many times, and the bottleneck of the host terminal is very obvious. For example, when there are 1000 audiences watching at the same time, the host terminal cannot bear so much data transmission.

So it’s easy to think of a way to add a very powerful server that acts as a middleman between the host and the user, sending data from the server to different users, as follows: 👇

The server here has two main functions: 1. Receiving data from anchors (push stream); 2. 2. Distribute the received data to users (distribution).

Of course, if the server performance here is too strong, it can perform the role of push flow and distribution, but also achieve beauty, effects, yellow and other functions. At this moment, this server has an identity – streaming media processing center.

However, the performance of a server is also limited, assuming that a server can support up to 1000 users at the same time, what if the number of users is far more than 1000? Yes, that is to add another layer of servers (clusters), as shown in the following picture 👇

In the figure above, server 0 is responsible for receiving the video data of the anchor and then transmitting it to server 1, 2, 3… , which are then distributed to users by these servers. Considering that users may access the data later, they simply store the data on servers 1, 2, 3… There’s a copy stored on all of them.

Relative term

Let’s understand some concepts from the above description.

Load balancing

When the audience is small, such as 1000 people in total, should one server serve 1000 people, or should three servers share 1000 people, or should two? There will also be new and old machines, the old machine can only withstand 800 quantities, so how to distribute? And so on. There needs to be a strategy for allocating resources. This strategy is called load balancing. Load balancing is usually implemented by means of redirection and reverse proxy. Common load balancing algorithms include polling method, random method, minimum connection number method, etc. (for space problems, I will not explain here).

CDN cache

Considering that users may access the data later, they simply store the data on servers 1, 2, 3… (The simplest example is when multiple users may access the same image at different times). This concept is called CDN caching.

Back source, source station, edge node

When the first viewer assigned to server 1 enters, server 1 does not store data. It fetches data from server 0. This process is called: back source; Accordingly, server-0 is called: source station; Servers 1, 2, and 3…… These content distribution nodes are called edge nodes.

Cache hit/cache hit ratio

If the data requested by the audience is provided by CDN cache, it is called cache hit. The cache hit ratio of all user requests is called cache hit ratio, which is a key indicator to measure the quality of CDN.

Nearby principle

Which server will a new viewer be assigned to? In theory, the shorter the link between the server and the user’s network, the better the stability of data transmission. This is called the proximity principle.

CDN? Object storage?

Through the above introduction, we know that the main purpose of CDN is to speed up access, and the service objects are mainly live streaming, on-demand, static files of web pages, small files and so on. At this time, some people may ask, in order to speed up the access of some small files, I will also use some manufacturers’ object storage services, such as OOS of Ali, BOS of Baidu, etc. What’s the difference between object storage and CDN?

Yes, the goal of both is to speed up user access, but the focus is completely different. CDN focuses on distribution and object storage focuses on storage. Object storage can be simply understood as a network disk, and CDN is a highway.

Take picture storage as an example, object storage is to store pictures, CDN is to speed up the download of pictures. So in many cases, the two are used together, and this set of combinations has become an essential part of Internet applications.

Benefits of using CDN

Having said that, if only to speed up the website access speed, can choose other ways, why must use CDN? Or, other than speed, what good is a CDN?

  1. Good for search ranking. Search engines such as Google already use site speed as an important indicator for ranking results.
  2. Websites are not prone to downtime. In fact, this is the same truth as putting eggs in many baskets. After multiple server shunts, the pressure of the source station will be much smaller.
  3. Reduce hosting costs. Most servers have limited bandwidth. Different files are stored on different servers to reduce bandwidth costs.

How to use CDN

How to use the CDN is a difficult question, because if you want to build a CDN service difficulty is very big, but if just want to use, there are a lot of companies have their own CDN services, different manufacturers have different standards and features, this is different from person to person, the specific using the various documents. Generally when I use it in HTML, I will go directly to the BootCDN and copy and paste the library I need to use.

One More Thing

Although there is a saying that “don’t waste time on software optimization if you can depress hardware”, we can also know from the above explanation that the capability of CDN is closely related to those edge nodes. Assuming the hardware investment has been saturated, what other way can accelerate the whole access process?

Compress transmission data! If the data in the transmission process can be further compressed while keeping the information unchanged, then the pressure on each node in the transmission process will be much less, and naturally access can be accelerated.

One of my favorite TV shows, Silicon Valley, tells the story of Richard, who developed a pioneering “universal compression algorithm” and started his own business. According to the show, the algorithm could transform the existing Internet world. IT is a pity that the algorithm in this drama does not exist in reality, but this does not affect this is a very excellent American TV series, recommended for everyone engaged in the IT field to watch.

The last

In view of my limited ability, if you have any questions or suggestions, please leave them in the comments section. New small partner might as well give a concern, your support I advance the biggest power 💪

Refer to the article

  1. What the hell is CDN
  2. What exactly is a CDN?
  3. What is CDN? What are the advantages of using CDN? – zhihu
  4. Content delivery network