I am a CPU: this world is slow! Death! ! Often hear someone say that the disk is very slow, the network card, which is standing in the human perception dimension to express, such as copying a file to the hard disk needs a few minutes to dozens of minutes, enough for me to have a meal; Downloading a movie from the Internet sometimes takes hours and I can get some sleep.
The most familiar graph of the speed difference between different components of a computer is the pyramid: faster as you go up, smaller as you go, and more expensive.
This diagram just gives us a sense of the speed and performance without quantifying the description and explanation. In fact, the differences between levels are much larger than this picture. This article will let you look at the world from the CPU’s point of view and explain how slow they are.
Note: All data are from the Internet. All the data may vary depending on machine configuration or hardware update, but it doesn’t affect our intuition.
The number according to
Let’s start with the CPU speed. My computer has a 2.6 GIGAByte CPU. That means 2.6*10^9 instructions per second, and each instruction takes 0.38ns. We think of this as the basic unit 1s, because 1s is about the smallest unit of time that humans can perceive.
The level-1 cache load time is 0.5ns, which translates to about 1.3 seconds in human time, about one or two heartbeats. The importance of cache can be seen here, as it can keep up with CPU speed, and the locality feature of the program itself combined with the optimization at the instruction level leads to a high cache access hit ratio, which can ultimately greatly improve efficiency.
branch prediction errors take 5ns, which is about 13 seconds in human time. That’s a long time, so you’ll see a lot of articles about how to optimize code to reduce branch prediction, like this highly rated stackoverflow problem.
The time of the level-2 cache is quite long. It is about 7ns, which is about 18.2s in human time. What I can see is that if the level-1 cache does not hit and I try to read the data from the level-2 cache, the time difference is one order of magnitude.
let’s keep going. The mutex takes 25ns to add and remove, which is about 65 seconds in human time. That’s a minute for the first time. In concurrent programming, we often hear that locks are time consuming because if it takes a minute to heat something up in the microwave, you’re going to be waiting for a long time.
then there is the memory. Each memory address takes 100ns, which is 260 seconds, or more than four minutes in human time. If you read an article without thinking too much, you can finish 2-3 thousand words in that time. That doesn’t seem too bad, except you have to read one piece of data out of memory, and when you need it, you have to read more. When it comes to memory, time becomes an order of magnitude, and the speed bottleneck between CPU and memory is called the Von Neumann bottleneck.
a CPU context switch (system call) takes about 1500ns, or 1.5us (a reference to other articles using the average time for single-core CPU threads). That translates into about 65 minutes of human time, or, well, an hour. We also know that context switching can be a time-consuming activity, and it feels guilty to waste an hour at a time. The scarier thing about context switching is that the CPU is not doing any useful calculations during this time, just switching the registers and memory state of two different processes. It also breaks the cache, making subsequent calculations more time-consuming.
It takes 20us to stream 2K bytes of data over 1Gbps. That’s 14.4 hours in human time. That’s enough time to watch all six Star Wars movies (and even pee at dinner). You can see that there is very little data transfer on the network and it’s already very long for the CPU. And the time here is the theoretical maximum, the actual process is slower.
THE SSD random read time is 150us, which translates to about 4.5 days in human time. In other words, SSD read some data, CPU can take a vacation, reported to participate in the peripheral tour. SSDS are known to be much faster than mechanical hard drives, but that speed is still turtle speed for cpus. The benefits of memory come to mind when I/O devices start to slow down from hard disks. It is common sense for all programs to keep the most frequently used data in memory as a cache while minimizing reads and writes to IO devices. Caching systems like Memcached and Redis, which have sprung up in recent years, address this problem.
It took about 250us (7.5 days in human time) to read 1MB of continuous data from memory. I upgraded my vacation to a seven-day trip abroad.
A round-trip trip on the same data center network takes about 0.5 Ms. In human time, about 15 days, or half a month. If your program has a piece of code that needs to interact with other servers in the data center, the CPU has been running furiously for half a month during that time. Reducing network requests from different service components is a major issue of performance optimization.
It takes about 1ms, or one month in human time, to read 1MB of sequential data from an SSD. In other words, an SSD reading an ordinary file, if you have to wait for you to finish, the CPU will waste a month. However, SSDS are fast, and look at the performance of mechanical disks below.
The disk addressing time is 10ms, which in human terms is 10 months, just long enough to create a new life. If the CPU needs the disk to make a cup of coffee, in its eyes, the disk has a baby and comes back to tell it that the coffee you asked me to make is ready. A mechanical hard drive uses RPM(RPM Per Minute) to assess a disk’s performance: the larger the RPM, the shorter the average addressing time, the better the disk performance. Addressing simply moves the head to the correct track before reading the contents of the specified sector. In other words, addressing is a waste of time, but it doesn’t actually do anything (read the disk).
it takes 20ms, or 20 months in human time, to read 1MB of continuous data from disk. IO device is the bottleneck of computer system, hope you can understand this sentence more deeply by reading here! If you still don’t understand, think about how you feel when you buy something online and have it delivered for nearly two years.
The average round-trip trip from one city to another in the world takes 150ms (referring to the time it takes to ping packets from around the world). That translates into 12.5 human years. It is understandable that all programs and architectures try to avoid network access between different cities and even across countries, and CDN is a solution to this problem: let users interact with the server closest to them, thus reducing the transmission time of packets on the network.
It takes five minutes to restart a physical server. In human terms, it takes 25,000 years to restart a physical server. 5 minutes humans will have to wait for a while, not to mention the CPU, so don’t restart the server at random and end the rhythm of a civilization in minutes. Every book we read and every language we learn will give us an unexpected return in the future. In fact, as a developer, it is particularly important to have a learning atmosphere and a communication circle. Here I recommend a Java learning exchange group 342016322. No matter you are small white or big bull, welcome to enter.