Internal work greatly increased! Look at IO from the structure of mechanical hard drives and solid-state drives

The word “disk” is no stranger to programmers. We know that it is a storage medium, mainly used to store data. It can be said that common middleware is basically inseparable from it, such as MySQL database, Kafka message engine, and even Redis cache.

When we are optimizing certain business logic, we often need to use the cache, and try to get hot data from the cache, because we know that disk is slow, especially in high concurrency scenarios, we need to ensure that very few requests go to disk IO. I don’t know if you’ve thought about the following:

Why is a mechanical hard disk slow?
How slow is a mechanical hard disk?
Kafka also writes disks, but it’s pretty fast. Why?
Why are SSDS faster than regular mechanical disks?
If SSDS are so fast, why not ditch traditional mechanical disks?

With these questions in mind, let’s take a look at disk-related knowledge.

Start with a mechanical hard drive

This is the internal structure of an ordinary mechanical hard disk, and it doesn’t have many components, and we’re going to focus on thatdisk,Head arm,headWill do.

Say first disk, it looks like a CD, our data are found in it, we usually call it disc, disc surface coated with magnetic recording materials, note that this is not only one side can save data, both sides of the disc can be saved, at the same time for a disk, usually it is composed of multiple blanks, so its composition should be like this.

There are 4 plates in the picture, 8 plates in total, and each of themdiskThey’re all drawn in concentric circles.

This orange circle that goes round and round is called a track, and of course a track is made up of arcs.

The green arc that we see here is calledsectorSectors are composed of disksThe smallest unit ofGenerally, a sector can be stored512Bytes.

Multiple disks of the same track can form a virtual cylinder, and this virtual cylinder is called a cylinder.

For a disk, its storage capacity = size of a single sector * number of sectors * number of tracks.

For a disk, it has two sides so the capacity of a disk is equal to 2 times the capacity of the disk.

For a disk, disk capacity = number of disks x capacity of a single disk.

The data on the disk cannot be transferred directly to the bus. Therefore, we need a medium. This medium is the magnetic head, which can transfer data from a certain sector to the bus.

And with the heads, here’s what happensMagnetic armThe magnetic head solves the problem of data reading and writing, but does not solve the problem of which sector to read and write. At this time, the magnetic arm is needed. The magnetic arm can swing in a certain range to find the target sector.

Magnetic arm swing range is limited, of course, such as magnetic whatever swing arm can swing to the sector B, then the rotating disc, you may have heard the speed of the disk, such as 7200 r/min, this is actually a shaft to drive the speed of the rotation of the disc, thus eventually of + magnetic arm swing, to locate our target sectors.

How fast a mechanical hard drive is

Now that we understand the physics of the mechanical hard drive, let’s see how fast it is. We know that to locate a piece of data requires the rotation of the disk and the swing of the magnetic arm, which are all physical. When our magnetic head is located in a specific sector, the speed of reading and writing data is very fast. Therefore, the main reason that affects the speed of reading and writing of mechanical hard disk is these two physical movements. These two physical motions correspond to two technical terms called mean delay and mean seek time.

Let’s first talk about the average delay. We mentioned above that when a target sector is not within the swing range of the magnetic arm, we need to turn the disk. Taking the disk of 7200 RPM as an example, it can turn 120 times per second, and a revolution is 1/120s=8.33ms. In the case of a track, which is circular, the target nodes can be distributed anywhere on the circle.

For example, when we’re looking for A, we might only have to go around A little bit, when we’re looking for B, we might have to go around half A circle, when we’re looking for C, we might have to go around almost A circle. Therefore, according to the arithmetic average method, we can roughly judge the average need to find a target nodeTurn a half circleThe time of this half turn is zero8.33/2 = 4.17 ms, which is the average delay.

Let’s take a look at the average path seeking time. Through the rotation of the disk surface, we roughly found the target area, but not precisely located it. At this time, the swing of the magnetic arm is needed to locate our specific target sector, and the swing time is generally 4-10ms.

Therefore, the approximate time of random DISK I/O is 4.17+4=8.17 ~ 4.17+10=14.14ms. What does this figure mean? Let’s take a compromise, assuming the random I/O time is 10ms, then 1 second can do 100 random I/O, see the number of 100, you understand something ~, this is really small, that is why we have to add a cache layer for high QPS interface, because the disk can not support it.

The 1 second memory can perform 100 random I/OS. This 100 is IOPS, which is the number of input/output (or read/write times) per second. It is a key indicator of disk performance

We can use iostat to view disk indicators on the current machine:

Iostat KB/t TPS MB/s US SY ID 1M 5m 15M 23.44 9 0.20 12 8 80 2.40 1.97 1.90Copy the code

Where TPS is the number of transfers per second of our current disk, it is important to note when this number is large.

Of course, the above are random IO, sequential IO is greatly different, sequential IO speed comparable to discrete memory read and write, in short, very fast, like the famous Kafka is disk sequential IO, so at least in the disk read and write this part of its performance is good. Sequential IO is fast, first of all, the disk does not need to rotate every time, and then our magnetic arm does not need to swing to find the path, so it saves a lot of physical time, speed and random IO should be an order of magnitude difference.

Faster solid-state drives

To start with, the data transfer rate of our daily mechanical drives is around 200MB/s, while the data transfer rate of solid-state drives is around 768MB/s. It can be found that solid-state drives are much faster than conventional mechanical drives. However, this is only when the interface is SATA3.0. Our SSDS also support PCI Express interface, under which the read and write capability of SSDS can reach more than 1GB/s. Of course, these are just popular knowledge, as programmers, we do not need to pay too much attention to the knowledge of interfaces.

To start with, solid-state drives (SSDS) don’t work at all like mechanical disks. As you can see from the internal structure above, SSDS have no mechanical parts such as disks or magnetic arms.

So how does it store data? The answer is capacitance, which is a very small electronic component. We only need to charge the capacitor, which can represent bit 1, and discharge the capacitor which can represent bit 0. A solid hard disk that stores data in this way is generally called a particle that uses SLC, the full name is single-level Cell. With only one bit of data in a single storage cell, how much data a SSD can hold depends entirely on how much capacitance it can hold, so some engineers have developed more advanced ways of making a capacitor hold two, three, or even four bits.

So the question is how does a capacitor represent this extra bit? The answer is voltage. The thing that charges the capacitor is called a voltmeter. Take a capacitor that can hold two bits, for example.

Of course, the more numbers you want to represent, the more different voltages you have to apply, so the speed will be slower.

Short-lived solid-state drives

Now that we understand the inner workings of SSDS, let’s take a look at how they work and why they don’t last long.

A solid state disk is called a bare disk, and the bare disks are stacked on top of each other. A bare disk, for example, has the following structure:

Or concept, first on a bare chip can put more flat, usually a level of storage capacity is probably in the plane of the GB, and then a plane can be divided into many blocks, generally a piece of the storage size, usually a few hundred KB to a few MB, inside a block and a lot of pages, the size of a page is usually 4 KB, we focus onblockandpageThis has a lot to do with the longevity of solid-state drives.

We know that on a mechanical hard drive, when you write data, you don’t care if the sector you’re writing to already has data, you just overwrite it, but on a SOLID-state drive, if the area you’re writing to already has data, you have to erase it before you write to it.

This erasure is critical because it directly affects the life of a fixed hard disk. The more you erase, the shorter the life will be. It is like wiping a piece of paper with an eraser.

So how many times can a SOLID-state drive be erased? In single-bit capacitance mode, it can erase about 10W times, and other multi-bit erasures are even fewer, perhaps only a few thousand times. So solid-state drives are not recommended if your business data needs to be updated frequently.

About erasing data, it is important, that is, is it a piece from above, we know that the smallest unit of data storage is page, page belongs to block, there are many pages on a block, if some of the pages on a piece of data is marked deleted, at this time can’t erase these separate page directly, therefore these pages cannot be reused, Unless the entire page on the block is marked for deletion.

As shown in the figure, page A, page B and page C are marked deleted data, but because the block they belong to has other valid data, when there is new data to write, it can only write the unused pages in the white area, and cannot use the red area. But the problem with that is that over time, there will be more red areas, more debris, and it will be wasted.

Waste shameful, so we need a compact mechanism, then the relevant piece of valid data moves to a new block, make the effective data of different blocks more compact distribution, then to be moving out of blocks of data, or data, didn’t it page either tags deleted data, can be directly to erase this piece, so as to achieve the purpose of recycling.

Back to the topic

With our knowledge of mechanical hard drives and solid-state drives, let’s take a look at our initial problem:

Why is a mechanical hard disk slow? The main reason why the mechanical disk is slow is that positioning a piece of data requires the rotation of the disk + the swing of the magnetic arm, which is physical, so it will be slow
How slow a mechanical hard drive is? According to the above calculation, the iops of a 7200 RPM mechanical disk is about 100, and the time of each IO is about 10ms
Kafka also writes disks, but it’s pretty fast. Why? Because Kafka is sequential IO, sequential IO is fast even for mechanical hard drives because it doesn’t require as much seek as discrete IO does.
Why are SSDS faster than regular mechanical hard drives? The main reason for this is that SSDS do not require the physical movement of a mechanical hard disk to find a path.
If SSDS are so fast, why not ditch traditional mechanical hard drives? Solid-state drives are a little more expensive in terms of price, and they don’t last as long as mechanical drives.

The last

It is not easy to create. Your three lines are the biggest support for the author and also the biggest motivation for his creation. We will see you next time.

Past highlights:

Memory management: programs load those things
Simple! This is how the CPU runs the code
20 picture! Common distributed theories and solutions

Internal work greatly increased! Look at IO from the structure of mechanical hard drives and solid-state drives

Start with a mechanical hard drive

How fast a mechanical hard drive is

Faster solid-state drives

Short-lived solid-state drives

Back to the topic

The last

Past highlights:

Related Posts

Text model design of product master data in SAP CRM

This article is very honest about distributed unique ids

Ten lines of code in Spring Boot construct RESTful applications