An overview of the

What is RAID? Redundant Array of Independent Disks (RAID) is a Redundant Array of Independent Disks. Simply put, RAID is a disk subsystem consisting of multiple independent high-performance disk drives that provide higher storage performance and data redundancy than a single disk. RAID is a multi-disk management technology that provides high-performance storage with moderate cost and high data reliability for the host environment.

There are three key concepts and technologies in RAID: Mirroring, Data Stripping, and Data parity.

Mirroring: Data is replicated to multiple disks to improve reliability and concurrently read data from two or more copies to improve read performance. Obviously, the write performance of mirroring is slightly lower and it takes more time to ensure that data is correctly written to multiple disks.

Data striping: Data slices are stored on multiple disks. Multiple data slices together form a complete data copy, which is different from multiple copies of mirroring. It is usually used for performance purposes. Data strips have higher concurrency granularity. When data is accessed, data on different disks can be read and written at the same time, thus achieving significant I/O performance improvement.

Data verification: Use redundant data to detect and repair data errors. Redundant data is usually calculated by hamming code, xOR operation and other algorithms. The verification function can greatly improve the reliability, robustness and fault tolerance of disk array. However, data verification requires data from multiple locations to be calculated and compared, which affects system performance. Different RAID levels use one or more of the three technologies to achieve different data reliability, availability, and I/O performance.

Select a RAID mode based on a thorough understanding of system requirements, and compromise on reliability, performance, and cost.

Common RAID levels are as follows:

  • Standard RAID RAID0, RAID1, RAID2, RAID3, RAID4, RAID5, and RAID6 are standard RAID levels
  • Mixed RAID: RAID10, RAID50, RAID60…

Let’s take a look at each RAID level and make a simple comparison.

RAID0

When N disks are combined to achieve n-fold performance, data is written into N parts and data is read from disks. In this way, the read and write performance is doubled. Advantages: RAID0 maximizes the utilization of disk space to 100%. Fast performance: The more disks, the better performance.

Cons: No data protection, even more risky than a single disk. Any disk failure will result in data loss.

RAID 1

The disks in RAID 1 mirror each other. N copies of data can be written to each other. Data can be read from any disk. The read performance is doubled and the write performance is the same as that of a single disk.

Advantages: Security grows according to the number of physical disks in the array.

Disadvantages: Low space utilization, the lowest among all storage arrays.

RAID 5

RAID5 takes space utilization and performance into consideration. The RAID5 array requires at least three disks and is composed of parity codes rather than mirroring. In the preceding figure, four disks are used. Each piece of data is divided into three data blocks and a parity block and placed on four disks respectively. Data blocks and parity blocks are distributed across each other. A reads data from Disk 0,1, and 2 to A1, A2, and A3 respectively, and then combines them into A. If A disk, such as Disk2, is damaged, A3 is calculated based on A1 and A2 plus verification codes, and then data A is generated. RAID 5 allows a disk to be damaged.

Advantages: Three times as much data as a single disk is read. The disk has certain security and can tolerate the damage of a disk. Disadvantages: The write performance deteriorates because the parity block needs to be calculated every time data is written. Only a bad disk can be damaged

RAID 6

RAID6 can flexibly design the ratio between database and parity blocks. The figure in the preceding figure shows a combination of three data blocks and two parity blocks to improve data reliability. RAID 6 is widely used in data backup scenarios and provides higher data reliability than RAID 5.

RAID 10

The two disks are first mirrored and then combined in RAID0 mode to achieve data redundancy and double performance. RAID 1+0 applies to database scenarios.

RAID 50

The groups are first made into RAID5 and then grouped into RAID0, taking into account the characteristics of RAID5 and RAID0.

RAID 60

You do a combination of RAID6 and RAID0, taking into account both RAID6 and RAID0 characteristics.

The contrast between the various combinations

RAID level redundant Space utilization Read performance Write performance Minimum number of disks
RAID0 no 100% * * * * * * 2
RAID1 is 50% 支那 支那 2
RAID5 is 67-94% * * * * 3
RAID6 is 50-88% 支那 * 4
RAID10 is 50% 支那 支那 4
RAID50 is 67-94% * * * * 6
RAID60 is 50-88% 支那 * 8