At present, we often mention the following kinds of disk arrays;
1.RAID 0 has at least 2 disks, and data is written to both disks in parallel; The characteristic is that the efficiency of writing will be improved by n times. In contrast, the destruction of any disk data will make all disk data meaningless. On the contrary, the risk has increased:
? Assuming that the probability of disk damage is p, then the probability of system unavailability is 1-?
2.RAID 1 has at least two disks, which write the same data and are backups for each other; Feature security is higher; However, the utilization rate of the disk has changed to 1/N, and the writing speed has not been enhanced.
Assuming that the probability of a disk being damaged is p, what is the probability of the system being unavailable?
3.RAID 5 has at least three disks, one of which has a disk-sized capacity (not an entire disk, but scattered among N disks with a disk-sized capacity) to store parity bits. When a fault occurs, if only one disk is damaged, the other bits can be recovered through parity bits; If the disk is damaged, only one disk can be recovered;
? Assuming that the probability of disk damage is p, then the probability of system unavailability is 1-? ? -?
4.RAID 10 It is actually a mixture of RAID 1 and RAID0 RAID 0; At least 4 hard disks are required; Two of them are mutually backed up, except for the backup disk, the others are written in different contents in parallel to improve the performance of writing to the hard disk;
Further thinking:
1.RAID0 is actually just to improve the performance of disk reading and writing, and the actual performance is definitely not improved by 2 times. Different disks read and write together, and the reading and writing of the two disks need at least synchronous speed. Therefore, the specific performance remains to be studied; After all, it usually takes several years for a hard disk to be damaged, but personal PC can use RAID0 to try to improve disk reading and writing. However, the price of SSD continues to decline, which seems to be of little significance;
2.RAID5 looks the most complicated and beautiful; But there are many voices on the internet, which also shows that it is not good-looking;
A. Only one piece can be recovered if it is damaged, and the probability of not recovering is not low (simple Baidu has passed);
B. After the damage occurs, the performance will decrease because parity is needed to recover the data;
3. In the process of checking network data, only parity can be checked, but error can not be corrected (both Hamming code and CRC code can be used); In RAID5, only one bit is needed to recover data, which is a bit counterintuitive, but when you think about it, recovering data is not error correction; It is indeed possible to recover data through simple parity (simple and magical parity can be seen in Mathematics-Abnormal Warden);
4.RAID 3 and 4 are not mentioned because they are very similar to RAID5 and can be regarded as their improved versions; As mentioned above, RAID5 distributes parity bits on each disk (a total of one disk space), while RAID4 stores parity bits on one disk. There is no in-depth study of why RAID5 is better; Possible reasons, such as RAID4 damaging a disk that is not a parity code; So reading every piece of data requires XOR operation with the check code to recover the data? The performance of RAID5 should be relatively stable;
5. In my opinion, RAID only ensures the high availability of the system, and cannot replace the normal data backup (for disaster recovery);