分类:
2009-12-10 12:05:20
We have covered RAID levels before in our posts. You can read about the different RAID levels and the I/O characteristics . While building up a DR (Disaster Recovery) environment for one of our clients, one of the questions asked by the client was: “How is RAID 1+0 different than RAID 0+1?”. Both RAID 0+1 and RAID 1+0 are multiple RAID levels which means that they are created by taking a number of disks and then dividing them up into sets. And within each of these sets, a single RAID level is applied to it in order to form the arrays. Then, the second RAID level is applied at the top of it to form the nested array. RAID 1+0 is also called as a stripe of mirrors and RAID 0+1 is also called as a mirror of stripes based on the nomenclature used for RAID 1 (mirroring) and RAID 0 (striping). Let’s follow this up with an example:
Suppose that we have 20 disks to form the RAID 1+0 or RAID 0+1 array of 20 disks.
a) If we chose to do RAID 1+0 (RAID 1 first and then RAID 0), then we would divide those 20 disks into 10 sets of two. Then we would turn each set into a RAID 1 array and then stripe it across the 10 mirrored sets.
b) If on the other hand, we choose to do RAID 0+1 (i.e. RAID 0 first and then RAID 1), we would divide the 20 disks into 2 sets of 10 each. Then, we would turn each set into a RAID 0 array containing 10 disks each and then we would mirror those two arrays.
So, is there a difference at all? The storage is the same, the drive requirements are the same and based on the testing also, there is not much difference in performance either. The difference is actually in the fault tolerance. Let’s look at the two steps that we mentioned above in more detail:
RAID 1+0:
Drives 1+2 = RAID 1 (Mirror Set A)
Drives 3+4 = RAID 1 (Mirror Set B)
Drives 5+6 = RAID 1 (Mirror Set C)
Drives 7+8 = RAID 1 (Mirror Set D)
Drives 9+10 = RAID 1 (Mirror Set E)
Drives 11+12 = RAID 1 (Mirror Set F)
Drives 13+14 = RAID 1 (Mirror Set G)
Drives 15+16 = RAID 1 (Mirror Set H)
Drives 17+18 = RAID 1 (Mirror Set I)
Drives 19+20 = RAID 1 (Mirror Set J)
Now, we do a RAID 0 stripe across sets A through J. If drive 5 fails, then only the mirror set C is affected. It still has drive 6 so it will continue to function and the entire RAID 1+0 array will keep functioning. Now, suppose that while the drive 5 was being replaced, drive 17 fails, then also the array is fine because drive 17 is in a different mirror set. So, bottom line is that in the above configuration atmost 10 drives can fail without effecting the array as long as they are all in different mirror sets.
Now, let’s look at what happens in RAID 0+1:
RAID 0+1:
Drives 1+2+3+4+5+6+7+8+9+10 = RAID 0 (Stripe Set A)
Drives 11+12+13+14+15+16+17+18+19+20 = RAID 0 (Stripe Set B)
Now, these two stripe sets are mirrored. If one of the drives, say drive 5 fails, the entire set A fails. The RAID 0+1 is still fine since we have the stripe set B. If say drive 17 also goes down, you are down. One can argue that that is not always the case and it depends upon the type of controller that you have. Say that you had a smart controller that would continue to stripe to the other 9 drives in the stripe set A when the drive 5 fails and if later on, drive 17 fails, it can use drive 7 since it would have the same data. If that can be done by the controller, then theoretically speaking, RAID 0+1 would be as fault tolerant as RAID 1+0. Most of the controllers do not do that though.
Of late I've heard much talk about RAID 1+0 being better than RAID 0+1, but never got a good answer why. Leah and I started talking about this over dinner one night and did a little math (literally on the back of a napkin) to calculate how much better. Here's what we figured out.
RAID 0+1 configuration where multiple disks are striped together into sets (sets A & B in the diagram, each set being as large as the resulting final volume), and then two or more sets are mirrored together.
RAID 1+0 configuration where two or more drives are mirrored together (mirrors 1-4 in the diagram), and then the mirrors (as many as are needed to result in the desired amount of space) are striped together.
In either case (0+1 or 1+0), the loss of a single drive does not result in failure of the RAID system. The difference comes in the chance that the loss of a second drive from the system will result in the failure of the whole system. In RAID 0+1, you have to lose one drive from each disk set to result in the failure of the whole system. In my diagram that would be one drive from set A and one drive from set B. In RAID 1+0, you have to lose all drives in a mirror. This would be both drives in any numbered pair in the diagram.
Mathematically, the difference is that the chance of system failure with two drive failures in a RAID 0+1 system with two sets of drives is (n/2)/(n - 1) where n is the total number of drives in the system. The chance of system failure in a RAID 1+0 system with two drives per mirror is 1/(n - 1). So, using the 8 drive systems shown in the diagrams, the chance that losing a second drive would bring down the RAID system is 4/7 with a RAID 0+1 system and 1/7 with a RAID 1+0 system.
The math gets more complicated when you have more than two elements to a mirror. Since that's a rare configuration, I haven't bothered to figure out the equations. If someone else would like to, I'll be glad to post them here.
Another difference between the two RAID configurations is performance when the system is in a degraded state, i.e. after it has lost one or more drives but has not lost the right combination of drives to completely fail. In a RAID 0+1 configuration, the loss of any drive in a set causes the failure of that entire set and the set is removed from the RAID system. Generally (in the two set case) this means you are left with a RAID 0 system made up of the remaining set of disks. This probably slightly improves write performance and slightly degrades read performance (but that's just a WAG, I haven't done any testing). In a RAID 1+0 system, you would see the same effect on each mirror that loses a drive, but not the whole system. In other words, a RAID 1+0 configuration will tend to show similar, but less dramatic, changes in performance when in a degraded mode than RAID 0+1. However, the changes will likely be slight in any case.
One more difference that was recently pointed out to me is the speed at which the RAID system recovers once the failed disk is replaced. RAID 1+0 only has to re-mirror one drive, whereas RAID 0+1 has to re-mirror the entire failed set. So RAID 1+0 will recover significantly faster.
There used to be a list here of products that did or didn't support RAID 1+0. The list got too long for me to maintain. Ask the vendor or something. Usually you can find this in the documentation for the product, but it is frequently in the fine print.