MMM
Results 1 to 25 of 98

Thread: RAID And You (A Guide To RAID-0/1/5/6/xx)

Hybrid View

  1. #1
    Xtreme CCIE
    Join Date
    Dec 2004
    Location
    Atlanta, GA
    Posts
    3,842

    Raid-1

    Index:
    Section 1: Overview
    Section 2: How it Works 1 - The Basics
    Section 3: File Reading Models
    Section 4: How it Works 2 - More Advanced
    Section 5: Side Notes
    Section 6: Advantages (summary)
    Section 7: Disadvantages (summary)
    Section 8: Rules of Thumb
    Section 9: FAQ (various)
    Section 10: Future of RAID-1
    Section 11: References / Further Reading



    Section 1: Overview


    In the broad strokes, RAID-1 (or mirroring) is the simplest RAID form there is. It requires two disks which are basically exact copies of one another. The RAID controller simply duplicates the stream of data that was to be written to the array and sends one stream to each hard drive. This provides a number of advantages, such as full fault tolerance, potentially amazing data read speeds, and the ability to create the array without formatting... but the cost is that you lose half of your total potential storage capacity (one disk simply mirrors the other), it is not scalable in terms of the ability to add in more disks to increase capacity, and suffers a penalty in write speeds.

    You may notice that throughout this thread many references are made to the complexity of RAID-1, yet I also just said that in the broad strokes it is the simplest RAID form there is. Both are true. The real complexity comes into play when you start to examine the various methods that exist to read data from a RAID-1 array that are not available in any other RAID model, which you will find explored in more detail in Section 3: File Reading Models (performance for various file sizes is in How it Works 2).


    Section 2: How it Works 1 - The Basics
    2a. Writing to the Mirror
    2b. Reading from the Mirror


    2a. Writing to the Mirror
    As mentioned in the introduction, the basic writing mechanism is quite simple: the RAID controller simply duplicates the stream of data that was to be written to the array and sends one stream to each hard drive. Still, for a more graphical representation I'll draw a little diagram showing the result on the physical disks of the processor requesting data blocks A, B, C, D be written to the array (where the array is, as you'll recall from earlier sections, the name for the logical grouping of disks):

    Disk 1: A, B, C, D
    Disk 2: A, B, C, D

    2b. Reading from the Mirror
    Reading from the mirror is where RAID-1 becomes more complicated, and I will highly suggest reading the "File Reading Models" section for a more thorough explanation of the nuances and different methods that can occur. For now it suffices to say that there are two major models which can be used depending on the type of controller being used:

    1. All data is read from only one disk
    2. Data can be read from both disks

    As you may have guessed, the second model has the potential to provide significantly better request processing.


    Section 3: File Reading Models
    3a. Single Disk
    3b. Both Disks - Per Job Load Balancing
    3c. Both Disks - Per "Stripe" Load Balancing
    3d. Both Disks - Read Optimizations - Elevator Seek
    3e. Both Disks - Read Optimizations - Shortest Seek First


    The potential strength of RAID-1 comes entirely from the variety of read methods available. Some methods provide no benefit over a single disk solution, while others are able to handily beat the speed of a RAID-0 array

    3a. Single Disk
    The single disk read model is one you will find in many basic RAID chipsets and is solely responsible for the bad rap that RAID-1 receives from the enthusiast community. As the name implies, although both disks hold the same data, only one disk is used to read data from (and though some implementations change this disk over from time to time, the result is the same). This is a gross inefficiency and reduces read times to give approximately the same result as a single disk which is not in an array. As a small bit of consolation, it should be noted that even with cheap onboard RAID chipsets this particular RAID implementation uses less CPU time than any other.

    3b. Both Disks - Per Job Load Balancing
    Per Job load balancing requests read data from different disks based on various criteria (see below). For example, say you as a user were going to open two files simultaneously. Each file (or "job") is then read from a different hard drive. Already you can see how this is a significant improvement over the single disk model - instead of reading two jobs off one disk, one job is read off each disk and the total is completed in half the time. Another advantage to this type of Load Balancing is that it is a popular method employed by many software-based RAID card manufacturers, and may even be employed by many operating systems' software-based RAID arrays, yet offers no more a performance hit than 1-2% on the CPU. This makes it a very cost-efficient solution. The various criteria that controllers can use to determine how to assign jobs include (but are not limited to): disk queue length (common), elevator seek (see 3d, often used in conjunction with disk queue length), shortest seek first (also used with disk queue length), and basic round robin.

    The "disadvantage" of this type of load balancing is that it doesn't often do much for a desktop user because most people do not encounter issues of a large read request interrupting smaller read requests in their day-to-day browsing.

    It is notable that this type of load balancing is that it inherently lends itself to multi-user environments where one users large request will not make a second user wait a long time for his small request to be filled. If, however, you find yourself in a situation where you have one program that does intensive reading and it interrupts your small side-requests (or vice versa), RAID-1 with Per-Job Load Balancing may be the way to go for you.

    3c. Both Disks - Per Stripe Load Balancing
    For lack of a better name I have decided to call this Per Stripe Load Balancing, though in reality a mirrored array really has no need for "stripes" in the same way that RAID-0 and RAID-5/6 do. In this scenario both drives can team up for each request that comes in, with one drive reading (for example) all the odd blocks of data and one drive reading all of the even blocks. This method effectively allows for RAID-0 style read speeds for large file read jobs while ensuring that if one disk fails you do not lose all of your data. Like Per Job Load Balancing, Per Stripe Load Balancing can also make use of various optimizations such as Elevator seek and Shortest Seek First which actually overcome the RAID-0 small file seek issue, and can vault RAID-1 into the throne of the fastest disk reading solution.

    3d. Both Disks - Read Optimizations - Elevator Seek
    The best explanation of this optimization comes from Wikipedia, and rather than plagarize, I'll simply quote them:

    Quote Originally Posted by Wikipedia
    From an implementation perspective, the drive maintains a buffer of pending read/write requests, along with the associated cylinder number of the request. Lower cylinder numbers indicate that the cylinder is closest to the spindle, and higher numbers indicate the cylinder is further away. When a new request arrives while the drive is idle, the initial arm/head movement will be in the direction of the cylinder where the data is stored, either in or out. As additional requests arrive, requests are serviced only in the current direction of arm movement until there are no further requests in that direction. When this happens, the direction of the arm reverses, and the requests that were remaining in the opposite direction are serviced, and so on.
    This particular optimization is seen extensively in both software and hardware implementations. It does not really require any additional processing power and allows a RAID-1 two disk array to overcome the largest pitfall of a two-disk RAID-0 array - small file access time.

    3e. Both Disks - Read Optimizations - Shortest Seek First
    I'll start off by stating that I myself have never specifically seen this implemented in RAID-1, but it does get spoken of and is probably in place somewhere. This particular algorithm is designed to provide the shortest access times possible by buffering requests and tying their positions to cylinder data. Each hard drive then selects jobs by choosing those jobs who are closes to the current read head, providing exceptional service time. The downside is that if requests continue coming in this method could starve requests for areas of the disk that are further from the head. This limitation is often overcome by micro-optimizations concerning when the head should service other requests. In theory this method has the potential to provide the shortest seek times of any optimization, but thanks to its complexity and limited implementation very little data about its actual performance exists.


    Section 4: How it Works 2 - More Advanced
    4a. Reading a "small" File
    4b. Reading a "large" File
    4c. Reading a "large" NON-SEQUENTIAL File
    4d. Writing to the Array
    4e. Reading & Writing from/to the Array (ie. uncompressing a large RAR)


    Please Note: This section *requires* an understanding of Section 3.

    4a. Reading a "small" File
    Without implementing any optimizations, any type of non-optimized read pattern will yield the same result - performance about equal to a regular single disk. With that being said however, any non-integrated solution (meaning either hardware or software controller seperated from the motherboard) should implement at least an Elevator Seek algorithm, at which time performance with small files excels. In fact, RAID-1 is the only RAID level which performs better than a single disk on average with random seeks (with optimizations, of course). In my own testing (which was by no means conclusive) I found that just implementing elevator seek in software alone resulted in a 10% reduction in average seek time versus a single disk (and let's not forget that other array types are actively worse than a single disk).

    4b. Reading a "large" File
    Obviously the read scheme will make a difference here. Single Disk read models will not help at all, and on a per-file basis neither will per-job load balancing, but per stripe load balancing will result in 2-disk RAID-0 speeds for large file reads.

    4c. Reading a "large" NON-SEQUENTIAL File
    Take everything from 4a and 4b and add them together here. A proper per striple load balancing scheme with an elevator seek will provide not only RAID-0 like speeds for the sequential reading bits, but can also reduce seek time. It should be noted however that to make effective use of this requires a somewhat more sophisticated piece of controller logic to allow one drives heads to go to one piece of fragmented data while having the other drives heads go to a different piece of fragmented data and keep track of it all. In reality I have not seen any data to conclusively show me that any hardware was able to show a difference one way or another versus RAID-0 for a test like this.

    4d. Writing to the Array
    Writing to the array is where RAID-1 takes a hit. Because one disk will always write slower than the other and write job will not complete until both are done, it will on average take longer than a single disk would.

    4e. Reading & Writing from/to the Array (ie. uncompressing a large RAR)
    Like any other method there is simply no benefit to using RAID-1 for this purpose. This is something best left for using multipe single drives for.


    Section 5: Side Notes
    5a. When Drives go Bad...
    5b. Software vs. Hardware RAID-1


    5a. When Drives go Bad...
    Your data is fully protected. You can have one of your two drives die on you and the other contains a full working copy of all your data. Some controllers allow a drive to fail without interrupting your service, while others will require a reboot. To rebuild the array simply find another hard drive and re-create it, generally there should be no formatting required. Rebuilding the array will take a fairly significant amount of time however, equivalent to however long it takes the new drive to write all the data that is on the old drive onto its platters.

    5b. Software vs. Hardware RAID-1
    While there is no specific requirement to use hardware RAID-1, there is a persuasive arguement. Software RAID-1 requires very few computer resources and has the ability to implement any form of optimization that the programmers could come up with without any real additional processing requirements. However, the main split comes in when you look at the type of load balancing that different methods offer. To date all software load balancing I have seen uses per-job load balancing, as do many cards (the Areca 1210, for example)... but if "per-stripe" load balancing is what you are looking for, a hardware RAID controller is almost certainly what you will need.


    Section 6: Advantages (summary)


    1. Complete single disk failure redundancy
    2. Low cost (hardware controller not necessary, though read section 5b)
    3. Potential for faster read access than a 2-disk RAID-0 array overall (see 4a, 4b)


    Section 7: Disadvantages (summary)


    1. High cost per GB (you lose 1/2 your potential storage capacity)
    2. Write speed is lower than a single disk on average
    3. Does not protect against data corruption


    Section 8: Rules of Thumb
    7a. When to use RAID-1
    7b. When NOT to use RAID-1


    7a. When to use RAID-1
    Use RAID-1 when:
    1. You want the fastest read speed you can pull out of two identical drives
    2. You have a game or application that would greatly benefit from a reduction in seek time (lots of small file reading going on)
    3. When data preservation is important to you

    7b. When NOT to use RAID-1
    1. When you care that you're losing half your storage capacity
    2. When you will be using the array for lots of writing


    Section 9: FAQ


    Q: Should I just read the RAID-0 FAQ first because it's boring to repeat the same questions and answers?
    A: Yes.


    Section 10: Future of RAID-1


    As SSD drives come out and hardware that can provide XOR calculations becomes more mainstream, RAID-1 will slowly die out. The fact is that as seek times stop becoming an issue and platter mechanics disappear the speed benefit of RAID-1 will disappear as well, and redundancy will likely be covered more by RAID-5 and RAID-6 arrays, which offer more storage capacity overall.


    Section 11: References / Further Reading


    http://en.wikipedia.org/wiki/Elevator_algorithm
    Last edited by Serra; 06-18-2008 at 01:43 AM. Reason: Typo Fix
    Dual CCIE (Route\Switch and Security) at your disposal. Have a Cisco-related or other network question? My PM box is always open.

    Xtreme Network:
    - Cisco 3560X-24P PoE Switch
    - Cisco ASA 5505 Firewall
    - Cisco 4402 Wireless LAN Controller
    - Cisco 3502i Access Point

  2. #2
    Xtreme CCIE
    Join Date
    Dec 2004
    Location
    Atlanta, GA
    Posts
    3,842

    Raid-5/6

    Index:
    Section 1: Overview
    Section 2: How it Works 1 - The Basics
    Section 3: How it Works 2 - More Advanced
    Section 4: Hardware and Performance
    Section 5: Side Notes
    Section 6: Advantages (summary)
    Section 7: Disadvantages (summary)
    Section 8: Rules of Thumb
    Section 9: FAQ (various)
    Section 10: Future of RAID-5/6
    Section 11: References / Further Reading



    Section 1: Overview


    As an astute reader, you have no doubt determined that unlike the last two posts, this single section serves as a guide for two different RAID levels. Is this because I'm lazy and wanted to save myself some time by only going into half the quality each level deserves? No. It's because at the heart of i RAID-5 and RAID-6 are not too dissimilar to one another... in fact, RAID-6 is often seen as simply being an extension of RAID-5. As such, this post will primarily concern RAID-5 and make note of any areas where RAID-6 deviates.

    Both RAID-5 and RAID-6 have a fairly idealized goal: to not only increase file read and write performance by utilizing multiple hard drives, but provide fault tolerance as well. While the latter goal is simply a necessary operation of these levels, the former may require significant investment depending on the level of performance expected. As such, both RAID-5 and RAID-6 are typically seen only in business environments which can tolerate the higher costs of ownership.


    Section 2: How it Works 1 - The Basics
    2a. Informational
    2b. Parity
    2c. Writing to the Array
    2d. Reading from the Array


    2a. Informational
    To operate properly, RAID-5 requires a minimum of 3 hard drives in an array and RAID-6 requires 4. The reason behind this is that RAID-5 was designed to handle the failure of one hard drive, while RAID-6 was designed to handle the failure of up to two. For either to have less than the minimum number the situation would have to fall into one of two categories: either there is only a single disk and thus no RAID array is possible OR there are only two disks in the initial array and RAID-0 or -1 would be the better option (depending on goals).

    2b. Parity
    Unlike the other levels examined thus far, both RAID-5 and RAID-6 require the use of parity information. This parity information is effectively only used when a disk fails and allows the remaining disks to continue functioning and re-construct all of the information that was on the failed drive. Although we will explore how this parity information is generated in the first place in a later subsection, it suffices to say for now that it requires an amount of space equal to one full hard drive for RAID-5 and two for RAID-6. Thus,

    4x 320GB drives in RAID-5 = 960GB of useable space (320GB lost)
    5x 400GB drives in RAID-5 = 1.6TB of useable space (400GB lost)

    4x 320GB drives in RAID-6 = 640GB of useable space (640GB lost)
    5x 400GB drives in RAID-6 = 1.2TB of useable space (800GB lost)

    2c. Writing to the Array
    The chain of events involved go something like this (broadly - much more detail in the advanced section):
    1. Data that needs to be written arrives at the RAID controller and is split into stripes
    2. Parity information is calculated for each data stripe
    3. For each data stripe, data is written on all but one of the drives, and on the drive it wasn't placed on the parity information is written
    3*. The hard drive that the parity data is written on is changed on a round-robin basis per stripe written

    Using 4 disks and a RAID-5 array, if the original data to be written were split into stripes A, B, C, D, E, then each stripe would be split into (n-1 = 3) blocks (A1-A3, B1-B3, etc). A parity block is then calculated for each stripe and is incorporated into a "full" stripe, which is written across the disks, with the parity block (denoted by the 'p') being distributed in round-robin fashion.
    Disk 1: Ap , B1 , C1 , D1 , Ep
    Disk 2: A1 , Bp , C2 , D2 , E1
    Disk 3: A2 , B2 , Cp , D3 , E2
    Disk 4: A3 , B3 , C3 , Dp , E3

    2d. Reading from the Array
    Thankfully, reading from an array is much simpler than writing to it. When reading from the array, only the data blocks (and not the parity blocks) are read, then re-combined by the controller to form a usable data stream.


    Section 3: How it Works 2 - More Advanced
    3-. Please Read
    3a. What is an XOR?
    3b. The Update Write Function
    3c. The Regenerate Function
    3d. The Rebuild Function


    3-. Please Read
    RAID-5/6 can be an extremely complicated beast depending upon the depth you want to go into. Like RAID-1, various manufacturers can offer different optimizations and to some degree different implementations (ie. on-disk versus on-controller partial XOR)... and the fact of the matter is that short of a 20 page document, I could not explain it fully. As such, while this section will attempt to detail the basic mechanism and I have provided a few links to more detailed pages (one so detailed it even outlines the protocol format for various operations) which I would suggest as a read for anyone who is really interested.

    Also, it will suffice to say for the rest of this section that RAID-6 will be considered as a duplicate occurrence of RAID-5 operation using another parity block. It's not entirely true, but for the extent RAID-6 is seen in enthusiast PCs it will suffice.

    Finally, I would like to acknowledge that the steps for subsections 2a/b/c are effectively paraphrases of the information found in the reference section (so shoot me, I found very well-written reference pages).

    3a. What is an XOR?
    An XOR is a mathematical operation also known as an Exclusive OR in many circles. For a full explanation of XOR and all of its applications, see http://en.wikipedia.org/wiki/Xor

    The basic mathematical idea is that:
    0 XOR 0 = 0
    0 XOR 1 = 1
    1 XOR 0 = 1
    1 XOR 1 = 0

    And it should be noted that when talking about multi-bit XORs, each bit is simply matched up with a bit below and XOR'd individually. Example:

    10010110 XOR 01001110 = 11011000

    The reason this is important to RAID-5 is that as it turns out if you XOR any number (n) of binary strings bit-wise (resulting in n original strings + 1 parity string) by XORing the first string with the second and the third string with the result of the first two etc, you can recreate the data lost by performing an XOR on the strings you have left and the final XOR result will be the string that was lost.

    3b. The Update Write Function
    The Update Write Function is used to write new data to a data stripe and update the parity information in parity stripes (note: on a per-stripe basis, each disk can be considered either a data drive or a parity drive). To perform a write operation to a RAID 5 disk array, it is necessary to perform what is referred to as a "Read-Modify-Writeback" operation. Several steps must be performed:

    1. Get the new data to be written to the data disk
    2. Read old block contents (data that will be replaced) into internal buffers from the data disk
    3. Read old blocks corresponding parity information into internal buffer from the parity disk
    4. XOR the old data block contents and the old parity information (removes the old data's contribution to parity) [wondering why this happens? read on]
    5. Compute new parity information using by XORing (1) with the XOR result from (4)
    6. Write the new parity to the parity disk
    7. Write the new data to the data disk
    8. Signify I/O completion via interrupt

    An excellent diagram of how this works is found on page 5 of the Intel page mentioned in the references for this section.

    There now remain only two things to explain relating to RAID-5 write operations:

    Q. What happens when you take more than just a single data disk and parity disk into the equation (RAID-5 does require a minimum of 3 disks)?
    A. Steps 1->8 occur for each disk in order. The parity block remains the same size because only XOR operations are performed upon it. This is one of the instances where optimizations can come in, with any decent controller being able to read all necessary data from all drives and computing XOR's in memory before finally telling the parity disk to read/write only once.

    Q. Why step 4 takes place (why old data is read and used)?
    A. Step 4 occurs because the data update may be small - say the size of only the stripe on one disk. If a slight change like that were to occur and step 4 did not take place, then the data from all disks would have to be read and parity calculated from all of them rather than simply from the single disk. Although it seems like it adds an unnecessary second XOR, in reality it prevents many more potential XOR operations.

    3c. The Regenerate Function
    The regenerate function is used to recreate a data block that cannot be read from a data device (ie. a disk has failed but data is still being read). In this scenario, the distinction between data and parity is nonexistent because neither means anything without the other. The sequence is as follows:

    1. Data is read from drive 1
    2. Data is read from drive 2
    3. The data read from drive 1 and 2 are XOR'd
    4. The XOR result is XOR'd with the data from drive 3 (if applicable)
    5. Step 4 repeats as many times as is necessary until no drives remain
    6. The final XOR result is the data that was on the failed drive

    3d. The Rebuild Function
    The rebuild function is similar to the regenerate function, except that the last XOR result is written to the formerly unreadable device (ie. a drive has failed and the controller is rebuilding the array on a fresh drive that has been installed in its place). Follow the sequence above and add a 7th step where the final XOR result is written to the new drive.


    Section 4: Hardware and Performance
    4a. Why do People Say RAID-5/6 is Resource Intensive?
    4b. Bottleneck Sources
    4c. Recommendations


    4a. Why do People Say RAID-5/6 is Resource Intensive?
    Because it is. While no single function that occurs in a RAID read or write may require too much in the way of resources, they do require a significant amount more than RAID-0 or non-RAID arrays and the cumulative effect over a larger array with an intensive operation is staggering, not to mention the effect these operations have on latency for small requests.

    4b. Bottleneck Sources
    Case 1: Software RAID
    Your main bottleneck is going to be the fact that your CPU has to do all the work and no functions are streamlined or designed for... ultimately your CPU has to deal with nearly everything and because it wasn't designed to, it also has to deal with everything that helped it deal with the data in the first place. There are few situations where software RAID-5/6 could be a viable solutions, and fewer where it would be recommended, so we're not going to go there.

    In hardware, many of today’s RAID controllers implement a hardware XOR engine - two in some cases - which can calculate XOR's at wire speed (meaning effectively instantaneous) and hence with such controllers the XOR itself is removed as the key bottleneck. What then emerges as the largest bottlenecks are:
    1. Speed of the drives themselves as they are used to read/write data (though this is unavoidable)
    2. Speed of the memory chips that are being used by the XOR engine for its data buffering on the controller. This one becomes the largest bottleneck because with the exception of distance of wire travel itself (also unavoidable and effectively a non-source of latency anyway), it is the only other source of latency. Every time data is to be XOR'd, it has to first sit in a buffer, and once done sit there again. This should become a smaller concern as time goes on, but it is still a concern. For example, for 250MBps of throughput, 4ns memory is required, and the keen observer will note that while many companies will list the clock speed of on-board processors, they fail to specify memory speed (not that I would particularly worry about this unless you really had the time and money to do so).

    4c. Recommendations
    My first recommendation is to find all your hookups with RAID card companies and have them send me units for testing. That aside, price is realistically going to be the largest determining factor in what you can (or cannot) get as good RAID controllers are not cheap, so I'll offer only two simple guidelines:

    1. Make sure to find a hardware RAID controller, a software RAID controller won't do too much for RAID-5
    2. Make sure to check the bandwidth of the bus the card is going to be on (PCI 2.2, PCI-E, PCI-X) and the potential bandwidth of each drive and ensure that if all fires perfectly you have room on the bus to spare so it doesn't bottleneck your system
    3. If possible, purchase hard drives which can perform XOR operations in their buffers (this should be true of many SCSI drives today). This will remove a number of buffer and transfer latency-inducing operations per write operation.


    Section 5: What if a Drive Fails?


    RTFM. Your controller will come with some form of documentation on exactly what steps to take, and I suggest following them. Procedures are different from product line to product line, but in general if a drive in your RAID-5/6 array fails, your setup will fall into one of three major categories: one which can rebuild without a reboot, one which has hot-swap capabilities, and one which has a hot-spare built in. In any case, once the failure is detected and the failed drive removed and replaced with a working one (except in the case of the hot spare, where a new one is already in there) the Rebuild write function will begin. This is an extremely disk and resource intensive process, which may very well make the disk system unusably unresponsive depending upon your setup. I recommend you take this time to floss your teeth, as despite what Listerine may think, flossing is an extremely important part of oral health.


    Section 6: Advantages (summary)

    1. Complete protection against the loss of 1 disk (RAID-5) or two disks (RAID-6)
    2. Read speeds are comparable by and large to RAID-0. Although with the same number of drives RAID-0 would have one additional disk worth of usable space to read from, lets not forget that RAID-5/6 do read from the same number of drives... the only difference is they effectively skip over a single stripe block (parity stripes) every so often. When you additionally take into account that RAID-5/6 require 3-4 drives minimum and that even theoretical read gains drop to single digits for RAID-0 by the 4th drive, distinctions are very difficult to see.


    Section 7: Disadvantages (summary)

    1. Cost of hardware controllers for a worthwhile implementation
    2. Cost per GB increases in your disk subsystem because at least one if not two disks worth of space is lost to parity


    Section 8: Rules of Thumb
    8a. When to use RAID-5
    8b. When to use RAID-6


    8a. When to use RAID-5
    Use RAID-5 if you meet the following criteria:
    1. You have a few hundred dollars to spend on a hardware disk controller
    2. You require large file read performance greater than what you would find with a single disk
    3. You require data availability in the face of disk failure
    4. You require more than one disk worth of space

    8a. When to use RAID-6
    Use RAID-6 if:
    1. You meet all criteria for RAID-5
    2. Your data is so important that the fact that another disk failure occuring while a failed disk is being replaced is not tolerable
    3. You have more than just a few hundred dollars to spend on a hardware disk controller


    Section 9: FAQ (various)


    I have not really encountered any questions to post here. Please note that I will not answer questions about setting up or using RAID-5 on any chipset/hardware controller - you have to read the manuals people.


    Section 10: Future of RAID-5/6


    RAID-5/6 has a bright future. Like RAID-0 it will benefit from a shift in hard drives towards solid state disks reducing latency, and as memory chips become faster and cheaper controller cards will (we all hope) become cheaper, faster, and carry larger amounts of on-board cache as well. Drives have already emerged which can perform partial XOR functions in their own cache, eliminating a large bottleneck, and as they become more common hopefully their prices will decrease as well.


    Section 11: References / Further Reading


    Read these two in order:
    http://en.wikipedia.org/wiki/Xor
    http://download.intel.com/design/sto...s/30094601.pdf

    Then read through some of these:
    http://findarticles.com/p/articles/m..._61620918/pg_1
    http://findarticles.com/p/articles/m..._60300939/pg_1
    http://www.t11.org/t10/document.07/07-113r0.pdf
    http://t10.t10.org/ftp/t10/document.94/94-111r6.pdf
    Last edited by Serra; 08-02-2007 at 04:31 PM.
    Dual CCIE (Route\Switch and Security) at your disposal. Have a Cisco-related or other network question? My PM box is always open.

    Xtreme Network:
    - Cisco 3560X-24P PoE Switch
    - Cisco ASA 5505 Firewall
    - Cisco 4402 Wireless LAN Controller
    - Cisco 3502i Access Point

  3. #3
    Xtreme CCIE
    Join Date
    Dec 2004
    Location
    Atlanta, GA
    Posts
    3,842

    Raid-xx

    Index:
    Section 1: Overview
    Section 2: Examples
    - 2a. RAID-00
    - 2b. RAID-01
    - 2c. RAID-10
    - 2d. RAID-11
    Section 3: FAQ
    Q: What is the effective difference between RAID-01 and RAID-10?



    Section 1: Overview


    Sometimes we want the best of both worlds. Sometimes "good enough" just isn't good enough. It's times like these when we look to combinations of RAID levels to offer us increased speeds and protection. Combination levels are pretty much what they sound like - multiple layers combined to behave in a way that neither could on its own. These
    combinations do not necessarily have to use different levels, or even only two RAID levels... indeed, you could have a RAID-1 array of RAID-1 arrays, or a RAID-5 array of RAID-1 arrays which are themselves comprised of RAID-0 arrays. Your choices and the nesting you opt for are - practically speaking - limitless (as the costs will generally become oppressive long before you are constained by the technologies available).

    How it Works
    The first trick to master when dealing with RAID combinations is to wrap your head around the basic concept of creating an array using other arrays as constituents instead of hard drives. The simplest way to think of a RAID combination level is to break it down and just work it through piece-by-piece. Popular nomenclature (which I oppose but will still use here) states that the level furthest to the LEFT of the level will be the one which is actually comprised of hard drives, and as you move further right you begin creating arrays which are made of the arrays which make use of the physical drives.

    For example, the level RAID-0+1 (aka RAID-01) could be expanded to:

    A RAID-1 array which mirrors a RAID-0 array on to another RAID-0 array.

    To go to a slightly more complex model, the level RAID-0+1+1 (aka RAID-011) could also be stated to be:

    A RAID-1 array which mirrors a RAID-1 array to another RAID-1 array, both of which are themselves comprised of two seperate RAID-0 arrays

    Pictoral examples:
    http://upload.wikimedia.org/wikipedi...RAID_0%2B1.png [RAID-01]
    http://upload.wikimedia.org/wikipedi...px-RAID_10.png [RAID-10]
    http://upload.wikimedia.org/wikipedi..._0%2B3.svg.png [RAID-03]

    It should be noted that you can technically arrange it such that each constituent array is made of different types of hard drives... however it is frowned upon because the best performance will come from matched drive, as so that will not be considered.


    Section 2: Examples


    Strictly speaking I believe once a person has seen a few pictures of how RAID combinations work and have been given a basic explanation they should be able to work out the details themselves as far as how many drives are required and what sort of benefits will be seen... but as we live in a world where expecting others to do their own homework as often as not results in misinformation I have created this section. Below you should find examples of a number of RAID combinations, as well as a brief synopsys of what benefits they offer and why, how many drives are required, and any special notes I may think of. As well, somewhere below is hidden the made up word "fishfacefignewton", which I expect anyone who seriously wants to read this full thread should be able to find on their own without using CTRL-F. If you look for but cannot find it, I would submit a proper exercise would be to read this entire guide from start to finish (not just this post) until you do because clearly you have not paid enough attention.

    Note: It will be assume here that RAID-1 will not load balance a single job, as that is not likely to be the case

    2a. RAID-00
    Min. Drives Required: 4 (RAID-0 requires 2, and this is two instances of it combined)
    Protection offered: None

    One of the more confusing RAID combination levels, RAID-00 really doesn't offer anything versus a regular RAID-0 array with the same number of drives except more overhead (not a lot, but it's there). Basically it takes your data, breaks it into a number of stripes equal to the number of constituent arrays, and each constituent array then breaks off its stripe into sub-stripes and writes them to its array. The combined effect is that your data is still broken into the same number of stripes and laid down in what is effectively the same manner as before, only you will have incurred buffer latency and a computational hit. Worse, your data still has no protection so you can't even say you got that at least.

    Recommendation:
    RAID-00 is not recommended for any implementation, unless you want to humiliate yourself.

    2b. RAID-01
    Min. Drives Required: 4 (RAID-1 requires two arrays, each made of a RAID-0 array which itself requires at least 2 drives)
    Protection Offered: One full RAID-0 array may fail

    This is not a somewhat popular choice of nested RAID levels in some business roles, and deservedly so. It leverages the load-balancing and redundancy of RAID-1 and combines it with the raw throughput power of RAID-0, though it should be noted that the primary advantages will only be seen when the load balancing is in effect. Theoretically the performance can be stated as:

    The performance of a RAID-01 array will closely approximate that of either component RAID-0 array on its own for single read requests, but will provide superior overall handling of multiple simultaneous read requests. Write operations will be constrained to the speed of the slowest constituent hard drive.

    The breakdown is as follows:
    - For a single read request, the request will only be made to one of the RAID-0 arrays (unless your controller allows for single-job RAID-1 load
    balancing, which it likely does not) and thus can provide no speed advantage over a single RAID-0 array.
    - For multiple separate, simultaneous read requests it is likely the RAID-1 logic will allow load balancing. Thus for request A one of the RAID-0 arrays will be used, while request B will be filled by the other.
    - For a write requests, the array will be constrained by the speed of the slowest of all hard drives used, because a write operation is not complete until it is complete in all drives.

    Recommendation:
    RAID-01 is recommended for environments where high volumes of read requests are encountered. Web and (read only) file servers are two prime candidates for this combination because they are likely to encounter multiple simultaneous requests which need prompt service. It should be noted however that RAID-10 is the preferred version of this type of nesting, as the volume of data which must be regenerated after a disk failure is lower.

    Primary Disadvantages:
    1. RAID-01 arrays rebuild slowly compared to RAID-10 arrays by virtue of the fact that they must rebuild an entire RAID-0 set of disks (versus a single disk) and;
    2. RAID-01 arrays cannot make use of any RAID-1 enhancement (ie. elevator seek) beyond basic load balancing.
    For both of these reasons, RAID-10 is the suggested configuration for most environments.


    2c. RAID-10
    Min. Drives Required: 4 (RAID-0 requires two arrays, each made of a RAID-1 array which itself requires 2 drives)
    Protection Offered: Any 1 drive in any constituent RAID-1 array may fail

    RAID-10 is a popular choice among many enthusiasts, though that doesn't necessarily mean it's actually the "best". It does offer fair data protection against potentially two or more disk failures (depending on the number of RAID-1 arrays in use), and also an increase in speed due to making use of RAID-0s striping, but in reality the behavior is very similar to that of RAID-01 (with two differences I will mention later). Theoretically the performance can be stated as:

    The performance of a RAID-10 array will closely approximate that of a RAID-0 array with half the total number of drives for single read requests - except with possibly better access time - and will provide superior overall handling of multiple simultaneous read requests. Write operations will be constrained by the speed of the slowest constituent hard drive.

    The breakdown is as follows:
    - For a single request, the request will be divided and sent to each constituent RAID-1 array. The RAID-1 controller may then select the drive closest to the information access point to retreive the data, thus delivering a faster access speed than an ordinary RAID-0 array.
    - For multiple read requests, the RAID-0 logic *MAY* (vendor discretion) pass along the full requests simultaneously to each RAID-1 controller, which may then implement per-job load balancing. This in effect puts all drives to use and can offer much higher total throughput versus a single request. It should be noted that if the software or controller does not allow requests to be passed until the first request is fulfilled at the RAID-0 or earlier level, this advantage in multiple request load balancing will be lost.
    - For a write requests, the array will be constrained by the speed of the slowest of all hard drives used, because a write operation is not complete until it is complete in all drives.

    Recommendation:
    Although the costs can quickly accumulate for RAID-10, I would have no problem recommending it for gaming and workstation use, and assuming the controller could gracefully handle multiple requests (not one at a time), then for web/file server use as well. It is preferable to use RAID-10 to RAID-01 in many instances where uptime is critical because less data regeneration must occur.

    Please Note: If your controller does not support RAID-1 optimizations such as elevator seek, this level will offer little compared to its potential.

    2d. RAID-11
    Number of drives required: 4
    Protection offered: Up to 3 drives may fail (75% loss tolerated!!)

    A classic. RAID-11 is, as the name implies, a mirrored set of arrays where each component array is itself a set of mirrored disks. The redundacy advantages are obvious, the speed advantages are fairly straightforward, and the loss of potential disk space is outrageous. Theoretically the performance can be stated as:

    The performance of a RAID-11 array for a single read request will equal that of a single disk, though with a lower average seek time. Multiple mirrors enable higher levels of multiple simultaneous requests with little degradation in speed as opposed to other nested RAID levels. Write operations will be constrained by the speed of the slowest constituent hard drive.

    The breakdown is as follows:
    - Single read requests will only be serviced by a single disk, but RAID controller optimizations may allow that controller to use its disk which is in the best position to fullfil the order. Thus the average seek time should be less than that of a single disk. fishfacefignewton.
    - Multiple read requests will be load balanced between the two sets of RAID-1 disk arrays, providing the same benefits as regular RAID-1, but after that the disk arrays themselves will also load balance, providing non-degraded service for up to 4 requests, and various queing optimizations in RAID-1 should lean towards improved scalability of service speed with number of requests versus other nested array types (assuming 4 disks only, of course).
    - For a write requests, the array will be constrained by the speed of the slowest of all hard drives used, because a write operation is not complete until it is complete in all drives.

    Recommendation:
    On a per-GB level, this is extremely hard to justify for home use. Businesses with high continuity needs for some files may implement this in some places, but realistically there's a point where you just have to learn to do backups regularly.



    Section 3: FAQ


    Q: What is the effective difference between RAID-01 and RAID-10?
    A: RAID-01 and RAID-10 do share some distinct differences. For enterprise setups, the largest factor to keep in mind is the array rebuild time in the event of a disk failure. Should a disk fail, a RAID-10 array will only have to rebuild a single disk, from its partner single disk. For RAID-01 however it must rebuild an entire RAID-0 array... which necessarily means 2 or more disks. For home users, the primary factor which makes RAID-10 a better option is the fact that if your RAID card implements proper RAID-1 optimizations, access times should decrease versus a RAID-01 array by virtue of the fact. To explain this a little better, in a RAID-10 array, it is RAID-1 arrays which are imposed on the physical drives themselves and hence can track which drives will have the shortest access times to what data. With a RAID-01 array however, it is the RAID-0 arrays which are imposed on the physical drives, and the RAID-1 array cannot implement any optimizations beyond simple load-balancing.

    More coming, I just feel my time would be better spent reviewing this collection of posts right now for accuracy, grammar, omissions, etc rather than adding on paragraphs about combinations that no-one here will make any use of in their personal life
    Last edited by Serra; 09-30-2007 at 03:26 AM.
    Dual CCIE (Route\Switch and Security) at your disposal. Have a Cisco-related or other network question? My PM box is always open.

    Xtreme Network:
    - Cisco 3560X-24P PoE Switch
    - Cisco ASA 5505 Firewall
    - Cisco 4402 Wireless LAN Controller
    - Cisco 3502i Access Point

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •