RAID And You (A Guide To RAID-0/1/5/6/xx)
As some of you may recall I had meant to post this very thing around March... well, here it is July and it's finally starting to get put up. I blame this on a combination of work, school, and losing the folder I had 3 of these posts already written in. And Neptune, that planet's never done anything for me.
I won't call my posts as thorough on the topic as an RFC - RFCs exist for a reason and are freely available. My goal here is to provide an understanding of the various RAID levels in enough detail that the average OCer can understand the pitfalls of RAID-0, learn the overlooked possibilities of RAID-1, and know why they need more advanced hardware to make effective use of RAID-5. Still, *if after reading everything carefully* you feel I have omitted anything crucial (quite likely in these early stages) or have gotten something wrong, feel free to post here to let me know. I will be sure to examine every post relating to the quality of this guide and try to make changes accordingly or at least justify why I did not make a change.
Some of you may wonder why I don't do a summary of all these RAID levels and technologies somewhere and basically just tell you how to set things up. The reasons are simple: tables and charts do not provide an understanding of the technology and it's not my job to tell you how to do things. All this guide is meant to do is inform you about different technologies available and allow you to draw your own informed conclusions as to what it best for your particular situation.
Master Index:
Post 1: Overview
Post 2: Revised Storage Technology Guide
Post 3: RAID-0
Post 4: RAID-1
Post 5: RAID-5 & -6
Post 6: RAID Level Combinations (coming soon)
Please note that this is NOT a "What To Buy" guide. Although I may make mention of some particular hard drives or controllers this post concerns the only the technologies used in hard drive systems. For advice on what controllers to get, check out the latest benchmarks available. This guide is intended to be informative only and that will not change until a controller company steps up and sends me something nice (j/k - or am I?).
Storage Technologies - A Few Facts To Get You Started
** Heavy editing and additions to come, this first edition simply has RAID-related topics cut out of the original **
This section is meant to complement the rest of the guide by providing a primer for the technologies that are used in the background in RAID arrays.
Index:
Section 0: Some Quick Definitions To Get Out Of The Way
Section 1: Hard Drive Specification Types
Section 2: Physical Data Placement And Speed (A Must Read For Page File / Swap Partitions)
Section 3: Controller Duplexing and "Hot-S" Failover Mechanisms
Section 4: (Almost) Universal RAID Primer
Section 5: The Windows Page File [aka Virtual Memory]
Section 6: FAQ
Q. I have a SATA II drive and a motherboard that only supports SATA I. Will I see a boost in speed if I get a board that supports SATA II?
Q. Will I see a difference between a drive with an 8MB cache and a drive with a 16MB cache?
Q. What is Perpendicular Recording?
Q. I'm currently using a RAID-(whatever) array, Why would I still need backups?
Section 7: References / Further Reading
Section 0: Definitions Used
1. Access Time - The time it takes for the heads of a hard drive to get *to* the data, not to read it.
2. Read Time - The time it takes to read data once the heads have gotten to it.
3. Sequential Read - A perfectly defragmented disk reading a file reads blocks sequentially, in that it can read blocks A->Z in a single pass without having to find the data somewhere else on the drive
4. "small" file - For the purposes of this guide a "small" file is considered to be a file of such a size that it takes the heads longer to get to the starting point of the file than it does to read the file itself.
5. "large" file - This guide will consider a "large" file to be a file of such a size that the time it takes for the heads to reach the beginning point of the data is trivial compared to the time it takes to read it from the platter.
6. Random Seek - This term is actually quite rare in this guide, but prevelant in many others. In essence it is usually meant to suggest a the operation of looking for a file. This guide uses the term "small file" instead because I believe there is a crucial semantical difference between the two relating to the fact that if the seek time does not outweigh the read time it is not the major source of latency and therefore not the largest issue.
Section 1: Hard Drive Specification Types
1a. ATA/EIDE/PATA [hereby referred to as PATA]
1a-i. Master/Slave Relations
1b. SATA I/II
1c. SCSI
1a. ATA/EIDE/PATA [Speed <= 133MBytes/s (or <= 1Gbit/s)]
For various historical reasons, you may find these drives labelled as any of the above. For the sake of my wrists, we'll just call them the most appropriate (and currently manufacturer supported) PATA drives. These are the last-gen drives we all all cursed because the thick ribbon cables would frequently interfere with our airflow or just look plain ugly. The speeds of these drives varies according to which exact part of the standard the drive is from, but realistically you aren't particularly likely to see anything slower than ATA/133 speeds (133MB/s maximum transfer speeds) nowadays. Of all the drive types, this is the only one to which a Master/Slave configuration matters.
1a-i. Master/Slave Relationship
As noted above, a Master/Slave relationship is important ONLY when using PATA devices. Basically, for each channel (PATA plug on your motherboard), there can only be 1 "master" drive (including cables with a hard drive and CD-ROM plugged in). The actual use of this relationship is now out-dated, but it is good to know because if configured incorrectly, may cause 1, both, and - possibly - neither of your drives to stop working. To make this long and irrelevant story short, there is only one method you should ever use when configuring these devices: move your jumpers to "Cable Select". Modern PATA cables have a pin blocked for one device to indicate to it that it is the slave device and this configuration issue has not made a difference to how things operate since '97 or earlier.
1b. SATA I/II [Speed = 1.5Gb/s & 3.0Gbps]
SATA I/II drives use a different cabling standard that their PATA predecessors. Instead of a bulky, difficult ribbon cable, they each use a streamlined serial cable. Physically a SATA I and SATA II cable are identical (and interchangeable), the only effective difference lies in the ability of your motherboards chipset and hard drives ability to operate at one speed or the other. Thus electrically SATA I devices have a maximum theoretical operating speed of 1.5Gb/s, where SATA II devices have a maximum theoretical operating speed of 3Gb/s. Being fully backwards-compatable with SATA I, SATA II devices can be plugged into SATA I ports (and vice versa), though they will operate at the speed of the lower standard. Realistically, as of March 2007 no commercially available hard drive I am aware of is able to saturate even at burst the SATA I bandwidth, so lets just be clear right now that the WD Raptors are not slowed down by being only SATA I devices.
1c. SCSI [Speed = Dependant upon standard and number of devices]
SCSI technologies are beyond the scope of this guide, and so will only briefly be mentioned. These are the drives every enthusiast dreams of: 10,000rpm+ drive speeds, enough bandwidth to fit 4 drives utilizing their full speed before any bottleneck occurs, and a heat tolerance that would make a Prescott P4 jealous. Of course, the considerably higher cost, lower capacity, and specialized settings/requirements have ensured that this drive is off-limits to most individuals. Beyond this, there are so many different SCSI types, cables, and considerations that one could (and many have) devote entire web pages to this topic.
Section 2: Physical Data Placement And Speed
2a. A Quick Guide To Rotational Platter Physics
2b. Multiple Zone Bit Recording / Zoned Bit Recording
2c. CAV (Constant Angular Velocity)
2d. CLV (Constant Linear Velocity)
2e. (As A Result Of The Above) - Placing Partitions For Speed
2a. A Quick Guide To Rotational Platter Motion Physics
In order to properly understand the rest of this section, this is pretty much a "must read". Luckily, it's not a hard concept. Okay, the basic premise is that if you have something carving out a circular rotation, the matter farthest from the center of rotation will be moving faster than the matter closer to the center. Why? Well because the matter farther from the center requires the same amount of time to create one full revolution, but it travels a greater distance. Hence, it must be moving faster. Now apply this to a hard drive. What you basically get is that *assuming* the platter speed remained the same accross the entire volume regardless of head placement, the outer edge of the disk will be going notably faster than the inside. To help muddy the waters, computers actually see hard drives outer edges as being the "start" of the disk and records inwards towards the "end".
This is one of the actual primary reasons why disks seem to slow down as more information is put on them... it takes more time to get the same amount of information from the sections of the disk closer to the center than it does from the outside (though the discussion on this gets deeper in the next subsections).
2b. Multiple Zone Bit Recording / Zoned Bit Recording
Effectively, this is the name used to describe how data is organized on a hard drive (in tracks and zones) and why the data density is at least approximately the same throughout a disk. I would explain in detail, but I was able to find a webpage devoted to the topic which not only covers it better than I could, but *also* includes benchmarks (albeit on a 3.8GB disk... but the idea should be the same). The webpage is at: http://www.pcguide.com/ref/hdd/geom/tracksZBR-c.html
2c. CAV (Constant Angular Velocity)
Basically this is the acronym used to describe most hard drives since the early 90's and many (but not all) CD drives nowadays: the disk rotates at a fixed speed regardless of where the read/write heads are.
2d. CLV (Constant Linear Velocity)
This is the older style of hard drive design wherein depending upon which zone is being read the rotational speed of the hard drive changes. Effectively, this means that your hard drive has a "max" speed, but slows down sometimes. As far as I know this is no longer used in any medium (instead now the drive stays at the same speed, but less is accomplished as the heads move inwards on the drive (or towards the end of the drive as the computer sees it)).
This is used in some CD drives but should no longer be seen in hard drives.
2e. (As A Result Of The Above) - Placing Partitions For Speed
Alright, now that we know that the physical outside of the disk (or the "start" as the computer sees it") is actually read faster, does anyone want to take a guess at what that means for you speed-wise? Well just in case, it means that partitions containing data which you would ideally prefer to give access-speed preference to (ie. swap and page file partitions) should be placed at the "start" of the disk for best performance.
Section 3: Controller Duplexing and "Hot-S" Failover Mechanisms
3a. Controller Duplexing
3b. Hot-Swap
3c. Hot-Spare
3a. Controller Duplexing
This is a topic which can be approached from two directions: redundancy / fault tolerance and speed. The basic physical idea behind both is that if you are planning on building an array (or combination array) of disks, instead of plugging all disks that will be used into one hard drive controlling device, you could use multiple controllers instead and span the disks between them.
Commercial environments that have strict uptime requirements for example often use this because although they may have implemented a RAID array to prevent any one disk failing from affecting their uptime, if the array controller dies, it doesn't matter if the data is still retreivable - the data can't be accessed. A simple example could involve two PCI hard drive cards in a computer with a two-disk RAID-1 array. Instead of plugging both hard drives into one card, each hard drive is plugged into its own card instead, thus preparing the computer for the failure of the hard drive or hard drive controller.
But this is Xtreme Systems, so let's focus more on the speed issue. Lets say that you have the money and desire to build a RAID 0/5 array (data is broken into two stripe streams, each of which is stored in a seperate RAID-5 array, providing redundancy to the data) with 6 disks. Now you could go out and find a controller that can handle nested RAID levels for you and that has the power necessary to do all the calculations for each of your 3 arrays... or you could get two controllers and unload half the work to each. Basically, you would set up 3 drives on each controller as a RAID-5 array, and then create a RAID-0 array between the two controllers (which will each appear as a single drive to any external card/onboard solution/software solution). By doing this, you can not only buy much cheaper cards (each card can afford to have lower specs and fewer ports), but you get a speed advantage as well. Because neither card only has to worry about its own RAID array (not both) the latency for calculations is effectively cut in half, any onboard memory the controller card may have is now effectively doubled, and the total bandwidth is as well (each card having its own connection to the motherboard). While certainly an expensive proposition, controller duplexing is a must for any enthusiast with more money than things to do with it.
3b.Hot Swap
Hot Swapping is the ability to remove a component and replace it while the computer is still running. Depending on the system, this could realistically apply to any component (though for a motherboard replacement, you're looking at more of a clustinger technology rather than a single-computer environment). In reference to a discussion on hard drive technologies, this would mean that your hard drive controller allows you to remove a dead disk and insert a new one in its place and have it detected without shutting down the system. This is the most useful in environments where hot sparing is not an option, uptime must be maintained, and a fault-tolerant RAID array exists (though the last part is optional). For examples and special considerations, see subsection 4c - Rebuilding Failed Arrays.
3c.Hot Spare
Hot Sparing works somewhat like Hot Swapping but more automated. When a controller is configured to use Hot Sparing, once a hard drive failure is detected a user-predetermined disk (or set of disks or array of disks) that was previously unused by the controller springs into action and replaces the failed disk. It should be noted here that although the disks functionality has been automatically replaced, once the disk fails you no longer have the spare set to use if another disk fails, and therefore it is adviseable that you remove the dead disk and replace it with a new, working one as soon as possible and configure your controller to use the new disk as the hot spare.
Section 4: (Almost) Universal RAID Primer
4a. What is RAID (as a whole)?
4b. What is a RAID controller?
4c. What Does "stripe size" Mean?
4d. What is the best stripe size for me?
4e. A Quick-and-Dirty RAID Level Comparison
4a. What is RAID (as a whole)?
RAID stands for Redundant Array of Inexpensive Disks. In older literature the "I" used to stand for "Independant" in most publications, but given the decline in prices over the years the word Inexpensive eventually came to the forefront. Nomenclature aside though, all levels of RAID have a common goal: to team together multiple physical hard drives to provide better performance and present this team (or Array) as a single disk to the Operating System. This is not to be confused with JBOD (Just a Bunch Of Disks), whose purpose is to bundle multiple hard drives together logically but which simply fills one, then the other, then the other, etc.
RAID has a number of flavors, or levels, and each level has a number associated with it. Today RAID levels 0, 1, 5, and 6 are common, while 3, 4, and 7 are not. Despite the naming convention, of all the levels that exist, level 0 is not actually redundant in any way.
4b. What is a RAID controller?
A RAID controller is either a hardware or software device which performs the operations necessary to bind the multiple hard drives together, makes them work according to the various RAID specifications, and presents the array as a single hard drive to the Operating System.
Software RAID controllers (as found in most onboard solutions and add-on cards that cost under $300US) use the computer processor to make any calculations necessary to perform this function. For some RAID levels this is acceptable as their processor overhead is quite low, while for others it is unacceptable.
Hardware RAID controllers on the other hand contain their own processing engines capable of performing all of the tasks required. This eases the burned on the rest of the system and provides a faster response time. Most hardware controllers also contain a certain amount of cache, and many contain RAM (often expandable) which allows the controller to aggressively seek data it thinks may be requested and supply it to the computer at a speed greater than would otherwise be possible from querying the hard drives themselves for it.
4c. What Does "stripe size" Mean?
When creating a RAID array one of the first choices a person has to make (after how many disks will be included) is what stripe size to use. The "stripe size" basically refers to how large the blocks will be that data is broken up into before writing to a disk (and, conversely, how large each block of data to be read from a disk will be). The values available are typically between 16k and 256k, with many manufacturers recommending a value of 64k be chosen. This may not seem like a terribly relevant piece of information, but it will become more valuable when we start drilling down into the specifics of RAID.
4d. What is the Best Stripe Size for me?
I'll say it right now - I hate the topic of chosing a RAID stripe size. Everyone wants a clear-cut answer for what their stripe size, but the honest truth is that there is none. The best I can really offer is that companies such as Areca, Promise, Highpoint, and Intel have all conducted their own tests and have all independently concluded that they should set their products to use a 64k default. From there it's up to you as to whether you want to go larger or not, and I'll offer some guiding tips on how to decide:
Quote:
Originally Posted by From Soulburner, not sure of original source
Decreasing Stripe Size: As stripe size is decreased, files are broken into smaller and smaller pieces. This increases the number of drives that an average file will use to hold all the blocks containing the data of that file, theoretically increasing transfer performance, but decreasing positioning performance.
Increasing Stripe Size: Increasing the stripe size of the array does the opposite of decreasing it, of course. Fewer drives are required to store files of a given size, so transfer performance decreases. However, if the controller is optimized to allow it, the requirement for fewer drives allows the drives not needed for a particular access to be used for another one, improving positioning performance.
So what should you use for a stripe size? The best way to find out is to try different values: empirical evidence is the best for this particular problem. Also, as with most "performance optimizing endeavors", don't overestimate the difference in performance between different stripe sizes; it can be significant, particularly if contrasting values from opposite ends of the spectrum like 4k and 256k, but the difference often isn't all that large between similar values.
4e. A Quick-and-Dirty RAID Level Comparison
Section 5: The Windows Page File [aka Virtual Memory]
5a. What is a Page File? Why do I need one?
5b.Okay, I've decided to have a Page File... what are the best practices?
5a. What is a Page File? Why do I need one?
A Page File is a volume on your hard drive that your operating system treats somewhat like RAM, in that it stores pieces of applications that it may need while the application is running. This is important because the majority of programs that exist do not use all of their code at any one time, and so loading the entire program into memory would be an extremely inefficient use of prime real estate (and indeed, it program may not *fit* in the first place). Consider for example your operating system. While some of you may have enough RAM in your rigs to handle it, *I* for one couldn't load all the parts of my operating system that are put into RAM/the page file and any useful application into the amount of RAM I have. It also comes in handy in the event of an unexpected system shutdown. If possible, Windows will try to save what is currently in its RAM to the page file so that you do not lose any work you had been doing. Lastly, if you are a mobile user who uses the hibernate function, the page file is where all of your RAM is copied when going into hibernate mode.
The question of whether a person should use a paging file is one that comes up with regularly around XS. Strictly speaking, no you do not need a paging file to run Windows, and this is where a lot of confusion comes in. People feel that because they do not have a paging file programs are loaded entirely into memory and/or that some form of overhead related to a system process is eliminated. These are false suppositions. Without a paging file your program is not loaded entirely into memory and there is no service you disable by not using one (it's simply not used, the algorithms are still calculated). What is your result from not having a page file then? For starters you have less control over where your less frequently accessed bits of applications are stored and can find that they are located at slower ends of your hard drive or that the parts that would have been loaded into your page file are fragmented. Further, for those of you who have multiple hard drives, your page file must be read off of your primary (and theoretically more heavily loaded) hard drive, which further increases read times. For an explanation of these downsides (and how to correct them) see section 3b.
5b. Okay, I've decided to have a Page File... What are the best practices?
After the question of whether one needs a page file or not has been addressed (3a), there are three primary Page File best practices left to go through before deciding upon an implementation plan: Size and Location.
Size:
Microsoft suggests that your page file be set to 1.5x the amount of RAM in your system. They suggest this particular rule of thumb because a hibernation or system dump can copy the entire contents of your RAM into your page file (which may have also had other information in it) and because systems which have insufficient RAM may require a large page file if they are using an application that requires application data be RAM-resident. Frankly, with the cost of hard drive space nowadays, I say go with 1.5x. If you happen to have 4+ GB of RAM and don't feel that you don't require 6+ GB of Page File, you're probably right.
Regardless of the size you feel your Page File should be, it is suggested that you set the maximum and minimum sizes be set to the same value. This prevents Windows from dynamically chaging the size of your page file (which you don't need, because you already figured out how much you need - right?) which can result in fragmentation and unnecessary disk overhead at critical times (ie. making your page file larger when it needs it to be - if it was large enough in the first place, that's a step that would not need to happen).
Location:
You have three options for the location of your Page File. The first is to put it on a different partition on your primary (boot) drive. By locating your Page File in its own partition (according to Microsoft) your page file will not become fragmented. Whether that's true or not, by partitioning your hard drive with a seperate Page File partition you do get to put it at a marginally faster part of your hard drive. The second option you have is to put your Page File on a second, less frequently used hard drive. This is an excellent option because (obviously) without any applications or systems using the hard drive, your Page File will have the full bandwidth of the hard drive bus at its disposal. Your final option is to split your hard drive accross two hard drives, with some on your boot partition and some on a seperate partition of your second drive. According to Microsoft this is the *best* option. According to the Triforce of Wisdom employed by the technical writers in the Microsoft Technet, the algorithm that decides where to put data to and from will always put data on the hard drive which is used less frequently. Personally, I really don't put a lot of stock into this. It also offeres a benefit useful for code debugging, but I'll leave that as being out of the scope of this guide.
Please be sure to also read the section on physical data placement and its affect on speed if you plan on using a seperate page file partition.
Section 6: FAQ
Q. I have a SATA II drive and a motherboard that only supports SATA I. Will I see a boost in speed if I get a board that supports SATA II?
A. The short answer: No. The longer-than-short-but-not-long answer: Yes, slightly. The long anwer: Although there will be an increase in speed, unless there's something very wrong you shouldn't notice it. This is because, strictly speaking, consumer-grade hard drives are not able to actually read data from their platters fast enough to saturate even the PATA/133 bus yet for a sustained period of time, let alone a SATA I / II bus. At best, most 7200rpm drives can peak at PATA speeds when bursting, but sustained speeds are significantly lower. With that being said, your data will *technically* move its way from the top of the SATA cable to the bottom almost twice as fast, but that is absolutely not the bottleneck of the system and so the difference is effectively zero.
Q. Will I see a difference between a drive with an 8MB cache and a drive with a 16MB cache?
A. Like anything else with hard drives, it depends. If you just cruise around the Internet on your computer you could probably have a 1MB cache and never really notice. If, on the other hand, you do other work that involves predictable reads, you will see more of a difference. For clarity, when I say 'predictable', I mean your computer anticipates what data it needs and holds it in cache before you specifically request it (ie. it takes the cluster you asked for and gives it to you, but may hold the next few clusters in cache because chances are you will want them too). But will you see the difference in your everday life? If your a gamer you may notice something depending on your games of choice (but I wouldn't hold my breath), if you do a lot of file manipulation it certainly won't hurt, and if you're all about benchmarks then definitely yes.
Q. What is Perpendicular Recording?
A. http://www.hitachigst.com/hdd/resear...Animation.html
Sorry, I'm just not even going to try to compete with that video. The only thing I will add is that it also allows for faster reads as well because it covers more bits with every sweep of the same speed than it would have if the disk was not using perpendicular recording.
Q. I'm currently using a RAID-(whatever) array, Why would I still need backups?
A. Although you may have a super-redundant RAID-6+1+1+6+1 array with 12 hot spare drives standing by, all on the fastest controllers ever produced jam-packed with ECC memory, there's still a chance you'll lose your data. I'm not talking about loss from a single hard drive, or even a dozen hard drives all at once. What I am talking about is data corruption. If somewhere in your data processing an unstable system component (ie. overclocked processor or memory) produced invalid results and writes those to the hard drive, your data is corrupted and no amount of redundancy will bring it back. Similarly, RAID also doesn't do anything to hamper viruses. Your only defense against either of these is to make backups at regular intervals, and to be safe you should be sure to keep old copies for a period of time (because, for example, you may discover you picked up a virus three weeks ago and didn't notice... if you have overwritten your last backup, you're still going to be infected no matter what).
Section 7: References / Further Reading
http://en.wikipedia.org/wiki/Hard_disk
http://en.wikipedia.org/wiki/Parallel_ATA
http://en.wikipedia.org/wiki/SATA#SATA_3.0_Gb.2Fs
http://support.microsoft.com/kb/314482 <- Windows Page File Information
http://www.pcguide.com/ref/hdd/geom/tracksZBR-c.html <- Zone Bit Recording / How Physical Placement Affects Speed