PDA

View Full Version : Looking for 1TB drive recommendations for RAID-10 fileserver



halo112358
11-03-2008, 05:30 PM
I'm doing a buy sometime in the next day or two to build another office file server and I'm curious which TB drives people are preferring these days. Any thoughts?

I tend to prefer Samsung F1 drives for desktop machines, are they any good in a file server context?

coo-coo-clocker
11-05-2008, 07:21 PM
"office file server" can mean a lot of different things.
What kind of usage patterns? What's the budget? How many (concurrent) users? HOw much space do you need? Availability?

halo112358
11-08-2008, 07:00 PM
No more than 5 concurrent users, more likely just 1-3 at any time. The machine is used to store research data for court cases (lots of digital photos and HD video) in addition to documents and programs. Most of the time people grab the data they want off the machine, copy it to their workstation, make modifications and then update the server later. My target is somewhere in the range of 3-5TB usable space in the array, we have 1TB now and I want the machine to be good for another ~2 years or so. I'd like to be able to lose 2 drives without requiring a restore from backup.

Right now I'm leaning towards a RAID6 with 6 1TB drives - the current machine is a 1TB array with 5 250GB drives. This machine has been reliable the last 18 months but A: we're running out of space and B: only being able to lose 1 drive makes me a little nervous. I'd much rather be able to lose 2 drives than have to restore from backup.

Budget is ~$1kUS on drives and controller cards, though I'd prefer to use linux md raid if I can - this machine won't be doing anything else with a dual (or quad) core processor, so I'd like to use the CPU to handle RAID calculations. I'd rather spend more money on drives than on a $400 controller card.

tiro_uspsss
11-09-2008, 05:27 PM
I would recommend nothing less than the Seagate ES.2 1TB HDDs :up:

Speederlander
11-09-2008, 05:37 PM
http://www.newegg.com/Product/Product.aspx?Item=N82E16822136313

Include same 5 year warranty as raptors.

coo-coo-clocker
11-09-2008, 09:03 PM
I'd like to be able to lose 2 drives without requiring a restore from backup. Right now I'm leaning towards a RAID6 with 6 1TB drives. Budget is ~$1kUS on drives and controller cards, though I'd prefer to use linux md raid if I can - this machine won't be doing anything else with a dual (or quad) core processor, so I'd like to use the CPU to handle RAID calculations. I'd rather spend more money on drives than on a $400 controller card.

That might be hard to accomodate and get decent performance. I would expect R6 to be very demanding on writes (think rebuild when a drive fails and how long it will take to recover), and even the latest southbridge ICH10R doesn't support R6, AFAIK. R5 is probably about the best you can get without investing in a decent dedicated controller.

FWIW, I "gambled" and bid on an adaptec 31605 on fleabay and snagged it for $225, and it works like a charm! :D

halo112358
11-09-2008, 11:30 PM
That might be hard to accomodate and get decent performance. I would expect R6 to be very demanding on writes (think rebuild when a drive fails and how long it will take to recover), and even the latest southbridge ICH10R doesn't support R6, AFAIK. R5 is probably about the best you can get without investing in a decent dedicated controller.

FWIW, I "gambled" and bid on an adaptec 31605 on fleabay and snagged it for $225, and it works like a charm! :D

Yeah, my plan right now is to spec the 6 drives and build the machine then run some tests and see what the performance is in RAID5, RAID6 and RAID10. If the hit isn't too bad between R5 and R6 I'll go for it.

stevecs
11-10-2008, 05:00 AM
raid 3/4/5 & 6 are all heavily dependent on your calculation engine. Best case (assuming zero penalty on calculation overhead (ie, cpu is fast enough to handle the task) raid 3/4/5 (parity raids) have a random write speed of a single drive (doesn't matter how many drives you have in an array) raid-6 would be slightly less than that as it has a couple more operations to do. For a file server your workload will be closer to pure random so this would be important if you're writing (not using it as a read-only file server).

Now w/ raid-10 (or any multi-level raid) you will get an increase in write iops with every new array. ie a raid-1 (two drives) iops=single drive, raid10 (4 drives/2 raid-1's) iops=2x single drive, et al. This is also true for other raids (raid50 (6drives/2 raid-5's) would be 2x single drive write iops et al.

The highest performance w/ redundancy given a number of drives would be a raid-10 for two reasons 1) # of raid arrays (each pair of drives) and 2) no parity calculations.

XS_Rich
11-10-2008, 05:03 AM
Best case (assuming zero penalty on calculation overhead (ie, cpu is fast enough to handle the task) raid 3/4/5 (parity raids) have a random write speed of a single drive (doesn't matter how many drives you have in an array)


Does this not depend on the write-back caching of the RAID card / software?

[Genuine question]

stevecs
11-10-2008, 05:31 AM
What the write back cache does is try to gather a stripe-width's worth of data (say your stripe size is 64K you have 8 drives so your stripe width would be 448K (8*64K -64K for raid-3/4/5 parity)) assuming the sectors you're trying to write are aligned to your stripe width it will then write the entire stripe width at the same time avoiding the 4 op (raid3/4/5) or 6 op (raid 6) penalties. This works ok, however for a file server you will never have enough cache to handle all writes (since with the more requests the more random the workload will appear to the subsystem) nor will you generally be writing exact stripe width's or have them aligned for updates.

I'm not saying that it won't help, it will, however the more of a load you have the more you will be looking like the worst-case scenario which is direct subsystem operations. This is one main reason why things like databases (and large file servers/sharepoint systems et al) use raid-10 or other multi-level raids to get more iops.

XS_Rich
11-10-2008, 05:34 AM
Cool, thanks for explaining!

My NAS is RAID5 with up to 1GB write cache (it's an embedded linux box which uses the system memory as cache and a Pentium M as the XOR processor). I take it with usually only 1 or 2 concurrent users and large sequential writes, I WOULD see the benefit of the write back cache?

stevecs
11-10-2008, 06:42 AM
Can't really tell from that layer (remember the cache for a raid is really thinking in terms of sectors). You have at least two layers of abstraction above that (filesystem/fragmentation) and network layer (nfs/smb). the raid array controller actually sees (for contiguous writes) something like write to sectors 1000-1895 inclusive for a stripe width write, it will tack on another 128 sectors (64K for the parity). And with network mounting you are at least breaking down the chunks into say 64K sizes (smb), there's way too many unknowns on your workload to see how it will be affected by the cache. You'll have to get to your device driver in linux to see what you're actually sending (what sectors and ranges of sectors) are being written with a normal workload to get a real feel.

As I said above though having cache won't hurt (assuming it's battery backed for power failure/system freeze) and will only really help, the question is how much will it help. :)

halo112358
11-10-2008, 12:06 PM
Thanks for the info :D

I'm going to get the parts together this week and I'll setup the array in R5, R6 and R10 for some bonnie++ tests then look to make a more final decision. Right now I'm leaning towards R6 or R10, I'd really like to have the margin afforded by 2+ drive failures. I might run R-50 tests for kicks, just to see how it compares vs R-6..

I did some bonnie++ tests on the current machine last night, the performance is less than stellar.

stevecs
11-10-2008, 05:03 PM
yes, and remember that raid-10 you can lose up to 50% of your drives (assuming that you do not lose two ADJACENT drives). So R10 would have higher availability than a raid-5 for example, and RAID-6 would have higher availability then the R10 but at the cost of write performance (but also the benefit of more usable space). Yes, bench them and find the one that fits your need.

One comment on multi-layer raids though in that besides R10, the others (R30/40/50/60 or whatever combination) are generally not expandable on raid controllers. If you want to use those I would suggest doing base raids on the controller (say two raid-5's or two raid-6's) and then use LVM on the host to stripe across both arrays to create your Rx0. This allows you to expand your base array (or add others/more cards et al) with a lot more flexibility (at the cost of complexity).

uOpt
11-11-2008, 08:24 AM
I decided to order a set of es.2.

I ended up not trusting Samsung. The first problem is underspecification, they don't even give you expected uptime values. There are a lot of reports of broken firmware, I don't want to deal with that either.

The 1.5 TB seagates get a lot of Flak for (lack of) reliability. Also, the read bit error rate for the *.11 (as specified) sucks. So I bit the bullet and bought the ES.2s.