Question Question : Raid10 or Raid5 for databases?

#1
Hello,

I am being tasked with reconfiguring a RAID disk setup on one server. I had a vendor who delivered a Progress database and application one with RAID10 (RAID 1 + 0) and I tend to agree with what they said:
"We always configure disk arrays as RAID 10 to maximize performance. We would never recommend RAID 5 because this can adversely affect database performance."
That statement came from on of their leading database admins.
However, I have another server (it is now a production box) that someone from my team (not the vendor) configured as RAID5.

I tend to follow a basic rule - if some vendor configures a system in one way I try not to argue with the approach, as if I make a copy of that system with a different configuration I may be left with a vendor who will not support any issues with a system, especially with some performance issues.

When I search the Internet I have not seen one recommendation with RAID5 (I once saw a Progress Database presentation stating "RAID5 is EVIL"), yet I have to constantly argue with my colleagues who claim RAID1 or RAID10 is slower than RAID5 because writes have to be written twice - which may be a fair argument. I would like to end those discussions, so I would like to ask people here is there at least one person who has a way of proving RAID5 is for databases?

The only argument I may see reasonable is that we use VM hosts and they can have various types of servers, mixed database and file servers, etc. and that causes problems, becuse to use RAID10 for them would be rather an inefficient way of using the computers resources, especially disks.

I had to work with Oracle and SQL-Server databases for the past 20 years and I know from learning courses and experience that at least an Oracle needs at least 5 spindles:
1) For data, 2) for indexes 3) for Rollback (Undo operations), 4) Temporary data (sorting and indexing) 5) Redo Logs - which actually should never be mirrored physically, as any block corruption will be replicated and a Redo Log crash will render a database inoperable.
From what I have learned Progress has basically the same concepts, correct me if I am wrong?
Unfortunately modern disks are so huge that it would be very wasteful to configure database in such a way, so for example I compromise and can only end up separating Data from Indexes, thus only 2 spindles o for example 4 hard drives. However with one of the servers I have been given there are 8x146GB disks.

On another matter, the majority of databases mainly operate in memory these days, some of mine can have 32GB of memory and at least 80% allocated to data buffering, so the FAST WRITE issue should be dismissed, as writes only occur during checkpoints or of memory is forcibely issues (either by a command or by a shutdown), or massive update operations flushing out "dirty" blocks - which triggers writes to Redo Logs and database files.

What is your opinion and how do I win the argurment, RAID10 or RAID5?

Thanks,
Richard
 
#4
Your colleagues are wrong. RAID5 is terrible for writes. Much, much, much worse than RAID10. RAID 5 has to read before it writes in order to calculate the checksum and it then writes to all of the devices the data is spread across PLUS the parity drive. RAID 10 does two parallel writes. No parity calculation and no parity write.

RAID4, RAID6, RAID-DP and all other fancy forms of vendor obfuscation are even worse.

BAARF

RAID5 is so very, very horrible that it isn't what is even actually implemented. What they really do is hide a RAID 5 behind a very large RAM cache. This allows a lot of "ordinary" IO writes to be done to the cache rather than having to suffer the consequences of actually involving the disks. (This is called a "write-back" cache, actually waiting for the disks is a "write-through" cache.)

For many "appliance" workloads this "write-back" cache is perfectly fine and is probably why your misinformed colleagues think that it is appropriate for all "cookie-cutter" applications.

For a database workload this is not fine. Database integrity depends utterly on write operations being reliable and properly ordered. Any hardware failure that results in data that was supposedly written to the disks, but which was actually only ever in the cache can be catastrophic.

From a performance perspective, rather than reliability, that caching only really works under "normal" circumstances. Under heavy load or during unusual operations such as rebuilding indexes (perhaps to repair corruption) or dumping & loading or just even running a backup it is trivially easy to saturate the cache. And once the cache is saturated you are back to the performance of the underlying RAID5 disks. The performance of which is awful (see above).

And then there is the whole question of shared devices -- a shared device is a performance nightmare. It is fine if you have some "appliance" and you want to consolidate a lot of light usage stuff. But it is THE WRONG STRATEGY if you are trying to deploy a mission critical application that needs high performance.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
#5
"We always configure disk arrays as RAID 10 to maximize performance. We would never recommend RAID 5 because this can adversely affect database performance."
I agree.

Unfortunately modern disks are so huge that it would be very wasteful to configure database in such a way
One traditional argument in favour of RAID 5 is storage cost. With disks of capacity X, a 4-disk RAID 5 array has capacity 3X, whereas a 4-disk RAID 10 array has capacity 2X. In years gone by, when disks were very expensive, this may have been a compelling argument. Today, storage is relatively cheap.

I have had occasion to investigate performance issues in environments with sub-standard hardware, spending hours or even days on the problems. My time was worth more than it would have cost to provision proper hardware in the first place. Relative to the overall cost of the solution (software licenses, maintenance, compute hardware, network, etc.), the cost of a few extra disks and good controllers is relatively quite small. Trying to save nickels and dimes on a few disks is a false economy. What is the cost of not having your business perform efficiently?

On another matter, the majority of databases mainly operate in memory these days, some of mine can have 32GB of memory and at least 80% allocated to data buffering, so the FAST WRITE issue should be dismissed, as writes only occur during checkpoints or of memory is forcibely issues (either by a command or by a shutdown), or massive update operations flushing out "dirty" blocks - which triggers writes to Redo Logs and database files.
I disagree. Yes it's good to have lots of memory to allocate to database cache, and up to a certain database size you can cache quite a lot of your data. So that can help increase the ratio of logical to physical reads, eliminating some disk reads that might otherwise have been done with a smaller cache.

But application performance isn't just about read throughput. Large databases get to be large because of a high rate of writes. And while writes of data buffers are asynchronous, writes of BI and AI buffers are synchronous. If they are slow then that can freeze users' applications, throttle overall OLTP throughput, and, if the page writers can't keep pace, cause extra delays during checkpoint processing. If there is transaction activity, writes are happening constantly, not just during end-of-checkpoint processing.

From a performance perspective, rather than reliability, that caching only really works under "normal" circumstances. Under heavy load or during unusual operations such as rebuilding indexes (perhaps to repair corruption) or dumping & loading or just even running a backup it is trivially easy to saturate the cache. And once the cache is saturated you are back to the performance of the underlying RAID5 disks. The performance of which is awful (see above).
And if you happen to have that heavy load when you lose a disk, you have to rebuild the array: add the overhead of reading all the other disks' parity data to write to the new disk.
 
#6
I would like to thank you all, you have given me a real set of professional advice. I shall put it into a document and circulate this info.

To add, I watched a youtube presentation and there may be some points I could add:

1) Raid 5 was reliable with smaller disks and many years ago space was at a premium, this is no longer the case. Since 2005 (apparently) Raid 5 arrays were deemed inefficient already. It takes much longer to rebuild Raid 5 arrays after a loss of one disk making the task ineffective, if another disk suffers a block error, the whole rebuild process will fail. Raid 6 which adds a second disk as a failsafe would be recommended.

2) Raid 5 is good for files servers where smaller files are in the majority and random access is good. Database files are very often huge, measured in GB. An Oracle database prefers data blocks to be as close to each other as possible. The RAID5 array will scatter those blocks across many disks defying the benefits of what the database engine does - reads sequentially multiple blocks in one transaction.

3) In personal humble opinion - mixing File server environments and database will always result in read/write bottlenecks. It's best to avoid that mixing if possible.

Correct me on these points, if I am wrong.

Thanks again,
Richard
 
#7
Replacing RAID5 with RAID6 improves the reliability -- but it hurts performance because there is now yet another disk that must be updated.

Way back in time when RAID was a new thing it was assumed that disk failures are random and uncorrelated. You know what they say about assumptions?

As it turns out disk failures are not so random. When you put a bunch of identical disks from the same manufacturer and the same batch together in a cabinet and subject them to the same environmental factors and the same workload the failures cluster. RAID5 can survive a single disk failure. RAID5 does not survive multiple disks failing.

Worse yet -- the "same manufacturer and same batch" part of things is probably not necessary to get clustered failures.

All of which means that not only do you need redundant disks but you also need robust crash recovery in the form of after-imaging (redo logs for those with an Oracle background) and replication.
 

RealHeavyDude

Well-Known Member
#11
Whatever you do will need to balance performance, security and cost. RAID 5/6 and the like are compromise favoring cost and taking into account some security. Performance is not taken into account prominently here.
 
#12
Just to be clear: by "some security" I believe that RHD is referring to reliability, robustness and recoverability rather than anything like "hacker-proof".
 
#13
And a little story about RAID5 for today...

A customer whom I shall not name (but they are a rather large bank) is having a little adventure today that highlights why RAID5 is "penny wise and pound foolish". Various things occurred which are all cautionary tales on their own but eventually we got called. Their databases (30 or so of them) had become corrupted due to 3rd party infrastructure provider incompetence. Then, rather than restore & roll-forward, they got some advice from somewhere else to start forcing their way into the db and do stuff with dbrpr. That didn't work out very well. So now they are scrambling to get backups of the databases and restore them so that they can roll-forward and recover properly.

I think we are coming up on around 18 hours now. The systems that this software supports have a lot of high level visibility both inside and outside the bank. Management is not happy.

The restore and roll-forward process is going very, very slowly. Why? Because the disks are RAID5.

RAID5 works "fine" when there are no problems. But when you really, really need to get stuff done fast? RAID5 is not the answer.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
#14
Yikes. Dbrpr is undocumented for a reason. I'd have to be really desperate to resort to that, and then only if instructed by Progress TS.

Sounds like a perfect storm of people, process, and infrastructure that all need improvement.
 

RealHeavyDude

Well-Known Member
#15
Should have made myself more clear: I meant protection against hardware failure :)

Seriously - if one solely relies on RAID to protect them against anything evil that can happen to a database then I would call them naive. Any RAID or mirror will happily mirror any software corruption and without a proven disaster recovery strategy that include regular backups and after image you will find yourself in all sorts of trouble your database is corrupt.

Nevertheless, we are facing a world where management an sales weasels are promoting all kinds of buzz words containing aaS ( as a service ) and pushing us into virtualization and database residing on, at best, storage are networks or, at worst, in some "cloud" as DB as a service. All those setups are just darn slow and only make cloud and hardware providers happy. We will be pushed onto IaaS - infrastructure as a service within the next year and I am so happy that the DBaaS stuff within our company does not support the Progress OpenEdge database ...
 
Top