Question Nimble Storage SANs

Rob Fitzpatrick

ProgressTalk.com Sponsor
Hi folks,

Our IT guys are considering a Nimble Storage CS220 SAN for use in our dev shop. The workload would be various internal systems, including dev, test, QA, etc. No production databases as far as I know. It may be a mix of VMware VMs and bare metal. The server operating systems would be Linux, mostly RHEL or CentOS 6.x. I have few details at this point (obviously) but I've been asked to investigate suitability for a Progress DB workload.

I've never heard of this company or product. I see that it does compression and uses RAID-6, both of which are non-ideal for this use-case, in my opinion. Does anyone out there have opinions about or experience with running Progress DBs on this platform, either in VMs or on bare metal?

http://www.nimblestore.com/Nimble-CS-Series.asp
 

TomBascom

Curmudgeon
I don't see any reference to RAID6? That would, of course, be a horrible mistake.

I do like that they are at least /talking/ about IOPs :)

I suggest getting an evaluation unit and spending some quality time experimenting.
 

cj_brandt

Active Member
My suggestion would be if you get to test either in house or at the vendor's site, make sure both VM's and bare metal are tested. In our environment we saw a significant difference between VM's and bare metal when we did the io tests.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
I think we're getting one in-house to test. Good point about the VMs; I'll make sure they add that to their list of tests.
 

Chris Hughes

ProgressTalk.com Sponsor
The write up / sales blurb is very similar to NetApp's which I had a little look at not too long ago. I think the big headline is the RAID 6 - 6 IOPS for every 1 real IO as far as we care. All this very clever software has an overhead, I recall NetApp were quite honest about their de-duping having an overhead of 7 (or was it 17?) percent.

All in all though the hardware vendors such as NetApp generally seem to think the physical grunt in the modern SAN minus the clever software is still as fast as a traditional RAID 10 setup with spinning disks.

I'd be interested in your results should you test it, the sell at business level with these sort of SANS with the DR capabilities etc etc etc means we're going to see them at customer sites whether we like it or not ;)
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
As I told our IT guys, I don't care about or want on-the-fly de-dupe or compression. That may be useful for backing up server file systems but not for database performance or reliability.
 

TomBascom

Curmudgeon
Not to prolong the agony or anything but... in theory I can imagine that de-duping and/or compression /could/ perhaps be done in such a way as to reduce the overall IO rate -- if the resources needed to do that consume less than the cost of the saved IO ops then it /could/ be a net positive.

For instance -- your typical Progress databases "zips" to about 5:1. If a disk subsystem was doing that behind the scenes and managed to cut the IO ops from 5 to 1, each IO op was 2ms and the extra time to compress/decompress was just 1ms you could, in theory, go from 10ms to 3ms to read the data...

But I'd want some pretty serious proof before I ran very far with that thought. I'd also want to think about cases where the data turns out to be incompressible.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
"CASL provides real-time data compression to compress VDI data 30-75 percent without increasing latency..." Sounds like marketing BS to me. However it's implemented, e.g. dedicated compression hardware, it has some throughput limit. If you feed it data fast enough long enough, it will be a choke point.

Maybe they claim "without latency" because it can match or better the fill rate of the attached disks. But disks can get faster and the embedded circuitry can't. I don't want to mess with compression for DB storage. And as you said Tom, if the data is already compressed or is encrypted then compression would serve no benefit. Hopefully such "features" can be turned off.
 

Glenn Stewart

New Member
Hi Rob, I am the Nimble Storage SE who presented to your company recently. Your post above was forwarded to me by an associate. I am new to this forum so if I am contravening any policies by responding as a vendor, please let me know and I will withdraw any comments made here ASAP.

Your questions and observations are typical to what we encounter every day from prospects looking at new storage options. I would like to take some of the responses you have received above and show how they accurately reflect how our product works or how, since our product works a little differently than traditional architectures, may not apply.

Someone pointed out "in theory I can imagine that de-duping and/or compression /could/ perhaps be done in such a way as to reduce the overall IO rate -- if the resources needed to do that consume less than the cost of the saved IO ops then it /could/ be a net positive". This is in fact one of the methods Nimble uses to improve performance over traditional approaches to laying data down on disk. If you are seeing 5:1 reduction of Progress database files when zipping them, this results in us writing 1/5th the amount of data to disk, and disk is the slowest component in the SAN, thus potentially reducing the workload of the slowest link in the chain. So, as long as the compression can be done without a performance penalty, the net result is improved performance. Where is compression done - in the SAN controllers via multi-core processor and RAM, much faster than disk, or at least should be if done correctly.

Another commented that "I think the big headline is the RAID 6 - 6 IOPS for every 1 real IO as far as we care. All this very clever software has an overhead, I recall NetApp were quite honest about their de-duping having an overhead of 7 (or was it 17?) percent". This again is accurate for more traditional systems where each database block written to the SAN is its own unique write or IO and RAID6 in those environments is definitely going to impact performance. So, what if, instead of treating each database write or block sent to the storage system as it's own unique disk IO or transaction, you bundled up thousands of them into a container, and then shipped this container down to the RAID6 set and wrote it out as a single transaction? Nimble uses 12 drives in RAID6 set, remove one for hot spare, you have 11 drives. Now write one thousand individual disk IOs as a single sequential stripe this translates to 1000+ application IOs done in 11 disk IOs. As long as you can bundle up those 1000s of transactions quickly, it will result in reduced latency. All of this data processing is done at the SAN controller level, and again, performance there is a factor of RAM and processor speeds, which are several orders of magnitude faster than disk. This data coalescing is in fact how Nimble accomplishes extremely high writes to spinning disk in RAID6 and it is a key reason for the bulk of our sales today, most of them to customers who are running some sort of database as part of the overall workload.

Some of you reading this may then ask, this might work for writes, but what about reads? For that reason, all Nimble arrays come with an SSD-based read cache that is populated in-line and in parallel with the spinning disk group, but only hosts a copy of the data blocks that are expected to benefit from SSD, ie data that is likely to be read back by the hosts in a random fashion, ie not sequential. Since it is a read cache, and not a tiered storage configuration, the usual overhead of having to move data back and forth between disk tiers does not apply. Our goal is to serve up a majority, if not all, random reads from SSD-based read cache, and service all writes (random or otherwise) via the spinning disks.

You can turn on or off these features such as compression and SSD read caching on a per volume basis in real time (ie no shutdown of the host or application) to determine which configuration best meets your needs. To date, I have NO customers who have disabled either feature in a database environment, but some have tested with them enabled and disabled and turned them back on. If your company does decide to proceed with a test of the array, I would encourage you to test bare metal, VM, etc configurations both with and without these features enabled to see the difference they make and to ensure the product meets your needs.

Thanks for your time.
 

TomBascom

Curmudgeon
Glenn,

It's nice to see a vendor respond. And especially nice to see that you're encouraging thorough evaluation and testing. It's a little hard to pick your bits apart from the quoted stuff but I'm sure we can manage :)

... Where is compression done - in the SAN controllers via multi-core processor and RAM, much faster than disk, or at least should be if done correctly.

So how can one tell if it is being done correctly? Do you provide any tools that would help Rob (or anyone else) know?

...As long as you can bundle up those 1000s of transactions quickly, it will result in reduced latency...

In my personal experience db IO is *extremely* random and it is very unlikely that data coalescing will have much impact. Exceptions are the before-image and after-image logs (UNDO log and REDO log in Oracle speak) which *are* sequential. But general db extent writes are very, very random. About the only time that I have seen that approach work is when the db is so horribly tuned on the server side that almost everything is going to disk rather than being buffered in -B (Progress' equivalent to Oracle's SGA). The most effective place to cache data is closest to where it is being used -- not far down the chain in the disk subsystem. You don't want to deliberately trade off server tuning in order to get effective caching in the SAN.

Does Nimble provide any tools that allow an administrator to get a good, detailed look at how effectively this write coalescing process is working?

Ultimately there is a RAID6 underneath all of this and RAID6 sucks at IO. Sooner or later (probably sooner IMHO) all of the various tricks being used to hide the RAID6 will run out and the DBA is the one left holding the bag. It may be that those tricks only run out under specific circumstances that a customer chooses to accept -- that's ok so long as they are clearly aware of the trade-off that they are making.
 

Glenn Stewart

New Member
Hi Tom, thanks for the welcome :). I do want to point out that while I am encouraging Rob to test the array in his environment, this is mainly because I don't have any customers in my territory running Progress on Nimble. Otherwise, our first option would be to have him speak with those customers (without me present) to see how things work in a production environment that has been running Progress for months/years since most evaluations are limited in terms of how much time customers have to dedicate to this testing, as well as how much of their environment can be tested in an evaluation scenario. If this was SQL or Oracle, this would have been our first suggestion as I do have several customers running those databases. I run evaluations in less than 10% of our projects as a result of having these existing reference customers. So, evaluations are definitely used, but less frequently as we build a more diverse customer base.

I will try to answer your questions in the order posted:

1. How can we tell if the compression is being done correctly?

I may have over-simplified my initial comment on in-line compression, but basically, if it is NOT being done correctly, you will see performance degradation, indicated by higher latency. So, via our built-in performance monitoring graphs, you can track real-time the read/write latencies, IOPS, MB/sec characteristics of any or all volumes. If latency is going up, it could be an indication compression is the problem. It may not actually be compression, but more likely CPU resources on the controller, since that is where compression takes place. How can you verify? We also provide a portal called InfoSight, that ingests a multitude of data points from your array on a daily basis via what we call heartbearts (sent approx. every 5 minutes) and autosupport (reports) and analyzes them. It then provides graphs showing you how the SSDs and processors in the controllers are performing, including a heat-map of read/write latency every hour of the day going back months. An example of not doing compression correctly might be taking a group of writes/blocks and compressing them as a group, instead of compressing each block or IO individually and then putting it in a container or larger group. If the former is done, one problem is that if you want to read back a single block, which is often the case, you must first read back the entire group, uncompress, then extract the block you want. This will almost always cause performance issues. Nimble compresses each block separately before putting them in the larger containers to be sent down to disk.


2. Data coalescing with random IO environments - how can this improve performance?

This is actually the best environment for a filesystem with characteristics like Nimble. One key point to understanding how this is possible is, you need to be aware that Nimble does not use a traditional "write-in-place" filesystem. What that means is, once you write a database block, if you want to modify it, Nimble does not need to actually overwrite the block in the exact physical location it was originally written to on the physical disk. We use a log-structured filesystem. So, once the block is first written, if it needs to be modified, we write it in a new location. The more random the IO pattern, the better Nimble performs compared to traditional write-in-place architectures since if it is all sequential IO, you are basically just streaming sequential writes to disk, which might be similar to a backup, and backups work pretty well on slow disk. In essence, Nimble is converting random IO, which is very disk intensive, to sequential IO, which is not. I didn't mention this before, but we only use 7200 RPM drives for our main disk subsystem along with SSDs for read cache only. Why do we use these 7200 RPM drives? Because their sequential write performance is similar to that of 15K RPM drives, and you get much more/cheaper capacity with them.

The challenge to this architecture is cleaning up after ones-self, ie if we are not overwriting existing blocks, if you don't clean them up, the system will run out of space? A built-in sweeping process continuously monitors existing containers and cleans them up when enough blocks in the container are no longer valid. This is part of normal system operation and does not impact performance until the array is just about at 100% capacity. We do not cache writes to SSD, they are streamed directly from controller NVRAM to spinning disk. So, via the coalescing and compression, we get great write performance, and then using caching algorithms to populate the "hot" blocks on SSD as a read copy, random read performance benefits from these much faster disks.

You are correct in saying that RAID6 will eventually cause performance issues, but that will only occur if/when the disk component becomes the performance bottleneck. Today, the bottleneck in a Nimble array is the Intel processor in the controller, so it is not spindle-bound for performance, but processor bound. And as you know, processors are always getting faster, at least for now. To prove we are processor-bound today for performance, we can look at our two models. The CS200 series comes with a single quad-core processor, and the CS400 series has dual hex-core processors. Other than a bit more RAM in the 400 series and these processor differences, the arrays are identical - same spinning disks and SSDs. The CS400 is capable of 3x more random IO than the CS200 series. At some point using the current disk architecture, the Nimble arrays will become spindle-bound, so we will need to adapt to that situation when it arrives but for now it hasn't happened.

Regarding tools to measure coalescing, we don't specifically identify how many IOs are placed in each container before being written to disk, as this might be a little too labour intensive and the array would spend more cycles telling you how it is doing than actually doing it's job of serving your IO requests. But with the tools described above showing IOPS, throughput, latency, etc. it is fairly easy to get an idea of how well the coalescing is doing. We only run a very small amount of NVRAM (protected memory on the controllers) so in a busy environment, it is definitely not able to service up all write requests, hence the disks themselves are directly involved in the latency results. We often have customers run benchmark tests on their current storage, then on Nimble and more times than not, the results are compelling.
 

Cringer

ProgressTalk.com Moderator
Staff member
Just a thought, but would you be willing to arrange your own test bed for Progress so you can give more applied information to folks? It might not be cheap to arrange, but I'd have thought that if your system works it might be very attractive to folks. I know a good consultant (@TomBascom) who would probably relish the idea to come and play! ;)
 

TomBascom

Curmudgeon
It sounds like your write strategy is somewhat similar to the way SSDs manage writes. It would also need a lot of free space on the underlying drives to stay efficient.

Why not offer RAID10 underneath it all as an option? It sounds like everything else is optional so why spoil a good story by insisting on RAID6?
 

Glenn Stewart

New Member
Unfortunately, I don't have the resources to set up a test lab for remote users to access. We do have corporate labs but they are not easily made available for this type of requirement so for now, the best I can do is share whatever test results are made available to me by other customers/prospects running Progress on Nimble. If you have suggestions for test criteria, I can definitely see if these can be introduced into the mix on the current evaluation (see below for more info).

We don't run RAID10 because that would be capacity-prohibitive and not provide noticeable performance increase over the current architecture. Nimble is not trying to necessarily produce the fastest or largest array possible (there are options for each of these today already such as all-flash or all spinning disk), but instead provide our customers with the best overall combination of these two aspects of storage, along with several other features such as snapshots, replication, etc. at a price that makes it compelling enough for people to consider us seriously.

As for free disk space requirements being high, this is actually not the case. Nimble arrays will run at 100% of their rated performance until almost at full capacity of the spinning disks. The background sweeping process is light-weight enough and efficient enough to keep the filesystem optimized without impacting capacity or performance otherwise. Keep in mind the SSDs are not a staging area, they are a copy of data that already exists on spinning disk. We do not include the capacity of the SSDs in the overall system usable capacity for this reason.

As luck would have it, in a couple of weeks, I am starting an evaluation with a large financial services company running Progress as the primary use case. I can report back our findings. If you are interested, I can keep you posted with any results my prospect is willing to share during the upcoming tests they are going to run. Thanks again.
 

TomBascom

Curmudgeon
Pardon my bluntness but the "capacity-prohibitive" excuse is lame. DBAs rarely care about "capacity" (when that term is used to mean "space" as it is here). Everything else seems to be an option. The underlying RAID should be too. Why would you want the most objectionable bit of your architecture to be unchangeable? Your argument basically amounts to "trust me". I'm sorry & not to impugn your personal integrity or anything but you are in sales -- you are not to be trusted. Let the customer decide.

I also have my doubts that everything stays hunky-dory as capacity approaches 100%. You're going to have an increasingly difficult time finding large blocks of contiguous free space as your disk fills up. Or you are going to be working harder and harder to "de-frag" and create such space. My SSD comment was just that the general approach is very similar to what SSDs use to make small writes efficient -- that's clever. I'll also note that it takes some fairly clever management of that process to keep it reasonably efficient and that the SSD people didn't get it right the first time...
 

Cringer

ProgressTalk.com Moderator
Staff member
As luck would have it, in a couple of weeks, I am starting an evaluation with a large financial services company running Progress as the primary use case. I can report back our findings. If you are interested, I can keep you posted with any results my prospect is willing to share during the upcoming tests they are going to run. Thanks again.

I would find the results very interesting. Thanks for your posts! :)
 

Glenn Stewart

New Member
Tom, I understand your position on capacity/space, perhaps that applies to many people on this forum but Nimble is a not designed for a single use case, such as databases, and many of our customers are looking for as much capacity as possible.

Regarding RAID configurations, since disk is not the bottleneck in the Nimble architecture at this time, reconfiguring as RAID10 would be somewhat similar to putting 91 octane gasoline in your car and getting the same performance/mileage as 87 octane - you are getting no appreciable benefits from doing this, at least not in a Nimble array.

I don't expect anyone to simply take my word for any of the points made above - in fact it is very rare that people will just take our word for it since it is a different approach to how storage is normally done, but I do at least need to try and explain the concepts the technology is based on to get the ball rolling. If this was an Oracle or SQL forum, I would ask existing customers to participate in this discussion to share their experiences. To that end, as mentioned earlier, I am running an eval for a prospect using Progress and if it goes well, after they have been running it long enough, more often than not they will be willing to take calls from others interested in doing the same thing so experiences can be shared.

Thanks for allowing me to participate in your discussion, I will report back once we have had a chance to make some "Progress" in this eval :). Sorry, I couldn't resist.
 

TomBascom

Curmudgeon
When I pull up to the gas pumps I, the customer, get to choose what octane to put in my tank.

As a customer buying storage I, the customer, likewise wish to be able to choose for myself how to optimize my devices. Personally I do *not* choose to maximize storage space. I choose to maximize throughput. Having a RAID6 in the picture is like driving a lawnmower down the auto-bahn. It's a great way to mow the grass -- but mowing the grass isn't the reason that I generally drive down the highway.
 
Top