Load - Index - single stream or multiple?

ron · Nov 30, 2006

Hello ...

We use Progress 9.1D09 on Solaris 8.

We're planning a dump/reload in a few weeks time. We've done this many times before. We always dump using multiple jobs in parallel (15) -- but have always used one single job to load and re-index.

However, the DB in question is now 270GB and the time needed for the whole process (including backups, tabanalys, etc) is getting quite painful, so we're thinking about ways to speed things up a bit.

We could try loading in parallel streams (into separate Areas, of course). And we could try running parallel indexing too (now that we've discovered the -SS parameter).

HOWEVER, although we have a development and test system, it has totally different performance characteristics to our production system. Thus, we have no way to try out these ideas to see if they improve the time -- or not.

In particular, in our single stream load/index job we have nine separate index phases -- and each uses 15GB of memory for the srt files (because we put the srt fils in /tmp). If we run parallel indexes only one of them can use /tmp.

Does anyone have experience with parallel loading and/or parallel indexing? If so, please share the knowledge of your experience. :idea1:

Regards,
Ron.

TomBascom · Dec 1, 2006

Yes. I did a presentation on the topic at Exchange this past year.

In brief -- it depends

Not having a test environment with the same performance characteristics as production is "unfortunate". Because you really don't know until you test -- there are no absolute rules in these matters. Can you get production on a weekend or holiday and run a non-destructive test?

I've seen dump & loads (especially on high end gear) where loading with the "inline" index rebuild was substantially faster than the separate post load index rebuild. But that has been rare.

The presentation mentioned above details a massively parallel d&l approach that uses BUFFER-COPY to:

1) Eliminate the need for scratch files (.d or .bd) and thus significantly reduce IO imbalances and better utilize available spindles by not wasting a bunch of them.

2) Use the power of 4GL WHERE clauses to split tables into multiple simultaneous threads. (Providing much better parallelization than DUMP SPECIFIED.)

3) Eliminate the index rebuild altogether by loading with indexes active (not as scary as it sounds with the v9+ CCLP feature).

I've had very good success with this approach but everyone's situation is different and there are always alternatives to explore.

The presentation and the necessary code along with a smidgeon of documentation are available at www.greenfieldtech.com. Or, if you or your management would prefer, a consulting engagement can be arranged.

ron · Dec 3, 2006

Thank you Tom. Your insight into DBA matters is always valuable!

I very carefully checked-through all the info I could find regarding the “massively parallel” procedure you presented at Exchange. I was very impressed with it. It is a novel and highly effective method of dealing with the biggest hurdle involved with doing dumps/reloads -- the time required.

Unfortunately I cannot use that approach. Your method presents me with one “problem” -- and one “obstacle”. The “problem” is that the source and target databases have to be both available at the same time. The production server has “rather little” spare disc space. However, because we have two warm spares, one of which is on the same LAN as production -- I could -- with a bit of trouble -- solve that. But the obstacle is that the company that supports our application will not allow us to dump/reload using 4GL! We are restricted to using “plain” Progress tools -- otherwise our support agreement is invalidated. I think that’s “unfortunate”, of course. (Stronger lanuage comes to mind!

) But I have been around this issue with them before and they simply won’t even discuss the matter.

As I mentioned before, I can test anything I like on our development/test server -- but its performance characteristics are so different from production that any observations I make just don’t apply at all when involving the production server (faster CPUs, more CPUs, more memory, faster SCSI bus).

Our loading job is a single stream. Essentially it loads an Area -- then indexes that Area -- and so on. When we last did it (in April) it took 20 hours (loading / indexing only). The DB has grown 9% -- so it should now take roughly 22 hours. If I do nothing other than load the Areas in parallel (5 streams) it would seem logical to expect that I could save about 5 hours (making some fair allowance for degradation because of running five streams sogether). But I don’t know if that “logical” expectation will actually be realised. That’s why I was canvassing the experience of others who may have tried this.

As for indexing during the load and/or running parallel indexing -- I have decided to think no more about it. Without the ability to test on the “actual” target server, I would never try it.

TomBascom · Dec 4, 2006

It sounds to me like you need to talk to a support person at your vendor who can do something more than read from a prepared script.

ron · Dec 26, 2006

Over Christmas we dumped/reloaded our 270GB database. We attempted a new ploy -- involving five parallel loading streams. After one hour we abandoned it -- because it was running significantly slower than using a single stream. We reverted to our old single-stream procedure.

It was a disappointment, of course, but thinking later about the way the parallel loading would affect the discs -- I can now understand why it was so slow. It would have caused significant head-thrashing -- whereas the single stream approach would keep the disc heads very still.

So -- I guess I'd have to say -- keep loading to a single stream! (Or, to be very precise -- if anyone out there has a "massive" system -- limit the number of streams to the number of separate striped metadrives.)

Ron.

TomBascom · Dec 26, 2006

I did a d&l of 100GB+ a couple of weeks ago that generally had half a dozen or so binary loaders running in parallel. (I started with 30 simultaneous binary dump processes.) The dump & load portions of the process were overlapped and in parallel. That took about 50 minutes. The index rebuild was done as a single monolithic process once all of the loads were complete and took another 4 hours.

I experimented with the inline index rebuild but, in this case, it was slower. Doing it that way took about 8 hours in tests.

I also experimented with the ascii d&l and had it running in 6.5 hours with indexes active and without multi-threading any of the tables. But I decided not to go that route even though 'm sure I could have got the window down to a couple of hours because 1) 5 hours was plenty fast enough for this client, 2) it would have caused the multi-threaded tables to be somewhat "scattered" 3) the necessary WHERE clauses for this db schema would have been a serious PITA.

Various factors that helped:

1) Lots of storage areas carefully tuned to their contents.
2) A disk array configuration friendly to a database.
3) Plenty of RAM.
4) Lots of CPUs.
5) Time to test and tweak with a reasonable facsimile of the production configuration.

Load - Index - single stream or multiple?

ron

Member

TomBascom

Curmudgeon

ron

Member

TomBascom

Curmudgeon

ron

Member

TomBascom

Curmudgeon