Dumping with Multiple Threads

ron · Sunday at 10:57 PM

RHEL 9.2 Linux + OE 12.8.3

To recover a lot of "deleted" space after purging, I face having to reclaim the storage on a few large databases (1TB to 2TB), and of course the down-time needed is a major factor. I am looking at various ways to tackle this, one being a D/L. I have done many D/Ls before, but not with DBs of this size and if a D/L is finally chosen as the way to do it, dumping using multiple threads would be especially helpful ... but I have not used this functionality before.

I understand the way it works: splitting tables into ranges of rowids for each thread. My questions are:

Is there a way to know when any particular thread is finished (ie, available for loading)?
Is there any better way to load-back the dump files other than running each of them separately in parallel?

Ron.

jdpjamesp · Monday at 3:58 AM

I've never done it that way. I've always run different tables in each thread. Essentially you decide how many threads you want, and then split up the batch files so each one is essentially doing a similar amount of work. I just used record count, but you could also add mean record size to the equation I suppose. You want File1 to start with a big table and get smaller, and you want File2 to start small and get bigger. File3 starts big, File4 starts small, etc. If the script then moves each dump file to a new location you know that any files in that location are ready for loading.
Start the new DB without integrity checking and have all indexes deactivated. If you screw up you're starting again, but it'll be orders of magnitude faster than doing it with integrity on.
Once data is loaded restart with Integrity enabled and build all the indexes. In 12.8 there's lots of switches for improving this performance as I'm sure you're aware.

ron · Monday at 4:20 AM

Yes, I have done D/Ls previously like that with several parallel dumping streams (likewise for loading) -- but the ultimate result was that the end-to-end time for the full process is (more or less) the time to dump then reload the very largest table. And that's why I'm looking to be able to carve-up the biggest handful of tables using dump streams.

Patrice Perrot · Monday at 5:00 AM

Hi Ron,
Depending on your system , you will be able or not to use the following trick.
I dump my "big" tables in a filesystem which use an "old" jfs ( limited to 2Gb) .
So each time the dump file reach the 2Gb, it creates automatically a new dump file .
So i am able to dump and load the same table at the same time (dumping in file number X, loading the file X-1).
With this trick the load ends few time after the dump.

If you are able to have more space for few time,
There is also the "online dump-load" of Mike Furgal based on Tablepartitioning .
This allow you to "tablemove" (on line with some activity on it) a table in another partition which could be in another area .
At the end of the process, you should have no table/partition in the old area so you could remove them.

kind regards
Patrice

tamhas · Monday at 10:58 AM

Of course the optimum strategy will depend on the profile of the database ... how many big tables? big records, etc.

FWIW, I have done a dump by recid ranges ... a painful experience with hardware damage and inadequate backups, but I did manage to recover something like 97% of the data.

George Potemkin · Monday at 11:33 AM

Max number of the dump threads is defined by the number of the keys in the root block of the index used for dump. In other words, the number of dump threads depends on the chosen index. The larger table more likely its root blocks have just the small numbers of index keys.

Dbanalys does not report the number of the keys per the root blocks. One of the methods is to dump the root blocks. I have the script that can dump the blocks even on online database.

You can force Progress to create the multi-volume dumps: create a copy of _proutil without the setuserid bit. The executable will use the current ulimit. Progress will switch to the new dump file when the current one reaches ulimit. You can load a dump volume as soon as you see its successor.

Patrice Perrot · Monday at 12:12 PM

Take care of LOB if you use the ulimit .

Defect # OCTA-31889 : "Binary dump fails with errors (9201) and (14624) when ulimit hit on AIX." (Status : Ready To Develop ) when dumping a table with a Clob or Blob.

TomBascom · Monday at 12:41 PM

The threaded dump can be very effective but, as George mentions, it will depend on your indexes. And you might benefit from doing an idxcompact first to clean up your post purge situation.

Dumping with Multiple Threads

ron

Member

jdpjamesp

ProgressTalk.com Moderator

ron

Member

Patrice Perrot

Member

tamhas

ProgressTalk.com Sponsor

George Potemkin

Member

Patrice Perrot

Member

TomBascom

Curmudgeon