Re-indexing - sort files

ron

Member
Re-indexing our database (just after dump/load) took 21 hours. It's 90GB - so the 21 hours was not a surprise.

We have space on 3 regular discs - and also a lot on the OS disc (FC-AL). The re-index used 30 * 1.5GB .srt files. The question is: what's the best strategy for specifying .srt files? Should we scatter the files around the available discs (ie, list them in a hap-hazzard order)? Is it a good/bad idea to use the OS disc?

(We have 16GB memory and specified idxbuild all -TB 31 -TM 32.)
 

ron

Member
Thanks for the replies.

We always set -B to 75000 ... and we're about to increase this to the max (500000).

I've read all the Progress Knowledge info; it's good stuff ... but doesn't answer my question.:drink:
 
One trick which I have used to great effect is to actually specify an area of memory as a file system and put sort files into that. (and I think this is what MurrayH was referring to)

You can create a mount point as a pointer to memory and create your sort files in there. You could safely use about 12GB memory this way, and it appears to be more efficient to do this than use a large -B since the sorting is now done in memory rather than the disk access.

Since this is FC-AL I would suggest that you should NOT stripe the disks together, but rather use the sort file on each spindle seperately.

I see no reason not to use the O/S disk since the entire rest of the system will be quiet while this happens.

I would suggest a -B of no more than 5000 (8K blocks). Others may disagree and may provide evidence that it works better with higher, in which case I bow to their knowledge and experience.

Good luck with this.
 

ron

Member
I tried what you suggested. Doing a full re-index needed 48GB of .srt space - and I could afford to use 12GB of memory - so I split the re-index into four equal parts and specified /tmp to put the .srt files (using Solaris 8 where /tmp is memory ... so long as you're careful).

The difference was remarkable! The total job was reduced from 21 hrs 40 mins to 9 hrs 35 mins.

Thanks a lot!

Ron. :)
 
Having read your post I was happy that you had succeeded in your venture, but then I re-read it and was a little alarmed and disappointed!

Using /tmp is a risky way to do this (since filling /tmp will often crash a unix box), and seems to have used some underlying disk as well since the performance should have been significantly better.

Usually on Solaris /tmp is mounted on swap, and swap is not memory. I would have expected better gains from using an actual memory file system (a file system of type "mfs"). I cannot remember how to create one, but it is possible.

Given that the index rebuild is still 3 phases (read db, sort keys, write keys) I would have expected nearly 5x performance (4.3 hours) rather than the paltry 2.3x (9.5 hours) you got.

This is based on the forllowing thoughts:
The sequential read of the database which will be significantly faster since it will be writing the keys to memory rather than disk. Maybe 1/5th the time required.
Sorting the keys will be immensely faster since the whole operation is in memory.
Writing the keys will take the same amount of time (nearly) since the I/O activity will be writes to the FCAL disk.

To make this even faster :)D) you could set the database up into multiple areas, each containg about 1/4 the data and keys and rebuild each area seperately. This would reduce the amount of wasted reads that you do!

Still - the job is done successfully. I'm only tweaking the process further!
 

joc

New Member
Hi,
After reading your posts I was hoping to try the solution on our database.
The database is currently almost 4GB (1GB made up of indexes) and an index rebuild takes around 3 hours to complete. We really only have 1GB of spare memory to work on.
Does anyone have any suggestions?
 
Hi

The size of the indices is not necessarily an indication of the size of the sort file! Progress compresses indices very effectively when it stores them so handle with caution!

You say you "only have 1GB of spare memory" which suggests to me that you are doing this on a host which is in use for something else. Be VERY careful. This technique can have major implications for production applications since the CPU & memory load increases significantly while the I/O load drops.

OK - caveat done!

I don't know how much of this you have already done, so please forgive me if I state the obvious.


If this is not a production application or you can get a test box then you may want to try a few tests to see how much sort file you need. Start with the largest tables in terms of records and rebuild then one at a time taking note of sort file sizes. You should be able to get to a point where you can make reasonably accurate predictions on sort files.

When you do the index rebuild are you using multi-volume sort files? Are these sort files on a different physical disk to the database? Are you using -TM and -TB and -B?

3 hours seems a long time to me... I would expect 4GB to take an hour or so, but then I am a hopeless optimist!
 

ron

Member
Originally posted by toby.Harman
Having read your post I was happy that you had succeeded in your venture, but then I re-read it and was a little alarmed and disappointed!

Using /tmp is a risky way to do this (since filling /tmp will often crash a unix box), and seems to have used some underlying disk as well since the performance should have been significantly better.


Toby: We had the good fortune of being able to test everything beforehand on an identically-sized machine. We monitored /tmp and knew that there was no problem with filling file-systems. We also monitored each of the four index sessions ... and considering the LONG time consumed by scanning (as distinct from sorting and writing) I doubt that we could have improved much on the result we achieved.

Usually on Solaris /tmp is mounted on swap, and swap is not memory. I would have expected better gains from using an actual memory file system (a file system of type "mfs"). I cannot remember how to create one, but it is possible.

Given that the index rebuild is still 3 phases (read db, sort keys, write keys) I would have expected nearly 5x performance (4.3 hours) rather than the paltry 2.3x (9.5 hours) you got.

This is based on the forllowing thoughts:
The sequential read of the database which will be significantly faster since it will be writing the keys to memory rather than disk. Maybe 1/5th the time required.
Sorting the keys will be immensely faster since the whole operation is in memory.
Writing the keys will take the same amount of time (nearly) since the I/O activity will be writes to the FCAL disk.

To make this even faster :)D) you could set the database up into multiple areas, each containg about 1/4 the data and keys and rebuild each area seperately. This would reduce the amount of wasted reads that you do!


Toby: We're using 8.3E so "areas" are not available to us. Moving to 9.? is a high priority for the next couple of months. I understand the point you're making.

Still - the job is done successfully. I'm only tweaking the process further!

*** Thank you Murray and Toby ... your tip was valuable. Reducing the indexing time from 21 hrs to 9.5 hrs might seem " paltry " - but it was magic to us ... because it meant that we were able to do a complete machine transition plus D & L over a two day period ... and that was the ultimate goal. We did it ... and the results have been "very excellent" (as my kids would say).
 

joc

New Member
Hi Toby,

Thanks for the reply.

The host really only contains this database - along with some applications that are required to run with the database (e.g. bank connections, etc).

We are currently using a srt file made up of 8*250M fixed length and 1* variable length. The sort files are on a different physical disc than the database, however all of these files write to the same disc. We do have spare discs - would it make a difference if the sort files were divided over different discs.

We use -TB 31 -TM 32 -B 2048.

I will start to rebuild the tables one at a time to see if this can shine any light.
 
Hi Joc

Good that there is only this database on the host, though the comment seems a little indefinite!!!

Sort file - This is what should be placed in memory if possible. It should make no difference whether the sort file is on one or many disks, though if you have spares it MAY be worth configuring them into a RAID-0 array and trying that (on the principle that striped disks may be quicker). However as I keep pointing out if this is Sun and FC-AL disks then DON'T stripe!

TB, TM and -B look good!
 
Hi Ron

The initial scan of the database gets MUCH quicker if you can get the sort files into memory rather than swap. This is actually the area where you get some of the greatest gain so your comments about the LONG time scanning indicate it MAY be worth it to actually use a memory file system next time.

The initial scan writes all the relevant index keys into the sort file, and since these writes are now happening at memory speeds rather than disk speeds the results can be dramatic!

Since you have an identical test machine it may be worth trying!!!

Glad to be of service!
 

Keith Owens

New Member
HI Toby,

I read the references to having srt and bi files in memory and am very to to experiment with binary load with bi file in memory.

What I can't discover is how do I set up the memory filing system?

Any help or pointers would be much appreciated.

Regards
Keith
 
1. NOT THE BI FILE!!!! - That stays on disk!

2. Platform? The Compaq Tru64 Unix supports a file system type called "mfs" (memory file system) and you define it in the mount table and mount it. That allocates the memory and away you go.

What you need effestially is a large chunk of memory which is allocated to look like a disk. The point your FIXED extent sort files to it and have some overflow ones on real disks!

The file system is a function of the O/s rather than progress so its System Admin time rather than DBA time
 

CtL

New Member
Good stuff!

For discussion purposes I'd like to know peoples opinions on the following:

Using the RAM disk methods are very nice, but could easily exceed your free RAM capacity. Two alternative arrangements come to mind:

a) Do not tune "up" the -TM and -TB to the maximum. This will mean less performance, but also smaller .srt's. Determining the optimal balance between decreasing these and still getting the performance gain from using a RAM disk would be tedious process.

b) Consider doing each table seperately. I don't believe that this will cause a full rescan under v9. It will use a much smaller .srt. Unfortunately you can't do any parallel operations with this. Again, only a lot of testing would show if the total performance gain would outway doing everything straight to disk.

So, what do you folks think?

Lastly: If you are reindexing for performance (instead of as part of a DnL) and are on v9, look at the proutil IDXCOMPACT. It appears you could do several of these as parallel operations, which is wonderful for speeding things up (usually!). Combining this with a RAM disk would be pretty interesting.
 
1. Using RAMDISK. Yes - It cannot be emphasised enough that this technique has risks. On Unix (and I did check Solaris and there is a command "mount_xmemfs") you can specify a fixed amount of memory and mount it as a disk. This can then have the fixed extent sort files. You should also allow for overflow and have that go to your variable length sort file on a physical disk. Ideally you should have a good idea of the amount of sort file space you will need.

A good system admin should be able to tell you how much memory you can have and set up the mount point. Having never used a Progress Database on a Microsoft product I cannot comment on Windows memory handling ;)

a) do not tune -TB and -TM. I would agree that the documentation suggests that this will reduce memory usage but it will also result in slower index rebuild. I think there is a happy medium between maxing out these and leaving them on default. As with all things performance related, no two peoples experience are ever exactly the same!

b) Do tables seperately. I think that this should be "Do each Data Area seperately". This will reduce the scan size since it only needs to scan the area in which the data block are stored. I don't believe that selecting one table necessarily reduces the number of blocks scanned unless the database is partitioned into areas.

As for idxcompact I believe that is the best online solution to the D&L problem Wiser heads than mine have stated that 70%+ of the performance gained from a dump and reload can be achieved with an index rebuild and IDXCOMPACT appears to do exactly what an Index Rebuild would do. However I don't see any tuable parameters listed for this. I would be tempted to (if it is available) use Private Buffers for this or there would be a significant risk of major performance degradation due to buffer flushing.

Enjoy!
 
Top