[Progress Communities] [Progress OpenEdge ABL] Forum Post: RE: Dump, Load and Index Rebuild on Type II Database. - Performance

Status
Not open for further replies.
R

Rob Fitzpatrick

Guest
I concur with all of Tom's points above. Some thoughts: I assume you have a reasonable Type II structure, with each large table in its own area, and each large table's indexes in their own area. Each area should contain objects of only one type: tables or indexes or LOBs. No application objects should be in the schema area or other Type I areas. 32 GB of RAM isn't much for a database of this size. Increasing that, even if only temporarily for the duration of the idxbuild, could help performance; it would give you more sort/merge buffers and could reduce disk I/O. Same goes for vCPUs, assuming they aren't thin-provisioned. More CPUs could allow you to improve performance with more threads. re: idxbuild all -TB 64 -TM 32 -l 2000 -B 1000 -SG 64 -TF 80 -thread 1 -threadnum 8 -mergethreads 8 -datascanthreads 8 -pfactor 90 -rusage: You don't need -thread 1; it is the default on Enterprise. -l is a client startup parameter (local buffers); it isn't useful here, as far as I am aware. With 8 CPUs you likely have way too many threads. The idxbuild maximum thread count is threadnum * ( mergethreads + 1) + 1. With these parameter values you will have up to 73 concurrent threads, depending on how many merge groups are in use for a given area. Aim for threads = 1.5 * CPUs, e.g. -threadnum 3 -mergethreads 4 (with 8 CPUs). It is also worth noting that every thread opens every file, so if you have a lot of extents and you aren't running as root it is possible to hit the process handle limit (ulimit -n). If your idxbuild runs out of file handles and crashes, you have to restore from backup. -pfactor 90 is on the high side. Packing keys that tightly into index blocks could cause a lot of expensive block splits as you do creates after the rebuild . Use -pfactor 80. Look in the log (and idxbuild stdout) for the 11480 and 11483 messages per area. They will tell you how much I/O you did to the temporary files on disk. The ideal is to have no I/O at all, i.e.e all of the sort/merge activity is in the -TF buffers. E.g.: DBUTIL : (11480) Temporary sort file at /home/robf/db/idxbuild used up 0K of disk space. DBUTIL : (11483) A total of 0K of temporary sort disk space was used for area 8. If you have a number larger than 0K then you are doing disk reads and writes during the sort/merge phase for that area. This could possibly be eliminated by adding RAM or by restructuring the areas such that you are processing less key data (fewer objects per area). This disk I/O is done to the temporary files. Their location is specified by the -T or -SS parameters. If neither parameter is specified then those files will be in the current directory, taking away some of your I/O bandwidth if it is in the db volume. If you do have to do disk I/O, it is preferable to do it on non-database disks, or use -SS to spread the I/O across multiple volumes, if possible. You have specified -datascanthreads 8; try 12. Note that this parameter has restrictions: data must be in Type II areas; you must answer 'yes' when asked if you have enough disk space for sorting; the data area being scanned must not also contain indexes being rebuilt; the area being scanned must not contain tables that have word indexes. If these restrictions are not met for a given area then its data scan is single-threaded. If you have tables with word indexes, then: if the table is small, put it in a "word indexes" area; if the table is large, it should be in its own area anyway. With a database this large, it can be helpful to load data and build indexes in a pre-grown set of extents (if you are writing to variable extents). For example, do a dry run where you load the data and build the indexes. Then truncate all the areas, leaving them logically empty. Then in your next load/build, you can write the data without the cost of any extend operations. Another small optimization you can try is to truncate index areas prior to the rebuild. This eliminates the first phase per area of the index rebuild, the index scan. Before attempting this, be sure to check that your index areas only contain index objects, not tables or LOBs. If you have the testing time, try this and see how much time it saves. Not a performance optimization, but still important: be sure to back up your DB between load and idxbuild. Re: dump: compacting or rebuilding the primary indexes in your source DB prior to the dump might help a little with dump performance.

Continue reading...
 
Status
Not open for further replies.
Top