Forum Post: Performance Degradation After Dump&load

Piotr Ryszkiewicz · Oct 7, 2015

Hello, I would like to share with you my strange experience with dump&load of the database. I am struggling with that for quite a long time now, and in fact already have the solution, but I do not understand what really happened, so maybe someone more experienced will be able to explain. I apologise in advance for quite a long post. The situation is following: - we have the database with just one huge table (cca 200 GB). Data and indices are separated to their own Areas. Data Area has RPB of 32 and cluster size of 512. Block size is 8kb. DBANALYS shows, that average size of record is 395 B, with minimum of 210B and maximum of 1668 B. All extents are fixed size. OE is 11.3.3 on AIX 64-bit. - data in this table are created in order of it's primary index (current date + ID from the sequencer). Records are never updated after they are created. - database was dumped and loaded (tried both binary and bulk load) with no changes to it's configuration. Dump and load was also done in the order of the primary index. - after dump&load of this database we observed massive performance degradation. I was able to narrow down the problem to simple sequential read of data based on primary index. Read through data created by dump&load was approximately 15 times faster than read through data created afterwards, during normal production. - to eliminate influences of other processes running on the database I started to play on the copy of the database with no other users connected. I created testing program which was doing sequential reads of data day by day, and after each day looking at changes in all relevant VSTs. I found that the difference between reading data created by dump&load and in production was in _ActIOFile only. Data created by dump&load were located really sequentially (almost all reads were done from the same or adjanced extents). But data created in production were scattered all over the database. So the read was not really sequential, what explains why it was slow. I mean reads from Data Area now - reads from Index area were pretty low and sequential in both cases. Looks like dump&load process had left holes in the database which were later filled during production. - after several days it went back to normal - sequential reads were as fast as before and the data were fitted sequentially at the end of the database. Looks like all holes were filled during initial days of production. - at this point I contacted technical support - as I was sure it was a bug. It took quite a long of time and I didn't get any satisfactory answer. They basically said it's a feature and suggested to decrease RPB to 16. I didn't like the idea but I had another - to play with CreateLimit of the table - at the beginning I tried to simulate whole process - dump&load and production - without any changes, just to have the baseline to compare. Unfortunately my first tries were not successful - that means everything worked good, data created by simulated production were not scattered at all, but created nicely at the end of the database. It was clear, that the problem is with production simulation, as dump&load was done in exactly the same way as before. After some investigation I found out where my simulation programs were different from real production. In real production process the records in that table are not created with single ASSIGN command, but it's divided into several blocks - but still within single DB transaction. At this point I remembered one of the sessions of the PUG last year - record is not allocated at the end of transaction, but in some cases somewhere in the middle. When I changed my simulation program that way I got the same result as in production. - when I found out that I thought that I know the reason and have the solution. My theory was following: When creating new record the database tries to keep it within single DB block. During load process it always knows the size of current record and the free space in current block, so if it fits it's created in current block, otherwise it is moved to next block. That causes that there is free space left in blocks, and the average size of this free space should be lower than the average size of the database record. As minimum size of the record is higher than default CreateLimit it should never happen that the record is created in next block because of hitting CreateLimit, but always because of record size. Increasing CreateLimit after the load to value close to average size of the record should cause that during production this empty space should not be filled (in most cases at least) - my first test was to increase CreateLimit to 350 after the load. Then started the production and got expected results - data were created sequentially at the end of the database. I could stop at this point and be satisfied, but (un)fortunately I didn't - my next test was to perform the load with CreateLimit set to minimum value (32). If my theory was correct it should have no impact on load process - as minimum record size is higher than default CreateLimit. But it was not true. Load packed the data better than before (HWM after load went down from 28124159 to 27628031). As I understand that it's possible only if records could be splitted to more (2 ?) blocks during load. But that should mean, that the holes in the blocks should not be bigger than current CreateLimit. But if so, why afterwards in production these holes were filled, if the CreateLimit was still the same ? - after the load I set CreateLimit back to default value (150) and simulated the production. It was still slowed down, but not so dramatically as in previous case (holes were smaller ? less of them ?). On the other hand, sequential read through loaded data was slower than before, and my test program showed some scattering. It looks like that with CreateLimit of 32 the database fills some holes created during load within the same load. That's all I found out till now. As you see the solution of my problem is quite simple - keep default CreateLimit during load and increase it afterwards. But I would also like to have some understanding what really happened. Are records splitted during load or not ? Why the database leaves holes during load which are apparently bigger than CreateLimit ? Is there any way to avoid that ? I would appreciate any suggestions

Best Regards, Piotr

Continue reading...

Forum Post: Performance Degradation After Dump&load

Piotr Ryszkiewicz

Guest

Similar threads