"Lazy" BIW

Our database has BIW and AIW, plus 3 APW.

We consistently see Writes by BIW or AIW of less than 10%. I imagine this means that something else is doing the other 90% of the work.

I've read suggestions that reducing the biblocksize will improve these figures, but we cannot set this any lower as it is already the same size as the database blocksize.

I feel that I'm not getting the full benefit of these dedicated servers and was wondering if anyone could suggest a way of improving their performance.

If it helps, we have
biclustersize 512 bytes
16 bibufs
25 aibufs
8K dbblocksize and biblocksize
 
Bi Cluster size is my suggestion for the culprit. To be more accurate on that I would want to know the frequency of checkpoints and numbers of buffers flushed at checkpoint. I would start this at 2MB and go from there. As a rule of thumb checkpoints should occur about every minute or so at peak load. Anywhere between one and 5 minutes is a good rule of thumb. The only times to decrease BI cluster size are when the truncate BI process takes too long and when you get occasional pauses in the system due to checkpointing.

Is 2 Phase Commit involved anywhere? This sets the -Mf flag to 0 which can really hurt your ability to use BIW and AIW.

Also, you have 3 APWs. These will do writes to the BI and AI on behalf of the client so the situation may not be as bad as you think. Try stopping (or not restarting next time you shutdown) one of the APWs.

Ultimately if the work gets done and the application responds well then does it really matter? What matters is the result, not the elegance of the process achieving the result!

promon R&D 2 1 or check the _Checkpoint VST (only the last 5 recorded.

The BIW algorithm requires there to be a good deal of time between checkpoints for it to work efficiently.

Hope this helps
 
Thanks for your help:).

Our checkpoints are happening every 2 minutes, so I guess there's no large improvement to be made there.

In the meantime, I'll try killing off one of the APWs and see whether this gets better performance from the BIW and AIW.

2PC is enabled, but I was not aware that this would automatically set the -Mf to 0. Can I subsequently reset it to 3 or higher, or does 2PC force it to stay at 0?
 
Sometimes I amaze even myself!

See KBase KB15425 for confirmation of the -Mf setting under 2PC .

I think the answer may be to use the -groupdelay which does the same thing as the -Mf, except measured in milliseconds. Have a look at KBases KB19295 and KB18003.

Unless your databases are running on different boxes, I would question the use of 2 phase commit. I have tried pretty hard by killing a database or a server with a -9 (on a test box) at various points through a distributed transaction to force the databases out of sync but was unable to. If this is to obtain a consistent picture for a backup and you have a genuine 24x7 operation then I understand but.....!

Maybe you could consider merging the databases together?
 
I meant to ask - Are there flushes being performed at checkpoints?

The real purpose of the BIW is to provide an independant mechanism to reduce the buffers flushed at checkpoint which will cause strange behaviour all over the place.

Ultimately it sounds as though there are not a hugely significant number of writes happening to this database if a 512KB bi cluster is lasting 2+ minutes at peak times. Counter-intuitively the way I read those KBases actually made an argument to decrease the bi cluster size and set -groupdelay to 100 (ish!).

This is definitely an area for some well documented testing and patience, especially from the users.

All in all, if the performance is satisfactory - who cares whether the BIW is doing 1% or 100% of the writes to the BI file!
 
Toby,
Thanks for all your help.

There are no flushes occuring at checkpoints.

Much as I'd love to merge all our databases, business requirements dictate that certain databases may be shut down without affecting others.

However, because we must be certain that events are recorded in our audit database, we were advised to have 2PC enabled.

I'll look up the knowledgebase entries on the much deteriorated Progress Knowledge Centre and get back if I have any further questions.

Thanks, once again
 
OK - Sounds like your application knows a few tricks about dynamic connects and disconnects.

If all the databases are on the same server it is THEORETICALLY possible to get a write to one and not another. However in all my years (and there's been a few!) I have never managed this! If the databases are on seperate hosts then go with 2PC.

It sounds like performance is not too foul. Did the groupdelay help at all? This started as a thread on lazy BIW but has (once again!) turned into Performance tuning!
 
We're running 24/7 except for reboot on Wednesday morning, so I haven't been able to introduce -groupdelay as yet.
As you can imagine, this also makes tuning an absolute pain in the a***.

You can only change startup parameters once a week.
All the average statistics such as writes per minute, are skewed by quiet overnight periods

Our partial bi writes are approximately 95% of all bi writes, but I'm not sure whether this is due to poor tuning or low activity.
bi writes per second run at about 0.75, but this includes quieter overnight and weekend information.

I was considering setting -groupdelay to 250 and monitoring it over an hour at a busy period.
 
Ow. The bit about partial BI writes being 95% of all BI writes hurts lots.

Maybe you could decrease the bi block and cluster size (thankfully only requires a truncate!)

I would definitely go with the groupdelay, and would consider dropping the bi cluster size.... Half of this should mean that checkpoints are every minute or so which should be fine but it MAY kick the BIW into a higher gear. The way it was explained to me is that the BIW works out how many buffers it has to flush and how long it has to flush them and sets about writing them out at a certain rate. Decreasing the time available can help make it more aggressive.

Otherwise you are into the black art of BIW and APW tuning (best left to Gus at Progress and a couple of others!)

You are using an AIW - This can also cause BI partial writes so you could try killing that (Advantage - No downtime required - just kill the process!!). APW's can also cause partial BIW and 3 APW is a lot unless you have a high user count and have -directio on.

You haven't said if this app is client-server or host based?

Hope this is all still useful.
 
The application is client-server running on WebClient with a steady 200+ users during office hours.

I have knocked off some of the APW's and this has forced the BIW & AIW to do a bit more work. I have also introduced -groupdelay at 250, which has reduced the bi partial write ratio to 60%ish.

I can't reduce the biblocksize as it is already the same size as the database block size. Perhaps this is an unavoidable side-effect of having read-heavy databases with large blocksize?

Checkpoints are currently every 10 minutes.

The bi cluster size is currently at 512kb. I'm sorely tempted to drop it to 128. Any opinions?
 
Hmmmm

I think I would drop the BI cluster size as suggested to try and provoke the BIW into being a little more aggressive.

I suspect we are reaching the point of doing things for the sake of it, rather than doing things to improve performance!

If you are feeling brave, you could always try reducing the database block size (remembering to re-set the various records per block and -B etc). This would allow you to drop the BI block size but we are really reaching here!

Reading KBase 18003 I see that the maximum they recommend for groupdelay is 500 and if partial writes are more than 50% then increase it. I would consider taking groupdelay to 350 given the above.

Good luck!
 
Top