Question BI Read

Cringer · Feb 11, 2015

If I'm monitoring User IO on a DB and I see users with BI Read activity, what does that mean? Are they backing out transactions?

Rob Fitzpatrick · Feb 11, 2015

Yes, I believe so.

Cringer · Feb 11, 2015

Thanks Rob. We've got some very strange performance issues at the moment and I've never really seen anything significant on that column. I'm suspecting it's a symptom rather than a cause though. People see the "Not Responding" message on the application and so kill it. Wish they wouldn't though as that causes the AppServer to have a hissy fit when it doesn't know where to return results to! lol

Rob Fitzpatrick · Feb 11, 2015

Windows users often mistake a window with "Not responding" in its title bar for an application that is hung or otherwise in a bad state. This is not necessarily the case. The OS adds this description when the thread that owns the window receives a window message and the thread is not waiting for window messages.

This could happen, for example, when a single-threaded process is executing business logic or waiting on the completion of a synchronous I/O or a system call. That thread will not consume window messages until it returns from whatever it is doing.

Has your rate of undos (promon 5) increased ? What are the other symptoms of your strange performance issue?

Cringer · Feb 11, 2015

Users are getting intermittent freezes on certain screens when committing changes. It tends to happen in the afternoons, although not strictly limited to that. I've pasted an extract from a monitoring program I have that I kicked off when I got a report of it happening.

Unfortunately I don't know what the rate of undos is normally so I can't say if it's changed or not.

Code:

Iteration    Read Date    Read Time    BI Read    BI Write    Checkpoint    Buffer Flush    Tran Commit    DB Access    DB Read    DB Write    Records Created    Records Deleted    Records Read    Records Updated    Record Waits    Records Committed
1    11/02/15    12:20:04    43    6829    2    0    42446    76920683    524206    9060    17910    6319    30722934    35686    1    0
2    11/02/15    12:25:15    11    6296    2    0    11473    83926694    433032    9021    25623    5548    30081785    33902    0    0
3    11/02/15    12:30:28    17    8559    2    0    5651    102418667    444331    8905    35725    6349    37253832    44815    1    0
4    11/02/15    12:35:39    4    6574    1    0    5185    94768711    399434    7142    22559    6738    34924147    32504    0    0
5    11/02/15    12:40:50    49    5950    2    0    4328    68424312    258407    5745    20707    6369    26956057    25757    1    0
6    11/02/15    12:46:02    5    7083    1    0    7274    90813472    245563    6340    25761    6388    35312684    31932    6    0
7    11/02/15    12:51:15    7    5425    2    0    5102    61743179    284143    6748    18288    6432    26379733    22427    2    0
8    11/02/15    12:56:26    7    5466    1    0    4656    46298824    226568    5146    18682    6160    19003023    24447    0    0
9    11/02/15    13:01:37    3    6239    1    0    5007    40719091    306658    7515    23513    6223    16325950    29656    0    0
10    11/02/15    13:06:48    11    4575    2    0    5967    41924760    224671    6276    14807    4354    17020137    24578    5    0
11    11/02/15    13:12:00    3    4475    1    0    4547    46610572    268941    5162    14098    3159    18434522    27180    10    0
12    11/02/15    13:17:11    3    5446    1    0    6085    47087756    322420    5353    17907    6388    18509296    24307    0    0

RealHeavyDude · Feb 11, 2015

You might want to have a look at the following KB: http://knowledgebase.progress.com/articles/Article/P186859 which explains the why and a possible solution.

Heavy Regards, RealHeavyDude.

Cringer · Feb 11, 2015

Thanks RHD - I'm aware of the why of it happening. Unfortunately as this is a symptom of an intermittent performance issue rather than a code issue there's not much I can do in that regard.

Cringer · Feb 11, 2015

I've added Undos to my monitoring tool so will keep an eye on that in future.

Rob Fitzpatrick · Feb 11, 2015

You're on 11, right? Forward processing is stalled while the fdatasync() call happens in the checkpoint process. That time is shown in the sync time column in the checkpoint screen in promon. Transaction activity also stalls when there are no free BI buffers. Potentially, AI could be a bottleneck as well if there's a problem there.

How often are you checkpointing during busy periods?
What are your BI block size, BI cluster size, and -bibufs?
Are you running a BIW?
Are -aibufs and AI block size set like BI?
Is this DB a replication source? If so, how far behind is the target (or targets)?

Cringer · Feb 11, 2015

Yes running 11.2.1, 32 bit on a 32 bit server.

Currently configuring a 11.5 64 bit test bed on Win Server 2012 R2

How often are you checkpointing during busy periods?

usually once or twice - we upped BI Cluster Size because we were on default values with huge checkpoint values and buffer flushes at checkpoint.

What are your BI block size, BI cluster size, and -bibufs?

BI Block 8192
BI Cluster 32768
-bibufs looks like 32 if I'm looking at the right OEM option

Are you running a BIW?

Yup

Are -aibufs and AI block size set like BI?

AI Block 8192
-aibufs looks like 5, again assuming I'm looking at the right option.

Is this DB a replication source? If so, how far behind is the target (or targets)?

Yes it's a source.
Practically no difference between the two at the moment

Rob Fitzpatrick · Feb 11, 2015

Well, your buffers for transaction activity are very small, so potentially that can be a bottleneck. Also, -aibufs should be the same as -bibufs. It may be easier for you to use promon, ProTop, or the DB log to look at startup parameter values so you don't have to do the OEM parsing in your head.

I would set:
-bibufs 200
-aibufs 200
-biblocksize 16
-aiblocksize 16

The buffers may be more than you need but in terms of risk versus reward, it's a small amount of RAM to pay to avoid big performance penalties. I'd rather have more than I need than too little.

Are you running an AIW?
Have you had empty BI buffer waits? promon R&D 2 5

Rob Fitzpatrick · Feb 11, 2015

As you probably know, changing the block sizes has to wait until you restart the DB.

Cringer · Feb 11, 2015

Yes running an AIW.
BI Log:

Code:

                                    Total

Total BI writes                   4225686
BIW BI writes                     4039525
Records written                    334568K
Bytes written                    32471123K
Total BI Reads                    3277453
Records read                      1978362
Bytes read                         178754K
Clusters closed                       995
Busy buffer waits                    2314
Empty buffer waits                   3791
Log force waits                         0
Log force writes                        0
Partial writes                     153833
Input buffer hits                  817974
Output buffer hits                1300806
Mod buffer hits                    438397
BO buffer hits                     711029

When you say a "small" RAM hit for that, what are we talking? We're pretty much at the maximum we can squeeze out of our 32 bit server!

Cringer · Feb 11, 2015

Unfortunately I don't have the log file from the last DB start anymore. We have been doing a prolog every Friday as we had a process that was spamming the logs. We fixed that issue last week, but forgot to stop the log trimming.

Cringer · Feb 11, 2015

Cringer said:
How often are you checkpointing during busy periods?

usually once or twice - we upped BI Cluster Size because we were on default values with huge checkpoint values and buffer flushes at checkpoint.

Once or twice every five minutes is what I meant to say here.

Rob Fitzpatrick · Feb 11, 2015

Cringer said:
When you say a "small" RAM hit for that, what are we talking?

The calculation is similar to -B. Number of buffers times block size. If BI block size is 16 KB and -bibufs is 200 then the hit is 3.1 MB. If you can't afford that you have bigger issues.

Cringer said:
Unfortunately I don't have the log file from the last DB start anymore.

You can also see -bibufs and -aibufs and block sizes in promon R&D 1 9 and 1 10.

Cringer · Feb 11, 2015

Thanks Rob. Time to get discussing with my colleague!

Cringer · Feb 11, 2015

If I change AI Block size on REPL source, do I also need to do the same on the REPL Target?

Rob Fitzpatrick · Feb 11, 2015

Cringer said:
If I change AI Block size on REPL source, do I also need to do the same on the REPL Target?

To change AI block size you first need to disable replication and disable AI. Then reseed replication.

Cringer · Feb 11, 2015

Thanks Rob. bi and ai buffers set to 200 so they'll come into force when the DB needs restarting (at the next crash!). I'll do the block sizes at the next planned outage.

Question BI Read

ProgressTalk.com Moderator

ProgressTalk.com Sponsor

ProgressTalk.com Moderator

ProgressTalk.com Sponsor

ProgressTalk.com Moderator

Well-Known Member

ProgressTalk.com Moderator

ProgressTalk.com Moderator

ProgressTalk.com Sponsor

ProgressTalk.com Moderator

ProgressTalk.com Sponsor

ProgressTalk.com Sponsor

ProgressTalk.com Moderator

ProgressTalk.com Moderator

ProgressTalk.com Moderator

ProgressTalk.com Sponsor

ProgressTalk.com Moderator

ProgressTalk.com Moderator

ProgressTalk.com Sponsor

ProgressTalk.com Moderator

Similar threads