User lock in 10.2A can not be cleared

GHToon

New Member
Hi Everyone,

This has been driving me nuts for 3 months.

We have a case management system that runs upon Progress database. Back at the end of April the software vendor upgraded the OpenEdge database engine from 10.1b to 10.2A and then the fun began.
My first re-action was to roll back to 10.1B but was told by the software vendor this was not possible.

The issue.
Users and ATM's (automatic process like reports) had random hangs. We found that Proshut clear the user from the Proshut list but was showing as locked in the Promon.

We found that if a user hanged it would kill the Client Broker and any user attached to that Broker.
The CPU % assigned to that _mprosrv.exe would go nuts.
Nothing in the logs, nothing on the screen apart from a white window where the application was suppose to be.

The only way to release the user/s is to shutdown the entire database and then restart.

Sometimes this problem would not occur for several days, maybe even a week. But in the past week it's been every day and becoming more more frequent.

Just spent 25hrs rebuilding the index because some corruption's where found. The rebuild of the index was successful and according to the log no corruption's.

But within 8hrs I have the same problem.

I have found the same problem reported on this forum but there didn't seem to be a solution.

I have loss faith with the Software Vendor as they don't know what the problem is either.
I know there next suggestion with be to downgrade to 10.1C 04 but am not convinced that this will make any difference.

Sorry for the rant on my first post, but please is there anyone out there who can help..

Regards

Gregg
Did anyone find a solution?

We are running 10.2A 03 on Windows 2003 Enterprise (32-bit) with 4 x Xenon cpu's, 16Gb RAM, loads of free disk space.


So
Watchdog's are enabled
 
Starting with 10.1C you can send a SIGUSR1 signal to a database process on UNIX and it will produce a protrace file with the ABL code that was executing at the time. I have no idea how you could do that with Windows, but maybe the old sysinternals utilities have something that would work.

Have you enabled Client Caching ? promon R&D 1 18 ? That could show you what code was executing on the hung client.

I would also consider running the progather script (download from progress kb) when the hang occurs.
 
I'm with CJ -- the most important first step is to clearly define and understand what this "hang" situation really is. The correct course of action depends greatly on that. Without knowing that you are just flailing about -- someone is going to get hurt if you keep that up.

My personal guess is that the vendor released some code along with this upgrade that is executing a query without proper index support. Possibly while a transaction is active. This would explain very long run times and poor disconnect behaviors. But it is only a guess -- the problem could be something else.

It would also be a good idea to make sure that you have applied all of the service packs to 10.2A.
 
General the client hangs when running a report or SQL query. Promon shows locks but there is no read/write occurring.
Just downgraded from 10.2A SP03 to 10.1C SP04

These reports and SQL queries are quite long and involve going through all records looking for matches. The bizarre thing is that the same report can hang, so restart and re-run the same report and it works.
So it's not the criteria of the report / SQL query.

Cheers

Gregg


If or should say when it happens again I will have a look at the Promon R&D 1,18.
 
When you say that PROMON "shows locks" what do you mean and why do you think that "locks" are particularly relevant to this problem?

Why do you think that there is no read/write occuring? What are you looking at that tells you this?

Now that we know that SQL is involved, the standard SQL related advice can be offered - run dbtool to fix width problems (even if you don't think you have any) and UPDATE STATISTICS to help the optimizer with query performance.

It might also be a good idea to post some information about you startup parameters and configuration. For instance, is all of your data in type 2 areas?
 
Well it didn't take long to crash.
The reason I know about no read/writes is that Windows Performance monitor shows no activity on the database drive.

Here is some data from Promon I could scrap off the screen:

♀08/10/10 Status: Startup Parameters
06:05:34

Maximum clients: 301
Maximum servers: 100
Maximum clients per server: 0
Lock table size: 150016 entries
Database buffers: 400000 (1600000 kb)
Excess shared memory: 133 kb
APW queue check time: 100 milliseconds
APW scan time: 1 seconds
APW buffers to scan: 666
APW max writes / scan: 25
Spinlock tries before timeout: 80000
Before-image buffers: 64 (256 kb)
After-image buffers: 64 (256 kb)
Max number of JTA transactions: 100


Under Clients/Processes

08/10/10 Status: All Processes 06:09:34

Usr Name Type Wait Trans id Login time

0 SYSTEM BROK -- 0 0 08/10/10 01:32
1 SYSTEM SERV -- 0 0 08/10/10 01:45
2 SYSTEM SERV -- 0 0 08/10/10 01:45
3 SYSTEM SERV -- 0 0 08/10/10 01:45
4 SYSTEM SERV -- 0 0 08/10/10 01:45
101 BIW -- 0 0 08/10/10 01:32
102 WDOG -- 0 0 08/10/10 01:32
103 APW -- 0 0 08/10/10 01:32
104 APW -- 0 0 08/10/10 01:32
105 APW -- 0 0 08/10/10 01:32
106 administ MON -- 0 0 08/10/10 06:00
397 REMC/ABL -- 21642880 0 08/10/10 01:45

Blocked Clients, Active Transactions, Local Interactive Clients, Local Batch Clients options return blanks




Locked Table:

♀08/10/10 Status: Lock Table
06:04:03

Usr Name Trans id Type Table RECID Flags Trans State


397 0 REC 2 3586 S Dead

397 1527985252 REC 31 15034039 S Dead


♀08/10/10 Status: Remote Clients
06:10:23

Usr Name Type Wait Trans id Login time Se
rv IPV# Remote Address

397 REMC/ABL -- 21642880 0 08/10/10 01:45
4 IPV4 10.11.4.39


♀08/10/10 Status: Background Processes
06:11:45

Usr Name Type Start time Pid

0 SYSTEM BROK 08/10/10 01:32 6640
1 SYSTEM SERV 08/10/10 01:45 4152
2 SYSTEM SERV 08/10/10 01:45 4228
3 SYSTEM SERV 08/10/10 01:45 752
4 SYSTEM SERV 08/10/10 01:45 4572
101 BIW 08/10/10 01:32 4480
102 WDOG 08/10/10 01:32 7124
103 APW 08/10/10 01:32 4216
104 APW 08/10/10 01:32 140
105 APW 08/10/10 01:32 7920
106 administ MON 08/10/10 06:00 8112
 
Using Windows perfmon won't show you reads if the reads are occurring in memory. I would suggest promon option 3 -> 1 or option 5 as a start. Then move on to configuring tablestats and indexstats so you can tell which tables and what users are performing the reads.

After updating the stats and begin collecting tablestats, I would recommend looking at the explain table for the sql queries begin executed to see if they are doing what you expect.
 
You might want to consider hiring an experienced database consultant to help you out with this...

To elaborate on what CJ said -- disk IO is only part of the picture. The query could be (and likely is) executing some convoluted process entirely (or mostly) in memory. You need to dig much more deeply into PROMON to see what is really happening. Perfmon isn't going to tell you much of anything useful about this.

Making sure that the db starts up with a -tablerangesize and -indexrangesize adequate to capture all table and index activity stats is also critical. The default range of 50 is almost certainly not nearly wide enough to be useful.

The "DEAD" that you see in the lock table is telling us that the transaction associated with the record lock is no longer active. There are several possible reasons for it showing up there -- one is poor transaction and record locking scope in the code.

You did not answer the question regarding startup parameters and configuration very completely. It would be very helpful if you were to show the database structure file (dbname.st). This will tell us whether or not type 2 storage areas are defined (it won't tell us much about if they are any good or not, but it will at least show if the attempt is being made).

You can get detailed and reliable startup parameter information from the log file (dbname.lg). The startup parameters are the first thing written to the log when the db starts. Open the .lg file with wordpad, search for the date when you last restarted the db and then copy the page or so of startup messages (the interesting stuff stops at about the same time you see BIW, APW & WDOG starting).

As an aside -- you are missing an AIW which means that you are not running after-imaging and you are therefore playing with fire in terms of the recoverability of this database. If you backup once per night you could, potentially, lose an entire day of work (unrecoverable crashes always occur when you have the most risk -- IOW just before you would have done a backup...). More if the backups turn out to be no good. Is the business aware that this risk exists?

Client side parameters are also of interest. Usually there is a .pf file associated with client startup.

The PROMON "summary" screen would be interesting to see. You get to it by selecting R&D -> 2 -> 1. Then use "s" to sample the data. Get a sample when things are going well and another when things are going poorly -- this may reveal an interesting difference that points the way to your real problem.

While you are within the PROMON R&D menus you might also want to take a look at R&D -> 1 -> 4 -> 2 (Blocked Clients) while the problem is occurring. If there is something interesting there it would be worth knowing about.

You said "SQL Queries". Yet in the screen scrapings above there are no SQL clients -- only ABL. Are you saying SQL out of habit (perhaps due to experience with some other database) or because there is an actual ODBC/JDBC/SQL client somewhere in this picture? If you are actually using SQL I see no evidence of it.

None the less, my advice regarding SQL stands. Run dbtool to correct field width problems (even if you don't think there are any). And run UPDATE STATISTICS so that the optimizer can do its job.
 
Ok, at the moment the server / client has not hung again, since I reverted back to an older version of the application software.

Am alarmed about the AIW, but we are totally reliant upon the software vendor as they created this database and software. It's not bespoke, but configurable.

The SQL I referrer to is the name of the module within the software we are using, whether it is using actual SQL is open to debate. So I leave that to one side for time being.

I will come back with a sample of good and bad times. Virtually no activity at the moment (8pm)

Below is database structure and database startup parameters our Proclaim database. At the very end is the client pf file.

Regards

Gregg

dbname.st

#
b E:\Proclaim\v3db\proclaimdb\proclaim.b1
#
d "Schema Area":6,32;1 E:\Proclaim\v3db\proclaimdb\proclaim.d1
#
d "General Data":7,128;8 E:\Proclaim\v3db\proclaimdb\proclaim_7.d1 f 100032
d "General Data":7,128;8 E:\Proclaim\v3db\proclaimdb\proclaim_7.d2 f 100032
d "General Data":7,128;8 E:\Proclaim\v3db\proclaimdb\proclaim_7.d3 f 100032
d "General Data":7,128;8 E:\Proclaim\v3db\proclaimdb\proclaim_7.d4
#
d "General Index":8,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_8.d1 f 50048
d "General Index":8,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_8.d2 f 50048
d "General Index":8,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_8.d3
#
d "Session Data":9,16;64 E:\Proclaim\v3db\proclaimdb\proclaim_9.d1 f 50048
d "Session Data":9,16;64 E:\Proclaim\v3db\proclaimdb\proclaim_9.d2
#
d "Session Index":10,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_10.d1 f 10048
d "Session Index":10,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_10.d2
#
d "High Activity Data":11,32;512 E:\Proclaim\v3db\proclaimdb\proclaim_11.d1 f 100032
d "High Activity Data":11,32;512 E:\Proclaim\v3db\proclaimdb\proclaim_11.d2 f 100032
d "High Activity Data":11,32;512 E:\Proclaim\v3db\proclaimdb\proclaim_11.d3
#
d "High Activity Index":12,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_12.d1 f 50048
d "High Activity Index":12,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_12.d2 f 50048
d "High Activity Index":12,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_12.d3
#
d "Field Dfn Data":13,64;64 E:\Proclaim\v3db\proclaimdb\proclaim_13.d1 f 10048
d "Field Dfn Data":13,64;64 E:\Proclaim\v3db\proclaimdb\proclaim_13.d2
#
d "Field Dfn Index":14,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_14.d1 f 10048
d "Field Dfn Index":14,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_14.d2
#
d "Names Data":15,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_15.d1 f 100032
d "Names Data":15,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_15.d2 f 100032
d "Names Data":15,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_15.d3 f 100032
d "Names Data":15,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_15.d4
#
d "Names Index":16,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_16.d1 f 50048
d "Names Index":16,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_16.d2 f 50048
d "Names Index":16,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_16.d3 f 50048
d "Names Index":16,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_16.d4
#
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d1 f 1000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d2 f 1000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d3 f 1000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d4 f 1000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d5 f 1000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d6 f 1000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d7 f 1000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d8 f 1000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d9 f 1000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d10 f 1000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d11 f 1912832
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d12 f 2000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d13 f 2000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d14 f 1925120
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d15 f 2000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d16 f 2000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d17 f 2000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d18 f 2000000
d "History Data":17,16;512 E:\Proclaim\v3db\proclaimdb\proclaim_17.d19
#
d "History Index":18,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_18.d1 f 500032
d "History Index":18,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_18.d2 f 500032
d "History Index":18,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_18.d3 f 500032
d "History Index":18,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_18.d4 f 500032
d "History Index":18,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_18.d5 f 500032
d "History Index":18,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_18.d6 f 500032
d "History Index":18,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_18.d7 f 500032
d "History Index":18,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_18.d8 f 500032
d "History Index":18,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_18.d9
#
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d1 f 2000000
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d2 f 1927168
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d3 f 2000000
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d4 f 2000000
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d5 f 1968128
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d6 f 2000000
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d7 f 1923072
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d8 f 2000000
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d9 f 2000000
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d10 f 2000000
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d11 f 2000000
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d12 f 2000000
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d13 f 1908736
d "Audit Data":19,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_19.d14
#
d "Audit Index":20,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_20.d1 f 1000000
d "Audit Index":20,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_20.d2 f 1789248
d "Audit Index":20,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_20.d3 f 1917760
d "Audit Index":20,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_20.d4 f 2000000
d "Audit Index":20,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_20.d5
#
d "Case Data":21,128;512 E:\Proclaim\v3db\proclaimdb\proclaim_21.d1 f 500032
d "Case Data":21,128;512 E:\Proclaim\v3db\proclaimdb\proclaim_21.d2 f 500032
d "Case Data":21,128;512 E:\Proclaim\v3db\proclaimdb\proclaim_21.d3 f 500032
d "Case Data":21,128;512 E:\Proclaim\v3db\proclaimdb\proclaim_21.d4
#
d "Case Index":22,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_22.d1 f 100032
d "Case Index":22,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_22.d2 f 100032
d "Case Index":22,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_22.d3 f 100032
d "Case Index":22,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_22.d4
#
d "XRef Data":23,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_23.d1 f 500032
d "XRef Data":23,64;512 E:\Proclaim\v3db\proclaimdb\proclaim_23.d2
#
d "XRef Index":24,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_24.d1 f 100032
d "XRef Index":24,1;8 E:\Proclaim\v3db\proclaimdb\proclaim_24.d2
#
d "BLOB Data":25,32;1 E:\Proclaim\v3db\proclaimdb\proclaim_25.d1 f 100032
d "BLOB Data":25,32;1 E:\Proclaim\v3db\proclaimdb\proclaim_25.d2


startup parameters from the lg file

[2010/08/06@13:12:26.179+0100] P-7240 T-7692 I BROKER 0: (333) Multi-user session begin.
[2010/08/06@13:12:26.195+0100] P-7240 T-7692 I BROKER 0: (5326) Begin Physical Redo Phase at 23552 .
[2010/08/06@13:12:26.570+0100] P-7240 T-7692 I BROKER 0: (7161) Physical Redo Phase Completed at blk 24744 off 4869 upd 590.
[2010/08/06@13:12:26.570+0100] P-7240 T-7692 I BROKER 0: (13547) At end of Physical redo, transaction table size is 512.
[2010/08/06@13:12:26.711+0100] P-7240 T-7692 I BROKER 0: (452) Login by Administrator on CON:.
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (5644) Started for 11001 using TCP IPV4 address 0.0.0.0, pid 7240.
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (8836) Connecting to Admin Server on port 7841.
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (14262) Successfully connected to AdminServer on port 7841 using TCP/IP IPV4 address 10.11.1.21.
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (8846) Registered with Admin Server.
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (4234) Progress OpenEdge Release 10.2A build 1390 SP03 on WINNT .
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (4281) Server started by Administrator on CON:.
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (6574) Started using pid: 7240.
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (4235) Physical Database Name (-db): E:\Proclaim\v3db\proclaimdb\proclaim.
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (4236) Database Type (-dt): PROGRESS.
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (4237) Force Access (-F): Not Enabled.
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (4238) Direct I/O (-directio): Not Enabled.
[2010/08/06@13:12:26.726+0100] P-7240 T-7692 I BROKER 0: (-----) LRU mechanism enabled.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4239) Number of Database Buffers (-B): 400000.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (9422) Maximum private buffers per user (-Bpmax): 64.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4240) Excess Shared Memory Size (-Mxs): 133.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (10014) The shared memory segment is not locked in memory.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4241) Current Size of Lock Table (-L): 150016.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (13953) Maximum Area Number (-maxArea): 32000.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4242) Hash Table Entries (-hash): 137743.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4243) Current Spin Lock Tries (-spin): 80000.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (6526) Number of Semaphore Sets (-semsets): 3.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (13924) Maximum Shared Memory Segment Size (-shmsegsize) 2048 Mb.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4244) Crash Recovery (-i): Enabled.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (6573) Database Blocksize (-blocksize): 4096.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4245) Delay of Before-Image Flush (-Mf): 32000.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4247) Before-Image File I/O (-r -R): Reliable.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4249) Before-Image Truncate Interval (-G): 0.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4250) Before-Image Cluster Size: 16777216.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4251) Before-Image Block Size: 16384.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4252) Number of Before-Image Buffers (-bibufs): 64.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (-----) Record free chain search depth factor 5 (-recspacesearchdepth)
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (9238) BI File Threshold size (-bithold): 0.0 Bytes.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (6552) BI File Threshold Stall (-bistall): Disabled.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4254) After-Image Stall (-aistall): Not Enabled.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4255) After-Image Block Size: 8192.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4256) Number of After-Image Buffers (-aibufs): 64.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (8527) Storage object cache size (-omsize): 1024
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4257) Maximum Number of Clients Per Server (-Ma): 5.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4258) Maximum Number of Servers (-Mn): 101.
[2010/08/06@13:12:26.742+0100] P-7240 T-7692 I BROKER 0: (4259) Minimum Clients Per Server (-Mi): 1.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (4260) Maximum Number of Users (-n): 301.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (4261) Host Name (-H): GHNSVR-17.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (4262) Service Name (-S): 11001.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (14268) TCP/IP Version (-ipver) : IPV4
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (4263) Network Type (-N): TCP.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (4264) Character Set (-cpinternal): ISO8859-1.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (4282) Parameter File: Not Enabled.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (5647) Maximum Servers Per Broker (-Mpb): 100.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (5648) Minimum Port for Auto Servers (-minport): 11500.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (5649) Maximum Port for Auto Servers (-maxport): 11900.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (8865) This broker supports both 4GL and SQL server groups.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (9426) Large database file access has been enabled.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (9336) Created shared memory with segment_id: 12713984
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (12813) Allowed index cursors (-c): 1204.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (12814) Group delay (-groupdelay): 10.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (12815) Lock table hash table size (-lkhash): 25621
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (12816) Maxport (-maxport): 11900
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (12817) Minport (-minport): 11500
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (12818) Message Buffer Size (-Mm): 4096
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (12820) Maximum Servers per Broker (-Mpb): 100
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (12821) Use muxlatches (-mux): 1
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (12823) Semaphore Sets (-semsets): 3
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (13870) Database Service Manager - IPC Queue Size (-pica) : 64.0 KBytes.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (13896) TXE Commit lock skip limit (-TXESkipLimit): 10000.
[2010/08/06@13:12:26.757+0100] P-7240 T-7692 I BROKER 0: (10471) Database connections have been enabled.

Client pf file, it launches multiple databases but it Proclaim that we have the issue with:

# database.pf - startup parameters for smartpath development session

-mmax 65000

-nb 255
-l 768

-Bt 4096
-Mm 4096
-s 2048

-db proclaim
-ld proclaim
-H GHNSVR-17
-S 11001
-N tcp

-db dsp
-ld sp
-H GHNSVR-17
-S 12001
-N tcp

-db proacc
-ld proacc
-H GHNSVR-17
-S 13001
-N tcp

-db dyndb
-ld dynschema
-H GHNSVR-17
-S 15001
-N tcp

-d dmy
-T c:\temp
# -mmax 16768
-Bt 8192
 
1) You do not have to be dependent on them. There are a great many things that you can do for yourself. Implementing after-imaging, general database administration best practices and performance improvement are all things that you can do yourself (or hire someone to do for you) with, or without their help and/or approval.

2) Good. You are indeed using type 2 storage areas to at least some degree. How effective it is can't be determined but at least they are trying. The XREF areas sound interesting -- intuitively that sounds like a type of table that could pose some potential query challenges.

3) Nothing is obviously "wrong" with the startup parameters. The provided data does, however, indicate that you are running 10.2A and you said earlier that you had gone to 10.1C04 so that's a bit confusing.

4) While there is nothing "wrong" with the startup some stuff might be worth changing:
a) -spin 80,000 is probably too high. Try 5,000.
b) the clients ought to have -q
c) you might want to consider going to -Mm 8192 (must be changed everywhere)

5) What is the E: drive? Is it plain old disk, SAN? NAS? If it is something other than a plain old disk is there RAID involved? I'm assuming that you are using NTFS.

6) What about C:? Especially on the clients. Is it an ordinary internal disk? Or something else?

7) Are there DBI* files in c:\temp that are larger than 32k? (On the clients.) If there are then you may want to consider adding "-tmpbsize 1" to the client .pf file. Are there other large files in C:\temp?

8) Is there a reason why you aren't running Server 2008 and a 64 bit version of Progress?

9) There are multiple databases involved. Are they all on this same server? Are the startup parameters the same for all of them? Are they all sharing the same disks? I wouldn't rule them out so casually -- sometimes the root cause turns out to be something silly in a seemingly insignificant "side database".

Having said all of that it still comes down to observing the system as it runs and getting better details about what the problem really is.
 
3. - The startup was taken from Friday morning startup whilst on 10.2A 03. The lg file is in use at the moment that has the 10.1C 04 settings, but I understand the parameters have not changed.

4. What does -spin -q and -mm mean?
5. e: drive is Raid 5 (4 SAS disk in the server) and yes it's NTFS
6. Most clients run on Terminal Servers so the c: drive is a regular Raid 0 SATA disk. Those on PC it's regular internal drive.
7. The DBI files in c:\temp max at 32k The types of files are Lbi* (these can be over 5Mb), srt*
8. The server was put in place before we had a licence for Windows 2008. We are in the process of deciding whether to with a VM or physical server for the next server. Pretty sure whichever we should it will be Windows 2008 / 64-bit. Apart from address more RAM does 64-bit offer any performance boost?
9. The database all sit on the same RAID 5 (e: drive).
 
3. - The startup was taken from Friday morning startup whilst on 10.2A 03. The lg file is in use at the moment that has the 10.1C 04 settings, but I understand the parameters have not changed.

I guess I will have to take your word for it ;)

4. What does -spin -q and -mm mean?

Those are the names of startup parameters. You're probably using Exploder to set them in the properties file so they have some horribly confusing name and you get to use a GUI that makes it impossible to tell what their values actually are. Sorry about that.

Case is significant. -Mm is not the same as -mm.

5. e: drive is Raid 5 (4 SAS disk in the server)

You do realize that it would be very close to impossible to find a worse disk subsystem for a database? It's sort of "ok" if you never have much write activity and you don't care how long it takes to do things like restore backups, rebuild indexes or dump & reload the db. But it will never give you "great" performance.

and yes it's NTFS

Well at least something is ok.

6. Most clients run on Terminal Servers so the c: drive is a regular Raid 0 SATA disk. Those on PC it's regular internal drive.

Ok. So long as the drive isn't overloaded (how many IO ops/sec do you see during peak activity?)

7. The DBI files in c:\temp max at 32k

That's good. It means that your temp-tables are fitting within memory.

The types of files are Lbi* (these can be over 5Mb), srt*

There is nothing you can do about LBI. Very large SRT files might indicate some index selection issues that might need code changes.

8. The server was put in place before we had a licence for Windows 2008. We are in the process of deciding whether to with a VM or physical server for the next server. Pretty sure whichever we should it will be Windows 2008 / 64-bit. Apart from address more RAM does 64-bit offer any performance boost?

Sure. 64 bit native stuff is faster. Lots of memory is a very good thing too.

I'd be very careful about virtualizing. It might be ok, but you've already got problems and adding a VM layer isn't going to improve performance. It may not significantly harm it (although it might) but it certainly won't make it better.

9. The database all sit on the same RAID 5 (e: drive).

Bad.

And do they all have the same startup parameters?
 
BTW, startup parameter definitions and default values are documented in the online help. Find the "?" icon in your Progress commands off the start menu and then ask the index about them.
 
Back
Top