Find table/record based on transaction id

oldemanw · Oct 15, 2008

Hi Folks,

Is it possible to find the table/record from transaction id only?
We use MFG/Pro together with QXtend on OpenEdge 10.1B03 on a Linux box.
Occasionally QXtend leaves an open transaction behind, causing our BI file to grow until the database is taken down for backup on Saturday night.

Using protop (Tom: great tool!) I found that qxtend is the cause of the growing BI file, but I can't find what it is doing when it happens. Usually this happens on a Sunday, when the database is idle (no users, no transactions, I can only think of user logon history).

When I run this:
rfutil /mfg/data/mfghelp -C aimage scan verbose -a /mfg/ai/db2/mfgprod.a20 | less

And find this:
Trid: 1500228144 Sun Oct 12 15:24:10 2008. (2598)
Trid: 1500228144 User Id: qextend (12531)
Trid: 1500228144 code = RL_TBGN version = 1 (12528)
Trid: 1500228144 dbkey = 0 update counter = 0 (12530)
Trid: 1500228144 code = RL_TMSAVE version = 2 (12528)
Trid: 1500228144 dbkey = 0 update counter = 0 (12530)

Can I find the table/record that this ID 1500228144 applies to?

Any hints are welcome, thanks in advance!

Willem

oldemanw · Nov 10, 2008

After this weekend, I found a few 'hanging' transactions again, see below:

14:57:47 ProTop xx -- Progress Database Monitor 11/10/08
Summary mfgprod [/mfg/data/db2/mfgprod] Rate
Hit Ratio: 3713:1 324:1 Commits: 10 13 Sessions: 204
Miss% : 0.027% 0.308% Latch Waits: 2 0 Local: 196
Hit% : 99.973% 99.692% Tot/Mod Bufs: 45002 435 Remote: 0
Log Reads: 37130 1081 Evict Bufs: 0 0 Batch: 8
OS Reads: 10 3 Lock Table: 200000 313 Server: 1
Rec Reads: 17533 399 LkHWM|OldTrx: 1926 09:06:21 Other: 7
Log/Rec: 2.12 2.71 Old/Curr BI: 6 3160 TRX: 24
Area Full: 16 100.00% After Image: 1 / 17 a1 Blocked: 0

Open Transactions
Usr Name TRX Num BI Clstr Start Trx Stat Duration Wait
----- ------------ ---------- -------- -------- -------- -------- --------------
251 qextend 1694717774 6 05:51:26 ACTIVE 09:06:21 -- 0
279 qextend 1694726861 16 16:11:30 ACTIVE 22:46:17 -- 0
294 qextend 1694726934 16 16:42:12 ACTIVE 22:15:35 -- 0
275 qextend 1694728257 17 19:48:29 ACTIVE 19:09:18 -- 0
333 qextend 1695439081 34 03:33:41 ACTIVE 11:24:06 -- 0
433 pabcnath 1700414477 3158 14:50:31 ACTIVE 00:07:16 -- 0
422 axrsafi 1700417171 3159 14:51:04 ACTIVE 00:06:43 -- 3

I'm curious if it is possible to find out to what table / record these transactions in red color apply. Any hint is welcome.
By the way: the qextend user is an MFG/Pro API. It can e.q. automatically
enter a customer as if it is entered via the keyboard.

TomBascom · Nov 10, 2008

Transactions don't necessarily map to a particular record.

You could, however, look for the record locks being held by the user with the active transaction. (Be careful if you -L is large -- scanning record locks is very CPU intensive and, if the system is very busy, may not even complete.)

tamhas · Nov 10, 2008

In the original post, it is indicated that these long lasting transactions are happening or being noticed on Sunday when there are no users ... although it would seem that there needed to be users to have transactions. So, this would seem like a perfect time to also identify the locks. Of course, it could be that there are a large number of locked records in one or more of those transactions and, in fact, the reason for the long running transactions is a deadly embrace resulting from those locks, but at the very least the system should be idle. The only problem is having to either script the query so that it happens automatically or having to work on Sunday.

TomBascom · Nov 10, 2008

Actually a careful look at the data shown shows that the lock table only has a few hundred entries in it at the time of the problem.

comatt1 · Nov 11, 2008

oldemanw said:
After this weekend, I found a few 'hanging' transactions again, see below:

14:57:47 ProTop xx -- Progress Database Monitor 11/10/08
Summary mfgprod [/mfg/data/db2/mfgprod] Rate
Hit Ratio: 3713:1 324:1 Commits: 10 13 Sessions: 204
Miss% : 0.027% 0.308% Latch Waits: 2 0 Local: 196
Hit% : 99.973% 99.692% Tot/Mod Bufs: 45002 435 Remote: 0
Log Reads: 37130 1081 Evict Bufs: 0 0 Batch: 8
OS Reads: 10 3 Lock Table: 200000 313 Server: 1
Rec Reads: 17533 399 LkHWM|OldTrx: 1926 09:06:21 Other: 7
Log/Rec: 2.12 2.71 Old/Curr BI: 6 3160 TRX: 24
Area Full: 16 100.00% After Image: 1 / 17 a1 Blocked: 0

Open Transactions
Usr Name TRX Num BI Clstr Start Trx Stat Duration Wait
----- ------------ ---------- -------- -------- -------- -------- --------------
251 qextend 1694717774 6 05:51:26 ACTIVE 09:06:21 -- 0
279 qextend 1694726861 16 16:11:30 ACTIVE 22:46:17 -- 0
294 qextend 1694726934 16 16:42:12 ACTIVE 22:15:35 -- 0
275 qextend 1694728257 17 19:48:29 ACTIVE 19:09:18 -- 0
333 qextend 1695439081 34 03:33:41 ACTIVE 11:24:06 -- 0
433 pabcnath 1700414477 3158 14:50:31 ACTIVE 00:07:16 -- 0
422 axrsafi 1700417171 3159 14:51:04 ACTIVE 00:06:43 -- 3

I'm curious if it is possible to find out to what table / record these transactions in red color apply. Any hint is welcome.
By the way: the qextend user is an MFG/Pro API. It can e.q. automatically
enter a customer as if it is entered via the keyboard.

I am not overly familiar with Protop (sorry Tom, use Promonitor). However, what stuck out to me was

Area Full: 16 100.00% After Image: 1 / 17 a1 Blocked: 0

Am I wrong in assuming that Area 16 is for AI, and is 100% full?

If so, could this be an issue with the AI archiving?

Wouldn't the BI grow if the AI can't hold any more data and/or can't keep up with the archiving of AI extents.

We have a CRON that runs every 15 minutes, but have seen spikes in the BI during periods where writes to the database exceed the level at which we archive AI.

I am currently on 10.0B05 though, so don't have the benefit of using the aiarchiver that is in 10.1.

Just a thought.

TomBascom · Nov 11, 2008

I am not overly familiar with Protop (sorry Tom, use Promonitor). However, what stuck out to me was

Area Full: 16 100.00% After Image: 1 / 17 a1 Blocked: 0

Am I wrong in assuming that Area 16 is for AI, and is 100% full?

It means that 16 storage areas have exceeded ProTop's threshold for a "full" warning (80%) and that the worst of the bunch is 100% full.

If so, could this be an issue with the AI archiving?

Wouldn't the BI grow if the AI can't hold any more data and/or can't keep up with the archiving of AI extents.

No. If ai extents fill and a new extent can't be used then the db will either stall or crash. BI is not an alternate for AI.

We have a CRON that runs every 15 minutes, but have seen spikes in the BI during periods where writes to the database exceed the level at which we archive AI.

Sorry, but this makes no sense to me.

I am currently on 10.0B05 though, so don't have the benefit of using the aiarchiver that is in 10.1.

Just a thought.

None of that has anything to do with the root cause of a long-standing transaction. Our original poster is trying to find the record ids associated with a particular transaction. I'm not entirely sure why -- I'm guessing that it might be that he is hoping to use that information to figure out what the process with the transaction is doing. The ai scan is telling that the dbkey is 0 though which argues that NO records are being modified. So there shouldn't be an ACTIVE transaction -- it should only be ALLOCATED. IOW something doesn't add up...

TomBascom · Nov 11, 2008

Re: ACTIVE vs ALLOCATED...

ProTop is showing a different TRX# than the scan -- that's why the dbkey is 0. The scan needs to xref the TRX# in order to make sense. (You need to "grep" for the TRX# in the scan output...)

oldemanw · Nov 12, 2008

Hi guys,

AI isn't troubling us. I have variable extents and rotate them once an hour. Every Saturday night the db is taken down for an offline backup and the AI's are emptied (on working days the system must be up 24 hours since we are a global company).

I can't speak freely since we are under a 'non-disclosure' contract, but...
I'm trying to find out what is happening in this hanging transaction so that I can guide our software supplier in the right direction. This is a call for them that has been open for a long time and they can't find the cause yet. This problem has caused a 'production down' twice, so I'm eager to find the cause.

There is a funny thing about these hanging sessions (that I can see on the Linux box):
The parent process of the progress session has died and the session is taken over by the init process (parent id = 1 instead of some logon script).

Having a look at the _LockReq table gives me this (user 321 is hanging):

for each _lockreq no-lock
where _lockreq-num = 321:

disp _lockreq with 1 col.

User Number: 321
User Name: qextend
Record Locks: 95
Record Waits: 0
Trans Locks: 0
Trans Waits: 0
Schema Locks: 0
Schema Waits: 0
ExclFind: 0
ShrFind: 0
Misc: ?

In the _Lock table I cannot find entries for this user.
Weird, I would expect to find 95 record locks there?

I'm trying to understand the deeper workings of Progress and vst's, but that is still fairly new to me. I'll need to go by your guidance.

Thanks for your feedback so far

-Willem.

tamhas · Nov 12, 2008

The parent process of the progress session has died and the session is taken over by the init process (parent id = 1 instead of some logon script).

Sounds like the result of a kill -9. Your culprit may not be the code, but the operator!

oldemanw · May 12, 2009

The problem is solved.
The cause was a Progress bug, OE00170971. A hangup signal could leave a transaction open, thus causing the BI to start growing.

The solution was to upgrade to OpenEdge 10.1C02. We are using it for 2 months now and I haven't seen the BI growth issue since

tamhas · May 12, 2009

This might be the first time on this board that I have seen someone resurrect an old thread for a good reason!

Find table/record based on transaction id

oldemanw

Member

oldemanw

Member

TomBascom

Curmudgeon

tamhas

ProgressTalk.com Sponsor

TomBascom

Curmudgeon

comatt1

Member

TomBascom

Curmudgeon

TomBascom

Curmudgeon

oldemanw

Member

tamhas

ProgressTalk.com Sponsor

oldemanw

Member

tamhas

ProgressTalk.com Sponsor