Active PID/db transaction - yet no db user

cdwatkins

New Member
Last week, I had a BI file hit the -bistall limit.
Investigation into "why" revealed something rather interesting.

I used R&D function of promon and found an "active transaction" that had been there for about four days.
When I performed a "who" to look at that user/sesson - there wasn't any Linux session to be found.
Using promon, I then Disconnected that user from all the databases in which it was attached (this is a QAD application).
Active Transaction display - STILL showed the same four-day old transaction - in process.
I performed ps -ef | grep (user) and found a process running for that userid - which had been disconnected from databases and was not showing in my "who" command. The PID displayed as 2nnnn 1 ... which concerned me. Attempting to kill 2nnnn provided NO results and I didn't want to kill -9 for fear of crashing the db and having 150 irate users.

At a quiet time, I was able to stop all the databases (eventuall), kill the offending PID, truncate my extremely large BI, and re-start processes.

Has anybody had a similar experience and was there a better way for me to resolve my issue?

I appreciate your feedback/thoughts.
 
Last week, I had a BI file hit the -bistall limit.
Investigation into "why" revealed something rather interesting.

I used R&D function of promon and found an "active transaction" that had been there for about four days.
When I performed a "who" to look at that user/sesson - there wasn't any Linux session to be found.
Using promon, I then Disconnected that user from all the databases in which it was attached (this is a QAD application)..
Was the database still stalled when you did this? Or had you previously increased the -bithold so new BI notes could be written? Also, what is your Progress version?

Active Transaction display - STILL showed the same four-day old transaction - in process..

How much time elapsed between requesting the disconnect of the user and checking the active transactions? If this was a long-running transaction with many changes then it would require a long undo phase, writing more BI notes.

I performed ps -ef | grep (user) and found a process running for that userid - which had been disconnected from databases and was not showing in my "who" command.

When you "disconnect" a user from the DB it doesn't necessarily disconnect immediately (or at all). A flag is set that the disconnect is requested, but any active transaction must still be undone and any resources held by that user must be released. What did you see in the database log at this time?

The PID displayed as 2nnnn 1 ... which concerned me.

Why? Because the PPID was 1? This doesn't necessarily indicate a process that is in a bad state. It means its parent has exited, but that doesn't affect the child. I currently have lots of healthy client processes on my dev box with PPID of 1.

Attempting to kill 2nnnn provided NO results and I didn't want to kill -9 for fear of crashing the db and having 150 irate users.

Which log(s) did you check to confirm there were no results? Also, good for you that you didn't use kill -9.

At a quiet time, I was able to stop all the databases (eventuall), kill the offending PID, truncate my extremely large BI, and re-start processes.

Has anybody had a similar experience and was there a better way for me to resolve my issue?

I have experienced BI growth due to long-running transactions. It is important to isolate what the client is or was doing that may have caused this. Getting the client's call stack before attempting to kill it is a good first step. If you're on 10.1C or later you can use kill -SIGUSR1 <pid> to produce a file protrace.<pid>. Failing that, talk to the user if possible and ask them to retrace their steps. Pass that info to the developers so they can look for transaction scoping issues. Also, it's important to check the client log, DB log, and promon or ProTop for relevant info (locks held, etc.).[/quote]
 
Last edited:
Was the database still stalled when you did this? Or had you previously increased the -bithold so new BI notes could be written? Also, what is your Progress version? I had increased the -bithold first - to make my phone stop ringing... Progress version 10.2B



How much time elapsed between requesting the disconnect of the user and checking the active transactions? If this was a long-running transaction with many changes then it would require a long undo phase, writing more BI notes. The active transaction still showed active for a couple days. The stall happened on a Wednesday, and I nursed the whole thing along until early Sunday morning, which is my maintenance window.



When you "disconnect" a user from the DB it doesn't necessarily disconnect immediately (or at all). A flag is set that the disconnect is requested, but any active transaction must still be undone and any resources held by that user must be released. What did you see in the database log at this time? I don't see anything in the db log that references the userID that I disconnected.. (not that it isn't in there, I just don't see it)



Why? Because the PPID was 1? This doesn't necessarily indicate a process that is in a bad state. It means its parent has exited, but that doesn't affect the child. I currently have lots of healthy client processes on my dev box with PPID of 1.
None of my other sessions (was an RFExpress session actually) show with PPID of 1.. so - the whole deal was not "normal"


Which log(s) did you check to confirm there were no results? Also, good for you that you didn't use kill -9.
As to the "no results".. the PID wouldn't go away - the active transaction still showed in promon.. I was eventually able to kill it after I'd gotten the database to shut down.

Rob - I do appreciate your thoughts on this issue.
 
I would guess that the process didn't go away because it was still in the process of rolling back the transaction. This has to happen before it disconnects from the database. Note that for a client that has been asked to disconnect, proshut and promon screen 1, 1 will show different information. The client will not appear in the former but it will be in the latter.
 
Back
Top