[progress Communities] [progress Openedge Abl] Forum Post: Re: Load Average Skyrocket In...

  • Thread starter Thread starter Simon L. Prinsloo
  • Start date Start date
Status
Not open for further replies.
S

Simon L. Prinsloo

Guest
At first, the dead lock was an educated guess, ass the message queue would stop processing until you killed the process and restarted it. But the code has a standard include, dating from circa v.6, that is used to lock a record if possible or tell the user what is happening when the record is locked. I changed it to publish an event when the session is in batch mode and the lock duration exceeds a thresh hold. The I trap the event to log some information about the lock. This was not very useful at first, but once the customer upgraded to v. 11.6 I could enhance the code to give me more information. Using this, we identified more than one of these conditions. As they fixed the code and the situation occurred less and less frequently, I noticed a direct correlation between the occurrence of a lock and a significant rise in the load average. When ever the load average rapidly rise from the normal levels (around 15) to 25+, data will also be written to the Lock-log. As soon as I kill either process writing to the log, the load average drops immediately like a stone. That being said, I think I just answered my own question. I suspect that the customer's code that informs the user what is happening while it tries to lock the record has a tight loop that is most likely killing off a processor. In the past the machined had less processors and all users had to share and this tight loop could go unnoticed, but this machine has almost 100 cores and such a "haning" process will get access to many more cycles than usual. If memory serves, the offending code goes something like this: IF LOCKED ({1}) THEN DO: MESSAGE "Record is locked...... w_locktimer = ETIME. DO WHILE NOT AVAIL ({1}): FIND {1} EXCLUSIVE WHERE ROWID({1}) = w_lockrowid NO-WAIT NO-ERROR. IF SESSION:BATCH-MODE AND ETIME - w_locktime > 20000 THEN DO: PUBLISH "ExtremeLockDuration" (INPUT BUFFER {1}:HANDLE, INPUT w_lockrowid). w_locktime = ETIME. END. END. END.

Continue reading...
 
Status
Not open for further replies.
Back
Top