progressproger
New Member
Hi.
I work with the Progress 7.3D on HP-UX B.10.20 A 9000/785 and, therefore, cannot use anything like _Lock, _UserLock, etc.
A couple of weeks ago I had the test environment DB replaced by that from the production. Since there, not much changes were made to the test base. But several days ago the nightly scheduled activity started crashing with "Lock table overflow, increase -L on server (915)". On production, everything is working well. The source code is rather old (about 10 years old) and had been written before I was employed here.
The script that is crashing generally looks like
The crash happens on the very first CustomerBills being processed on one of its CustomerServiceChanges records.
Under the rather complex activity I mean procedures that are used in hundreds of places across the whole big application, and this is why the TRANLABEL transaction is so wide - it's better to undo the big TRANLABEL transaction than cleaning anything out manually from all the about 20 tables.
I don't expect more than a few dozens of records to be locked within the TRANLABEL transaction block at most, but the promon shows about 1700 simultaneous locks kept by the user I execute the script on behalf of.
The -L parameter is now 12000 (when this started a few days ago, it was 10000) which is much more than 1700, but the crash does happen.
I did not increase the -L value more this time before I figure out this strange behavior. And this is also something to take long, because I don't have permissions to do this and getting this done by admins takes long to go thru several approvals chain.
When such problem happened before in different places, increasing helped for just a few months, and then I had to increase it again and again. Before I go ahead and increase again, I would like to figure out more than I used to when increasing -L in the past, because increasing helps very temporarily, and also I don't want to give -L too large value in order not to decrease the overall performance which is great now.
I also played with the transaction scope - I removed the transaction work from the TRANLABEL's do statement and put only actual changes of be_CustomerServiceChanges and be_CustomerBills into separate transactions, but it always crashed on attempting to get an exclusive lock either on be_CustomerServiceChanges or on be_CustomerBills.
In other words, changing transaction scope did not help at all.
What is strange also - every next script execution processes one be_CustomerServiceChanges record more before the crash. in other words, if now it crashed on some particular be_CustomerServiceChanges record, next time it will process this record and will crash on the next be_CustomerServiceChanges record.
Any ideas on what should I do? I don't have a budget to rewrite what's described above as "some rather complex activity with the modification of about 20 related tables - not more than few dozens of locked records".
I am going to request the DB rebuild via creating a new empty DB and copying everything over there, but the approval will take me some while and I cannot wait.
Thanks a lot for ideas.
I work with the Progress 7.3D on HP-UX B.10.20 A 9000/785 and, therefore, cannot use anything like _Lock, _UserLock, etc.
A couple of weeks ago I had the test environment DB replaced by that from the production. Since there, not much changes were made to the test base. But several days ago the nightly scheduled activity started crashing with "Lock table overflow, increase -L on server (915)". On production, everything is working well. The source code is rather old (about 10 years old) and had been written before I was employed here.
The script that is crashing generally looks like
Code:
def buffer be_CustomerBills for CustomerBills.
def buffer be_CustomerServiceChanges for CustomerServiceChanges.
for each CustomerBills (a few dozens of thousands) no-lock:
some minor read-only activity and 1 record is created in another table
TRANLABEL:
do transaction:
some minor read-only activity
for each CustomerServiceChanges no-lock where (correspond to a current CustomerBills and are unprocessed):
find first be_CustomerServiceChanges exclusive-lock where recid(be_CustomerServiceChanges) = recid(CustomerServiceChanges) no-wait no-error.
if could not get lock on be_CustomerServiceChanges then
undo TRANLABEL, leave TRANLABEL.
some rather complex activity with the modification of about 20 related tables - not more than few dozens of locked records
end. /* for each CustomerServiceChanges */
if CustomerBills needs to be changed then
do:
find first be_CustomerBills exclusive-lock where recid(be_CustomerBills) = recid(CustomerBills) no-wait no-error.
if could not get lock on be_CustomerBills then
undo TRANLABEL, leave TRANLABEL.
else
modify 2 fields of be_CustomerBills and release be_CustomerBills
end.
end. /* TRANLABEL */
delete the record in another table created at the beginning of for each CustomerBills
end. /* for each CustomerBills */
The crash happens on the very first CustomerBills being processed on one of its CustomerServiceChanges records.
Under the rather complex activity I mean procedures that are used in hundreds of places across the whole big application, and this is why the TRANLABEL transaction is so wide - it's better to undo the big TRANLABEL transaction than cleaning anything out manually from all the about 20 tables.
I don't expect more than a few dozens of records to be locked within the TRANLABEL transaction block at most, but the promon shows about 1700 simultaneous locks kept by the user I execute the script on behalf of.
The -L parameter is now 12000 (when this started a few days ago, it was 10000) which is much more than 1700, but the crash does happen.
I did not increase the -L value more this time before I figure out this strange behavior. And this is also something to take long, because I don't have permissions to do this and getting this done by admins takes long to go thru several approvals chain.
When such problem happened before in different places, increasing helped for just a few months, and then I had to increase it again and again. Before I go ahead and increase again, I would like to figure out more than I used to when increasing -L in the past, because increasing helps very temporarily, and also I don't want to give -L too large value in order not to decrease the overall performance which is great now.
I also played with the transaction scope - I removed the transaction work from the TRANLABEL's do statement and put only actual changes of be_CustomerServiceChanges and be_CustomerBills into separate transactions, but it always crashed on attempting to get an exclusive lock either on be_CustomerServiceChanges or on be_CustomerBills.
In other words, changing transaction scope did not help at all.
What is strange also - every next script execution processes one be_CustomerServiceChanges record more before the crash. in other words, if now it crashed on some particular be_CustomerServiceChanges record, next time it will process this record and will crash on the next be_CustomerServiceChanges record.
Any ideas on what should I do? I don't have a budget to rewrite what's described above as "some rather complex activity with the modification of about 20 related tables - not more than few dozens of locked records".
I am going to request the DB rebuild via creating a new empty DB and copying everything over there, but the approval will take me some while and I cannot wait.
Thanks a lot for ideas.