Startup of AI Replicated Database fails with Memory Violation

ellioas

New Member
(Low priority - came across this issue while performing the annual AI tests)

System:
Progress OpenEdge 10.1C
Red Hat Linux
AI Replication to standby DBs
9/10 AI-replicated DBs started up without any issues.

I came across a unique problem with a replicated database, where it would not start up after having AI files applied to it. Both systems are nearly identical in every way (DBs, O/S, etc.). Every now and then I also see errors when performing a procopy from the same database.

All of our other databases started up without any problems. This DB in question was #3 in the startup list. After the failed startup I rebooted the server and only tried starting that 1 DB, with the same result. I DID get the database to start up eventually by restoring the last backup before my 'rfutil aimage start' and applying each AI file up until the last one (with the error). DB Started up fine after that.

As far as I know, there WAS an outstanding transaction at the time (transaction was not complete and active for 2-3 hours), which had a few locks to other tables. The -L is 10x bigger on our production server, as is the big "B" setting (-B).

My questions are as follows:

-is it possible to apply AI files and then "roll back" to the last transaction?
-is it likely that the outstanding transaction caused the problem?
-is there a faster/more efficient way to resolve this issue other than the way I did it?
-is there a progress tool for protrace analysis?

I'll include the DB logs when I tried to start up, as well as the protrace files:
------------------------------------------------------------------------
logs (tried twice):


Wed Apr 18 09:42:24 2012
[2012/04/18@09:42:24.829-0400] P-17467 T-0 I BROKER 0: (333) Multi-user session begin.
[2012/04/18@09:42:24.844-0400] P-17467 T-0 I BROKER 0: (5326) Begin Physical Redo Phase at 74240 .
[2012/04/18@09:42:24.927-0400] P-17467 T-0 I BROKER 0: (7161) Physical Redo Phase Completed at blk 74954 off 4610 upd 2977.
[2012/04/18@09:42:24.927-0400] P-17467 T-0 I BROKER 0: (13547) At end of Physical redo, transaction table size is -18014.
[2012/04/18@09:42:24.927-0400] P-17467 T-0 I BROKER 0: (7163) Begin Physical Undo 1 transactions at block 74954 offset 4646
[2012/04/18@09:42:25.141-0400] P-17467 T-0 I BROKER 0: (5331) Physical Undo Phase Completed at 74939 .
[2012/04/18@09:42:25.141-0400] P-17467 T-0 I BROKER 0: (7162) Begin Logical Undo Phase, 1 incomplete transactions are being backed out.
[2012/04/18@09:42:25.141-0400] P-17467 T-0 I BROKER 0: (11231) Logical Undo Phase begin at Block 74939 Offset 552.
[2012/04/18@09:42:25.248-0400] P-17467 T-0 I BROKER 0: (49) SYSTEM ERROR: Memory violation.
[2012/04/18@09:42:25.255-0400] P-17467 T-0 I BROKER 0: (5292) SYSTEM ERROR: The broker is exiting unexpectedly, beginning Abnormal Shutdown.
[2012/04/18@09:42:25.255-0400] P-17467 T-0 I BROKER 0: (-----) drexit: Initiating Abnormal Shutdown
[2012/04/18@09:42:25.266-0400] P-17467 T-0 I BROKER 0: (439) ** Save file named core for analysis by Progress Software Corporation.


Wed Apr 18 09:50:15 2012
[2012/04/18@09:50:15.343-0400] P-5249 T-0 I BROKER 0: (333) Multi-user session begin.
[2012/04/18@09:50:15.350-0400] P-5249 T-0 I BROKER 0: (5326) Begin Physical Redo Phase at 74240 .
[2012/04/18@09:50:15.413-0400] P-5249 T-0 I BROKER 0: (7161) Physical Redo Phase Completed at blk 74968 off 2868 upd 4123.
[2012/04/18@09:50:15.413-0400] P-5249 T-0 I BROKER 0: (13547) At end of Physical redo, transaction table size is -18014.
[2012/04/18@09:50:15.414-0400] P-5249 T-0 I BROKER 0: (7163) Begin Physical Undo 1 transactions at block 74954 offset 4646
[2012/04/18@09:50:15.415-0400] P-5249 T-0 I BROKER 0: (5331) Physical Undo Phase Completed at 74939 .
[2012/04/18@09:50:15.415-0400] P-5249 T-0 I BROKER 0: (7162) Begin Logical Undo Phase, 1 incomplete transactions are being backed out.
[2012/04/18@09:50:15.415-0400] P-5249 T-0 I BROKER 0: (11231) Logical Undo Phase begin at Block 74939 Offset 552.
[2012/04/18@09:50:15.417-0400] P-5249 T-0 I BROKER 0: (49) SYSTEM ERROR: Memory violation.
[2012/04/18@09:50:15.418-0400] P-5249 T-0 I BROKER 0: (5292) SYSTEM ERROR: The broker is exiting unexpectedly, beginning Abnormal Shutdown.
[2012/04/18@09:50:15.418-0400] P-5249 T-0 I BROKER 0: (-----) drexit: Initiating Abnormal Shutdown
[2012/04/18@09:50:15.418-0400] P-5249 T-0 I BROKER 0: (439) ** Save file named core for analysis by Progress Software Corporation.


PROGRESS stack trace as of Wed Apr 18 09:50:15 2012
Command line arguments are
/usr/dlc/bin/_mprosrv /db/dr/iscorp -n 500 -c 350 -L 15000 -B 95000 -bibufs 30 -
aibufs 45 -spin 3000 -tablerangesize 650 -indexrangesize 60 -S iscorp-srv -N TCP
-ServerType 4GL
Startup parameters:
-pf /usr/dlc/startup.pf,-cpinternal ISO8859-1,-cpstream ISO8859-1,-cpcoll Basic,
-cpcase Basic,-d mdy,-numsep 44,-numdec 46,(end .pf),-db /db/dr/iscorp,-n 500,-c
350,-L 15000,-B 95000,-bibufs 30,-aibufs 45,-spin 3000,-tablerangesize 650,-ind
exrangesize 60,-S iscorp-srv,-N TCP,-ServerType 4GL
#1 [0x81c7c81] uttraceback+0x139 from /usr/dlc/bin/_mprosrv
#2 [0x81c9e80] uttrace+0x14c from /usr/dlc/bin/_mprosrv
#3 [0x81c9d1c] utcore+0x104 from /usr/dlc/bin/_mprosrv
#4 [0x8059d7f] drexit+0x3fa from /usr/dlc/bin/_mprosrv
#5 [0x80a14c0] drSigFatal+0x84 from /usr/dlc/bin/_mprosrv
#6 [0xe82420] /usr/kerberos/bin/
#7 [0x816c3e6] rmUndoLogicalDelete+0x54 from /usr/dlc/bin/_mprosrv
#8 [0x815bacc] rlundo+0x67 from /usr/dlc/bin/_mprosrv
#9 [0x816344f] rllbk+0x3b6 from /usr/dlc/bin/_mprosrv
#10 [0x8161067] warmstrt+0x295 from /usr/dlc/bin/_mprosrv
#11 [0x81607ab] rlseto+0x5c6 from /usr/dlc/bin/_mprosrv
#12 [0x80f5cf8] dbSetOpen+0x3ac from /usr/dlc/bin/_mprosrv
#13 [0x80f3972] dbenv1+0xf82 from /usr/dlc/bin/_mprosrv
#14 [0x80f28a5] dbenv+0x45 from /usr/dlc/bin/_mprosrv
#15 [0x811d137] dsmUserConnect+0xf7 from /usr/dlc/bin/_mprosrv
#16 [0x809d603] doserve+0x1ee from /usr/dlc/bin/_mprosrv
#17 [0x809d3f7] main+0x109 from /usr/dlc/bin/_mprosrv
#18 [0x7e1e9c] __libc_start_main+0xdc from /lib/libc.so.6

Thanks!

Andrew.
 
Top