[Progress Communities] [Progress OpenEdge ABL] Forum Post: OE Replication Target Crash, source continues

Status
Not open for further replies.
J

James Palmer

Guest
OE 10.2B08. See this from time to time on various sites and just haven't got to the bottom of it. This time it's landed us in a difficult position with a customer because it's the 3rd time Replication has failed in a week (other 2 were from other causes). [2018/02/08@12:41:16.083+0000] P-6244 T-5912 I RPLA 41: (9407) Connection failure for host 10.100.1.40 port 64037 transport TCP. [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) Diagnostic Dump of RPCommInfo_t - TCP/IP Poll Error:2 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) 0000: 0000 0000 0000 0000 f056 e401 9113 0000 8813 0000 ec13 0000 0200 0000 2400 0000 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) 0020: cc01 0000 a411 0000 0000 0000 6a45 7c5a 0000 0000 3c41 0000 0000 0000 2d00 0000 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) 0040: 0000 0000 58f0 ffff 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) 0060: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) 0080: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) 00a0: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) 00c0: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) 00e0: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) 0100: 0000 0000 0000 0000 0000 0000 3130 2e31 3030 2e31 2e34 3000 0000 0000 0000 0000 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) 0120: 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (-----) 0140: 0000 0000 0000 0000 0000 0000 [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (10492) A communications error -157 occurred in function rpNLA_PollListener while receiving a message. [2018/02/08@12:41:16.085+0000] P-6244 T-5912 I RPLA 41: (11699) A TCP/IP failure has occurred. The Agent's will enter PRE-TRANSITION, waiting for connection from the Replication Server. [2018/02/11@16:08:34.568+0000] P-6244 T-7392 I RPLA 41: (9438) CTRL_SHUTDOWN_EVENT console event received. See above logfile. As you see, the target quite clearly crashes because of a communication error. The thing is, we're replicating 10 databases and this is the only one with the issue. It occurred within an hour of reseeding Replication in case that's pertinent. The source database just carried on happily with nothing in the logs. It just started locking AI files. First we knew about it was when the system crashed as there was a bug in our monitoring script. Has anyone seen this before? Any ideas as to the cause? Any ideas how to fix? I've fixed the monitoring script so we should get alerts well in advance of DB crash so it's not so much of an issue, but it's not pretty.

Continue reading...
 
Status
Not open for further replies.
Top