Target replication Database got modified .

Jack@dba

Member
HI All,
We need your experience to identify the root cause of the issue.

Database Version : 11.7
OS : IBM Aix 6

We have Replication plus product installed on our system. We have Source database and single target which reside on DR server.
During Database Service pack upgrade we have shut down the Source database after Target database perfectly in Sync, later we used to shut down the database target database.
Once upgrade completed, we started the source database everything came up fine and started target database and, but we found error saying, "target database got modified after its created" not able to sync target databases with Source, please find attached for error screen shot.

This issue happened On Jan'24 and we have raised case with Progress vendor but still case is open.
We have encountered similar type of issue where database going to pre-transition state if target database shutdown more than 3 hours database.

When Source and target database are down and in server no users, crontab jobs and control-m schedular jobs also stopped.
Once database restarted on Source side after verification within shorter time, we started target database during that time no application users or jobs not released on server but how it is possible target database got modified?

Source.repl.properties file
====================

[server]
control-agents=agent1
database=prdsl
transition=manual
transition-timeout=600 # doesn't matter if transition is manual
defer-agent-startup=10080
agent-shutdown-action=recovery

[control-agent.agent1]
name=agent1
database=prdslc
host=mfgpro
port=4212
connect-timeout=120
replication-method=async
critical=0

[transition]
database-role=normal


Target repl properties
=============

[server]
control-agents=agent1
database=prdslc
transition=manual
defer-agent-startup=1440
repl-keep-alive=300
schema-lock-action=wait
agent-shutdown-action=recovery
transition-timeout=600
minimum-polling-delay=10
maximum-polling-delay=501

[control-agent.agent1]
name=agent1
database=prdslc
host=mfgpro
port=4212
maximum-message=32
connect-timeout=120
replication-method=async
critical=1

[agent]
name=agent1
database=prdslc
listener-minport=10000
listener-maxport=11000
 
I don't see any "please find attached for error screen shot".

Going into pre-transition when the source is down is normal. That's kind of the point and not a worry.

As for "how it is possible target database got modified?", first I would want to see the missing error message screen captures. Second, I would want to take a look at the .lg file for the target to verify the timeline and see if any process could have connected to the target in a manner that would generate such a message.
 
So far as I can determine those logs do not contain any errors that suggest that the target was modified after creation.
 
Hi Tom,
Here is the error message.

Source log error details:
[2024/02/17@11:22:58.663-0500] P-56099336 T-1 I RPLS 152: (10842) Connecting to Fathom Replication Agent agent1.
[2024/02/17@11:23:01.693-0500] P-56099336 T-1 I RPLS 152: (10507) The Fathom Replication Server has successfully connected to the Fathom Replication Agent agent1 on host 10.72.00.1 .
[2024/02/17@11:23:01.693-0500] P-56099336 T-1 I RPLS 152: (11251) The Replication Server successfully connected to all of its configured Agents.
[2024/02/17@11:23:01.698-0500] P-56099336 T-1 I RPLS 152: (-----) It appears the target database /tis/dbmain/prdslc got modified after it's created.
[2024/02/17@11:23:01.698-0500] P-56099336 T-1 I RPLS 152: (-----) Expected time for last modification at: Fri Feb 16 22:40:45 2024, timestamp: 1708141245, or at: Fri Feb 16 22:40:45 2024, timestamp: 1708141245, found: Sat Feb 17 08:10:26 2024, timestamp: 1708175426.
[2024/02/17@11:23:01.698-0500] P-56099336 T-1 I RPLS 152: (-----) Expected master block update counter: 1607555 or 1607555, found: 1607578.
[2024/02/17@11:23:01.698-0500] P-56099336 T-1 I RPLS 152: (-----) Please make sure no database changes or transaction UNDOs before replication is enabled on the target database.
[2024/02/17@11:23:01.698-0500] P-56099336 T-1 I RPLS 152: (-----) Either the Fathom Replication Agent agent1 has been incorrectly configured, or the target database /tis/dbmain/prdslc has been improperly sourced, or the replication recovery file needs to be recreated
[2024/02/17@11:23:01.698-0500] P-56099336 T-1 I RPLS 152: (11696) The Agent agent1 cannot be properly configured and is being terminated.

[2024/02/17@11:23:01.701-0500] P-56099336 T-1 I RPLS 152: (18958) The Replication Server attempted shutdown of agent agent1 on host 10.72.00.1 . The agent will enter pretransition according to the specified agent-shutdown-action property.
[2024/02/17@11:23:01.701-0500] P-56099336 T-1 I RPLS 152: (-----) Replication agent or target database was misconfigured.
[2024/02/17@11:23:01.701-0500] P-56099336 T-1 I RPLS 152: (-----) Enabled default logins.
[2024/02/17@11:23:03.701-0500] P-56099336 T-1 I RPLS 152: (10505) The Fathom Replication Server is ending.
[2024/02/17@11:23:03.701-0500] P-56099336 T-1 I RPLS 152: (-----) *** Sync Info RPSERVER At Exit ***
[2024/02/17@11:23:03.701-0500] P-56099336 T-1 I RPLS 152: (-----)
[2024/02/17@11:23:03.701-0500] P-56099336 T-1 I RPLS 152: (-----) Online Replication Recovery Information for /tis/dbmain/prdslc:

Target log details :

[2024/02/17@11:16:58.227-0500] P-18022780 T-1 I RPLA 141: (10392) Database /tis/dbmain/prdslc is being replicated from database /tis/dbmain/prdslc on host 10.00.00.45.
[2024/02/17@11:17:01.247-0500] P-18022780 T-1 I RPLA 141: (12688) The Replication Server has been terminated or the Source database has been shutdown. The Agents will enter PRE-TRANSITION, waiting for re-connection from the Replication Server.
[2024/02/17@11:17:01.247-0500] P-18022780 T-1 I RPLA 141: (-----) Please check source database log file or replication server log file for details.
[2024/02/17@11:17:01.247-0500] P-18022780 T-1 I RPLA 141: (11699) A TCP/IP failure has occurred. The Agent(s) will enter PRE-TRANSITION, waiting for connection from the Replication Server.
[2024/02/17@11:18:54.588-0500] P-21103092 T-1 I SRV 1: (452) Login by liz on batch.
[2024/02/17@11:18:54.611-0500] P-21103092 T-1 I SRV 1: (5646) Started on port 11002 using TCP IPV4 address 0.0.0.0, pid 21103092.
[2024/02/17@11:18:55.619-0500] P-21103092 T-1 I SRV 1: (742) Login usernum 1640, userid liz client type ABL , on onetis batch using TCP/IP IPV4 address 10.00.00.45.
[2024/02/17@11:18:55.619-0500] P-21103092 T-1 I SRV 1: (7129) Usr 1640 set name to liz.
[2024/02/17@11:18:55.619-0500] P-21103092 T-1 I SRV 1: (17961) User 1640 set tty to onetis batch.
[2024/02/17@11:19:02.618-0500] P-21103092 T-1 I SRV 1: (7129) Usr 1640 set name to .
[2024/02/17@11:19:06.462-0500] P-21103092 T-1 I SRV 1: (739) Logout usernum 1640, userid , on onetis batch.
[2024/02/17@11:22:54.311-0500] P-18022780 T-1 I RPLA 141: (10392) Database /tis/dbmain/prdslc is being replicated from database /tis/dbmain/prdslc on host 10.00.00.45.
[2024/02/17@11:22:57.312-0500] P-18022780 T-1 I RPLA 141: (12688) The Replication Server has been terminated or the Source database has been shutdown. The Agents will enter PRE-TRANSITION, waiting for re-connection from the Replication Server.
[2024/02/17@11:22:57.312-0500] P-18022780 T-1 I RPLA 141: (-----) Please check source database log file or replication server log file for details.
[2024/02/17@11:22:57.312-0500] P-18022780 T-1 I RPLA 141: (11699) A TCP/IP failure has occurred. The Agent(s) will enter PRE-TRANSITION, waiting for connection from the Replication Server.

[2024/02/17@11:28:52.429-0500] P-24445238 T-1 I RPLA 141: (10392) Database /tis/dbmain/prdslc is being replicated from database /tis/dbmain/prdslc on host 10.00.00.45.
[2024/02/17@11:28:55.439-0500] P-24445238 T-1 I RPLA 141: (12688) The Replication Server has been terminated or the Source database has been shutdown. The Agents will enter PRE-TRANSITION, waiting for re-connection from the Replication Server.
[2024/02/17@11:28:55.439-0500] P-24445238 T-1 I RPLA 141: (-----) Please check source database log file or replication server log file for details.
[2024/02/17@11:28:55.439-0500] P-24445238 T-1 I RPLA 141: (11699) A TCP/IP failure has occurred. The Agent(s) will enter PRE-TRANSITION, waiting for connection from the Replication Server.
 
Ok, I missed that. I blame it on trying to read the log on my phone ;)

FWIW I really dislike those (-----) messages. But that's not your fault.

The message at 11:23 says that it expected the last change timestamp to have been on or around Fri Feb 16 22:40:45 2024 but, instead, found a change at Sat Feb 17 08:10:26 2024.

Unfortunately the logs start on Feb 17th at 11:22 (source) and 10:03:54 (target). So, something happened but your logs are not showing it.

One thing that often messes up OE Replication is running a backup at an unfortunate time. It looks like the original failure of communications with the source was Friday evening around 10pm? And fail-over efforts started around 10am Saturday? If a backup ran Saturday morning and perhaps finished around 8:10am that might be what this is all about. Or it might have been something else entirely. The logs are not complete enough to know how you got to this point.

Also - these logs are from February but your original post mentions January.

Is this a repeatable situation?
 
Thanks Tom.

The logs I have provided for Feb month only.
As stated our Backup will start late night and complete morning around 2AM.
Only concern we have when the source and target are in downstate once database started with 2 to 3 mins difference we are seeing target database modified.

When target replication database is ready only database how it is updating.

Can you share your thoughts here.
 
As I mentioned, important parts of the logs are missing.

I cannot say for sure that this is your root cause but it is well known that a backup during synchronization will spoil replication. From what you have shown so far I cannot rule that out and there is plenty of reason to think that it has occurred.

You might also have something like a "truncate bi" being performed on the target. But I cannot say if that is happening because, again, the relevant portion of the logs is missing.

Or, someone might have run various commands that transitioned the target to a source. (Or that might have happened automatically.) And now when the original source tries to connect it is surprised to find that the target is no longer a target. Or something like that. But, again, the relevant portion of the logs is not available.

This has happened at least twice now, correct? Your original post mentions that a "Database Service pack upgrade" was involved at least the first time. I'm assuming that that means that you are applying OpenEdge service packs? Is that correct? Applying two in such a short timeframe is unusual albeit praise-worthy. But I am curious - why were you doing that? And what service pack level are you now at? Or was one associated with a service pack and the other event not?

Since it has happened apparently at least twice it seems somewhat repeatable. Have you tried to duplicate the problem in a test environment?
 
Thanks Tom.
We don't have proper logs to share but I'm asking on what situation will encounter this error "target database got modified".

How to reproduce the same issue in test environment, please help on the steps
 
Back
Top