Question OE Replication

RealHeavyDude · Jul 31, 2018

OE 11.7

Just to clarify: As per my understanding OE Replication requires the source and target databases to reside on the same platform. As far as I know ( I vaguely remember my sole forays into OE Replication back in 2003? ), you need to start OE Replication with a backup of the source database on the target. That will effectively rule out a different platform for the target.

Am I correct?

I think so.

Nevertheless, since I am now in the inception phase of the project to migrate our Progress server backends off of Solaris SPARC onto RHEL x86, we reached out to our Progress account manager to talk about the the platform change. Mostly we are concerned about the grace period where we are allowed to run both licences as the migration through all environments will take us about a year. But, our biggest concern is that we have to copy the binary dump files of the 1 TB large database across the network and the time it will take as everything else like NFS file systems or USB devices are disapproved. So the Progress account manger - who clearly wants to sell us OpenEdge Replication - said, that we could use it for the database migration from Solaris SPARC to RHEL x86: Start with an empty database on Linux - enable OE Replication and off we go ...
Unfortunately project management and the business stakeholder bought into that idea and I need to convince them that it won't work.

Thanks in advance.

TomBascom · Jul 31, 2018

Solaris and Intel are different byte orders. Last I knew that prevents you from backing up on one and restoring on the other.

But maybe the sales guy has been working on the side in engineering and knows something?

TomBascom · Jul 31, 2018

Having said that... I do vaguely recall recently hearing a product manager or someone similar say something that sounded suspiciously like platform neutrality with oe replication. I don't think it came from anyone that I would consider "technically reliable" (IOW it wasn't Rich Banville) and I have not been motivated to dig into it. If it is a real thing I would expect that it requires a very up to date release or is maybe an oe12 roadmap item.

I would also want to do a proof of concept and a lot of testing before I committed to actually implementing it on an important system.

Rob Fitzpatrick · Jul 31, 2018

I agree with Tom, you can't do this between a big-endian system (Solaris) and a little-endian (RHEL).

There were a bunch of Replication enhancements in 11.6 and 11.7 but this wasn't one of them. They added Replication Sets but it's still between like platforms.

From the 11.7 Replication User Guide, under Guidelines for working with source and target databases:

Both the source and the target machines must have the same endian ordering. Endian ordering defines how multiple byte integers are stored in memory—either by most-significant byte (MSB) or least-significant byte (LSB). Those systems storing by MSB are called Big Endian, and those storing by LSB are called Little Endian.

The term endianess is used in general to describe a situation in which binary files are portable between platforms; those platforms with the same endianess may use binary data transparently.

Typically, UNIX machines and Windows machines use different endian ordering for storage. Therefore, a Windows source database can be replicated to another Windows machine, but not to an HPUX machine, for example. An HPUX source database can be replicated to another HPUX machine

RealHeavyDude · Aug 2, 2018

Thanks for your valuable input.

jdpjamesp · Aug 6, 2018

You could probably achieve it with Pro2, which is essentially what ~~Bravepoint~~ Progress Services use for minimal downtime dump and loads AFAIK. But, like others, I would seriously doubt being able to do it with Replication.

Rob Fitzpatrick · Aug 6, 2018

Cringer said:
You could probably achieve it with Pro2, which is essentially what ~~Bravepoint~~ Progress Services use for minimal downtime dump and loads AFAIK. But, like others, I would seriously doubt being able to do it with Replication.

I thought they used an old BravePoint product called ProDump&Load or something similar; based on replication triggers.

That said, not too long ago Mike Furgal gave a presentation about performing a platform migration via Pro2Pro. So that approach would be viable, though somewhat expensive if you don't already have a license.

Rob Fitzpatrick · Aug 6, 2018

Mike's presentation:
Case Study: Platform and Data Migration with Little Downtime

TomBascom · Aug 6, 2018

If you stand back far enough there is a lot of similarity between the two.

My understanding is that, these days, they prefer to use Pro2.

Personally I haven't come across any situations where I couldn't get a normal dump & load done in a small enough window to make it pointless to go through that level of effort. It *sounds* great that you just need this really short window at the beginning and at the end but there is, none the less, a lot that goes into making that work. A normal dump & load is much simpler and generally more than good enough for almost everyone.

RealHeavyDude · Aug 7, 2018

I almost sure that I will go for a binary dump and load, as this is - IMHO - they most straightforward approach. I've done it lots of times and never had any issues whatsoever.

Basically there are 2 issues I will face regarding time consumption:

Dump the data on the source Solaris SPARC system to a ZFS file system residing on an EMC SAN. I expect it to take some 8+ hours.
Copy the binary dump files ( roughly 500 GB ) via FTP over a slow network connection as NFS file systems and/or using an USB device are both disapproved. I expect it to take some 10+ hours.

Effectively this means that I will lose 18+ hours before I will even be able to start the load on the target. Including pre and after work ( backup, table analysis before and after ) I expect it to take more that 24 hours if nothing stupid goes wrong. With stupid I mean rogue sys admins and a security implementation killing break glass access to PROD systems after 8 hours ...

Thanks for your insights - I will have a deeper look at Pro2, maybe it can help us.

jdpjamesp · Aug 7, 2018

You can probably reduce that time quite significantly by starting to copy and load the files as they become available to you. One technique is to have n parallel dump scripts that dump a subset of tables. Each dump is calculated to do approximately the same amount of work by using a tabanalys (you need this anyway to check record counts). You then order script 1 to start with the biggest table descending, and script 2 to start with the smallest table ascending. Script 3 is the biggest descending again, etc. That way you should always have data to copy and load available to you.
I've never done it on as big a database as that, and it sounds like you have your hands tied, but I used such a technique to reduce dump and load times for a 120GB database by around 75% if memory serves.
The key is to test of course.

andre42 · Aug 7, 2018

A colleague of mine won the DBA and Programmer's Challenge in the 2017 EMEA PUG Challenge (lowest downtime for dump and load). I don't have the technical details, but I understand that he used a generator that writes lots of batch or script files that can run in parallel. It seems that there is some room for optimization.

jdpjamesp · Aug 7, 2018

Yes his solution is what I tried to outline above. The nice thing about his solution was that most of it was also automated meaning he had very little to do other than make a cup of coffee.

RealHeavyDude · Aug 7, 2018

I've automated the dump & load of single storage areas or the whole database in a similar way.

The foundation is a database (index and table) analyse that runs on a weekly base which gives me enough accuracy to spread the tables across several dump threads. In our PROD environment mostly 4 dump threads dump the database or a single area the fastest way. Nevertheless, for safety reasons I also incorporate a table analyse before and after.

As I don't have any access to our PROD systems I needed to automate the process to an extent where all that is to do for the operator is to execute a shell script. The shell script that

executes a table analyse
starts an ABL session that loads the result into dedicated database tables
starts an ABL session that will create the scripts for the dump threads and a script for a single load thread based on the last table analyse stored in the database
invokes the (4) dump thread scripts in parallel
truncates the concerned storage areas
invokes the load script
executes a table analyse
starts an ABL session that will compare the result with the data stored in the database

So far I've not seen a significant difference between an inline index rebuild during binary load or doing an index rebuild for the whole database or area when when all tables are loaded in our current environment (Solaris SPARC with ZFS file systems residing on a SAN). The might change on the new platform (RHEL Linux VMs with Ext4 file systems).

TomBascom · Aug 7, 2018

The effectiveness of the inline index rebuild vs the one big bang varies.

The one big bang approach has the advantage of being able to leverage the new (10.2b06+) multi-threaded idxbuild stuff.

The inline approach can work well if the table sizes are nicely distributed and the dumps finish in a helpful order.

FWIW -- in my experience the big bang is almost always the winner. I do try the inline once in a while just to be sure. But usually the big bang is the winner.

TomBascom · Aug 7, 2018

You should *always* do the tabanlys comparison. It will save your butt some day.

The big problem with that is usually that the dump server is old and slow. One thing that you can do to speed up the tabanalys is to run one per area in parallel. And then merge them for the compare.

The newly loaded db is usually very fast. I just did an HP to Linux migration where the HP side was taking 5+ hours for the single threaded tabanalys. I broke it up and ran them in parallel on the dump side which got that down to 2 hours (which was really just the one monster table, most of them were done much sooner than that). Over on the newly loaded side it took 6 minutes for the single threaded tabanalys.

This did, in fact save my butt. During the dry runs something happened and 2 tables were missing from one of the scripts. The tabanalys compare found that and I fixed it long before we did it for real. That's not the first time I was glad that I checked record counts.

jdpjamesp · Aug 7, 2018

Yeah at 3 in the morning when you're tired and you accidentally ran the same load script twice is always a good one to find out BEFORE the users log in

Question OE Replication

RealHeavyDude

Well-Known Member

TomBascom

Curmudgeon

TomBascom

Curmudgeon

Rob Fitzpatrick

ProgressTalk.com Sponsor

RealHeavyDude

Well-Known Member

jdpjamesp

ProgressTalk.com Moderator

Rob Fitzpatrick

ProgressTalk.com Sponsor

Rob Fitzpatrick

ProgressTalk.com Sponsor

TomBascom

Curmudgeon

RealHeavyDude

Well-Known Member

jdpjamesp

ProgressTalk.com Moderator

andre42

Member

jdpjamesp

ProgressTalk.com Moderator

RealHeavyDude

Well-Known Member

TomBascom

Curmudgeon

TomBascom

Curmudgeon

jdpjamesp

ProgressTalk.com Moderator