backup strategy

Afternoon all,

v10.2b on linux.

we're looking to change our db backup routine and would appreciate some input as regards the new plan.
Currently, we've got 3 live databases running 24/7 with AI (which is mananged by the AI archiver). We do a full online backup every evening, transferring the backup files across to a secondary db server, recreating a reporting database for use the following day.

The issue is that the databases are getting a bit too large for us to be transferring and rebuilding every night so we've had a look at incorporating incremental backups and the basic plan would be:

Full backup on Saturday night with incremental backups on Sunday - Friday.

Another option that I've seen mentioned in a couple of posts in this forum is to use AI rather than Incrementals and this seems like quite a decent option, especially given that progress supply the AI Archiver to managed the AI files.

Just wondering if there's anything in particular we need to be looking out for when moving to incremental backups and if we did use AI files instead, would it be the case (within reason) of the shorter the better for the ai archiver interval ?

TIA,
DG.
 

TomBascom

Curmudgeon
Is the reporting database read-only?

If not you are going to ruin it for additional incremental after-image roll-forward purposes as soon as you open it. Which means that you will have to obtain a new backup all over again.

Is there a reason that you have not implemented OE Replication? That would be an even better solution to this problem.

With regards to the "too large to transfer & rebuild issue"... Are you sure? How large? How much network bandwidth do you have and how much is actually in use during the transfer?

Once upon a time I was faced with a "not enough time" issue. The transfer was taking 12 hours or so. We were considering using the corporate jet to transfer tapes because it would have probably only have taken 4 hours that way (including driving to and from the airports...) But when I looked at the network I discovered that it was less than 10% utilized... With a little scripting magic I split the transfer into 10 pieces and launched them all in parallel -- after that the transfer only took a bit more than an hour.
 

cj_brandt

Active Member
Maybe list the size of the databases and how long it takes to backup and restore.

I used to work for a company that had some big databases at client's sites and we would try and pull them back over a very slow network. About 80% of the data was in 2 blob tables that we didn't need, so it was much faster for us to dump all the tables, except the 2 containing the blobs and then restore that.

Also 10.2B has the -dumpspecified parameter to binary dumps. It allows you to grab only a portion of a table.

I don't see the point of incremental backups. They take almost as long to run as full backups.

If you are going to access a database you are applying AI files to, make sure you use the -RO parameter. Your clients can still get errors, but otherwise you will adjust the timestamps in the db files and you won't be able to apply any more AI files.
 
Thanks for the response Tom,

no it's not read-only. we apply various data updates once the reporting database is built, but I think we'd be okay have to keep the Saturday night full-backup and start rebuilding the database every night from this, applying the increasing number of incrementals, before applying our data updates.

Yep, the management are certainly aware of Replication, but doesn't seem to be something they're keen on. They're even less keen on giving me access to the company jet, alas.

Good questions on the transfer size & time. We're transferring about 140Gb of backup files, and the transfer itself takes about 2.5 hours. This, combined with the backup & verification on the source side and the restore on the target side mean we're taking an ever-increasing amount of time. I think disk i/o is a bit of a bottleneck for the transfer, but there's some talk of new disks in the near future.

I'm not sure about the bandwidth utilisation but I'll speak with our sys admin guy about to see how we go about checking it. Assuming there is spare bandwidth, parallel transfers are something I'll definitely look at.

Thanks again,

DG.
 

cj_brandt

Active Member
Why bother with the verification of the backup on the source side when you are going to restore the database to a backup server ? If the backup wasn't valid, you would know in a few hours. Since you restore the database to a backup server and you have AI files, I wouldn't bother with the verification.
 

TomBascom

Curmudgeon
I agree. That verification step is redundant. It is even worse than redundant if you are waiting for it before you take the next step.
 

atopher1

New Member
Another option that I've seen mentioned in a couple of posts in this forum is to use AI rather than Incrementals and this seems like quite a decent option, especially given that progress supply the AI Archiver to managed the AI files.

Any thoughts on this as an option? We also run full backups each night and the backup is then transferred off site to 3rd party remote storage. The 3rd party transfer bit is the bottle neck. The DB is about 90gb.

What I've considered is running 1 or 2 full backups a week and then just backing up the subsequent AI files. So to use AI would 1. Save time & 2. Save costs on the backup solution.

My only reservation is that when the proverbial hits the fan, there's no replacement for a full backup right?

Chris
 

TomBascom

Curmudgeon
I think that using AI to extend the period between full backups is fine. Especially if you are continuously rolling forward. In that case your remote db is just as good as a full backup. And you know immediately if there is a problem.

The issues above have to do with trying to use the database for two purposes -- both reporting and backup -- while continuously rolling forward. If you want to do that then you need to either use -RO clients for the reporting (which has many problems) or you need to use OE Replication.

BTW -- you could "backup norecover" the rolled forward db, and then restore it for snapshot oriented reporting purposes -- it won't be "live" but it will be consistent as of the 2nd backup which will avoid the issues with -RO access.
 

atopher1

New Member
Thanks Tom, I'll be implementing this into our Development environment this week.

We haven't done continuous roll forward for a few years now, as it was decided that the risk was such that we could afford an hour or so down time.

You've got me thinking though. We've just purchased some additional storage that we could use to roll forward onto. And I quite like the comfort of having a hot (ish) spare, so we can quickly recover from a DR event. And mainly help me sleep easier :).

Just a comment for the original post - we use storage based snapshots for reporting and test environments. Although it's not automated, it works very well for us.
 
Morning,

thanks for the input, it's helped clarify a few things and given some good ideas. I've run a couple of test scripts to look at transferring the backup files in parallel rather than in series and it knocks off about 2/3 rd's of the transfer time so we're well pleased with this gain (this despite being told by our sys admin guys that the transfer was fully utilising available bandwidth).
It would certainly save us a lot of time to do away with the verification part of the process - as you say, we'd know soon enough if the backup is valid, so we'll do this as well.

Thanks,

DG.
 
Top