Backups and Updating

wessel · Jul 29, 2005

We are looking for backup and disaster recovery solution. Could you give some suggestions on backup procedures used by corporate companies with large db's of about 80GB and more.

For instance do they use after imaging, mirror storage disks or even servers, or incremental backup. What other methods are there?

Then also, what procedures can be followed when updating large db’s of 80GB and more. We can’t afford shutting down the system for the time it takes to update the db. It takes to long.

Please give some advice.

Look forward to your response

Casper · Jul 29, 2005

Hi just a quick reply:

-Always use AI (on seperate disks from db), also on smaller db's (size doesn't matter, I suppose every data is relevant)! You regret it if you don't.
-Always mirror (use raid 1+0 or something similar), so disk faillure won't mean crash of the db.

Maybe you should look into Fathom replication for backup.

You could also try to look into creating a hot spare, which means making a copy of the db and apply every time interval AI from production db.

Hot spare and fathom replication bring downtime to a minimum.

I'm sure the're people who got lots more to tell you, but this is just a hint what direction to take.

HTH,

Casper.

TomBascom · Jul 31, 2005

wessel said:
Then also, what procedures can be followed when updating large db’s of 80GB and more. We can’t afford shutting down the system for the time it takes to update the db. It takes to long.

I don't know what you're doing but an upgrade doesn't need to take more than a couple of minutes -- regardless of db size. You can just do a "proutil dbname -C convXY". Is that still too long?

Of course if you want to take full advantage of new features (like storage areas) you may need to do things like dump and load but there are ways to do that in relatively short windows too.

TomBascom · Jul 31, 2005

wessel said:
We are looking for backup and disaster recovery solution. Could you give some suggestions on backup procedures used by corporate companies with large db's of about 80GB and more.

For instance do they use after imaging, mirror storage disks or even servers, or incremental backup. What other methods are there?

I have several customers who do most of those things simultaneously (they don't do incremental backups.) At any point they have a live database and 5 (or more) paths to recovery in anything from minutes (split mirrors) to a couple of hours (offsite tapes).

It all depends on your (unstated) requirements and your budget.

wessel · Aug 1, 2005

TomBascom said:
I don't know what you're doing but an upgrade doesn't need to take more than a couple of minutes -- regardless of db size. You can just do a "proutil dbname -C convXY". Is that still too long?

Of course if you want to take full advantage of new features (like storage areas) you may need to do things like dump and load but there are ways to do that in relatively short windows too.

Hi there. Thanks for the reply.

We don't want to upgrade, we want to update the db's, as in db changes. Dumping and loading takes for ever. A single table consists of 6656016 blocks, 256 records per block, that gives you 1.7billion records.

Now there are 2 db's each consisting of about 10-15 tables (some large as mentioned above), all in separate storage areas.

What methods are there to update the db’s, but at the same time minimize or even eliminate downtime.

TomBascom · Aug 1, 2005

wessel said:
Hi there. Thanks for the reply.

We don't want to upgrade, we want to update the db's, as in db changes. Dumping and loading takes for ever. A single table consists of 6656016 blocks, 256 records per block, that gives you 1.7billion records.

Now there are 2 db's each consisting of about 10-15 tables (some large as mentioned above), all in separate storage areas.

What methods are there to update the db’s, but at the same time minimize or even eliminate downtime.

Can you give an example of the sort of update that you're talking about?

What kind of hardware is this running on?

wessel · Aug 1, 2005

Our db's are running on a HP Proliant server consisting of 4x3.2Ghz Xeon processors, 4 Gb memory, 8x135Gb Scsi 10000rpm – mirrored. We ar running Progress 91D, patch 9 on Red Hat Enterprise 3, Advanced Server

Updating takes place when we need to drop or load fields, tables and indexes; or performance wise, when we zero the scatter factor. All these aren’t necessary now, seeing as it’s a new server.

What happened is that when the new server was installed, we only got 4 days to finish the server.

The first db (70 GB), we started a bulk load on a Thursday evening and it completed on Sunday afternoon.

The second db (50 GB), we started Friday evening and had to can the process early Monday morning in order to restore a backup, to have the old db up and running business time, Monday morning.

The reason for the dump and load is that the db’s were upgraded from a 1k-blocksize, 32 records per block, to a 4k-blocksize, 256 records per block db.

Casper · Aug 1, 2005

Hi there,

I suppose both db's are running on separate servers?
256 rpb with 4k blocks means maximum of 32GB data per area and suggests average record size of 16 bytes (from tabanalys).

This is pretty small for average record size. So I suppose the calculation you make:

A single table consists of 6656016 blocks, 256 records per block, that gives you 1.7billion records.

isn't accurate.

With 8 135GB mirrored disks you mean you have 16 disks (8 + 8) or 8 mirrored (4 + 4)?

Casper.

wessel · Aug 1, 2005

No the db’s run on the same server, however one was dumped and loaded on the old server and the other on the new.

I apologize, my calculation is incorrect. It should be that the entire area consists of 6656016 blocks, 256 records per block, which gives you 1.7billion records – for that area.

8 mirrored disks as in 4+4.

Wessel

Casper · Aug 1, 2005

Hi Wessel,

I meant, that there are most probably not 256 rpb in that area because I suspect the average record size is bigger than the 16 bytes needed for 256 rpb.

Furthermore I think I would use more disks with less volume with databases of that size. The more disks the more IO's per second --> the better perfomance of the database and dump/load procedures for that database. You can only load as fast as IO's per second permit. I'll bet you have an IO-bottleneck with the dump/load procedures.

Normally you have AI on seperate disks form the db, that leaves you with only 1 disk per database, not the most optimum configuration, I suppose.

Tom knows lots more of this than me, so I hope he gives you good advice what to do.

Regards,

Casper.

TomBascom · Aug 1, 2005

wessel said:
Our db's are running on a HP Proliant server consisting of 4x3.2Ghz Xeon processors, 4 Gb memory, 8x135Gb Scsi 10000rpm – mirrored. We ar running Progress 91D, patch 9 on Red Hat Enterprise 3, Advanced Server

Ok, that helps set some background. When you say 8 disks are mirrored I'm assuming that that means that you have 4 disks "visible" from Progress' point of view?

Updating takes place when we need to drop or load fields, tables and indexes; or performance wise, when we zero the scatter factor. All these aren’t necessary now, seeing as it’s a new server.

Dropping tables is made substantially easier and faster by putting the table data into a dedicated area. Then you just "truncate area"...

Loading data into new fields is a mass update -- very similar to the load phase of dumping and loading. This process is going to be sensitive to all of the tuning that the load phase is sensitive too. The key programming "trick" is to consolidate updates into groups of records to minimize bi file activity. The default dump & load process, for instance, uses groups of 100 records. Other important bits are to isolate the bi file, turn off after imaging (turn it back on again after the update!), use a very large bi cluster size and, possibly, deactivate relevant indexes.

Zeroing the scatter factor can only be done by dumping and loading.

What happened is that when the new server was installed, we only got 4 days to finish the server.

The first db (70 GB), we started a bulk load on a Thursday evening and it completed on Sunday afternoon.

The second db (50 GB), we started Friday evening and had to can the process early Monday morning in order to restore a backup, to have the old db up and running business time, Monday morning.

That's a very long time for a dump and load of this size on that hardware. Bulk load is probably not a good option for you. I also suspect that you did not take good advantage of the potential for parallelism in your process and your hardware.

The reason for the dump and load is that the db’s were upgraded from a 1k-blocksize, 32 records per block, to a 4k-blocksize, 256 records per block db.

Rich Banville (Rich has been the engine crew's leader for quite a while now that Gus has moved on to bigger things...) just posted a very interesting bit on PEG about rows per block. I have been a big fan of "use 256 and stop worrying" for a long time but he has raised some good points.

Backups and Updating

wessel

New Member

Casper

ProgressTalk.com Moderator

TomBascom

Curmudgeon

TomBascom

Curmudgeon

wessel

New Member

TomBascom

Curmudgeon

wessel

New Member

Casper

ProgressTalk.com Moderator

wessel

New Member

Casper

ProgressTalk.com Moderator

TomBascom

Curmudgeon