Migration voes - dump & load from AI files - say YES

pinne65

Member
We are planning on migrating an OE 10.2B database from AIX to RH Linux. But have very little time at our disposal. All the tables is still in the schema area.

I'm looking for ideas on how to speed things up and split it up if possible.

Does it make sense to grab the data files from the source for each table as soon as they are completely dumped and load them onto the new target database while other tables still are being dumped? Pros / Cons.

I know I'm dreaming - but just in case Im not I'll put it out there :) Would it somehow be possible to dump & load AI files? That way we could dump just after a backup. Do the load to the new system while old system goes back in production and then "just" dump & load the AIs.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
OK, so you're moving from AIX to Red Hat. That means you're going to dump and load. And you should, even if your server migration didn't necessitate it, as all of your tables are in the schema area and therefore in Type I storage.

So you should give some thought to your storage area design for your new database. If you need guidance, refer to Tom Bascom's Storage Optimization Strategies presentation.

Next you need to decide how to dump and load, and you have several options. There is no single way that is best. What is best for you depends on your system(s), your database size, your storage hardware, and your business constraints, i.e. window of permissible downtime.

Read the Dumping and Loading section of the Database Administration manual for an overview of available techniques. At a high level, tables can be dumped in two ways; an ASCII dump into a .d file via the Data Dictionary tool or a binary dump into a .bd file via the proutil dump command. The Data Dictionary can dump one table or many at once, which makes it fairly easy to use, but it is the slowest way to dump. The binary dump is much faster but usually will need to be scripted as it operates one only one table per command. In 10.2B it can run multi-threaded and if desired can dump with a specified (non-primary) index.

It is typical that a database will contain a handful of very large tables that you want to binary dump, and a lot of small or empty tables that you can do quickly and easily via an ASCII dump. Run a proutil dbanalys report on the source database so you can see table record counts and sizes. Analyze the data to determine your dump and load plan, but don't just rely on this data. Plan to run some off-hours dumps against a copy of production to get a sense for how long this process takes on your hardware with the different methods. You also need to determine how you will get the dumped data over to the new server. If they are on the same LAN you can share an NFS mount between them. If they both have access to a SAN you could use that. If they are physically separate you may be looking at FTP or rsync, or maybe even driving tapes from A to B. However you do it, remember to benchmark this transfer and include it in your total downtime calculation.

Remember that you aren't just dumping table data. You need to dump your database schema to a .df file. If you have defined database sequences you have to dump the current sequence values. If you define users in the _User table then you need to dump users. These tasks are done via Data Dictionary. If you have defined SQL permissions or views, these also need to be recreated on the target.

Create the target database via prostrct create with your new structure file and your desired database block size. Then load your schema, users, sequence values, etc. Then you will load your table data. If you dumped a table via proutil dump, then you will load the .bd file on the target with proutil load. Again, you probably want to script this for efficiency and accuracy, and to capture the output in case there are any errors that might otherwise scroll off the screen. If you dumped ASCII .d files then you can load via Data Dictionary or via proutil bulkload (which requires a bulk load descriptor file from your source database).

However you load, once you are done, do a full backup. If you binary dumped and loaded any tables, they require a separate index rebuild step with the database offline. This is where the backup is very important, because if the rebuild fails or crashes for any reason then restoring the backup may be your only option aside from repeating the load.

Again, what is right for you is what allows you to do the job in the allotted time. If you have a small DB that dumps and loads in 15 minutes via Data Dictionary and you have a 3-hour downtime window from the business, you don't need to get fancy and create binary dump and load scripts. Just do the simple and quick ASCII D&L and you're done. However if you have some combination of large database, slow hardware, and small downtime window, you may need more testing and optimization of your techniques. Testing, documentation, and logging are really really key; don't skip them. You may be doing this at 3 AM when you're dopey, or you may get sick and rely on a backup person to get the job done. Document it so well that even your manager can do it. :)

Here are a few general tips. For large (or slow) tables, try multi-threaded binary dumps. Try using a couple of threads per CPU. Also try binary dumps with -RO (read-only connection). If a large table dumps very slowly and it has a large primary index, try dumping it with a smaller index. It probably isn't necessary to binary dump everything. Do all the small stuff in the Data Dictionary, which runs in a single thread. Run the dumps with the database online (ensure though that clients can't connect to it!), where you have a large buffer pool and have first run a dbanalys to load the data from disk into the cache. Another part of your pre-dump analysis should be determining the current state of your indexes. You may benefit from a pre-D&L index compact or index rebuild. It could make your dumps go faster, and could also prevent a situation where you discover index corruption problems during your D&L, which is never ideal.

For your loads, optimize your BI file; this makes a huge difference. Use a very large BI cluster size and 16 KB BI block size. Start a broker, a BIW, and several APWs; "several" is the number at which you don't have buffers flushed at checkpoint during the loads. Start the broker with -i (no-integrity mode), which reduces BI file writes. Don't have AI enabled during the load. Don't use the default -spin, it will be too high; I usually go with about 5000. Since you're using -i, you have no crash recovery. After the load, shut down immediately and back up the database. If you are loading into a fully variable structure, you will incur the expense of a lot of extend operations as you grow your extents. You can avoid this by preparing a pre-grown target database into which you have already done a dump and load of your data. Copy the appropriate empty DB into it (e.g. empty8 for 8 KB block size), and you will have a large, empty, pre-grown all-variable DB into which you can load your schema and data without any extend operations.

For index rebuilds, ensure you are using service pack 06 (or later) of 10.2B. The index rebuild performance is substantially improved in SP06. Be sure to read the release notes as there are a few new or changed client startup parameters you can use to tune index rebuild performance by taking advantage of available memory. RAM is your friend. More available RAM means less disk I/O which means faster rebuilds. You may want to go an area at a time, so very large tables which are likely to be in their own area will have more available memory for the rebuild. Specify -T or use a .srt file to ensure that your temp files are not written to the database disk(s). If possible, allocate space in a tmpfs file system so you can write temp files in memory instead of on disk. Once your index rebuild is done and you are ready to restart in production mode, remember to reset the BI file to an appropriate cluster size, back up the database again, and re-enable AI.

If your source server is a bottleneck because it's old, as may often be the case, you might have to get creative. If you can beg, borrow, or steal a similar server (same platform) that is faster, then you could shut down production, back it up, and restore to the fast box and then dump from there. I've seen cases where that reduces the total downtime even though it's more work. If this option is available to you, you could restore a recent backup there and roll forward AI files to it up until you shut down production. This would save you the time of a full backup and restore in your downtime window.

There are various techniques and "tricks" you can use to squeeze out more performance, but this requires detailed knowledge of the platforms on which the work will be done. The performance difference between very generic advice like mine or the things you read in the manuals, versus highly optimized settings tuned for your specific environment can be quite dramatic. It may be well worth your while to hire a consultant to guide you through this process. Some of them even stop by this forum now and again...
 

TomBascom

Curmudgeon
I have used they "borrow a fast server and keep it up to date with ai logs" approach with great success :) We got a D&L that was initially estimated to take 3+ weeks down to 6 hours and that was one important part of the puzzle.

Another big piece of the puzzle was to overlap the dumping and loading (as suggested in the original post) and to run many threads in parallel.
 

pinne65

Member
Thanks for all the good info and advice! It will help for sure. Borrowing / renting a faster box sounds really appealing since we'll able to a dry run.
 

TomBascom

Curmudgeon
Actually most of what made the borrowed box so much better was that it came with a decent disk subsystem...

Every situation is different. YMMV. Past performance is not an indicator of future performance. Etc. ;)
 
Top