Type II, Areas, and After-Imaging

KMoody · Jul 26, 2013

Progress: 10.2b SP7
OS: SUSE Linux Enterprise Server 11

We're transitioning to from 9.0 to 10.2b, and we're considering changing our databases from Type I to Type II, separating indexes, schema, and data into different areas, and using after-imaging. Our largest database is about 1.7 GB, and we currently back up all of our databases on one SAN.

Given that, I have a few questions:

Could Type II databases with areas and after-imaging benefit us, given our situation?
Would leaving these features out put us at risk?
In your experience, when did these features help you? If they solved a problem, what caused the problem?

TomBascom · Jul 26, 2013

1) Yes
2) Yes
3) After-imaging has saved numerous customer databases from oblivion. In one particular case I needed to roll forward a years worth of logs because their backups hadn't run correctly since their SA left a year prior to the hard drive failing... Type-2 areas are the basis of many, many performance oriented features that I have been able to leverage to customer advantage more times than I can count. Your mileage may, of course, vary but anything that deals with large result sets (reporting, extracts etc.) will typically see a significant improvement. Halving the runtime of such things is not unusual when moving from type 1 to type 2.

TomBascom · Jul 26, 2013

Both after-imaging and type 2 areas should really be considered signs of basic competence. Not having after-imaging is malpractice. Failing to implement type 2 areas is incompetence.

IMHO

KMoody · Jul 26, 2013

Tom, would you also say that placing AI files on physically separate drives from the database is a basic requirement?

TomBascom · Jul 26, 2013

The reality of today's systems is that many people cannot do that effectively. It is more important to quickly archive the filled extents to a safe location. Preferably one in a different time-zone.

Rob Fitzpatrick · Jul 26, 2013

KMoody said:
Progress: 10.2b SP7
OS: SUSE Linux Enterprise Server 11

Could Type II databases with areas and after-imaging benefit us, given our situation?

Aside from the clear performance benefits of Type II areas versus Type I, they shouldn't be viewed solely as a performance enhancement. They also provide benefits in reliability, availability, flexibility, and access to new features.

Performance

Blocks and clusters are "asocial", containing data from only one storage object; in Type I areas, records from different tables may be interleaved. This improves caching efficiency, as it minimizes physical I/O for tables in multi-table areas.
Type II areas are extended a cluster at a time, rather than a block at a time in Type I. This minimizes the number of extend operations and makes data more contiguous on disk, improving sequential I/O.
A full table scan of a table in a Type II area can be performed via ABL without an index, using the TABLE-SCAN record phrase. This is the fastest way to scan a full table, although the records are not ordered.
In a Type I area, a storage object's data can reside in any block in the area. In Type II, a storage object is a chain of clusters. This allows object-based utilities (e.g. tablemove, idxmove, etc.) to be more efficient.

Reliability

In Type II areas every block contains a checksum which is verified every time the block is read, meaning corruption can be detected and reported in the DB log.

Availability

The second phase of index rebuild (data scan/key build phase) for tables in Type II areas can be multi-threaded with the -datascanthreads n parameter (subject to certain conditions). Depending on factors such as available cores, available RAM, and area size, multi-threading can significantly reduce the overall index rebuild time, which means reduced downtime.

Flexibility

When using Transparent Data Encryption with Type II areas, data can be encrypted at the level of individual storage objects (tables, indexes, LOB columns). In Type I areas, you can only encrypt an entire area.
When using Alternate Buffer Pool with Type II areas, you can assign areas or individual storage objects to the ABP. With Type I you can only assign an entire area to ABP.
In Type II areas there is one RM chain (a linked list of RM blocks with available space for record storage) per table, and thus individually-tunable create and toss limits per table as well. This allows DBAs to better tune the storage to the characteristics of the data, fill blocks more efficiently, and limit record fragmentation. In Type I areas there is a single RM chain, and create and toss limits, per area.

Access to new features

When using OpenEdge Auditing, the audit data and index areas must be Type II.
When using Transparent Data Encryption, the encryption policy area must be Type II.
When using Multi-tenant Tables, multi-tenant tables (table partitions) must be in Type II areas.
When using Horizontal Table Partitioning (in OE 11.4), table partitions will have to be in Type II areas.
If you have your data in Type II areas now and you decide to use these and other new database features that require Type II areas then you can access them immediately. If your data is still in Type I areas you won't have access to these features until you dump and load.

In short, there is no scenario where Type I areas are more beneficial than Type II. As the schema area must be Type I, ensure that you have no application data there.

TomBascom · Jul 27, 2013

Excellent summary Rob.

KMoody · Jul 29, 2013

Thank you both for your advice. Based on other forum posts, I knew these changes were important, but I didn't know all the reasons why. You've been an invaluable help!

Since these changes are necessary, what issues should I anticipate during the conversion? I plan on coming in over a weekend to make the switch, and I want everything to go over smoothly the next Monday. I have live backups, cold backups, dumping, loading, database starting, and database stopping just about finished, and I know I need to figure out AI archiving. What issues have you faced during a Type I to Type II conversion?

tamhas · Jul 29, 2013

There is no substitute for doing a dry run not on the production database. There are lots of discoveries to be made about how long various operations take and feedback from the forums about ways to make them faster.

Rob Fitzpatrick · Jul 29, 2013

KMoody said:
Thank you both for your advice. Based on other forum posts, I knew these changes were important, but I didn't know all the reasons why. You've been an invaluable help!

Since these changes are necessary, what issues should I anticipate during the conversion? I plan on coming in over a weekend to make the switch, and I want everything to go over smoothly the next Monday. I have live backups, cold backups, dumping, loading, database starting, and database stopping just about finished, and I know I need to figure out AI archiving. What issues have you faced during a Type I to Type II conversion?

As Thomas said, be sure to test. That's a key to success without impact to the business.Document your procedures, script your command lines, capture stdout and stderr to log files you can analyze later, be sure to check those logs for errors, success messages, and absence of success messages; you can script that too. Be sure to take milestone backups at important points: after the creation of your empty target DB (which you can create and pre-grow in advance), after your loads, and after your index rebuild(s).

Take a tabanalys or dbanalys report from both the source and target to verify that all records were loaded successfully. In addition to table data, make sure you dump and load _User records, sequence values, and SQL permissions if you use any of those.

KMoody · Aug 1, 2013

In addition to table data, make sure you dump and load _User records, sequence values, and SQL permissions if you use any of those.

Good point! I'm glad you brought that up.

I use Progress utilities (prodict/dump_df.p, prodict/dump_fd.p, prodict/dump_d.p, prodict/load_df.p, and BULKLOAD) to dump and load my databases. It seems these utilities don't dump or load hidden tables or sequence values, at least by default.

There's a good article on loading and dumping sequences on knowledgebase.progress.com. However, is there a way to use the dump and load utilities and include hidden tables? For example, when I run prodict/dump_fd("ALL", [dumpPath]), I want _User to be included in the .fd file.

Rob Fitzpatrick · Aug 1, 2013

KMoody said:
However, is there a way to use the dump and load utilities and include hidden tables? For example, when I run prodict/dump_fd("ALL", [dumpPath]), I want _User to be included in the .fd file.

Looking at the prodict source, I see a program called _dmpuser.p in $DLC/src/prodict/dump. It looks like that's what the Data Dictionary uses when you use the menus.

If your DB gets much larger (and/or if you care about minimizing downtime), particularly in one or two tables, you may want to look at binary dumps instead of ASCII dump/bulkload.

KMoody · Aug 1, 2013

Thanks, Rob.

Since I also assign Security Administrators and table/field level permissions, I'll need to dump and load the _File and _Field tables as well. I see two programs called $DLC/src/prodict/dump/_lod_fil.p and $DLC/src/prodict/dump/_lod_fld.p, but I don't see any corresponding dump programs.

Are there fields in _User, _File, or _Field that I should not dump or load? Would it be safe to dump and load all hidden tables and all their fields? I feel like I'm getting into unknown unknown territory here.

TomBascom · Aug 1, 2013

_file & _field are taken care of by the .df file. File & field level permissions (the CAN-* fields) are part of that.

Type II, Areas, and After-Imaging

KMoody

Member

TomBascom

Curmudgeon

TomBascom

Curmudgeon

KMoody

Member

TomBascom

Curmudgeon

Rob Fitzpatrick

ProgressTalk.com Sponsor

TomBascom

Curmudgeon

KMoody

Member

tamhas

ProgressTalk.com Sponsor

Rob Fitzpatrick

ProgressTalk.com Sponsor

KMoody

Member

Rob Fitzpatrick

ProgressTalk.com Sponsor

KMoody

Member

TomBascom

Curmudgeon

Similar threads