Storage Area Limitations

bjag · Oct 30, 2007

Hi, folks - this is my first post, as I just joined today. Being relatively new to Progress, I have a question that may be considered rather basic. I have a rather large database (Progress V10.1B, 287 GB), most of which is in "Storage Area 6" - the Schema area, 225 GB, however, a lot of it appears to be empty blocks, as the total of all objects in Area 6 is only 134 GB. Does anyone know how close I am to reaching any type of storage area limitation? I read in the V9 doc that it was 256 GB, but the V10 doc says 8,796,096 Million records. (I am using 8k blocks, 64 records/block). Since I only have about 2.7 M records in that storage area, does that mean that size doesn't matter (no pun intended), just the number of records? Did Progress totally change the way it looked at storage area limitations between 9 and 10? Thanks!

Casper · Oct 31, 2007

Hi there,

Did Progress totally change the way it looked at storage area limitations between 9 and 10? Thanks!

Yes, the total number of extents per area has increased, the same aplies to the total number of area's.

IN V9 total # extents per area was: 255
In 10.1B: 1024

Maximum number of storage area's increased from 1000 to 32000.

With the use of 64-bit dbkeys and internally 64bit rowids the maximum number of records increased hugely.

The maximum size for an area is approx. 1PB. And is calculated as follows (provided large files is enabled):

Maximum extent size: 1TB
Maximum extents per area = 1024
Maximum area size = 1024 * 1TB = 1PB.

With this you can calculate the total number records per area which is for 64 rpb approx 8,796,096,000,000.

Some questions though:
Why do you have all the data in the schema area?
Do you have at least indexes seperated from the data?
If you run a dbanalys how much data is in your database really?
Considered doing a dump/load and use of typeII storage area's?
Do you use 64-bit Progress?

Greetz,

Casper.

TomBascom · Oct 31, 2007

Casper is right.

From a tuning perspective this database sounds "target rich".

You might want to look at this:

Suprising Benefits of Storage Areas

for some ideas...

bjag · Oct 31, 2007

Hello, Casper...please see below

Casper said:
Hi there,

Hi, Casper...I've responded to your questions below...

Yes, the total number of extents per area has increased, the same aplies to the total number of area's.

IN V9 total # extents per area was: 255
In 10.1B: 1024

Maximum number of storage area's increased from 1000 to 32000.

With the use of 64-bit dbkeys and internally 64bit rowids the maximum number of records increased hugely.

The maximum size for an area is approx. 1PB. And is calculated as follows (provided large files is enabled):

Maximum extent size: 1TB
Maximum extents per area = 1024
Maximum area size = 1024 * 1TB = 1PB.

With this you can calculate the total number records per area which is for 64 rpb approx 8,796,096,000,000. THANKS for the info - that really helps!

Some questions though:
Why do you have all the data in the schema area? *** I dunno - probably it was the default at the time, and grew too big to deal with by the time anyone noticed (just my theory, as I came into this project cold with no one to really ask about any historical info)
Do you have at least indexes seperated from the data? *** mostly, but I noticed that there are a few in Area 6
If you run a dbanalys how much data is in your database really? *** The figures I gave you were from the dbanalysis. In the totals for Area 6, here's what I got...80676988 137.0G (#records & total size of area 6), but in the "Block analysis of Schema Area 6, I got "29464748 block(s) found in the area", which adds up to about 225 GB when you multiple it by 8192 bytes/block.
Considered doing a dump/load and use of typeII storage area's?
Do you use 64-bit Progress? *** We're already using Type II areas. Much of "Area 6" is consumed with one humongous table - over 100 GB. I would like to put it in its own storage area, but it appears that just doing a "tablemove" (or, dump/move structure/load) will take too long, take too much diskspace, and may not decrease the blocks in Area 6. I did try this on a smaller database (i.e., moving a table to another storage area), and found that BOTH areas increased in size, which begs the question...is the only way to decrease the number of blocks consumed by an area to do a dump of all objects within it, delete the storage area completely, recreate it, then reload???

Again, THANKS so much for the info!
Brenda

Greetz,

Casper.

tamhas · Oct 31, 2007

For the table move, consider defining a new table in the new area, copying from one table to the other, deleting the source table and renaming the new table. You can use multiple processes simultaneously to speed the copy. See greenfieldtech.com for a discussion of this kind of approach to replication.

bjag · Oct 31, 2007

Thanks, "Tamhas".

That's an intriguing idea...I've never copied a table to another area, just did the "tablemove" command... which takes a long time. Do you know if this would be quicker than doing a dump/"delete table"/"recreate on new area"/reload?? When you're trying to move a 100+ GB table, time is of the essence!

Brenda

tamhas said:
For the table move, consider defining a new table in the new area, copying from one table to the other, deleting the source table and renaming the new table. You can use multiple processes simultaneously to speed the copy. See greenfieldtech.com for a discussion of this kind of approach to replication.

tamhas · Oct 31, 2007

The reason I mentioned it was a recent thread on PEG where someone was talking about moving a very large table. I believe that it was suggested that this could be the fastest way, particularly since the actual copy could be done largely without taking the database down, especially if it is some kind of history table where it is easy to identify the most recent activity and copy that at the last minute before doing the delete and rename.

Talk to Tom Bascom.

TomBascom · Oct 31, 2007

Moving big tables via either table move or dump and load has some serious downsides. Tablemove eliminates the IO associated with the scratch file but generates enormous BI activity. And you only get one thread. Dump and load creates disk IO to and from the disks that the .d (or .bd) files go to and, unless you customize it, only uses one thread.

Neither approach uses modern hardware very effectively.

I'd probably create the new area and then use a variation of Highly Parallel Dump & Load to get the data moved.

bjag · Oct 31, 2007

Casper...

I see that I didn't answer one of your questions...do we use 64-bit Progress. The answer is...I thought we were using 32-bit, but can't seem to find where it tells that. I can't find it on the license sheet or any of the promon displays. If you can tell me how to find out, I'll let you know what we use.

Thanks,
Brenda

TomBascom said:
Moving big tables via either table move or dump and load has some serious downsides. Tablemove eliminates the IO associated with the scratch file but generates enormous BI activity. And you only get one thread. Dump and load creates disk IO to and from the disks that the .d (or .bd) files go to and, unless you customize it, only uses one thread.

Neither approach uses modern hardware very effectively.

I'd probably create the new area and then use a variation of Highly Parallel Dump & Load to get the data moved.

bjag · Nov 6, 2007

Hi, Casper

I just have a couple of things to add to what I had mentioned earlier. I had thought that we use Type II storage areas because they were storage areas with multiple extents, but after reading more on it, I realize that it's the cluster size that you look at. And, all of my cluster sizes are a "1", so I guess that means that I am not using Type II storage. Does that mean that the limits you gave me are not as high, as a result? Also, I have verified that, even though we do have a 64-bit machine (Solaris), we are not using 64-bit Progress....does that make a difference in those maximum storage area size limits, as well?

Thanks again for your help!
Brenda

Casper said:
Hi there,

Yes, the total number of extents per area has increased, the same aplies to the total number of area's.

IN V9 total # extents per area was: 255
In 10.1B: 1024

Maximum number of storage area's increased from 1000 to 32000.

With the use of 64-bit dbkeys and internally 64bit rowids the maximum number of records increased hugely.

The maximum size for an area is approx. 1PB. And is calculated as follows (provided large files is enabled):

Maximum extent size: 1TB
Maximum extents per area = 1024
Maximum area size = 1024 * 1TB = 1PB.

With this you can calculate the total number records per area which is for 64 rpb approx 8,796,096,000,000.

Some questions though:
Why do you have all the data in the schema area?
Do you have at least indexes seperated from the data?
If you run a dbanalys how much data is in your database really?
Considered doing a dump/load and use of typeII storage area's?
Do you use 64-bit Progress?

Greetz,

Casper.

Casper · Nov 7, 2007

Hi Brenda,

I thought already that you would have Type I storage area because of almost all the tables being present in the schema area.

The limits still stay the same. You can have 1024 extents in an area. If you have enabled large files then each extent can grow to 1 TB. If you don't have large files enables then each extent can reach 2GB. Which still leads to an impressive 2TB of Data. So I think you don't have to worry about reaching area limits for now.

64-bit Progress can address more shared memory then 32-bit. With 32-bit Progress the upper limit for shared memory is approx 2GB. On most systems this is less. Depending on the nature of the data you have, I can imagine that 2GB of shared memory can be somewhat small. Do you have any figures on the Buffer hit ratio of your database?
In most articles I read the rule for shared memory is 10% of the database size. But it all depends on how much 'active' data you have. If most of the data isn't accessed frequently and -Bp is being used to access the 'non active' data then smaller shared memorey can be sufficient to reach a reasonable high buffer hit ratio (>95%).

The best way to monitor buffer hit ratio is to sample the database on busy times for half an hour with promomn or protop and see what the buffer hit ratio is. If this is below 95% then some shared memory tuning should improve performance, provided you can allocate more shared memory and the -pinshm parameter is used.

Having said this, I still think it's worth while to look at the database for tuning reasons.
Run dbananalys in a weekend and find out the following key features of the database:

Code:

proutil /path/to/database/databasename -C dbanalys > dbanalys.txt

There are 3 things important for tuning the database.

scatter factor for big tables (last column of record block summary)
Fragmentation facor (#fragments / # records).
Utility % for indexes

The scatter factor is the mean distance between records. The optimal value varies per database. General rule I use is that if the scatter factor of big tables (> 1Gb or more then 1000000 records) is higher then 4 then a reload improves performance significantly.

Fragmentation occurs if records grow after they have been created. A record can grow so big that it doesn't fit in the same database block anymore. Space from another database block is then being used to fill the rest of the data. This results in the fact that if you retrieve this record then 2 database block have to be retrieved in order to get 1 record. This is offcourse overhead. Rule of thumb is that if the fragmentation factor is higher then 1,5 a reload of the database improves performance significantly.

If many of the indexes have utility percentage below 60% then an index rebuild improves performance.

If it turns out that your database is running nice and smoothly you can postpone dumping and loading for a while.
On the other hand if all data points to the fact that the database is performing poorly then it is time to start making plans for a dump load.
If you have sufficient hard ware then a dump/load should be feasible in a weekend.

But I'm sure Tom can tell you much more about this in detail. If there is no experience with this type of projects at your site then it will be extremely usefull to hire someone like Tom for consultancy in this.
Major performance improvements and therefore major production improvements can be made if a reload is done properly. The cost of consultancy is much less then the benefits you gain after a reload of the database.

Hope this clarify's somewhat.

Regards,

Casper

TomBascom · Nov 7, 2007

What he said

Storage Area Limitations

bjag

New Member

Casper

ProgressTalk.com Moderator

TomBascom

Curmudgeon

bjag

New Member

tamhas

ProgressTalk.com Sponsor

bjag

New Member

tamhas

ProgressTalk.com Sponsor

TomBascom

Curmudgeon

bjag

New Member

bjag

New Member

Casper

ProgressTalk.com Moderator

TomBascom

Curmudgeon