Question Type-II-database again

Hi everybody,

OE11.7, Windows 8, 10

We have a Type-I-database that has only one table - it is used as an document-archive. So there are only creates, no updates at all and very few deletes.
Every document is written in chunks of 16K, so if a document has e.g. 100K it needs 6 records which are all filled up and the rest goes into one more record.

with 650.000 records tabanalys shows the following statistics:

Max recordsize: 16422
Average recordsize: 15373
Minimum recordsize: 23

I wonder if it would make sense to change it to a Type-II-database. Apart from the fact, that the indices of this table should be in an own area, I would guess: NO.

The database we are using now is a 4K database. We might benefit a bit if we changed to a 8K database, so that 1 record uses 2 blocks instead of 4 as it does now, but in the end I would think this is neglectable. And as there is only 1 table it shouldn’t be important, whether the database has one block per cluster or 512 blocks per cluster.

Of course there is the index-problem, but apart from this ?

TIA, Wolf
 

TomBascom

Curmudgeon
You guess wrong.

There are zero good reasons to ever create or keep a type 1 storage area.

Even though you "only" have 1 table and you are sort of getting the effect of a type 2 area it isn't quite the same. The larger cluster size has an impact when space is allocated that you do not get with type 1 areas. There are certain utility functions, such as binary dump, that can do things (like dump with no index) on a type 2 area but not on a type 1 area. FOR EACH TABLE-SCAN is much faster with a type 2 area than a type 1 area.

Likewise - 8k blocks are slightly more performant than 4k blocks. There is only one "block header" so slightly more space is available to store data which impacts overall efficiency. And you will occasionally get benefits from IO consolidation and read-ahead.

You should also take into consideration that new features and improvements to the database engine are designed to target type 2 areas. So if they ever add something that would be really, really cool for your particular table you will not be able to leverage it because of this poor decision (Progress will *never* add a really, really cool feature just for type 1 areas).

I'm sure that I have left out quite a few advantages. And even if you kid yourself that none of these apply to you, your argument is similar to the argument used by people who did not want to move from single volume to multi-volume databases. Or who do not want to move from "everything in the schema area" to using storage areas. Don't be a luddite. Move that data to (and the indexes) type 2 storage areas.
 

TomBascom

Curmudgeon
There are more wrinkles related to overall application design pros and cons but if you have settled all of that and just have to make a decision between storing lots of small chunks vs one big LOB I would argue that a single big LOB makes more sense. Using a real LOB vs a psuedo LOB positions you to make the most out of current and future storage and language features with the least amount of rework.

I freely admit to never having tested it.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
It sounds like he's rolling his own LOBs, which would require extra (possibly error-prone) processing in the language for storage and retrieval.

I would let the DB engine do the work, rather than the client. Put the document in a LOB column in its own area, and the rest of the fields (used for identification & indexing) in a table area. This approach means you can also do things like look up the data by the identifying fields without having to read all the document data. Of course the indexes would get their own area as well.
 
I'm aware of the fact that LOBs would be better, but this code is quite old and worked well all the time. It's from a time when there were no LOBs in a Progress-database ::)
Besides: the documents are stored once and viewed occasionally, really occasionally. None of our customers ever complained about speed, so I see no need of changing the code. And: the rest of the "document-fields" are in another table, even in another database.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
So the remaining question is: should you dump and load to get to Type II and 8 KB blocks?

Well, if you can afford the downtime, I would say yes. I guess this DB is around 12 GB in size so it shouldn't take long. The whole process of binary dump/load/index rebuild can be scripted. With only one table, that scripting can even be done manually rather than programmatically.

Getting to Type II storage will make your database more reliable, easier to maintain, and less in need of maintenance. Your customers should like that; you should too. See this post for more details on the benefits of moving to Type II. It's a little dated but still relevant.

OE11.7, Windows 8, 10
Given that you have paying customers, I assume this application is in some way mission-critical to them. Are you running important databases on a client OS and, presumably, low-reliability commodity PC hardware?
 
Top