I confess that I don't understand George's remarks about indexes. A multi-thread dump does't use any index (as far as I know) -- it just deals with a block of records within separate ranges of recid ... or have I missed something?
Binary dump does use an index. That is how it finds the records and determines their order in the dump file(s), and thus in the resulting table in the target. You use the
-index n
option to specify the dump index.
Source:
https://docs.progress.com/bundle/openedge-database-management/page/PROUTIL-DUMP-qualifier.html
Example: dumping customer from sports2020 via the primary index (CustNum):
$ proutil s2k20 -C dump customer dump -thread 1
OpenEdge Release 12.7 as of Fri Apr 21 08:45:41 EDT 2023
Threads number is not specified.
Maximum number of threads running is 2. (14774)
Using index CustNum (13) for dump of table customer. (17813)
Thread 140132717238592 dumped
222 records for bracket 2. (17816)
Thread 140132718250304 dumped
895 records for bracket 1. (17816)
Binary dump started 2 threads. (14776)
Dumped 1117 records. (13932)
Binary Dump complete. (6254)
This explains George's comment on how the Progress code chooses the thread count. The idxblockreport provides a textual depiction of the index B-tree. We can see that for the index Customer.CustNum, the root block in the tree contains two entries:
Code:
$ proutil s2k20 -C idxblockreport customer.custnum
OpenEdge Release 12.7 as of Fri Apr 21 08:45:41 EDT 2023
BlockSize = 8192 Block Capacity = 8100
Number Length On Length Delete
of of Delete of Chain Percent
DBKEY Level Entries Entries Chain Size Type Utilized
1664 1 2 21 0 0 root 0
1728 2 896 8092 0 0 leaf 99
1696 2 222 2010 0 0 leaf 24
Index Block Report completed successfully.
The entries in the root block, dbkey 1664, point to the two leaf blocks with dbkeys 1728 and 1696. So one thread dumps records pointed to by the leaf-level entries in the left side of the tree, and the other thread dumps the records for the right side of the tree. George's point is that the B-tree can have more than two paths from the root block, so a binary dump with -thread 1 might choose a different number of threads, depending on how many blocks there are at the second level, i.e. how many entries are in the root block (the first level).
Note also that in this case, the threads did not do equal work. One dumped 222 records and the other dumped 895 records.
Side note: leaf blocks are the blocks in the bottom level of the B-tree. They contain the pointers to the records. In the example above, the index had only two levels: the root block and the leaf level, because this is a very small table. Large tables will have more levels (typically three to five) in their B-trees. The number of B-tree levels in an index is shown in the Index Block Summary of proutl dbanalys/idxanalys:
Code:
Table Index Fields Levels Blocks Size % Util Factor
PUB.Customer
CustNum 13 1 2 3 9.9K 41.3 2.2
For dumping without an index, you may be thinking about
-index 0
. This follows the table's cluster chain to find the records in their per-cluster physical order, so it is only possible for a table in a Type 2 area. But a consequence is that the dump order of the records is indeterminate, which may not be desirable.
$ proutil s2k20 -C dump customer dump
-index 0
OpenEdge Release 12.7 as of Fri Apr 21 08:45:41 EDT 2023
Performing table scan for dump of table customer. (14653)
Dumped 1117 records. (13932)
Binary Dump complete. (6254)
A simple visual depiction of a B-tree:
Note that in this B-tree, the root block (19) has two paths: to block 7 and its child nodes, on the left; and to block 37 and its child nodes, on the right.
Source:
Again, this example shows two child nodes below the root, but the kind of B-tree used in Progress indexes (Prefix B+ tree, I think?) can have multiple children below a parent node.