Fastest Way to Get Progress Database Table Into Memory

BCM

Member
My Progress database administrator tells me that Progress loads a table into memory when that table is requested. Any external request for data from that table (example: SQL Select of all records via ODBC) executes significantly faster when the table is already loaded in Progress memory. Using 4GL, how can I force Progress to load the table into its memory?
 

bulklodd

Member
that Progress loads a table into memory when that table is requested
It's something :) but what will progress have to do when the table size reaches your memory size?! According to your DBA progress will stop working, won't it? no,it's not quite right. progress uses ordinary cache to load datablocks in there but not a table. there's no difference between SQL and 4GL clients in the way how they use that cache. All you can do is to provide as large cache as possible and both your clients will be okay. Use -B server parameter to specify the cache size.
 

BCM

Member
I am not sure I understand what you are saying.

Here is how I know that the execution is significantly faster:
(1) I executed a Select on all records of a medium-sized table (500,000 records) using ODBC and SQL-92. This executed in approximately 17 minutes as usual.
(2) Immediately upon completion of step (1), above, I re-executed the same query via ODBC and SQL-92. The execution time decreased to 2 minutes.
 

bulklodd

Member
I said it's okay because progress uses memory cache or buffer pool. The first query is always slower than the second one. when your first query is processing progress is reading datablocks from disk into the cache but when the next query is processing most of the blocks are already in the cache and there's almost no reading from disks. if you run 4gl query it will work in the same way.

You can read it here
http://www.peg.com/techpapers/monographs/tuning/tuning.html

(see chapter Tune the database buffer pool)
 

TomBascom

Curmudgeon
If your database is small and your buffer cache larger than the db you can load the whole db into RAM by doing a "proutil dbname -C tabanalys" ;-)

The Progress 4gl & database does not load tables into memory on reference. It loads blocks. When a record is referenced the data block which contains it is loaded into the buffer cache. A data block may contain many records (depending on the b lock size, the rows per block setting and the distribution of data within your database). If nothing displaces that block from the cache it will be there the next time you need it and it won't have to be fetched from disk. This is why queries are usually faster the second time around.

If you have Type 1 storage areas (all v9 and lower databases are only type 1) then data blocks can contain records from multiple tables. You can work around this in v9 by setting up a storage area with just one table in it -- then those blocks will have records from just one table.

Starting with OE10 there are type 2 storage areas. In a type 2 area data blocks are "pure". They only contain records from a single table -- even if multiple tables are in the (type 2) area (OE 10 still supports type 1 areas, you make an area type 2 by specifying a cluster size of 8 or more).
 

joey.jeremiah

ProgressTalk Moderator
Staff member
i was under the impression that when they started with asa (advanced
storage architecture) in version 9, table blocks (rm) contain only records
from a single table. quick search showed i was wrong, again :p

i don't think you need to load the entire table into memory. the buffer pool
does a pretty good job of managing memory, conserving i/o etc.

queries that monopolize the buffer pool may not be a good thing either. i've
been meaning to test the effects of using private readonly buffers in a
production environment.

sounds to me like you should do a dump and load (and index rebuild,
compact). the diff in reading records in a contiguous and scattered order
can be as high as 20x (ymmv).
 

bulklodd

Member
The Progress 4gl & database does not load tables into memory on reference. It loads blocks.

Tom, don't you know about PSC plans for area buffer pools IOW an each area will have its own buffer pool? I think it'll be good compromise between the private pools and the common buffer pool.
 

joey.jeremiah

ProgressTalk Moderator
Staff member
yes. indexes are updated while loading as they are updated when
creating, updating etc. records

if they were not the database would be corrupt ( but triggers are disabled
while loading ).


updating indexes has a significant overhead, in table updates.

building all the indexes together mainly speeds things up ( though there are
huge performance diff in the index rebuild util in version 9 and 10, sorry )

you would need to first deactivate the indexes before loading.


it would also compress index blocks to the smallest possible size. which also
has a small negative impact.

because it would cause more index blocks to split afterwards until it settles
down usually a week or two later.


index rebuild is not the only way to optimize dump and load.

your dba might want to have a look at the dump and load chapter in the
database administration doc, i think, theres also a webcast.

maybe he could create a test database ( maybe restore from backup ) and
put together a process, do some benchmarks etc.


the biggest problem is that the database would need to be taken off line for
a few hours to perform dump and load.

but it could also give you a chance to do some house cleaning, take care of
some sql stuff ( you're the real big expert here )

maybe play around with the database structure, add areas, separate
indexes and data etc.


test drive is making a come back ( some time soon ) so you can get a copy
and try out the latest release.

i would really be interested to hear your opinion on the sql stuff.
 

TomBascom

Curmudgeon
bulklodd said:
Tom, don't you know about PSC plans for area buffer pools IOW an each area will have its own buffer pool? I think it'll be good compromise between the private pools and the common buffer pool.

Yes. But BCM was talking about the here and now not the maybe someday in the future ;-)
 

BCM

Member
You've got that right, Tom! Also, we cannot spare the hours needed to take the database offline for the dump/reload. We may have things as good as they are going to get. Copying the tables to SQL Server via the ODBC and SQL-92 takes 3.5 hours. I'd be much happier if we could accomplish that in less than 3 hours. Bandwidth is not the bottleneck. Our internal network bandwidth is 1 Gb per second. Once the data is on SQL Server it can be copied to another SQL Server in less than 10 minutes.

We have added a second SQL only server port to our broker. I have two jobs that copy the data from Progress to SQL Server. Both jobs run concurrenlty from the SQL Server. Some of the tables are copied through the first SQL port, and some through the second SQL port. The login credentials are different between the two ports, but the client identification of the box is the same. I do not observe any evidence that Progress sees the requests as coming from two different clients and runs them concurrently. I really don't understand this. We have a powerful multi-processor server with 4Gb of ram for our Progress database. I don't understand why Progress will not blast through these simple requests concurrently.

I should add that we run these copy jobs early in the morning when no users are logged-in to the Progress database.

I should add something else, too. On our largest table (4.8 million records) the following 4GL code takes almost 40 minutes to run. That is completely unacceptable.

DEF VAR N AS INT NO-UNDO.
N = 0.
FOR EACH IR NO-LOCK:
N = N + 1.
END.
 

bulklodd

Member
DEF VAR N AS INT NO-UNDO.
N = 0.
FOR EACH IR NO-LOCK:
N = N + 1.
END.

If you're on client/server you can improve your code significantly by adding fields option in FOR EACH

Code:
FOR EACH IR FIELDS() NO-LOCK: 
N = N + 1. 
END.
 

bulklodd

Member
I do not observe any evidence that Progress sees the requests as coming from two different clients and runs them concurrently. I really don't understand this. We have a powerful multi-processor server with 4Gb of ram for our Progress database. I don't understand why Progress will not blast through these simple requests concurrently.

Could you tell us about your progress version? Is it workgroup or enterprise? Could you publish your progress server startup parameters? It'll be really difficult to help you without that information.
 

BCM

Member
bulklodd - That is only an example. I am not trying to run 4GL code. I am trying to copy tables from Progress to SQL Server as quickly as possible, and I do not know why coming from Progress is so much slower than using other data sources.
 

BCM

Member
We use Progress Enterprise 9.1D. We will not be spending any money to upgrade any part of our Progress systems so please do not suggest that. Our perception is that using Progress kills integration with non-Progress applications on Windows platforms.

Do you mean the database startup parameters?
 

BCM

Member
bulklodd - Here are the recent log entries after stopping and re-starting the database.


Mon May 22 03:00:26 2006
03:00:26 BROKER 0: Multi-user session begin. (333)
03:00:27 BROKER 0: Begin Physical Redo Phase at 576 . (5326)
03:00:28 BROKER 0: Physical Redo Phase Completed at blk 679 off 4213 upd 1130. (7161)
03:00:28 BROKER 0: Started for 2501 using TCP, pid 2768. (5644)
03:00:28 BROKER 0: Connecting to Admin Server on port 7835. (8836)
03:00:28 BROKER 0: Registered with Admin Server. (8846)
03:00:28 BROKER 0: PROGRESS Version 9.1D on WINNT. (4234)
03:00:28 BROKER 0: Server started by SYSTEM on CON:. (4281)
03:00:28 BROKER 0: Started using pid: 2768. (6574)
03:00:28 BROKER 0: Physical Database Name (-db): d:\database\prod\asset300. (4235)
03:00:28 BROKER 0: Database Type (-dt): PROGRESS. (4236)
03:00:28 BROKER 0: Force Access (-F): Not Enabled. (4237)
03:00:28 BROKER 0: Direct I/O (-directio): Not Enabled. (4238)
03:00:28 BROKER 0: Number of Database Buffers (-B): 40000. (4239)
03:00:28 BROKER 0: Maximum private buffers per user (-Bpmax): 64. (9422)
03:00:28 BROKER 0: Excess Shared Memory Size (-Mxs): 16414. (4240)
03:00:28 BROKER 0: The shared memory segment is not locked in memory. (10014)
03:00:28 BROKER 0: Current Size of Lock Table (-L): 128000. (4241)
03:00:28 BROKER 0: Hash Table Entries (-hash): 13063. (4242)
03:00:28 BROKER 0: Current Spin Lock Tries (-spin): 1000. (4243)
03:00:28 BROKER 0: Number of Semaphore Sets (-semsets): 1. (6526)
03:00:28 BROKER 0: Crash Recovery (-i): Enabled. (4244)
03:00:28 BROKER 0: Database Blocksize (-blocksize): 4096. (6573)
03:00:28 BROKER 0: Delay of Before-Image Flush (-Mf): 3. (4245)
03:00:28 BROKER 0: Before-Image File I/O (-r -R): Reliable. (4247)
03:00:28 BROKER 0: Before-Image Truncate Interval (-G): 60. (4249)
03:00:28 BROKER 0: Before-Image Cluster Size: 524288. (4250)
03:00:28 BROKER 0: Before-Image Block Size: 8192. (4251)
03:00:28 BROKER 0: Number of Before-Image Buffers (-bibufs): 5. (4252)
03:00:28 BROKER 0: BI File Threshold size (-bithold): 0.0 Bytes. (9238)
03:00:28 BROKER 0: BI File Threshold Stall (-bistall): Disabled. (6552)
03:00:28 BROKER 0: After-Image Stall (-aistall): Not Enabled. (4254)
03:00:28 BROKER 0: Number of After-Image Buffers (-aibufs): 5. (4256)
03:00:28 BROKER 0: Storage object cache size (-omsize): 1024 (8527)
03:00:28 BROKER 0: Maximum Number of Clients Per Server (-Ma): 5. (4257)
03:00:28 BROKER 0: Maximum Number of Servers (-Mn): 25. (4258)
03:00:28 BROKER 0: Minimum Clients Per Server (-Mi): 3. (4259)
03:00:28 BROKER 0: Maximum Number of Users (-n): 105. (4260)
03:00:28 BROKER 0: Host Name (-H): progress1. (4261)
03:00:28 BROKER 0: Service Name (-S): 2501. (4262)
03:00:28 BROKER 0: Network Type (-N): TCP. (4263)
03:00:28 BROKER 0: Character Set (-cpinternal): ISO8859-1. (4264)
03:00:28 BROKER 0: Parameter File: Not Enabled. (4282)
03:00:28 BROKER 0: Maximum Servers Per Broker (-Mpb): 20. (5647)
03:00:28 BROKER 0: Minimum Port for Auto Servers (-minport): 3000. (5648)
03:00:28 BROKER 0: Maximum Port for Auto Servers (-maxport): 5000. (5649)
03:00:28 BROKER 0: This broker supports 4GL server groups only. (8863)
03:00:28 BROKER 0: Created shared memory with segment_id: 10420224 (9336)
03:00:28 BROKER 0: Created shared memory with segment_id: 144637952 (9336)
03:00:29 SRV 1: Started on port 3000 using TCP, pid 2076. (5646)
03:00:30 SRV 1: Login usernum 128, userid administrator, on . (742)
03:00:30 SRV 1: Login usernum 127, userid administrator, on . (742)
03:00:31 BROKER 2: Started for 2509 using TCP, pid 2780. (5644)
03:00:31 BROKER 2: This is an additional broker for this protocol. (5645)
03:00:31 BROKER 2: This broker supports SQL server groups only. (8864)
03:00:33 BROKER 3: Started for 5018 using TCP, pid 2700. (5644)
03:00:33 BROKER 3: This is an additional broker for this protocol. (5645)
03:00:33 BROKER 3: This broker supports SQL server groups only. (8864)
03:00:35 BIW 25: Started. (2518)
03:00:37 WDOG 26: Started. (2518)
03:00:39 APW 27: Started. (2518)
03:00:41 APW 28: Started. (2518)
03:21:01 SQLSRV2 4: SQL Server 9.1D.08 started, configuration: "prod.defaultconfiguration"
03:21:01 SQLSRV2 4: "SQL92" started on port 3001, pid 2612 (0x00000a34).
03:21:01 SQLSRV2 4: Thread stack size: 1024000 (bytes).
03:21:01 SQLSRV2 4: DLC from ENVIRONMENT VARIABLE is: D:\dlc91d
03:21:01 SQLSRV2 4: WRKDIR from REGISTRY is: D:\PROGRESS\WRK\
03:21:01 SQLSRV2 4: JDKHOME from REGISTRY is: D:\dlc91d\jdk
03:21:01 SQLSRV2 4: JREHOME from REGISTRY is: D:\dlc91d\jre
03:21:01 SQLSRV2 4: CLASSPATH from DEFAULT is:
03:21:01 SQLSRV2 4: PROSQL_LOCKWAIT_TIMEOUT value is: 5 seconds
03:21:02 SQLSRV2 4: Login usernum 126, remote SQL client. (8873)
03:21:02 SQLSRV2 4: Usr 126 set name to dummyID01. (7129)
03:25:01 SQLSRV2 5: SQL Server 9.1D.08 started, configuration: "prod.defaultconfiguration"
03:25:01 SQLSRV2 5: "SQL92" started on port 3005, pid 2100 (0x00000834).
03:25:01 SQLSRV2 5: Thread stack size: 1024000 (bytes).
03:25:01 SQLSRV2 5: DLC from ENVIRONMENT VARIABLE is: D:\dlc91d
03:25:01 SQLSRV2 5: WRKDIR from REGISTRY is: D:\PROGRESS\WRK\
03:25:01 SQLSRV2 5: JDKHOME from REGISTRY is: D:\dlc91d\jdk
03:25:01 SQLSRV2 5: JREHOME from REGISTRY is: D:\dlc91d\jre
03:25:01 SQLSRV2 5: CLASSPATH from DEFAULT is:
03:25:01 SQLSRV2 5: PROSQL_LOCKWAIT_TIMEOUT value is: 5 seconds
 

bulklodd

Member
Here are the recent log entries after stopping and re-starting the database

if you only read database without writing, try to use -i starup parameter. it allows you significantly improve reading. I think -RO parameter will be usefull as well.
 

BCM

Member
BulkLodd - The database is being written to throughout the day. The updates to the database are handled via a 4GL application. We copy the database to SQL Server once each day for reporting.

Here are the startup parameters for the Application Server.

#
# Connection Manager Properties File
#
%% Properties File
%% version 1.1
%% May 19, 2006 7:41:41 AM

[configuration.prod.defaultconfiguration]
asynchronouspagewriters=2
blocksindatabasebuffers=40000
database=prod
displayname=defaultConfiguration
locktableentries=128000
maxservers=24
maxusers=104
otherargs=
servergroups=prod.defaultconfiguration.defaultservergroup, prod.defaultconfiguration.sql92, prod.defaultconfiguration.sql92-2
spinlockretries=1000

[configuration.train.defaultconfiguration]
asynchronouspagewriters=2
blocksindatabasebuffers=40000
database=train
displayname=defaultConfiguration
locktableentries=80000
otherargs=-spin 1000
servergroups=train.defaultconfiguration.defaultservergroup, train.defaultconfiguration.sql92

[database.prod]
autostart=true
configurations=prod.defaultconfiguration
databasename=d:\database\prod\asset300
defaultconfiguration=prod.defaultconfiguration
displayname=Prod

[database.train]
autostart=true
configurations=train.defaultconfiguration
databasename=d:\database\train\asset300
defaultconfiguration=train.defaultconfiguration
displayname=Train

[environment]

[servergroup.prod.defaultconfiguration.defaultservergroup]
configuration=prod.defaultconfiguration
displayname=defaultServerGroup
maxclientsperserver=5
minclientsperserver=3
numberofservers=20
port=2501
type=4gl

[servergroup.prod.defaultconfiguration.sql92]
configuration=prod.defaultconfiguration
displayname=SQL92
maxclientsperserver=5
minclientsperserver=1
numberofservers=3
port=2509
type=sql

[servergroup.prod.defaultconfiguration.sql92-2]
configuration=prod.defaultconfiguration
displayname=SQL92-2
maxclientsperserver=5
numberofservers=5
port=5018
type=sql

[servergroup.train.defaultconfiguration.defaultservergroup]
configuration=train.defaultconfiguration
displayname=defaultServerGroup
maxclientsperserver=5
numberofservers=15
port=2507
type=4gl

[servergroup.train.defaultconfiguration.sql92]
configuration=train.defaultconfiguration
displayname=SQL92
maxclientsperserver=5
minclientsperserver=1
numberofservers=6
port=5014
type=sql
# host=localhost # -H
# initialservers=0 # n/a
# maxclientsperserver=0 # -Ma (calculated value)
# maxdynamicport=5000 # -maxport (5000 for NT; 2000 for UNIX)
# messagebuffersize=350 # -Mm (4gl only)
# minclientsperserver=1 # -Mi
# mindynamicport=3000 # -minport (3000 for NT; 1025 for UNIX)
# networkclientsupport=true # false for self-service
# numberofservers=0 # -Mpb
# port=0 # -S ; Must be non-zero
# # when networkclientsupport=true
# prosqltrc=nnnnnnnnnnn # turn on various levels of SQL tracing
# reportinginterval=1 # -rpint (4gl only)
 

bulklodd

Member
Do you read database using 4gl or SQL client? Have you considered a possibility just to dump data from the database in SQL format? I mean something like that:

Code:
OUTPUT TO ir.sql.          
FOR EACH ir NO-LOCK:
   PUT UNFORMATTED "INSERT INTO ir ..."  SKIP.
END.
OUTPUT CLOSE.
 
Top