[Progress Communities] [Progress OpenEdge ABL] Forum Post: Transactions and Undos. Novel

Status
Not open for further replies.
G

George Potemkin

Guest
It’s important to know how the transactions and their rollbacks are working in Progress. I think there are the myths and the missed details here. In this topic I’d like accumulate the information about the subject. If you have any information you can share with Community, please, add it here. Comments are very welcomed. Chapter one: Transactions in VSTs. _ActSummary._Summary-Commits _ActSummary._Summary-TransComm _ActOther._Other-Commit *-Trans All these fields are sourced from the same counter stored in database shared memory. The counter is incrementing when a transaction is removed from Transaction Table provide the whole transaction was not undone. _ActSummary._Summary-Undos _ActOther._Other-Undo The counter is incremented by each UNDO statement even if it's an undo of sub-transaction. The same transaction can increment both the Commits and Undos counters. The same transaction can increment Undos many times. Fortunately it’s possible to separate the undos of sub-transactions and the backouts of the whole transactions using lasttask. _MstrBlk._MstrBlk-lasttask _DbStatus._DbStatus-LastTran mb_lasttask (field in master block, size: 4 bytes, offset: 44 bytes) It’s the last transaction number. Since V6.2G01 when it gets to 2147483647 it rolls over and continues from 1. At crash recovery the lasttask will be re-set to the last transaction id recorded in the transaction table note in the before-image log. In other words, the recent values of lasttask can be lost and the lasttask can have the non-unique values even on the short intervals. Progress uses (or used) at least two different algorithms to increment the lasttask. Simple but rarely used (or not used at all in the recent Progress versions): the lasttask is continuously incrementing by one. It was working that way in V10.2B on Windows for newly created databases. But when you at least once start a database with the -n 128 or higher the algorithm is persistently changed to the different one: the main algorithm. It looks like Progress uses some flag to choose the algorithm. I’m sure it’s not a flag in Master Block. Most likely it’s not a flag in before-image file - at least ‘proutil -C truncate bi’ will not restore the initial algorithm. Location of the flag is a mystery. The main algorithm: the lasttask is incrementing by one but only (-n + 100) times. Then it jumps up to the value multiple to the power of two – where the power of two is, of course, larger than (-n + 100). Note that strictly speaking the –n parameter does NOT set the “Maximum Number of Users” as documentation and msg #4260 are saying. Maximum number of database connections (the number of entries in Usrctl Table ) is equal to ( -n + -Mn + 2). The –n parameter defines the number of entries in Transaction Table – the maximum number of transactions that can be opened simultaneously. It looks like Progress uses some pool of entries for future transactions (for example, with the pointers to the free entries in Transaction Table) and the size of this pool is the first power of two larger than (-n + 100). But why the size of the pool is larger than the size of Transaction Table? Examples: -n 1: Pool size = 128, lasttask increments continuously 101 times, then it jumps by 27; -n 20: Pool size = 128, lasttask increments continuously 120 times, then it jumps by 8; -n 1000: Pool size = 2048, lasttask increments continuously 1100 times, then it jumps by 948. If the –n equals exactly to a power of two minus 100 then the lasttask will increments continuously. If the –n equals to a power of two minus 99 then Progress will use only a first half of the pool and the half of TRIDs will be never used. Does it affect the performance? As the result the increments of the lasttask can be twice higher than the number of transactions allocated in Transaction Table. But we can calculate the real number of transactions by the formula: The number of transactions = (lasttask2 – lasttask1) – (truncate(lasttask2/X, 0) - truncate(lasttask1/X, 0)) * (X- -n – 100) where X is the first power of two greater or equal to (-n + 100) provided the –n was not changed between lasttask1 and lasttask2. The difference between the number of transactions and the number of the committed transactions is the undos of the whole transactions. The _Summary-Undos counter includes this number plus the undos of sub-transactions. Since V10.0A Progress internally supports another two useful counters related to the transaction activity but unfortunately they are not available in promon or VSTs. They are available in Master Block only: mb_txnSequence (size: 8 bytes, offset: 240 or 236 bytes on Linux) mb_lastLogOpSeq (size: 8 bytes, offset: 248 or 244 bytes on Linux) Note that the field offsets in Master Block on Linux is different from the ones on other platforms. By the ways, it’s one (but not the only) reason why we can’t copy Progress databases between Windows and Linux. Also Master Block is flushed on disk only during rather rare events like checkpoints, switching after-image files etc. Usually it happens only once per few minutes. We can read these values from disk but they will not be up-to-date. mb_txnSequence It counts only active transactions – the transactions that did changed a database. The only similar counter in VSTs is _Connect-NumTrans. In the “young” databases the value of mb_lasttask is always higher than mb_txnSequence. The ratio is opposite in the “matured” databases where mb_lasttask is rolled over. The mb_txnSequence devided by 2147483648 estimates how many times the mb_lasttask was rolled over in your database. By the ways, some activity screens in promon (for example, Activity: BI Log) have the “Per Tx” columns. The values are based on _Summary-Commits. In other words, the counters of db changes are divided by the number of transactions that includes the transaction in the “BEGIN” status. It makes “Per Tx” statistics meaningless. mb_lastLogOpSeq It counts the "parentheses" (RL_LOGOP_START and RL_LOGOP_END notes) in BI file that enclose the physical operations that compose a logical operation. They are generated only when the database is the source for replication. Ratio of mb_lastLogOpSeq to mb_txnSequence is a rough estimation of average transaction size. Dbrpr utility will not show non-zero values in mb_lastLogOpSeq because it will request to turn off after-image logging. Instead of dbrpr the viewdbblock script can be used. To complete the list of the “transaction” fields in database blocks: An extended header of the first block in data cluster stores the transactionId (size: 4 bytes, offset: 64 bytes). I guess it’s filled based on mb_lasttask. _Trans._Trans-Num _Connect._Connect-TransId (since V8.2) _Lock._Lock-TransId (introduced in 11.4) DBTASKID() They return the transaction ID. If _Connect is in use but does not open a transaction then _Connect-TransId returns 0. DBTASKID() returns data stored in the client’s memory rather than in database shared memory. _Connect._Connect-NumTrans (introduced in 11.7.0) The counter is incremented only when transaction status is changing to “ACTIVE”. This contradicts to "What's new with the VSTs" presentation: “# transaction begin requests. Includes commit, rollback, and txns with no changes” Promon/ Activity: Summary => Active trans It’s the number of entries in Transaction Table that are currently in use, the transaction status does not matter: it’s not only ACTIVE transactions. There is no such field in VST. The corresponding value can be calculated by counting the records in _Trans where _Trans-State NE ? The value can be used for a rough estimation of the average transaction duration as a ratio of “Active trans” to “Commits”. It will work even if transaction durations is a small part of a second. _ActBILog._BiLog-RecRead _ ActBILog._BiLog-RecWriten (both exist since 8.2) _UserIO._UserIO-BIRecReads _UserIO._UserIO-BIRecWrites _Trans._Trans-BIRecReads _Trans._Trans-BIRecWrites (all introduced in 11.7.0) The number of recovery notes read from/written to BI file. Before V10.1A that introduced the savepoints the undos of sub-transactions did not read BI file. The -nosavepoint startup option on client processes restores the pre-10.1A local before-image mechanism. Progress use so called “the jump note technology” (see Article 000042774) that rules the reads of recovery notes during rollbacks but the detailed are unknown. In the most cases the transaction rollbacks read and write almost the same number of notes that was created before beginning of undo but there are the scenarios when the behavior is quite different. There is a feature of the _Trans-BIRecReads and _Trans-BIRecWrites fields: the counters should be zeros when a transaction is in the BGIN state but you may see the non-zero values. It’s the final values left by the previous transaction of the same user. _Trans-BIRecWrites will count even the RL_TEND note created by the previous transaction. The counters will be zeroed when the current transaction becomes “ACTIVE” and the _ Connect-NumTrans is incremented. I’m not sure if it’s an intended behavior but I’d not call it a bug either. We’ll get the correct values for the ACTIVE transaction. I have some statistics that makes me think that those counters in _Trans table can be zeroed under some other (unknown) circumstances. _Trans._Trans-State (since V8.2) _Lock-Trans-State (introduced in 11.4) Before 8.2: DEAD, ALLOCATED, ACTIVE, PREPARING, PREPARED, COMMITTING Since 11.4: NONE, BEGIN, ACTIVE, PREPARING. PREPARED, COMMITTING Promon: None, Begin, Active, Prep, Phase 1, Phase 2 NONE = DEAD It means that the transaction table entry previously held is now free and available for reuse by a new transaction. This state can be seen only for a very short period of time. BEGIN = ALLOCATED The entry in transaction table is created but the transaction did not yet update a database. PREPARING, PREPARED, COMMITTING They are intended for use with Two Phase commit protocol but for a short period of time these states can be seen even if the protocol is not enabled. Example from ATM test running 150 sessions on the top hardware (OpenEdge Release 11.7.2): 01/22/18 Activity: Summary 08:32:19 01/22/18 08:32 to 01/22/18 08:32 (10 sec) Event Total Per Sec |Event Total Per Sec Commits 37574 3757.4 |DB Reads 156775 15677.5 Undos 0 0.0 |DB Writes 53437 5343.7 Record Reads 112596 11259.6 |BI Reads 0 0.0 Record Updates 112094 11209.4 |BI Writes 1730 173.0 Record Creates 37346 3734.6 |AI Writes 0 0.0 Record Deletes 0 0.0 |Checkpoints 0 0.0 Record Locks 187442 18744.2 |Flushed at chkpt 0 0.0 Record Waits 15 1.5 |Active trans 150 0 Servers, 151 Users (151 Local, 0 Remote, 151 Batch), 8 Apws 08:32:19 01/22/18 08:32 to 01/22/18 08:32 (10 sec) Typical counts of transactions per their status on Status: Active Transactions screen: 121 Phase 2 16 Active 12 Begin 1 None There were no transactions in the PREPARING or PREPARED (Phase 1) states. In other words, when the transactions are short (in this case the average duration was 40 msec = 150 / 3757.4) then the ACTIVE state of transaction can be much shorter than the COMMITTING state. _Trans._Trans-Flags (introduced in 11.0) _Lock-Trans-Flags (introduced in 11.4) Values since 11.4: FWD UNDO They return the UNDO flag only when a session was externally interrupted (by signal). Undo of transactions (or sub-transactions) initiated by ABL code is reported with the “FWD” flag even if _Trans-BIRecReads has non-zero value. _Connect._Connect-Resync It returns 1 when a session is externally interrupted. If a session updated a few databases inside the same transaction then _Connect-Resync will be set only for one database at time. Session will write to a log of the corresponding database: (2252) Begin transaction backout. (2253) Transaction backout completed. Transaction rollbacks will be done per database level: the last connected – the first undone. _Connect. _Connect-Interrupt It’s not used. To be continued…

Continue reading...
 
Status
Not open for further replies.
Top