Answered Can (SQL) UPDATE STATISTICS hamper performance of ABL AFTER the fact - NO!

pinne65

Member
OE 11.6 / RHEL 5.11 - A little bit nervous to try it. I'm worried about any potential negative side effects on ABL AFTER you run UPDATE STATISTICS. I'm aware it'll use lots of resources while you run it. But we are not 24/7 shop so I can find some down time.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
If you are on 11.6.0 I would advise caution. I ran into a bug in 11.6.0 on AIX where UPDATE statistics uses a lot more locks than it should. I ran it in a customer's non-production environment, as a test, and blew the lock table. I can confirm that bug is fixed in 11.6.3, though it isn't mentioned in any of the SP release notes. I don't know whether this bug is AIX-only or cross-platform.

As to whether it will impact ABL clients after it runs, the main impact would be the state of the buffer pool. If you run it against all tables and indexes then it will have read the entire database. So ABL clients accessing data after UPDATE statistics will have a lot of cache misses until the data they access frequently is loaded back into the buffer pool.
 

pinne65

Member
I was more worried about any potential lasting ill effects. If that's not the case we should b e fine.
Thanks for the heads up though - we are on 11.6. But on Redhat Linux / VMware.
(Just as a side-note - we were on AIX before and ran into some bugs afftecting Progress that I think were specific to that OS.)
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Thanks for the heads up though - we are on 11.6. But on Redhat Linux / VMware.
As I said, I don't know this to be an AIX-only problem. If you are on a service pack earlier than SP03 you may be vulnerable to this bug so I strongly suggest you test this first on a newly-started copy of production and check your lock table HWM after the test.

That aside, ABL clients don't read the SQL metadata tables so I don't see how UPDATE STATISTICS could affect them. If anything, it might slightly benefit them if SQL servers are constructing more efficient query plans and are therefore not fouling the buffer pool with unnecessarily-read data or contributing to queuing due to unnecessary disk reads.
 

pinne65

Member
Thanks - it all worked w.o. problems. I just ran it on the tables in question. I don't think it has ever been run. Didn't see much if any of a performance gain though.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
Thanks - it all worked w.o. problems. I just ran it on the tables in question. I don't think it has ever been run. Didn't see much if any of a performance gain though.
There are variations to the UPDATE STATISTICS command. You can optionally update table stats, index stats, or columns stats or all column stats. Did you do all of them?

Are your SQL queries very time-consuming?
 

pinne65

Member
I did the whole shebang - UPDATE TABLE STATISTICS AND INDEX STATISTICS AND COLUMN STATISTICS on a couple of tables.

The queries are not horrible, just not as snappy as the ABL queries in our ERP. The ERP returns the result asap, and my SQL query takes 2-3s. But I'm not really sure how our ERP does. There's probably more stuff under the hood...

I'm running my queries from SQL explorer on OpenEdge our server.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
I did the whole shebang - UPDATE TABLE STATISTICS AND INDEX STATISTICS AND COLUMN STATISTICS on a couple of tables.
I believe the "whole shebang" is "UPDATE TABLE STATISTICS AND INDEX STATISTICS AND ALL COLUMN STATISTICS". Admittedly, the docs on this could do with some improvement. It says, "To get the best SQL query performance" and "a full set of SQL statistics", execute the statement you wrote. That makes it sound like you can't do any better. Then in the next paragraph it says "you get even better statistics by executing" the statement I wrote above. The difference is that without ALL, you're only getting data-distribution statistics on columns that are index components, not all columns in the table. That may or may not matter, depending on your queries.

So is this more a matter of curiosity, rather than trying to solve a problem?
 

pinne65

Member
Thanks!
You're right - and I'm not sure :) if I used ALL - will have to try again. I'm working on a real problem - the query speed is passable now. But I'd like to get it as fast as possible.
 

pinne65

Member
Yes, they are separate:
4GL: -n 1000 -L 1000000 -bibufs 40 -aibufs 60 -B 1750000 -bithold 19999 -bistall -ServerType 4GL -Mi 1 -Ma 4 -Mn 251 -Mpb 218 -maxport 32000 -DBService replserv -pica 8192
SQL: -m3 -Mi 1 -Ma 4 -Mpb 31 -N tcp -S 8001 -ServerType SQL
 

Cringer

ProgressTalk.com Moderator
Staff member
How big is the database? That's a pretty large -L. And -pica will probably be grateful for a bit of a boost too. I presume you are monitoring this to ensure it doesn't fill up?
/tangent
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
This seems not to about UPDATE STATISTICS any more, so maybe should be in a different thread, but:
  • -n 1000: too low, given a possible 996 remote clients plus page writers, watchdog, replication server, monitors, utilities, and maybe self-service clients.
  • -L: I agree it's a red flag but on 11.6, not a performance problem in and of itself.
  • 4GL broker has maxport but no minport.
  • SQL broker has neither.
  • -bibufs 40/-aibufs 60: on the small side (though this depends on transaction rate) and indicative of old tuning advice to set aibufs to 1.5 * bibufs. I set them the same, starting at 150.
  • -B: 7 GB or 14 GB, depending on your block size. Don't know if that's appropriate without knowing more about the DB, the system, the hit ratio, etc.
  • You could add -pinshm, as it's Linux.
  • You may want to add -T to the SQL broker, specifying a non-database disk or partition, depending on where your working directory is.
  • -spin: missing. Depending on how many cores you have, this is likely set much too high. Default is 6000 * # of cores.
  • -lruskips: missing. I'd start with -lruskips 100.
  • I always specify -MemCheck on the primary broker. This is for safety, not performance.
  • -tablerangesize/-indexrangesize: missing. If you have performance concerns, you should be looking at CRUD stats. If they show small, heavily-read tables, you should probably use Alternate Buffer Pool. That will lead to further tuning.
  • -omsize: missing. Do you have fewer than 1024 storage objects?
  • If you have SQL performance concerns, you may want to change the balance of -Mpb/-Ma: more servers, fewer clients per server.
  • Do you have your objects in a decent Type II structure? Can you post your structure?
  • Do you have decent storage hardware?
  • Do you have decent server hardware?
  • Are you using virtualization? If so is it configured and tuned by someone who knows what they're doing?
  • Is this database sharing its I/O bandwidth or other server resources with some other applications or databases?
 

pinne65

Member
Thanks everybody for the input. It's Greatly appreciated!

This is a 3rd party system from an outside vendor. So most things are out of my control and I'm not comfortable changing the database configuration too much. I'm a "Jack of all trades", that sometimes put on the junior dba hat and know just enough to be dangerous.

The database is 420GB, highwater at 312GB and blocks size 8192.

-L - yes, we've run into some locking problems a while back. It seems to be ok now though.

-omsize - seems like it needs a bump - we have 2623

I don't know what our confidentiality agreement with the vendor is so I'm not sure I can post the database structure. It is however type II, and appears to have designated table and index areas for different types of table record sizes/growth rate and so forth.

Server and storage hardware is ok.

We are in a VMware environment and the database is sharing IO with other databases and applications. The disks of this system are all currently using the same type of storage on the vm hosts. Moving the after image file system/disk to storage optimized for database log files is on our todo list.

We can't do much about the application itself though.

Again, thanks for the "todo" list!

We will slowly try things out one by one, starting with the items that can be changed while the database is online.

Short snapshot of recent activity
1522869102386.png
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
The stats shown look generally ok, no big red flags, other than 14 GB being potentially a somewhat small buffer pool. But that depends a lot on what data the application reads. The ratio of logical to physical reads shown is good, but of course that is something can vary significantly from one moment to the next. It would be interesting to see whether it remains high over time. If not, a -B increase may be in order, resources permitting.

Set -omsize above 2623; you might even be able to change it online with proutil -increaseto (I've had mixed success with that). But once set, it's a quick win in terms of reducing read overhead. As far as changing DB configuration generally is concerned, it's a good idea to consult with your vendor on any changes you plan to make. But it's also a good idea for them to listen to you and not be married to boilerplate settings.

Re: confidentiality, I don't think database structure qualifies as intellectual property. Also, it's your database, not the vendor's. Those are just my personal opinions, not my employer's (I work for an application partner), and I'm not a lawyer. But I do understand that you have to get buy-in from other people before disclosing that information. Fair enough. The important thing is that someone is looking after this database and they understand modern storage-design best practices. Based on what you've described, the structure sounds reasonable. But the devil's in the details. There can be efficiency issues lurking in otherwise reasonable structure/schema assignments.

I really strongly encourage adding -tablerangesize and -indexrangesize with appropriate values. Note though that with a fairly large schema and user count this change will somewhat increase your shared memory size. Test-start a schema holder DB on a test box with your production params to see how much of a change it is for you. Once you restart your prod DB with these params, start looking at your CRUD stats. You might learn a lot of interesting things about your application. ;) You'll also be better informed about whether certain aspects of your DB configuration are appropriate.

Finding appropriate values:
Code:
find dictdb._statbase no-lock.
find last dictdb._file no-lock where _file._tbl-type = "T" use-index _file-number.
find last dictdb._index no-lock where not _index._index-name begins "_" use-index _index-number.

display
  "Current -basetable  : " _statbase._tablebase                                               skip
  "Current -baseindex  : " _statbase._indexbase                                               skip
  "Highest table number: " _file._file-number                            format "->>>>>>>>>9" skip
  "Highest index number: " _index._idx-num                               format "->>>>>>>>>9" skip
  "Min -tablerangesize : " _file._file-number - _statbase._tablebase + 1 format "->>>>>>>>>9" skip
  "Min -indexrangesize : " _index._idx-num    - _statbase._indexbase + 1 format "->>>>>>>>>9" skip
with no-labels.

We are in a VMware environment and the database is sharing IO with other databases and applications. The disks of this system are all currently using the same type of storage on the vm hosts. Moving the after image file system/disk to storage optimized for database log files is on our todo list.
That's good, though it begs the question, where are they now? On a NAS?

Also, keeping AI areas, local backups, and local AI archive physically and logically separate from the database extents isn't just a performance optimization, it can reduce the damage caused by hardware/software/meatware problems. If the AI partition goes away, you still have your DB. If your DB partition goes away, you still have everything you need to recover it. But if they're all together in one place... :(
 
Last edited:

pinne65

Member
Thanks again!

Will gather some more stats, increase omsize, get stats again and compare. And go from there.

We are currently on a SAN, so we can move disks between different types of storage while online. So moving the AI disk will be the next easy step.

Ai and database files are backed up to storage physically separate from the productionn SAN. And the database is replicated with OpenEdge replication to a stand-by machine. The production server is also replicated to a 2nd DR site using Zerto. So I think we are good on the backup side of things.
 

pinne65

Member
Been gathering stats from promon r&d activity since yesterday and was about up the omsize. But it doesn't seem like there's a need, or?:

Been running gus's test script from 11.3.2: default -omsize & high number of MTL_OM latches - Forum - OpenEdge RDBMS - Progress Community
and I don't see any waits at all...

_Latch-Name _Latch-Lock _Latch-Wait latch/sec
──────────── ───────────────────── ───────────────────── ──────────
MTL_OM 746 0 746
MTL_OM 746 0 0
MTL_OM 746 0 0
MTL_OM 746 0 0
MTL_OM 746 0 0
MTL_OM 746 0 0
MTL_OM 746 0 0
MTL_OM 746 0 0
MTL_OM 746 0 0
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
I'm not 100% clear on this, but I think you may not get accurate info on latch waits and naps without enabling latch timing in promon.

What do you see for OM in promon | R&D | debghb | 6 | 11?
 

TomBascom

Curmudgeon
Increasing -omsize is probably more of a "best practice" than an "urgent need".

And, of course, I recommend ProTop for keeping an eye on such things ;)
 
Top