[Progress Communities] [Progress OpenEdge ABL] Forum Post: Question about the performance of ABL queries resolved by multiple indexes

dbeavon · Mar 17, 2020

I've been looking for KB articles to explain what is going with queries that are resolved by multiple indexes. The articles I've found don't get that deep into the implementation details. We have a table that has, among many other indexes, three indexes that are composed of a single column each: fy, period, and div. Here is an ABL query that interacts with these three indexes: FOR EACH fin_gl WHERE fin_gl.fy = 20190 AND fin_gl.period = 1 AND (fin_gl.div = "E212") /* (fin_gl.div + "X" = "E212X")*/ EXCLUSIVE-LOCK: EXPORT STREAM S1 fin_gl. END. According to xrefs, and according to table/index stats, this is using all three indexes. What is confusing to me is how quickly this actually executes I thought that using separate indexes to this degree might actually *decrease* performance. I thought it would resolve the query on all three indexes separately and then do a hash-match of the resulting ROWID's. I can query the three columns individually in SQL like so: SELECT count(*) FROM pub.fin_gl WHERE fy = 20190 SELECT count(*) FROM pub.fin_gl WHERE period = 1 SELECT count(*) FROM pub.fin_gl WHERE div = 'E212' These take about ten seconds each. And the selectivity is bad for all of them. The first returns ~3 million rows and the next two return ~1 million rows apiece. The final results when combining all three criteria together are about ~1000 rows. So based on an approach where you would run these three SQL statements individually and combining the ROWID of the results, I would have thought things would take a worst case of thirty seconds. At the very minimum they should take ten seconds. However the ABL query runs in just around 100 ms! If anyone understands why this takes less than ten seconds on three independent indexes, I'd love to hear it! Below are the highlighted stats that are generated after running my query. Notice that there were 1000 base rows that were returned. Also notice that none of the index reads get anywhere near 1 million. fin_gl4 is for the fy column, fin_gl5 is for the period column, and fin_gl6 is for the div column. I would have thought that at least one of the indexes would have a million reads, prior to joining against the others. However the largest number of reads for any of the indexes is 2782. The only thing I can think of is that there is additional information encoded/included in the indexes (like a SQL Server index does with "INCLUDE" options). Or perhaps the ROWID itself has some composite details encoded within it, like page information, and that could be hash-matched at a higher level, prior to hash-matching the ROWID's themselves. If one of these things are taking place then I'm wondering if/how they would be counted in the index stats. PS. Any tips would be appreciated. Someone went hog-wild about ten years ago and added a whole bunch of single-column indexes on this table. I was skeptical about their value, and hoped they would NOT show up in our XREF's at all. Once I discovered that they DO show up, I hoped the related programs would be super-SLOW so that I could rationalize the removal of these indexes in favor of conventional ones. However they are NOT actually as slow as I hoped, and I'm reconsidering whether they should be removed at all. They do appear to be used in very isolated circumstances and I still think that many of the other indexes would be suitable here as well... but I'm not eager to take risks since this is a fairly critical table. What a mess.

Continue reading...

[Progress Communities] [Progress OpenEdge ABL] Forum Post: Question about the performance of ABL queries resolved by multiple indexes

dbeavon

Guest