Question Query Performance

WesleySmith · Jun 7, 2013

Hi Everyone,

We are currently performing some analysis of our code base to see where we can improved performance and reduce database reads.

We have a table: Table1 which can contain a lot of records.
This table has an index, lets call it idx1, which has these fields: Field1, Field2, Field3, Field4, Field5, Field6, Field7, Field8
This index is not unique

Our query currently looks like this:

Code:

 FOR EACH Table1 WHERE Table1.Field1  = Variable1
                   AND Table1.Field2  = Variable2
                   AND Table1.Field3  = Variable3
                   AND Table1.Field4 >= 11
                   AND Table1.Field4 <= 31
                   AND Table1.Field5  = Variable4
                   AND Table1.Field6  = Variable5
                   AND Table1.Field7  = Variable6
                   AND Table1.Field8  = Variable7
                   NO-LOCK:
  <Do something>
END.

The value of Table1.Field4 will either be 11, 21 or 31

According to XREF the index being selected is idx1 but if my understanding of index usage is correct (and I may be wrong) it will only select the first 4 components of the index due to the range being used for Field4 - Please correct me if I am wrong.

So I am looking for ways to improve this.
One idea I have is to do this:

Code:

DO iLoop = 11 TO 31 BY 10:
  FOR EACH Table1 WHERE Table1.Field1  = Variable1
                    AND Table1.Field2  = Variable2
                    AND Table1.Field3  = Variable3
                    AND Table1.Field4  = iLoop
                    AND Table1.Field5  = Variable4
                    AND Table1.Field6  = Variable5
                    AND Table1.Field7  = Variable6
                    AND Table1.Field8  = Variable7
                    NO-LOCK:
    <Do something>
  END.
END.

I think this will then use all 8 components of the index and the end result will be a faster and more efficient query.

Am I correct, or is there another way?

Thanks all in advance.

TomBascom · Jun 7, 2013

You are correct.

To have all components used they must all be equality matches. Your method (elegantly) achieves that goal.

Rob Fitzpatrick · Jun 8, 2013

Just for the sake of argument...

I understand that the goal is to bracket further in the selected index and thereby reduce logical I/O. However, especially given that this table "can contain a lot of records", perhaps physical I/O should be considered as well.

Given the solution provided you are querying the table three times. Each time, the server loads the blocks satisfying the query into the buffer pool (if necessary). Granted, I don't know your primary index ordering, what kind of storage area this table resides in, the relative size of the buffer pool and the blocks that satisfy the queries, or the read activity of other users who may exert memory pressure on the blocks you have read. But if this is a very active database or if the buffer pool is small compared to the data being read, is it possible that you are reading the same block from disk multiple times, e.g. that block A contains a record that satisfies the first query (where field4 = 11) and it also contains one or more records that satisfy the other two queries but it is evicted from -B between queries?

In other words, is it possible to rewrite the query in such a way that it is fully brackets on idx1 and is equally efficient from the perspective of logical I/O but it only iterates over the table once? I don't know if such an edge case is worth considering, much less changing your code for it.

Is it possible the following (which is much less elegant) could be equally efficient logically and more efficient physically?

Code:

FOR EACH Table1 WHERE
 (Table1.Field1 = Variable1 AND
  Table1.Field2 = Variable2 AND
  Table1.Field3 = Variable3 AND
  Table1.Field4 = 11        AND
  Table1.Field5 = Variable4 AND
  Table1.Field6 = Variable5 AND
  Table1.Field7 = Variable6 AND
  Table1.Field8 = Variable7) OR
 (Table1.Field1 = Variable1 AND
  Table1.Field2 = Variable2 AND
  Table1.Field3 = Variable3 AND
  Table1.Field4 = 21        AND
  Table1.Field5 = Variable4 AND
  Table1.Field6 = Variable5 AND
  Table1.Field7 = Variable6 AND
  Table1.Field8 = Variable7) OR
 (Table1.Field1 = Variable1 AND
  Table1.Field2 = Variable2 AND
  Table1.Field3 = Variable3 AND
  Table1.Field4 = 31        AND
  Table1.Field5 = Variable4 AND
  Table1.Field6 = Variable5 AND
  Table1.Field7 = Variable6 AND
  Table1.Field8 = Variable7) 
  NO-LOCK:
    <Do something>
END.

tamhas · Jun 8, 2013

If -B is reasonably large and the table not ginormous, if a block with an 11 also contains a 21, it will be in memory anyway and no new disk read will occur.

WesleySmith · Jun 10, 2013

Thank you all for your valuable input.

I had not considered Physical I/O so thanks Rob for highlighting that but I have a couple of concerns about the less elegant approach

If we ever introduced a new value for Field4 (say 41) the programmer would need to replicate multiple, and the correct, lines of code from the where expression whereas using a loop they just need to increase that to loop to 41 instead of 31
If a new element needed to be added to the where expression the programmer would need to make sure they add it for each one.

So, for the moment I am going to proceed with the more "elegant" approach.

Question Query Performance

WesleySmith

New Member

TomBascom

Curmudgeon

Rob Fitzpatrick

ProgressTalk.com Sponsor

tamhas

ProgressTalk.com Sponsor

WesleySmith

New Member