Answered For each using only lookup = no index

bigwill

Member
Hi all

We have a rather complex search site that searches in multiple tables. The query for this search is buildt up dynamiclly, then executed and the result is returned to a .NET site as xml. (I use appserver for calling the search and returning the search result).

Sometimes a user creates a search where every criteria has multiple values. My "query-builder" ends up with this:

Code:
for each Activity no-lock  
  where  index(string(activity.a-closecode),"0,1") > 0
    and  lookup(string(activity.WorkFlowDefinitionID), "7621e6f7-4398-453a-b424-dcaeb26b6c39,aa9a25f8-0202-4f7e-8d41-4c0977b02792,b890ed0d-29ec-43e8-84ba-06ed1ae17724") > 0 
    and  lookup(activity.QWorkFlowItemName, "UL Close,UL Complete") > 0 
    and  lookup(activity.GroupAssigned, "Feso AS,Tel2Cel") > 0:
   
end.
(there are index on both WorkFlowDefinitionID, QWorkFlowItemName and GroupAssigned)

As you can see i use lookup since there are more then one values. This is a very slow query. On our system, 45 sec. If i rewrite and use and (...or ....or ...or) and (...or ...or ...or ) it is still slow. Around 25 sec.

I have tried to set use-index xxxx for this search, but that dosn't help.

Is there a way to speed things up here ? How can i get progress to use correct index here ?
I need a solution when every criteria has multiple values.

Thank you.

Lars E.
 

Cringer

ProgressTalk.com Moderator
Staff member
As soon as you start adding ORs or functions to a query Progress has a hard time selecting good indexes and you end up invariably with a whole table read. My personal solution for the above would be to load your multi values into temp-tables and join them.
 

TomBascom

Curmudgeon
Personally, I cannot begin to answer that without knowing what indexes are available on the table.

And it would be useful to see what your 20 second faster (but still not fast enough) query actually looks like.

Lastly a bit of information about what is stored in these fields and how it is stored would be helpful.
 

GregTomkins

Active Member
We have this problem in various forms in a few places, and I hate it. Sometimes there's no good solution. At least in your case there's only the one table (but let me guess, it has 200 million records in it).

I think that INDEX in the WHERE will cause big problems (unless the criteria is met on enough records, found quickly enough searching naively, that nobody notices).
 

TheMadDBA

Active Member
The short version... INDEX or LOOKUP will not allow you to use an index. Specifying USE-INDEX doesn't change the rules of how indexes work. Your query as constructed is reading every record in that table, every time the query runs.

Progress loves equality matches. You can use OR but there are rules on when it will use indexes and not. The documentation covers this pretty well.

Step 1 is to get rid of the INDEX and LOOKUP functions and convert those into table.column = <the value> with no datatype conversions, Without that you are pretty much doomed to read all of the records in the table.

When I run into problem queries like this I create a temp-table or temp-tables with the different possible permutations of the search values (not all possible in the data) and join those temp-tables to the base table.

Define a temp-table for each of your input values (ttCloseCode, ttWorkFlowID, etc) and populate those temp-tables values.

Code:
for each ttCloseCode, 
      each ttWorkflowID,
      each ttWorkflowItem,
      each ttGroup,
       each activity where
               activity.a-closecode                      = ttCloseCode.a-closecode AND
               activity.WorkFlowDefinitionID = ttWorkFlowID.WorkFlowDefinitionID AND
               activity.QWorkFlowItemName = ttWorkflowItem.QWorkFlowItemName AND
               activity.GroupAssigned             = ttGroup.GroupAssigned NO-LOCK:

If you have indexes on each of those fields or a compound index on those 4 fields then you are in pretty good shape. If you don't you can at least add the appropriate indexes and have the code use it.
 

tamhas

ProgressTalk.com Sponsor
Depending on where you are running this, one option might be to decompose the query into multiple queries, each using an equality match and then compose the result set yourself.
 

bigwill

Member
Ok. I also had the idea with temp-tables, but that means i have to rewrite so much :) My queries are buildt up based on 30-40 input fields, but i guess that i can rewrite and use temp-tables on my most used fields.
I did a small test and my seach went from 40 sec to around 7. I am aiming at 3 sec, so probably have to add more temp-tables.

Thank you all for input.

Lars E.
 

Cringer

ProgressTalk.com Moderator
Staff member
Have a look at ProTop (http://www.dbappraise.com/protop.html) it has user bound table and index usage statistics. You can use it to see exactly what a particular query is reading (assuming -tablerangesize and -indexrangesize are set correctly). It is a very easy way of tracking down bad queries.
 

tamhas

ProgressTalk.com Sponsor
I know you are trying to avoid a big rewrite, but you might want to look at Alon Blich's talk here on a Query Optimizer since this seems to relate to your problem type.
 

andre42

Member
You can actually rephrase the query into disjunctive normal form, that usually works, even with different indexes for the conjunctive clauses. But in your case you will get a huge query. As an example I rephrased part of your query:

Code:
for each Activity no-lock 
  where  (activity.QWorkFlowItemName = "UL Close" and activity.GroupAssigned = "Feso AS")
    or (activity.QWorkFlowItemName = "UL Close" and activity.GroupAssigned = "Tel2Cel")
    or (activity.QWorkFlowItemName = "UL Complete" and activity.GroupAssigned = "Feso AS")
    or (activity.QWorkFlowItemName = "UL Complete" and activity.GroupAssigned = "Tel2Cel")  :
end.

Maybe this is feasible when using a query builder, but I wouldn't be surprised if Progress chokes on huge WHERE-clauses.
 
Top