Understanding of queries internal working

Shubham Pipada

New Member
Hi everyone,
I know my post could be quite annoying coz there are many and always asked questions about this topic.
Hi Tom i read your articles and few other on find and for statements.
But i am still confused with these.
So can you folks please help me understand a bit in detail as to how simple find with proper (unique)index/non-unique indexes, find first/last, for first/last, for with indexes,
and what are there differences in their working and their performances and also which ones are efficient!
 

TomBascom

Curmudgeon
It really isn't very complicated:

If you intend to fetch a single record to work on the most effective way to do that is a unique FIND statement. IMHO the Progress 4GL was successful compared to other 80s era tools precisely because they made this very, very simple and efficient:

The following is the heart and soul of quite a lot of business logic:
Code:
FIND customer NO-LOCK WHERE custNum = X no-error.
IF AVAILABLE( customer ) THEN
  // do something
 ELSE
  // handle the error case

(An unfortunately large amount of code makes no attempt to handle the potential error.)

FIND FIRST is mostly an abomination coded by lazy slackers who are either blindly following local worst practices or who fundamentally misunderstand what the statement is actually doing. I'm sure you can find a few threads here and there where I go into detail about that.

Every excuse anyone makes about why they should always put FIRST on every FIND is bullshit and some of that bullshit is actively harmful to correct operation of the application.

If you are adding FIRST to every FIND you are doing it wrong.

Once in a while FIND FIRST makes sense. It is not, however, appropriate to glue FIRST onto every FIND.

====

FOR EACH is the statement that you should be using to process a set of records. That result set _could_ have zero, one or many records.

FOR FIRST is even more of an abomination than FIND FIRST. It does not do what you hope that it does - namely it does not sort and then select. The FOR statement selects and then sorts. FIRST means that it will select exactly one record. Which will result in a null sort. To see what kind of issues this might cause run the following in the sports database:

Code:
for each customer no-lock:
  display custNum name discount.
end.
pause.

for first customer no-lock by discount:
  display custNum name discount.
end.

FOR FIRST is a red flag. It almost certainly means that the programmer is wrong about something. If you're lucky the programmer left a comment behind confessing to his (or her) sins.

====

When speaking of efficiency index selection rules are critically important. The most vital rule is simple -- the compiler will choose the index with the most leading components that participate in equality matches. This also happens to be the most natural rule to think about when your database is properly designed following 3rd normal form. IOW: Every field in a record is a fact about the key, the whole key and nothing but the key. So help me Codd.

There are more rules and things can get quite complicated but, IMHO, that one rule covers 99.44% of actual usage and it will serve you very well.
 

Shubham Pipada

New Member
It really isn't very complicated:

If you intend to fetch a single record to work on the most effective way to do that is a unique FIND statement. IMHO the Progress 4GL was successful compared to other 80s era tools precisely because they made this very, very simple and efficient:

The following is the heart and soul of quite a lot of business logic:
Code:
FIND customer NO-LOCK WHERE custNum = X no-error.
IF AVAILABLE( customer ) THEN
  // do something
ELSE
  // handle the error case

(An unfortunately large amount of code makes no attempt to handle the potential error.)

FIND FIRST is mostly an abomination coded by lazy slackers who are either blindly following local worst practices or who fundamentally misunderstand what the statement is actually doing. I'm sure you can find a few threads here and there where I go into detail about that.

Every excuse anyone makes about why they should always put FIRST on every FIND is bullshit and some of that bullshit is actively harmful to correct operation of the application.

If you are adding FIRST to every FIND you are doing it wrong.

Once in a while FIND FIRST makes sense. It is not, however, appropriate to glue FIRST onto every FIND.

====

FOR EACH is the statement that you should be using to process a set of records. That result set _could_ have zero, one or many records.

FOR FIRST is even more of an abomination than FIND FIRST. It does not do what you hope that it does - namely it does not sort and then select. The FOR statement selects and then sorts. FIRST means that it will select exactly one record. Which will result in a null sort. To see what kind of issues this might cause run the following in the sports database:

Code:
for each customer no-lock:
  display custNum name discount.
end.
pause.

for first customer no-lock by discount:
  display custNum name discount.
end.

FOR FIRST is a red flag. It almost certainly means that the programmer is wrong about something. If you're lucky the programmer left a comment behind confessing to his (or her) sins.

====

When speaking of efficiency index selection rules are critically important. The most vital rule is simple -- the compiler will choose the index with the most leading components that participate in equality matches. This also happens to be the most natural rule to think about when your database is properly designed following 3rd normal form. IOW: Every field in a record is a fact about the key, the whole key and nothing but the key. So help me Codd.

There are more rules and things can get quite complicated but, IMHO, that one rule covers 99.44% of actual usage and it will serve you very well.
Hi Tom,
I understood a bit from your answer. But still i have few doubts.
I got the point that FIND is to be used only if we know we have a unique record(by providing complete index).
But then if i am not sure whether i have only a single unique record, how should i check that? using for first / find first? I know that in this case where in if i dont have unique record, then simple find with unique index is gonna throw an error that NO table-name record found.
And as you said, for first would first select and then sort (which states that sorting is of no use coz its gonna give only 1 sorted record),does that mean that for first would use multiple indexes even if i dont specify any?
And if for first first selects and then sorts, is that the same case with find first i.e select and sort or the opposite?
And i also want to know that like for-each scans the whole table i.e for checking the condition specified by the where clause, does for first/find first
(if at all used)/find with unique index continues scanning the whole table even if the condition mentioned in where clause is satisfied in the middle of the table scan?
Thanks in advance.
 

Cringer

ProgressTalk.com Moderator
Staff member
To check for multiples / no records:

Code:
FIND customer NO-LOCK NO-ERROR. 
IF AVAILABLE customer THEN DO: 

  /*blah*/

END. 
ELSE DO: 
  MESSAGE SUBSTITUTE ("No record found. The record is &1.", TRIM (STRING (AMBIGUOUS (customer), "ambiguous/unavailable"))) VIEW-AS ALERT-BOX.
END.
 

Shubham Pipada

New Member
To check for multiples / no records:

Code:
FIND customer NO-LOCK NO-ERROR.
IF AVAILABLE customer THEN DO:

  /*blah*/

END.
ELSE DO:
  MESSAGE SUBSTITUTE ("No record found. The record is &1.", TRIM (STRING (AMBIGUOUS (customer), "ambiguous/unavailable"))) VIEW-AS ALERT-BOX.
END.
Thanks......it means that whenever using FIND, we must handle the error occurring situations.
 

TomBascom

Curmudgeon
But then if i am not sure whether i have only a single unique record, how should i check that? using for first / find first? I know that in this case where in if i dont have unique record, then simple find with unique index is gonna throw an error that NO table-name record found.

As a programmer you really ought to know if your INTENT when you write your code is to work with a single, unique, record or not. There really shouldn't be situations where you do not know. Pretty much the first thing you should do is to learn about and understand the data model and schema of the database that you are working with. That includes understanding the indexing and relationships between tables.

You should not be writing code where you are unsure of such things and trying to protect yourself by adding FIRST to FIND statements. In fact that is precisely the sort of thing that lazy slackers do. And one of the major reasons why they do it is because they have no clue what they are doing. Which is why their code sucks.
 

TomBascom

Curmudgeon
FOR EACH will use one or more indexes to "bracket" the data and read as little as necessary to resolve your query. If you omit the WHERE clause or write a WHERE clause that is not well supported by at least one index then a table scan will occur. (In 11.x+ you can also specifically force a table scan with the TABLE-SCAN keyword, it is rare but sometimes this is actually more efficient).

You can determine if the WHERE clause is using a bracketed query or a table-scan by compiling with XREF and checking the output for WHOLE-INDEX.

Again -- as a programmer you should have a pretty good idea what you are expecting to see. It is always good to confirm that with COMPILE XREF but you should not be throwing darts at the wall and praying for a miracle.
 

TomBascom

Curmudgeon
FIND does not select and then sort. FIND has no BY clause. FIND simply selects exactly one record or it returns an error. FIRST is ignored if the WHERE resolves to a unique index. If the WHERE is non unique then FIND FIRST simply stops after it fetches the first record.

Unless you FIND by RECID or ROWID OpenEdge *always* uses an index. The WHERE clause is evaluated at compile time to determine which index. (You can over-ride that by specifying USE-INDEX but that is also a "worst practice" that you should not adopt.)

When a programmer codes a FIND statement they are saying that they expect exactly one record to be returned. The WHERE clause is describing that record's key. In almost all cases that key should be unique. If it is not unique then the programmer is probably not a very good programmer and quite likely lacks much of an understanding of the data that he or she is working with.

(I said "almost all". There are a few cases where it is legitimate to use FIND FIRST and FIND LAST. But they are *exceptions*. It is NOT proper to glue FIRST onto every FIND.)
 

Shubham Pipada

New Member
FIND does not select and then sort. FIND has no BY clause. FIND simply selects exactly one record or it returns an error. FIRST is ignored if the WHERE resolves to a unique index. If the WHERE is non unique then FIND FIRST simply stops after it fetches the first record.

Unless you FIND by RECID or ROWID OpenEdge *always* uses an index. The WHERE clause is evaluated at compile time to determine which index. (You can over-ride that by specifying USE-INDEX but that is also a "worst practice" that you should not adopt.)

When a programmer codes a FIND statement they are saying that they expect exactly one record to be returned. The WHERE clause is describing that record's key. In almost all cases that key should be unique. If it is not unique then the programmer is probably not a very good programmer and quite likely lacks much of an understanding of the data that he or she is working with.

(I said "almost all". There are a few cases where it is legitimate to use FIND FIRST and FIND LAST. But they are *exceptions*. It is NOT proper to glue FIRST onto every FIND.)
Thank you so much for clearing up my doubts! Really explained very well!
Now i am clear and sorted with the things.
 
Top