Table Join - totally frustrated!

mosquito

New Member
I need to extract data from the UNIX system we have and populate a SQL server database. Every day my DTS packages import the flat files that are pushed to the SQL server and insert into or update the database for the past few days. I don't actually have access to the UNIX box, but instead provide scripts to the on staff UNIX guy. However, I'm having a most difficult time trying to retreive data where I need a "group by" clause. Most of the code is very simple; similar to:

DEFINE BUFFER buys FOR BF.buys.
OUTPUT TO /buys.txt.
main-loop:
FOR EACH BF.buys NO-LOCK
WHERE BF.buys.BuyDate GE BegDtParam
AND BF.buys.BuyDate LE EndDtParam:
PUT CONTROL
BF.buys.cust#
"|"
BF.buys.sale#
"|"
BF.buys.BuyDate
"|"
BF.buys.cashier
"|"
BF.buys.item#
CHR(13) CHR(10).
END.
RETURN.


Say we run a beach front business that sells items such as gum, candy, and drinks and also services such as surfing lessons. Patrons purchase only one thing at a time so they often have many sales in a day. The Progress db system keeps track of each beach sale and also constantly updates a current trip table with daily patron totals. A patron comes in 2 months later and says he received bad surfing instructions. The 2 month old sale entry is changed to 1/2 price, which changes his trip data for that day 2 months ago. The system does not include a record change date in the trip table. There is however a record change date in the buys table. I know this is bad bookkeeping, but it illustrates what I'm trying to do.

The SQL code I would use is this:
SELECT Trips.Cust#, Trips.TripDate
FROM Buys INNER JOIN Trips ON
buys.Cust# = Trips.Cust#
AND Buys.BuyDate = Trips.TripDate
GROUP BY Buys.Cust#, Buys.BuyDate,Trips.Cust#, Trips.TripDate, Buys.RecordDate
HAVING Buys.RecordDate between '2007-01-25' and '2007-01-31'

Can anyone help me (please!)?
 

mosquito

New Member
The question is (if I wasn't clear in my first post): What would the Progress code look like for this join? please,please,please?
 

mikieb

New Member
Hi,
I'm not blessed with a lot of time at the moment, but looking at your SQL code I'm guessing the BREAK BY statement might be the right direction to look in .... something along the following lines:

Code:
FOR EACH Trips NO-LOCK:

    FOR EACH Buys NO-LOCK
    WHERE Buys.Cust# = Trips.Cust#
    AND Buys.BuyDate = Trips.TripDate
    AND Buys.RecordDate >=  DATE("25/01/2007")
    AND Buys.RecordDate <=  DATE("31/01/2007")
    BREAK BY Buys.BuyDate BY Buys.RecordDate:

        IF LAST-OF(Buys.BuyDate) OR LAST-OF(Buys.RecordDate) THEN
            PUT CONTROL Buys.Cust# "|" Buys.BuyDate CHR(13) CHR(10).
    END.
END.
Hopefully that'll either give you a few ideas or maybe get some even better ones from some of the others on this esteemed website.
Good luck.
MB
 

mosquito

New Member
Thank you! :)

I was getting discouraged with no reply when I left the office on Tuesday. I was off yesterday. I was so happy this a.m. to find a response with something I could try! I fixed my script and sent it off to the UNIX guy just now. Hope it works!
 

mosquito

New Member
Well, the UNIX guy said the script had been running for about an hour, so I told him to just stop it. That's too long, even if it were to work. There's around 60,000 "buys" records in this period and only 4,000 "trips" records. There's a total of 19 million "buys" records in the entire table, though. Part of the problem may be that the record change date "RecordDate" is not an indexed field. BuyDate, Cust#, & TripDate are all indexed. Normally the 5 day period Progress extract "buys" script takes about 20 minutes (and thats using the un-indexed field - WHERE RecordDate is between a begin and end date paramater).

Any other ideas?
 

tamhas

ProgressTalk.com Sponsor
Long running queries are a sure sign that you are not correctly using the indexes of the tables in question. To help you on this, you will have to tell us what those indexes are.
 

mosquito

New Member
Long running queries are a sure sign that you are not correctly using the indexes of the tables in question. To help you on this, you will have to tell us what those indexes are.

Here are the indexes that I get from the data dictionary for these two tables:
============================= Table: buy ==============================
Flags: <p>rimary, <u>nique, <w>ord, <a>bbreviated, <i>nactive, + asc, - desc
Flags Index Name Cnt Field Name
----- -------------------------------- --- ---------------------------------
audit 3 + sycom#
+ buydate
+ buy#
buydate 4 + sycom#
+ cust#
+ buydate
+ buy#
cust 3 + sycom#
+ cust#
+ buy#
pu buy 2 + sycom#
+ buy#

============================= Table: trip ==============================
Flags: <p>rimary, <u>nique, <w>ord, <a>bbreviated, <i>nactive, + asc, - desc
Flags Index Name Cnt Field Name
----- -------------------------------- --- ---------------------------------
pu cust 3 + sycom#
+ cust#
+ tripdate
trip 2 + sycom#
+ tripdate
 

tamhas

ProgressTalk.com Sponsor
So, if I am reading that correctly, both of these tables have sycom# as the first field in every index and you have not included sycom# in your join specification, so it will do a full table scan. Try adding buys.sycom# = trips.sycom# to the join specification.

I am hoping this is a partial query since you are returning no fields from buys.
 

mosquito

New Member
Here's the whole script that I submitted this a.m. (execpt I shortened it and excluded the other 20 fields from the trip table that I want returned). I don't want any fields returned from buys, just the trips that were affected by any buys with a record change date in my range. Where should I "Try adding buys.sycom# = trips.sycom#" in this code?

DEF INPUT PARAM vipSycom# AS CHAR NO-UNDO.
DEF INPUT PARAM vipBegDt AS DATE NO-UNDO.
DEF INPUT PARAM vipEndDt AS DATE NO-UNDO.
DEF OUTPUT PARAM qbf-count AS INT NO-UNDO.

DEFINE VARIABLE qbf-governor AS INTEGER NO-UNDO.

DEFINE VARIABLE qbf-govcnt AS INTEGER NO-UNDO.
DEFINE VARIABLE qbf-loop AS INTEGER NO-UNDO.
DEFINE VARIABLE qbf-unknown AS CHARACTER NO-UNDO INITIAL ?.
DEFINE VARIABLE qbf-time AS INTEGER NO-UNDO.
DEFINE BUFFER trips FOR BF.trips.
ASSIGN
qbf-count = 0
qbf-governor = 0
qbf-time = TIME.
OUTPUT TO /trips.txt.
main-loop:
FOR EACH BF.trips NO-LOCK
WHERE BF.trips.sycom# = vipSycom#:
FOR EACH BF.trips NO-LOCK
WHERE BF.buys.sycom# = vipSycom#
AND BF.buys.cust# = BF.trips.cust#
AND BF.buys.buydate = BF.trips.tripdate
AND BF.buys.recorddate GE vipBegDt
AND BF.buys.recorddate LE vipEndDt
BREAK BY BF.buys.buydate BY BF.buys.recorddate:
IF LAST-OF(BF.buys.buydate) OR
LAST-OF (BF.buys.recorddate) THEN

qbf-count = qbf-count + 1.
PUT CONTROL
BF.trips.cust#
"|"
BF.trips.tripdate
CHR(13) CHR(10).
END.
END.
 

tamhas

ProgressTalk.com Sponsor
I thought you were trying to do this in SQL?

You are almost accomplishing *almost* the same thing with

FOR EACH BF.trips NO-LOCK
WHERE BF.buys.sycom# = vipSycom#

But, as a matter of form, I would always join the fields in the two tables which are being joined, not referring back to the external variable, i.e.,

where bf.buys.sycom# = trips.sycom#

Not, that *almost* is as good as a miss here since you didn't specify the value of the sycom# field in the trips table, so it will do a full table scan.
 

mosquito

New Member
:blush1: So, if I change it to:

main-loop:
FOR EACH BF.trips NO-LOCK
WHERE BF.trips.sycom# = vipSycom#:
FOR EACH BF.trips NO-LOCK
WHERE BF.buys.sycom# = BF.trips.sycom#
AND BF.buys.cust# = BF.trips.cust#

then you think it will run faster?
 

tamhas

ProgressTalk.com Sponsor
Way faster.

Are you handing in ABL code to be run on the host or are you running something from your client, e.g., via a SQL client?
 

mosquito

New Member
Are you handing in ABL code to be run on the host or are you running something from your client, e.g., via a SQL client?

Hmm, greek to me. Not sure what you are asking.

I work for a company that purchased this Progress based UNIX system a few years ago. We originally outsourced the "UNIX to SQL server" data warehousing component, but never really got the product we expected. Now, I'm trying to build the SQL server database. I use Cognos BI tools to produce management reports from the SQL server data.
 

tamhas

ProgressTalk.com Sponsor
In your original example, you posted both ABL and SQL versions of the code. In your updated example, it was ABL. This and the reference to "handing" over to the Unix guy make me think that you are doing extracts to filesystem files and then importing from that. One way to do it, certainly, but if you had a SQL client on your own workstation, you could write your own SQL queries without involving anyone on the Unix side. If the SQL database is just copies of data from the OE database on Unix, then you probably don't even need the SQL database. Just run your BI tool directly against the Progress database.
 

mosquito

New Member
I'm not really sure how to explain it. I'm learning as I go. I've got a picture in my mind of a UNIX box sitting next to the IT professional that I've been told is our UNIX guy. I don't know what his capabilities are, but I know he's not a Progress guy. I just know that if I send him a .p file, he'll try to run it for me and put the resulting .txt file on a server that I have access to. :eek:
 

tamhas

ProgressTalk.com Sponsor
Which is the least effective form of integration. For a nominal sum, you could have a 4GL client on your workstation, connect to the database, and submit your own queries without involving the Unix guy at all, beyond the initial setup. You could also have a SQL client in your workstation and connect with your BI tool directly to the Progress database on the Unix box, skipping the whole import-export bit. The only reason to use a separate box for this stuff is that the volume of BI work would create a drag on the production database. If you are running a lot of whole table scans unintentionally, that might be true! :) But, often it isn't. To me, this whole export-import bit is a lot of extra work, error prone, implies out of date data, and is generally not necessary or useful.
 

mosquito

New Member
I'm sure there are better solutions than what we are doing, but for the time being, this is what I have. Almost 2 years ago, the computer running the Progress database crashed. It took several days for the IT dept to get it back up and running. All the .p scripts that ran daily to pull data for the server I had access to were lost. Everything except those scripts were backed up, so the database itself was okay.

I was mining canned reports to get data. Finally, in August, I got access to the SQL server and started writing my own .p scripts. This is way better than mining reports to update a database!
 

mosquito

New Member
How do I tell the UNIX guy to do a compile with XREF? Can I send him some code? Otherwise, he won't know what to do.
 
Top