Question Cleanup after using a dataset

Documentation says:
There are two recommended practices for cleaning up after using datasets in your code:
DETACH-DATA-SOURCE
EMPTY-DATASET


IMHO, it’s true only partially.

I wrote demo.p that runs the procedure A that calls procedure B. Procedure B creates a dataset with one temp-table record and returns it to A as “OUTPUT DATASET-HANDLE hDataset”.

Through the code’s execution demo.p checks the number of the session’s objects:
SESSION:FIRST-<Object> + NEXT-SIBLING.
Namely FIRST-BUFFER and FIRST-DATASET.

Result:
Code:
 # Breakpoint               Datasets Buffers
 1 Test is beginning...            0    0
 2 A: Before OUTPUT DATASET-HANDLE 0    0
 3 B: Before CREATE TEMP-TABLE     0    0
 4 B: After CREATE TEMP-TABLE      0    0
 5 B: Before CREATE DATASET        0    1
 6 B: After CREATE DATASET         1    1
 7 B: After DETACH-DATA-SOURCE     1    1
 8 A: After OUTPUT DATASET-HANDLE  2    2
 9 A: After GET-BUFFER-HANDLE      2    2
10 A: Before EMPTY-DATASET         2    2
11 A: After EMPTY-DATASET          2    2
12 A: After DELETE hDataset        1    1
13 Test is finished                1    1
Checking from Progress editor      1    0
Procedure A cleans up after using a dataset but I don’t see how it would be possible to do the same in procedure B. Each run of it leaves a dataset with its buffer(s) in memory and its records in DBI files.

A bit more complicated demo program allows to run the procedure B on appserver (classic or PAS) and it confirms that the sizes of DBI files of the session running procedure B keeps growing with every its call. All records created by all calls are stored in DBI file. The only way to clean up is to restart a session. The customer's case: they got a DBI file of about 100 GB.

After running demo.p we are going back to Progress editor. All instances of the previously created datasets are still available for the session. But editor cleans up the buffers!
Code:
DEFINE VARIABLE hDataset  AS HANDLE NO-UNDO.
DEFINE VARIABLE hBuffer   AS HANDLE NO-UNDO.

ASSIGN hDataset = SESSION:FIRST-DATASET.
DISPLAY "VALID-HANDLE(FIRST-DATASET):" VALID-HANDLE(hDataset) SKIP.

IF VALID-HANDLE(hDataset) THEN
ASSIGN hBuffer = hDataset:GET-BUFFER-HANDLE(1).
DISPLAY "VALID-HANDLE(GET-BUFFER-HANDLE):" VALID-HANDLE(hBuffer) SKIP.

ASSIGN hBuffer = SESSION:FIRST-BUFFER.
DISPLAY "VALID-HANDLE(FIRST-BUFFER):" VALID-HANDLE(hBuffer) SKIP.
The following code will crash the session:

Code:
DEFINE VARIABLE hDataset AS HANDLE NO-UNDO.
ASSIGN hDataset = SESSION:FIRST-DATASET.
hDataset:EMPTY-DATASET.
Tested with V10.2B and 12.8 on Windows and Linux.

What I missed? How to clean up after a procedure that sends a dataset as OUTPUT DATASET-HANDLE?
 

Attachments

As James suggests, passing BY-REFERENCE may solve:


The 'TIPS FOR A PRO DATASET DEVELOPER.pdf' on TIPS FOR A PRO DATASET DEVELOPER | The OpenEdge Hive contains a few BY-REFERENCE examples that may be useful.

There is a tip here to use INPUT-OUTPUT parameters instead of OUTPUT parameters but as stated the downside is that the data will be passed along as well, which may result in an unnecessary overhead:

 
Is it any different if you pass it BY-REFERENCE?
Code:
# Breakpoint               Datasets Buffers
 1 Test is beginning...            0    0
 2 A: Before OUTPUT DATASET-HANI   0    0
 3 B: Before CREATE TEMP-TABLE     0    0
 4 B: After CREATE TEMP-TABLE      0    0
 5 B: Before CREATE DATASET        0    1
 6 B: After CREATE DATASET         1    1
 7 B: After DETACH-DATA-SOURCE     1    1
 8 A: After OUTPUT DATASET-HANDLE  1    1
 9 A: After GET-BUFFER-HANDLE      1    1
10 A: Before EMPTY-DATASET         1    1
11 A: After EMPTY-DATASET          1    1
12 A: After DELETE hDataset        0    0
13 Test is finished                0    0

With BY-REFERENCE a cleanup in A empties a dataset in B.

I'm thinking about a workaround - to sign a dataset handle:
Code:
  CREATE DATASET ophDataSet.
  ASSIGN ophDataset:PRIVATE-DATA = "my signature".

When the same appserver will get new request it can check the chain of the old dataset handles (SESSION:FIRST-DATASET) to find out the signed one to be re-used.
 
Ah, it may not solve with a remote procedure. A very old discussion suggests BY-REFERENCE is ignored and a deep copy happens:


If so, then the only option may to delete specifically as I am sure I read somewhere that OUTPUT DATASET-HANDLE parameters used without BY-REFERENCE needed to be deleted but cannot find the actual details.
 
I am sure I read somewhere that OUTPUT DATASET-HANDLE parameters used without BY-REFERENCE needed to be deleted
Code:
PROCEDURE B:
  DEFINE OUTPUT PARAMETER DATASET-HANDLE ophDataSet.
  <code>
END PROCEDURE. /* B */
It looks like a dataset is sent to a calling procedure at the END operator. We should not empty a dataset before the END. We can't empty a dataset after the END.
 
I'm thinking about a workaround - to sign a dataset handle:
Unfortunately, it turned out to be impossible to re-use or to delete the "forgotten" datasets from the chain in SESSION:FIRST-DATASET.
DELETE OBJECT hDataset has no effect.
hDataset:EMPTY-DATASET() crashes a session whatever I do with dataset before EMPTY-DATASET.

IMHO, it looks like a bug.
Code:
PROCEDURE B:
  DEFINE OUTPUT PARAMETER DATASET-HANDLE ophDataset.
  <code>
END PROCEDURE.
Progress itself should delete ophDataset at the END PROCEDURE provided the dataset was not sent to a caller BY-REFERENCE.

The bug does not allow to create the datasets on appservers without their periodical restarts. Any use of OUTPUT DATASET-HANDLE without BY-REFERENCE can create the problems in production due to unlimited disk space usage for DBI files.


A bit off-topic question.

The following code will create only one dataset (no CREATE DATASET statements needed):
Code:
RUN B(OUTPUT DATASET-HANDLE hDataset).
RUN B(OUTPUT DATASET-HANDLE hDataset).

The next code will rise the error 12314 for the second call of procedure B:
Code:
RUN B(OUTPUT DATASET-HANDLE hDataset).
DELETE OBJECT hDataset.
RUN B(OUTPUT DATASET-HANDLE hDataset).
Invalid DATASET-HANDLE parameter given. (12314)

Next code does not rise the error:
Code:
RUN B(OUTPUT DATASET-HANDLE hDataset1).
DELETE OBJECT hDataset1.
RUN B(OUTPUT DATASET-HANDLE hDataset2).

It looks like OUTPUT DATASET-HANDLE sometimes works like CREATE DATASET but other times doesn't.
Are the rules documented/explained somewhere?


Continue the off-topic...
Going back to the Progress editor after running our code cleans up the chain of SESSION:FIRST-BUFFER but does not clean up the chain of SESSION:FIRST-DATASET. Would it be a good idea for the editor to issue a warning if any of SESSION:FIRST-<object> chains is not empty? It would kindly inform us about the object leak in our code.
 
Unfortunately, it turned out to be impossible to re-use or to delete the "forgotten" datasets from the chain in SESSION:FIRST-DATASET.
DELETE OBJECT hDataset has no effect.
I was wrong. On return to the editor, Progress deletes all temp-tables linked to the orphaned datasets but it just forgets to delete the datasets themselves.

EMPTY-DATASET does crash a session when dataset does not have the temp-tables.

I'm thinking about a workaround - to sign a dataset handle:
The workaround does work!

In the callee I’m using:
Code:
DEFINE OUTPUT PARAMETER DATASET-HANDLE ohDataset.
RUN DatasetCleanup(PROGRAM-NAME(1)).
CREATE DATASET ohDataset.
/* Set "owner" of dataset: */
ASSIGN ohDataset:PRIVATE-DATA = PROGRAM-NAME(1).
where DatasetCleanup:
Code:
PROCEDURE DatasetCleanup:
  DEFINE INPUT PARAMETER ipDatasetOwner AS CHARACTER NO-UNDO.

  DEFINE VARIABLE hCurrDataset AS HANDLE NO-UNDO.
  DEFINE VARIABLE hNextDataset AS HANDLE NO-UNDO.

  ASSIGN hNextDataset = SESSION:FIRST-DATASET.
  REPEAT WHILE VALID-HANDLE(hNextDataset):

    ASSIGN hCurrDataset = hNextDataset
           hNextDataset = hCurrDataset:NEXT-SIBLING
    . /* ASSIGN */

    IF hCurrDataset:PRIVATE-DATA NE ipDatasetOwner THEN
    NEXT.

    IF VALID-HANDLE(hCurrDataset:GET-BUFFER-HANDLE(1)) THEN
    hCurrDataset:EMPTY-DATASET.

    DELETE OBJECT hCurrDataset.
  END. /* REPEAT */
END PROCEDURE. /* DatasetCleanup */
Now DBI is not growing for the repeated runs of a callee that uses OUTPUT PARAMETER DATASET-HANDLE.

Bonus:
DatasetCleanup procedure can be used in a caller as well:
Code:
RUN callee.p(OUTPUT DATASET-HANDLE hDataset /*BY-REFERENCE*/).
ASSIGN hDataset:PRIVATE-DATA = PROGRAM-NAME(1).
<deal with hDataset>
RUN DatasetCleanup(PROGRAM-NAME(1)).
DatasetCleanup will clean up all datasets “owned” by the procedure. It will work for BY-REFERENCE as well.
 
The only thing you need is to add the somewhat counter-intuitive delete object ophDataset to the end (or finally block) of procedure B. The delete is postponed until the procedure ends.

See first note of DELETE OBJECT statement:

When a temp-table object is passed as a table-handle output parameter, the deletion of the object following the DELETE OBJECT statement is postponed until the procedure goes out of scope. When the procedure returns, the table-handle is created, receives a copy of the original temp-table, and is then returned.
 
You looked to have solved the problem especially with the DBI no-longer growing.

You mention in one post that there may be a bug and this mentions a different bug so you may have uncovered another one:


I have been trying to find the statement about deleting the dataset if not using BY-REFERENCE but have been unable to. There is one I found where it says that using DATASET-HANDLE can be a deep copy and the callee is responsible for deleting the object. As you have highlighted the bonus with your DatasetCleanup procedure is it can be used in a caller as well. Although not the one I was thinking of the deep copy information may be useful to know so will list for reference:


Anyway, the good news is you have come up with a very good solution.
 
The only thing you need is to add the somewhat counter-intuitive delete object ophDataset to the end (or finally block) of procedure B. The delete is postponed until the procedure ends.
Bingo!
Code:
  DELETE OBJECT ophDataSet.
END PROCEDURE. /* Callee */

Tested: a caller gets all records from ophDataSet and then a callee deletes the dataset

Thank you!
 
Back
Top