ABL2DB

tamhas · Sep 10, 2014

So, does the physical name (or logical name from the connect statement) end up in the XREF for the unqualified ones?

Stefan · Sep 10, 2014

tamhas said:
Well, hopefully that is cured by the current version on OE Hive.

What about the - line 32768: -( question?

Here is our code that was handling the issues in the line numbers reported by callStack:

Code:

               /* line is being returned as a 16-bit signed integer... note that if line > 65535 then we are lost */
               IF cline_nr = "-(":U
               THEN iline_nr = 32768.
               ELSE iline_nr = INTEGER (cline_nr).
               IF iline_nr < 0
               THEN iline_nr = 65536 - iline_nr.

tamhas · Sep 10, 2014

So, what makes the listing file so much larger. Is it just the line wrap resulting in multiple lines in the file and only one in the debug? Or is there a huge block/buffer/frame listing at the end too? I am guessing the 58K to 112K is includes.

Stefan · Sep 10, 2014

tamhas said:
So, does the physical name (or logical name from the connect statement) end up in the XREF for the unqualified ones?

Yes:

<Reference Object-identifier="exactcs.debtor" Reference-type="REFERENCE">

tamhas · Sep 10, 2014

So, it really was -( ... how peculiar. I was thinking of doing something similar except also keeping track of the last line number so if the new line number was less, then go on beyond 65K. I guess for *you* we might have to go to 2X or 3X 65K!

tamhas · Sep 10, 2014

So, everything should be fine with a .df for exactcssh and the alias addition you made. No?

Stefan · Sep 10, 2014

tamhas said:
So, what makes the listing file so much larger. Is it just the line wrap resulting in multiple lines in the file and only one in the debug? Or is there a huge block/buffer/frame listing at the end too? I am guessing the 58K to 112K is includes.

Yes, 58 -> 112 is includes.

112k -> 167k is page headers and 80 character wrapping
167k -> 175k is block summary - there are no frames, this is server side business logic

I may be off - I've never looked at list output before.

Stefan · Sep 10, 2014

tamhas said:
So, it really was -( ... how peculiar. I was thinking of doing something similar except also keeping track of the last line number so if the new line number was less, then go on beyond 65K. I guess for *you* we might have to go to 2X or 3X 65K!

Your approach will work if line numbers are always going upwards sequentially. This may be sufficient for the .lst file - I do not know if there are random order references to these line numbers.

Stefan · Sep 10, 2014

tamhas said:
So, everything should be fine with a .df for exactcssh and the alias addition you made. No?

Yes, I think so - will fire it up.

Stefan · Sep 10, 2014

Aha - results just came in on the ' benchmark':

Code:

---------------------------
Message
---------------------------
3654203 
1111986
---------------------------
OK  
---------------------------

Note that simply reading thru the file entry by entry is taking 20 minutes. I love the entries functions for their simplicity, but their performance can be dreadful.

Stefan · Sep 10, 2014

When reading the file with index and substring:

Code:

DEF VAR lcc AS LONGCHAR NO-UNDO.

COPY-LOB FROM FILE ("C:\Temp\abl2db\work\lst\finance\ifallgen\gatrss-general.p.lst") to lcc.

DEF VAR itime AS INT NO-UNDO EXTENT 5 INITIAL {&SEQUENCE}.
DEF VAR cline AS CHAR NO-UNDO.
DEF VAR ipos1 AS INT NO-UNDO INITIAL 1.
DEF VAR ipos2 AS INT NO-UNDO INITIAL 1.

itime[{&SEQUENCE}] = ETIME.

DO WHILE ipos1 > 0:
   ipos2 = INDEX( lcc, "~n", ipos1 ).

   IF ipos2 > 0 THEN
      ASSIGN
         cline = SUBSTRING( lcc, ipos1, ipos2 - ipos1 )
         ipos1 = ipos2 + 1
         .
   ELSE 
      ASSIGN
         cline = SUBSTRING( lcc, ipos1 )
         ipos1 = 0
         .
END.

itime[{&SEQUENCE}] = ETIME.

MESSAGE
   itime[2] - itime[1] SKIP
   itime[3] - itime[2]
VIEW-AS ALERT-BOX.

The file is read in 892 ms - which is 1000 times faster than using entries.

tamhas · Sep 11, 2014

Boy, is that ugly. But, 1000 to 1 is hard to ignore. Entry was so nice. I may have to encapsulate this to hide it!

Thanks for your effort and contribution. I wish we had a whole community of people like you!

TomBascom · Sep 11, 2014

How large is your test file?

In my tests if ipos1 > 32767 INDEX() throws an error.

tamhas · Sep 11, 2014

If I understood Stefan correctly, the test file was 174,186 lines.

TheMadDBA · Sep 11, 2014

Worked for me using his sample code on a 215,591 line file on 10.2B AIX.

TomBascom · Sep 11, 2014

Additional testing... the error I got is misleading:

The third argument to INDEX or R-INDEX must be in the range 1 to 32767. (4687)

It *actually* blows up at roughly 1GB.

To simply count lines I am using:

Code:

define variable bigFile  as longchar  no-undo.
define variable numLines as integer  no-undo.

define variable i as int64 no-undo initial 1.
define variable j as int64 no-undo initial 1.
define variable k as int64 no-undo initial 1.

etime( yes ).

copy-lob from file "/db/s2k0.lg" to bigFile.

message "read file:" etime.
hide message.

etime( yes ).
numLines = num-entries( bigFile, "~n" ).

message "count lines with entry():" etime "num lines:" numLines.
hide message.

assign
  numLines = 0
  i = 1
  j = 1
.

etime( yes ).
do while j > 0:
  j = index( bigFile, "~n", i ).
  assign
  numLines = numLines + 1
  i = j + 1
  .
  if numLines modulo 1000 = 0 then
  put screen column 1 row 48
  string( numLines, ">,>>>,>>>,>>9" ) +
  " " +
  string( i, ">,>>>,>>>,>>9" )
  .
end.

message "count lines with index():" etime "num lines:" numLines i j.
hide message.

return.

This reads the file in around 4 seconds.

Once read ENTRY() takes less than 2 seconds to count 13 million lines.

INDEX() doesn't finish -- it blows up right around 1GB. But on smaller files it takes 10x longer (with the PUT SCREEN debugging removed).

Looking back through this thread the test using entry() to pull out lines does seem to have num-entries()) in the DO loop test -- that's a problem.

tamhas · Sep 11, 2014

Yeah, I just ran the test on 11.4 and the index scan worked fine for me doing the 174K file in .411 seconds. While trying the same test with entry blew up after 30957 entries. Curious.

TomBascom · Sep 11, 2014

Adding a little instrumentation and fetching each line unsurprisingly shows that ENTRY() scans the whole CLOB with each call. Obviously INDEX() with the "start at" parameter will avoid that but that will only work if the CLOB is smaller than 1GB.

Stefan · Sep 11, 2014

The file is 9.5 mb.

The problem is not getting the line count, it's reading the actual lines to do something. For every line, when using entry, I expect progress starts trundling thru the file from the start to find entry N. With a file with 174000 entries I expect this is taking longer and longer and longer.

TheMadDBA · Sep 11, 2014

I would hate to see the actual code that generates a 10MB listing much less 1GB

Nothing personal Stefan

ABL2DB

ProgressTalk.com Sponsor

Well-Known Member

ProgressTalk.com Sponsor

Well-Known Member

ProgressTalk.com Sponsor

ProgressTalk.com Sponsor

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

ProgressTalk.com Sponsor

Curmudgeon

ProgressTalk.com Sponsor

Active Member

Curmudgeon

ProgressTalk.com Sponsor

Curmudgeon

Well-Known Member

Active Member