Selection using UTF-8

petervdh · Aug 1, 2011

Hello,

We are using a database UTF-8 encoded, connected clients are also using UTF-8.

Resently we upgraded to OE10.2B SP 4 and we have the following problem.

When using a selection like below:

for each <table> no-lock where <table>.<charfield> >= charvar1 and <table>.<charfield> <= charvar2 :
<process selection>
end.

Before executing the selection validation is performed on charvar1 and charvar2. One of the things happening during this validation is:

IF (charvar2 = "" AND
charvar1 = "" ) OR
charvar = "*"
THEN ASSIGN charvar2 = FILL(CHR((IF SESSION:CHARSET = "UTF-8" THEN 15712191 ELSE 254)),20).
ELSE IF charvar2 = ""
THEN ASSIGN charvar2 = charvar1.
ELSE ASSIGN charvar2 =
charvar2 + FILL(CHR((IF SESSION:CHARSET = "UTF-8" THEN 15712191 ELSE 254)), 20 - LENGTH(charvar2)).

So after this, charvar1 is empty "" and charvar2 is filled with 20 times chr(15712191).

When executing above selection width this the query doesn't return all expected records.

This is "old" code and has always worked fine, up until now.
Any ideas what might cause this or how to fix this.

Thanks for your time.

Kind regards,
Peter.

trx · Aug 1, 2011

I would check why expected record doesn't appear in query with simple for each run from plain _edit.r:

Code:

on find of <table> override
do:
end.

<fill charvar1 and charvar2>

for each <table> no-lock where 
        recid(<table>) = expected record recid 
  and <table>.<charfield> >= charvar1 
  and <table>.<charfield> <= charvar2 :
  disp 1.
 end.

Either you have changed startup parameters or there are some additional conditions which reject this record from result list.
Also check whether SESSION:CHARSET is really "UTF8" and what is collation table (-cpcoll).

petervdh · Aug 2, 2011

The environment I use is Windows, client and database server.

SESSION:CHARSET is "UTF-8" and SESSION:CPCOLL is "Basic".

When I change the code to fill charvar2 with chr(127) - chr(254) doesn't work in an UTF-8 environment - or plain "z" the selection seems to works fine, all expected records are returned. <table>.<charfield> only contains plain ascii characters. Something seems to break when using UTF-8 characters to limit the selection, is anybody aware of this?

I've tried your code, when I include the recid of a missing record it gets returned, which is normal, I think, since progress ignores all the rest when specifying a recid or rowid in the selection.

I've moved the display to the find trigger. This displays the records found, but this doesn't display the "missing" records.

rstanciu · Aug 2, 2011

I'm using UTF-8 too and I take careful with the code source saving.
If you say FOR EACH Customer where Name BEGINS "ç" ... and if I save this file on charset iso8859-1
for execute in utf-8 ... I have a problem.

When I convert the database I do also:

echo "(15/17) apply convert to word-index";
$DLC/bin/proutil -C wbreak-compiler $DLC/prolang/convmap/utf8-bas.wbt 254 >/dev/null 2>&1;
$DLC/bin/proutil $DBLOCATION/$DBN -C word-rules 254 >/dev/null 2>&1;

Stefan · Aug 3, 2011

Interesting from an academic point of view (and of course to keep your app running as it was) but why is the max clause being added at all? Use a dynamic query and optimize the where clause:

if charvar1 is empty skip it
if charvar2 is empty skip it
if charvar1 equals charvar2 use an equality match

Do you have records filled with CHR(255) / UTF-max fills to indicate something else?

Selection using UTF-8

petervdh

New Member

trx

Member

petervdh

New Member

rstanciu

Member

Stefan

Well-Known Member