[progress Communities] [progress Openedge Abl] Forum Post: Re: Error Trying To Load Data...

  • Thread starter Thread starter Aidan Jeffery
  • Start date Start date
Status
Not open for further replies.
A

Aidan Jeffery

Guest
The ICU collations can only be applied to databases configured with the utf-8 code page. > And Yes I think we have some 1252 characters for some databases I think this could be the root of some of the problems you are running into. Think of iso8859-1 as a subset of 1252. 1252 has characters defined in the range 128-159 (decimal) that are not defined in iso8859-1. Since the database is defined as iso8859-1, any characters in the 128-159 range will fail to be converted correctly to UTF-8 when you dump the data. After that, of course, they won't load correctly into a utf-8 database. To get an idea of the differences, start a Procedure Editor session with -cpinternal utf-8, and run this code: DEFINE VARIABLE decval AS INTEGER. DEFINE VARIABLE ch8859-1 AS CHARACTER. DEFINE VARIABLE ch1252 AS CHARACTER. DEFINE VARIABLE utval8859-1 AS INTEGER. DEFINE VARIABLE utval1252 AS INTEGER. MESSAGE "CPINTERNAL =" SESSION:CPINTERNAL VIEW-AS ALERT-BOX. REPEAT decval = 122 TO 256: ch8859-1 = CHR(decval, SESSION:CPINTERNAL, "iso8859-1"). ch1252 = CHR(decval, SESSION:CPINTERNAL, "1252"). utval8859-1 = ASC(ch8859-1). utval1252 = ASC(ch1252). DISPLAY decval ch8859-1 utval8859-1 ch1252 utval1252 WITH FONT 2. END. The important thing here is for the conversion process to be told the correct code page that it is converting the data from, wherever the conversion is done. Another consideration is that the conversion can be done when the data is loaded into the new utf-8 database. Here is the process that I suggest... I assume your clients normally start with -cpinternal iso8859-1 -cpstream iso8859-1. Please correct me if I am wrong. 1. When dumping the data from the original databases, start your session with -cpinternal iso8859-1 -cpstream 1252. When dumping the data, do not use the MAP option - it will default to 1252. No character conversions will occur, but the data will be (correctly) flagged as 1252 data at the bottom of the *.d files. 2. Set up the new databases with code page utf-8 and utf-8 word-break table. 3. Load collation definitions ICU-UCA.df. Rebuild the indexes. Deactivate all except primary index. When loading the data into the new utf-8 databases, start your session with -cpinternal 1252 -cpstream 1252. 4. After loading the data, rebuild all the indexes.

Continue reading...
 
Status
Not open for further replies.
Back
Top