Importing and Exporting text of Other Languages into a Progress Database

Hello,
We have had a long standing issue with importing data into our Progress database (OE and earlier).
When we bring the text in, the special characters do not transfer. We have the ability to view these characters in our ERP system, but when we try to import characters of this type, we get issues. This may be a function of our OS (Linux), but I have not been able to figure out where to start looking.

Thanks!
 

RealHeavyDude

Well-Known Member
Most likely you are experiencing a code page issue.

You can use different code pages for your database and your client. Progress does the code page conversion for you automatically - when it can. That means it is only possible to convert without a fuzz when each character in one code page appears in the other and vice versa - for example ibm850 and iso8859-1, these share the same character set but use different codes to represent them.

You need to figure out:

  • Which code page does your database use.
    • Look in the information section under utilities in the data dictionary or data administration tool.
  • Which code page does your client use.
    • Have a look for -cp* parameters in the parameter files that are used to start the clients or in the startup.pf in your installation directory.
  • Which code page was used to generate the file you are trying to import.
    • Ask the provider ...
Only if you have that information it is possible to decide what needs to be done.

Most likely this is not an OS issue.


Heavy Regards, RealHeavyDude.
 
In one of my import issues, the file generated is an xml file coming from Oracle. The encoding for this file is [FONT=r_ansi][FONT=r_ansi]ISO-8859-1. [/FONT][/FONT]My Progress database is [FONT=r_ansi][FONT=r_ansi]iso8859-1[/FONT][/FONT]. Could the slight difference in the naming of the code page be a possible point here? If so, can I get Progress to recognize the correct code page when I load the document?
 

Stefan

Well-Known Member
Does cpstream apply when LOADing a document onto an X-DOCUMENT object handle or when SET-INPUT-SOURCing on a SAX-reader object handle?

I would think cpinternal could then pose an issue (if not all characters from the xml can be mapped), but cpstream should be irrelevant.

Maybe DL can share how the xml files are being read
 

RealHeavyDude

Well-Known Member
Out of experience I can tell you that the best thing is when your -cpstream setting corresponds with the encoding of the file or XML document you are importing or loading. If it's not set that way it might still work when the file or XML document does not contain offending characters ( some which are not part of whatever your -cpstream setting is ). On the other hand, UTF-8 is used almost everywhere on the internet even if the web pages or XML documents contain nothing else but plain ASCII characters.

BTW - if you don't set -cpstream explicitly, most likely it will come from you startup.pf residing in the installation directory as it was created from your input during installation ...

Bottom Line: You should be very well aware that code page issues may really bite you where it hurts - anything from storing or out-putting wrong characters to data loss - and if I were you I wanted total control over what and how something is imported/loaded into my database. Therefore I recommend you to explicitly set the code pages you use, be they for the database, the internal processing of a runtime session or the streams you use.


Heavy Regards, RealHeavyDude.
 
Sound advise! I am going to initiate a project to employ -cpstream on ALL inbound data. This is where I believe that I was hung up. The data in my app when logging on as a user of a particular language (say spanish), but the data I input does not have the same appearance. This is good info!
 
Top