G
Garry Hall
Guest
There might be some confusion when they "appear correctly" in the output file. Whatever you use to view the file makes an assumption of the encoding of the file. If you EXPORT with -cpinternal UTF-8 -cpstream UTF-8, the German characters should be 2-byte characters. If it is malformed, and actually in 1252, then when you view it with a text editor that assumes 1252, it will appear to be correct. You'd probably need a hex viewer to view the raw bytes in the output file to make sure. For example, take ß (U+00DF). Its UTF-8 encoding is the two bytes 0xC3 0x9F. This is what you would expect to see in a file EXPORTed with -cpstream UTF-8. However, if the character was instead represented as 0xDF, then the data is actually encoded as 1252.
Continue reading...
Continue reading...