UTF-8 to 1251 conversion issue in PASOE

Robert_Wilson

New Member
Hey Guys,

Some back story to this, I'm developing some IM chat functionality using node and Openedge 11.7. When the agents and customers message each other there is a ping to a rest endpoint through PAS to store the message in our DB. The issue I've got just now is that when the message comes in if there is, for example, a pound sign that can't be converted from UTF-8 to 1251 (our cpinternal codepage) th GetJsonText function break with Unexpected UTF-8 bit pattern 0xa0 in WRITE-JSON( ). (15353).

I'm looking to see if anyone has any pointers on how best to deal with this. I've tried converting the codepage of my long char to UTF-8 but the error persists, another idea I had was to URL encode the message data and then decode on the backend but URL-DECODE() only seems available in webspeed.

Here's a snippet of the code:
Code:
  /* Parse the json data and pull out the required information */
  ASSIGN
    omp       = NEW ObjectModelParser()
    jsonParam = CAST(omp:Parse(icJSON),JsonObject)
    cAuth     = jsonParam:GetCharacter('authId')
    lcImChatMsg = jsonParam:GetJsonText('messageData') //code is failing here with the above error
    lAgent = LOGICAL(jsonParam:GetJsonText('isAgent')).
 
  /* The frontend likes to escape slashes so unescape them so that dates can be read into the temp-table correctly */
  ASSIGN lcImChatMsg = REPLACE(lcImChatMsg,"\/","/").
 
  /* read the message data into the imchat temp-table */
  EMPTY TEMP-TABLE ttimChat NO-ERROR.
  ASSIGN hTemp = TEMP-TABLE ttimChat:DEFAULT-BUFFER-HANDLE.
  hTemp:READ-JSON("LONGCHAR", lcImChatMsg).

Thanks in advance,
Robert.
 
I'm looking to see if anyone has any pointers on how best to deal with this. I've tried converting the codepage of my long char to UTF-8 but the error persists, another idea I had was to URL encode the message data and then decode on the backend but URL-DECODE() only seems available in webspeed.

There are the HTTP client classes available in 11.7. These have URL encoding available , via the OpenEdge.Net.URI class. You'll need to make sure that $dlc/gui|src|tty/netlib/OpenEdge.Net.pl is in PROPATH.

There is also a OpenEdge.Core.Util.UTF8Encoder class which will encode UTF-8. By default it uses U+to indicate the encoding but you can choose \u or something else. See Progress Documentation for more
 
There are the HTTP client classes available in 11.7. These have URL encoding available , via the OpenEdge.Net.URI class. You'll need to make sure that $dlc/gui|src|tty/netlib/OpenEdge.Net.pl is in PROPATH.

There is also a OpenEdge.Core.Util.UTF8Encoder class which will encode UTF-8. By default it uses U+to indicate the encoding but you can choose \u or something else. See Progress Documentation for more
Thanks for the guidance Peter, appreciate it as always!
 
There are the HTTP client classes available in 11.7. These have URL encoding available , via the OpenEdge.Net.URI class. You'll need to make sure that $dlc/gui|src|tty/netlib/OpenEdge.Net.pl is in PROPATH.

There is also a OpenEdge.Core.Util.UTF8Encoder class which will encode UTF-8. By default it uses U+to indicate the encoding but you can choose \u or something else. See Progress Documentation for more
I've had a look at the resources. The idea I had was to URL encode it on my website and when passed to my ABL code then I would URL decode it (space = %20 etc). I'm not looking to encode anything in the ABL code.

I'm sure there must be an easier way to handle the conversion of UTF-8 characters into the correct codepage but I'm a bit stumped thus far.
 
Of course, one could see this as a signal that it was time to get off of 1251, a character set that is really not up to modern demands, and move to UTF-8 overall and not have to do stuff like this.
 
Of course, one could see this as a signal that it was time to get off of 1251, a character set that is really not up to modern demands, and move to UTF-8 overall and not have to do stuff like this.
Exactly, Thomas it's definitely prompting a conversation around moving over to UTF-8.
 
I'm sure there must be an easier way to handle the conversion of UTF-8 characters into the correct codepage but I'm a bit stumped thus far.

If you don't want to lose any characters, then encoding is the way to go. Percent (URL) encoding doesn't handle the higher ranges of the codepoints.

If you don't care about losing characters, you can walk through the strings character-by-character and ignore those that don't fit into 1251 (or whatever).

lcImChatMsg = jsonParam:GetJsonText('messageData') //code is failing here with the above error

Why are you using GetJsonText() here? Could you not use GetJsonObject()?
 
Back
Top