Answered Question on xcoding

ron

Member
11.7.17 on Red Hat Linux

At present our Production system has xcoded source on it. I'm planning changing to have no source at all on Prod, but it will take a few months to get that done. In the meantime I'm learning about xcode. I have encountered it many times before but this is the first time I actually need to do the encoding. Looking at the Progress documentation it is easy to do -- but I'm curious about the remark "NOTE: You must also encrypt include (.i) files".

Is it just a warning that you don't have "complete" encryption unless every file (including .i) are xcoded? Or does it mean there will be a compiler error if there is a mixture of encoded and plain text files?

Ron.
 

Cringer

ProgressTalk.com Moderator
Staff member
You will get compiler errors if the icode is not encrypted. We have that issue every time we update our OE minor version as it overwrites the webspeed includes with unencrypted versions so we have to xcode them again.
 

ron

Member
Another xcode query. (OE 11.7.18 / RHEL 7.4 )

We have a financials system with about 20 VMs set-up for developers. When code is ready it is promoted to the SIT environment (System Integration Testing), compiled and tested. After testing the new/changed code is promoted to UAT, compiled and tested again. Similarly it goes to STG (Staging) -- and finally PROD.

Our auditors will not allow plain-text source code to reside in the Production domain, so the DBA a few years ago when this 'rule' was introduced solved the problem by using this procedure when code was promoted from STG to PROD:

** The (plain-text) code is sent to GIT.
** The code is transferred to PROD and immediately xcoded and compiled.

I am planning a completely new SLDC procedure whereby no code will be xcoded -- and no source code at all will reside in PROD. To prepare for this I wanted to do a code reconciliation to ensure that I had a plain-text version of all the programs running in PROD. That's a challenge, of course, because how can an xcoded file be reconciled with a plain-text copy in GIT? For each xcoded source file in PROD I found all plain-text copies throughout the whole set of environments -- made an xcode copy then calculated a checksum by adding the decimal value of every byte in the xcoded files.

That worked -- to the extent that I could reliably match about 30% of the code set. About another 30% did not match with the checksum -- but the plain-text and xcoded files only varied in byte count by one -- suggesting that they "should" match -- but don't. The last (about) 40% appeared to be completely different.

So -- I have a big problem. As far as I know there are no "un-xcode" utilities available, and I could appreciate that Progress would never release such a program. But, of course, they would be able to "un-xcode". Does anyone know of a way to resolve this? Would Progress do it? Is there any other organization that can do it?

Ron.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
That worked -- to the extent that I could reliably match about 30% of the code set. About another 30% did not match with the checksum -- but the plain-text and xcoded files only varied in byte count by one -- suggesting that they "should" match -- but don't.
An xcoded file will always be one byte larger than the source file: as I recall, the first byte is always 0x13.

That's a challenge, of course, because how can an xcoded file be reconciled with a plain-text copy in GIT? For each xcoded source file in PROD I found all plain-text copies throughout the whole set of environments -- made an xcode copy then calculated a checksum by adding the decimal value of every byte in the xcoded files.
It sounds like you're rolling your own checksum algorithm. Couldn't you just use an OS tool like diff to compare two xcoded files?

Also, is there any possibility that some of your source files were xcoded with a non-default key? That would also explain differences where you suspect the source should be the same.
 

ron

Member
Yes, Rob, diff easily tells you if two binary files are the same or not. The problem, however, is that there are 3,500 individual source files and various copies are in 18 different locations on a Windows server (GIT) -- and two Linux servers and altogether there are 29,000 separate files. The only realistic way I could rationalize all of that and find which ones are the same or different was to extract from each one its name, location and checksum into a small DB and analyse it all.

The xcoded files all used the default key. I found it quite strange that in the case of about 900 files that were xcoded I could find one (or more) plain-text copies that were exactly one byte shorter -- yet the checksums differed. As an extra check I picked 8 random examples of this and did a byte-by-byte compare and in each case there were blocks of bytes that differed.

Ron.
 

Rob Fitzpatrick

ProgressTalk.com Sponsor
I found it quite strange that in the case of about 900 files that were xcoded I could find one (or more) plain-text copies that were exactly one byte shorter -- yet the checksums differed.
Not all code changes alter the length of a source file, e.g. fixing a typo of transposed characters in a string, changing an operator (e.g. ">" to "<"), changing the value of a constant, etc. Such changes would result in a different xcoded file with the same length as the prior one.
 

ron

Member
I appreciate that, Rob, but close to 50% of all of the source files (2,200 vs 4,790) are in that group. I think that most of those files actually do match-up, but I can't explain how they get different checksums.
 

TomBascom

Curmudgeon
It could be something like a timestamp in a comment or hard-coded into a string that is being inserted automatically somewhere along the line.
 

TomBascom

Curmudgeon
Or any sort of hard-coded thing that your tooling might update based on which server the code is being deployed on.

Note: I'm not saying that that kind of thing is a good idea ;) but I have seen stuff like that and it would explain code that varies in this way.
 

ron

Member
I tried comparing using just the r-code, but that failed too.

I found that if I compile a plain-text program (.p) and save the result r-code -- then do exactly the same thing again a few minutes later -- changing nothing at all, the two r-code files differ. It appears that the compile process must embed the current time into the r-code. I can't think of anything that would have changed in the test I performed other than the time.
 

ron

Member
RHEL 9.3 -- OE 11.7.18

I have just discovered that Progress documentation mentions "SIGNATURE-VALUE attribute".

It says "Use this attribute to determine if a procedure changed between different versions of your application."

Can someone explain exactly what this means, please?

I understand the words, but I could interpret it in different ways.
 

Stefan

Well-Known Member
signature-value was added in OpenEdge 12, so I do not think it will help you with your case, it's predecessor md5-value needed to be enabled by compiling with generate-md5.

So if you did compile all versions of your r-code with generate-md5, then md5-value could help you to compare which are the same.
 

dimitri.p

Member
I tried comparing using just the r-code, but that failed too.

I found that if I compile a plain-text program (.p) and save the result r-code -- then do exactly the same thing again a few minutes later -- changing nothing at all, the two r-code files differ. It appears that the compile process must embed the current time into the r-code. I can't think of anything that would have changed in the test I performed other than the time.
I just CTRL-A <DEL> my previous post after realizing md5 is no longer an option in 12.x.

Also found out md5 WILL generate different values if SQl-92 is still in the code. Progress Customer Community

In addition CRC checksums only check a portion of the file CRC-VALUE calculation question - OpenEdge General - Forum - Progress Community Archive

So...CRC can produce false positives and md5 can produce false negatives.

The new RCODE-INFO:SIGNATURE-VALUE is a replacement for MD5-VALUE. Progress Customer Community

Progress! (pun intended) :)

Sometimes the only way to know which code produced the .r is to know which code produced the .r.
 
Top