I am trying to understand how the COBOL variables with COMP Usage clause stores values.
I tried one example as below
01 VAR14 PIC S9(5)
In IBM's Enterprise COBOL, there are four ways to define a binary field: COMP; COMP-4; BINARY; COMP-5.
How does that come about? A COMPUTATIONAL field (COMP for short, and here short for "all COMPUTATIONAL fields") is "implementor defined". Which means what is COMP-something in one compiler, may be COMP-somethingelse in another compiler, or may even have no direct equivalent.
And yes, you can code COMPUTATIONAL, COMPUTATIONAL-4 and COMPUTATIONAL-5 if you want. The compiler will be happy.
To standardise things, the 1985 COBOL Standard introduced BINARY and PACKED-DECIMAL as USAGEs. For portability to other COBOL compilers, these would be the best USAGEs for COMP and COMP-3 (packed-decimal) fields.
What is the difference between these different binary fields? Mostly, none. COMP, COMP-4 and BINARY are in fact synonyms of each other in the compiler (more accurately, COMP-4 and BINARY are synonyms of COMP).
COMP-5, also known as "native binary", is different. COBOL has what you might call "decimal-binary" fields (COMP and siblings). That is, the data is stored as binary but its maximum and minimum values are the number and full value of the PICture clause which is used in the definition.
COMP PIC 9 - can contain zero to nine.
COMP PIC S99 - (signed) can contain -99 to +99.
COMP PIC 999 - can contain zero to 999.
COMP-5 is different.
COMP PIC 9 - can contain zero to 65535.
COMP PIC S99 - (signed) can contain -32768 to +32767.
COMP PIC 999 - can contain zero to 65535.
What happens for COMP-5 is that the PICture is used to define the size of the field (as with other binary fields) but every possible bit-value is valid.
How does the PICture relate to the size of the definition? PIC 9 through PIC 9(4) will be stored in a half-word-sized field (which is two bytes). PIC 9(5) through PIC 9(9) will be stored in a word-sized field (which is four bytes). PIC 9(10) through PIC 9(18) will be stored in a double-word-sized field (eight bytes).
OK, so how does this difference (COMP-5 use all the bits, COMP can only represent the decimal value of the PICture) affect what is defined? Doesn't "native binary" sound much better, and obviously faster, than anything "non-native" would give?
The difference is in how they truncate. And, as scintillating as "native binary" sounds, it is generally slower than using COMP & CO, because of the truncation.
COMP truncates to the decimal value of the PICture. COMP-5 truncates to the size of the field.
Consider (names just for demonstration, only ever use descriptive names):
01 PROGA COMP PIC 9(4).
01 PROGB COMP-5 PIC 9(5).
01 PROGC BINARY PIC 9(4) VALUE 9999.
ADD PROGC TO PROGA
ADD PROGC TO PROGB
Remembering that PROGA has a maximum value of 9999, and noting that 19998 fits easily within the existing size of the field, the compiler can effect the addition and then truncate to the decimal value, all in-place.
Remembering that PROGB has a maximum value of 65535 and there is absolutely fat chance that there is enough room in the original field to successfully add a further 65535, the compiler has to generate a temporary field of double the original size, do the addition, and then truncate back to the original maximum value, getting that result back to the original field.
ADD 1 TO PROGA
ADD 1 TO PROGB
Note that with these two, ADD 1 TO PROGA, since it is less than 9999, will still allow the ADD to be done in place (obviously) but ADD 1 TO PROGB will still require the expansion of the field and all that mucking-about, because PROGB just may have a value of 65535 in it already, so the compiler has to allow for that.
Coming to DISPLAY. You have COMP PIC S9(5), and you get a 10-digit output. Why? OK, size you have worked out, the field is four bytes long. However, that should get you a five-digit output, in the range -99999 to +99999. Let's pretend for a moment that your field was instead COMP-5 PIC S9(5).
With COMP-5 you all the bits are valid, and, for a signed field, your range for a full-word/word is -2,147,483,648 through +2,147,483,647. That's 10 digits, note. Which matches to the 10 digits you got in your output. What happened?
Compiler option TRUNC. If you use compiler option TRUNC(BIN), all your COMP/COMP-4/BINARY fields are treated as COMP-5. End of story. You have TRUNC(BIN) either specifically chosen by you, your project, or as your site default. This is not necessarily a good choice.
Other values of compiler option TRUNC are STD, which does the "normal" truncation for COMP/COMP-4/BINARY, and OPT which does whatever is best (for performance) at the time.
Note, strongly not, that TRUNC(OPT) imposes a contract on the programmer. "I will not, must not, and will never even consider, allow a COMP/COMP-4/BINARY field to have a value which does not conform to it's PICture. If I do, it is all my fault, full-stop, end-of-story, and no crying from me".
Don't, except for the purposes of investigating how things work, ever just up and change a TRUNC setting. If you do, you can break things, and it can be a very, very subtle break.
My advice: TRUNC(BIN), don't use it unless you have to (someone decided, and you have no choice); TRUNC(STD) use if your site is scared of the contract; TRUNC(OPT) use if your site is comfortable with the contract.
Do use COMP-5, for individual field-definitions, where you need to. Where do you need to? For any place you have a binary field whose range is beyond the "decimal value" of its PICture. For instance, look to the size of the CICS COMMAREA and the field which indicates how big an individual example is. Look to a VARCHAR host-field in a COBOL program. Data communicating with JAVA or C/C++ may be like that. Otherwise, for new programs, prefer BINARY, which shows that you are slap-up-to-date with 1985.
Setting TRUNC for investigative purposes.
CBL TRUNC(STD)
ID (or IDENTIFICATION) DIVISION.
Compiler options can also be set by the PARM statement in the JCL for the compile, but you may not have access to that. CBL will override any value set in the PARM. There is an installation option which can prevent the use of CBL (also known as PROCESS). Individual options can also be "fixed" at installation time. If your site has fixed TRUNC or prevented CBL, you won't be able to try these things out.
COMP
usage clause will be called as BINARY
or COMPUTATION
.
COMP
usage clause applicable to Numeric data type only.
COMP
usage is a binary representation of data.
The data in COMP
variables stored memory in pure binary format.
The memory allocation for COMP
USAGE is like below.
Picture Number of Bytes
S9 to S9(4) 2
S9(5) to S9(9) 4
S9(9) to S9(18) 8