Parser for signed overpunch values?

前端未结

关注

 7  1331

南旧 2021-01-06 01:19

I am working with some old data imports and came across a bunch of data from an external source that reports financial numbers with a signed overpunch. I\'ve seen alot, but

7条回答

孤街浪徒 (楼主)

2021-01-06 01:51

Over-punched numeric (Zoned-Decimal in Cobol) comes from the old-punched cards where they over-punched the sign on the last digit in a number. The format is commonly used in Cobol.

As there are both Ascii and Ebcdic Cobol compilers, there are both Ascii and EBCDIC versions of the Zoned-Numeric. To make it even more complicated, the -0 and +0 values ({} for US-Ebcdic (IBM037) are different for say German-Ebcdic (IBM273 where they are äü) and different again in other Ebcdic language versions).

To process successfully, You need to know:

Did the data originate in a Ebcdic or Ascii system
if Ebcdic - which language US, German etc

If the data is in the original character set, you can calculate the sign by

For EBCDIC the numeric hex codes are:

Digit          0     1     2   ..    9

unsigned:   x'F0' x'F1' x'F2'  .. x'F9'     012 .. 9 
Negative:   x'D0' x'D1' x'D2'  .. x'D9'     }JK .. R
Positive:   x'C0' x'C1' x'C2'  .. x'C9'     {AB .. I

For US-Ebcdic Zoned this is the java code to convert a string:

int positiveDiff = 'A' - '1';
int negativeDiff = 'J' - '1';

lastChar = ret.substring(ret.length() - 1).toUpperCase().charAt(0);

    switch (lastChar) {
        case '}' : sign = "-";
        case '{' :
            lastChar = '0';
        break;
        case 'A':
        case 'B':
        case 'C':
        case 'D':
        case 'E':
        case 'F':
        case 'G':
        case 'H':
        case 'I':
            lastChar = (char) (lastChar - positiveDiff);
        break;
        case 'J':
        case 'K':
        case 'L':
        case 'M':
        case 'N':
        case 'O':
        case 'P':
        case 'Q':
        case 'R':
            sign = "-";
            lastChar = (char) (lastChar - negativeDiff);
        default:
    }
    ret = sign + ret.substring(0, ret.length() - 1) + lastChar;

For German-EBCDIC {} become äü, for other EBCDIC-Language you would need lookup the appropriate coded page.

For Ascii Zoned this is the java code

    int positiveFjDiff = '@' - '0';
    int negativeFjDiff = 'P' - '0';

    lastChar = ret.substring(ret.length() - 1).toUpperCase().charAt(0);

    switch (lastChar) {
        case '@':
        case 'A':
        case 'B':
        case 'C':
        case 'D':
        case 'E':
        case 'F':
        case 'G':
        case 'H':
        case 'I':
            lastChar = (char) (lastChar - positiveFjDiff);
        break;
        case 'P':
        case 'Q':
        case 'R':
        case 'S':
        case 'T':
        case 'U':
        case 'V':
        case 'W':
        case 'X':
        case 'Y':
            sign = "-";
            lastChar = (char) (lastChar - negativeFjDiff);
        default:
    }
    ret = sign + ret.substring(0, ret.length() - 1) + lastChar;

Finally if you are working in EBCDIC you can calculate it like

sign = '+'
if (last_digit & x'F0' == x'D0') {
   sign = '-' 
} 
last_digit = last_digit | x'F0'

One last problem is decimal points are not stored in a Zoned, decimal they are assumed. You need to look at the Cobol-Copybook.

e.g.

 if the cobol Copybook is

    03 fld                 pic s99999.

 123 is stored as     0012C (EBCDIC source)

 but if the copybook is (v stands for assumed decimal point) 

   03 fld                  pic s999v99.

 then 123 is stored as 1230{

It would be best to do the translated in Cobol !!! or using a Cobol Translation packages.

There are several Commercial Packages for handling Cobol Data, they tend to be expensive. There are some Java are some open source packages that can deal with Mainframe Cobol Data.

0 讨论(0)

查看其它7个回答