How to unpack COMP-3 digits using Java?

ascii mainframe ebcdic packed-decimal

13,939

Solution 1

COMP-3 (or "packed decimal") data looks like this: 0x12345s, where "s" is C for positive, D for negative, or F for unsigned. Thus 0x12345c -> "12345", x012345d -> "-12345", and 0x12345f -> "12345".

You've got one obvious error: You're ignoring the nybble in the byte that contains the sign nybble (e.g., "5" above) if the sign is negative. In addition, you're working too hard at manipulating the nybbles, it's a simple bitwise-and or a 4-bit shift to isolate a nybble.

Try something like this (untested):

public String unpackData(String packedData, int decimalPointLocation) {
    String unpackedData = "";
    char[] characters = packedData.toCharArray();
    final int negativeSign = 13;
    for (int currentCharIndex = 0; currentCharIndex < characters.length; currentCharIndex++) {
        byte firstDigit = ((byte) characters[currentCharIndex]) >>> 4);
        byte secondDigit = ((byte) characters[currentCharIndex]) & 0x0F;
        unpackedData += String.valueOf(firstDigit);
        if (currentCharIndex == (characters.length - 1)) {
            if (secondDigit == negativeSign) {
                unpackedData = "-" + unpackedData;
            }
        } else {
            unpackedData += String.valueOf(secondDigit);
        }
    }
    if (decimalPointLocation > 0) {
        unpackedData = unpackedData.substring(0, (decimalPointLocation - 1)) + 
                        "." + 
                        unpackedData.substring(decimalPointLocation);
    }
    return unpackedData;
}

Solution 2

public static final int UNSIGNED_BYTE = 0xff;
public static final int BITS_RIGHT = 0xf;

public long parseComp3(byte[] data) {
    long val = 0L;
    boolean negative = false;
    for (int i = 0; i < data.length; i++) {
        int raw = data[i] & UNSIGNED_BYTE;
        int digitA = raw >> 4;
        int digitB = raw & BITS_RIGHT;

        if (digitA < 10) {
            val *= 10L;
            val += (long) digitA;

        } else if (digitA == 11 || digitA == 13) { // Some non-IBM systems store the sign on left or use 11 for negative.
            negative = true;
        }

        if (digitB < 10) {
            val *= 10L;
            val += (long) digitB;

        } else if (digitB == 11 || digitB == 13) {
            negative = true;
        }
    }
    if (negative)
        val = -val;
    return val;
}

Solution 3

The Ross Paterson solution has a bug when it moves the first 4 bits to the right. The mask 0x0F must be applied.

Here is the corrected method:

private static String unpackData(byte[] packedData, int decimalPointLocation) {
    String unpackedData = "";

    final int negativeSign = 13;
    for (int currentCharIndex = 0; currentCharIndex < packedData.length; currentCharIndex++) {
        byte firstDigit = (byte) ((packedData[currentCharIndex] >>> 4) & 0x0F);
        byte secondDigit = (byte) (packedData[currentCharIndex] & 0x0F);
        unpackedData += String.valueOf(firstDigit);
        if (currentCharIndex == (packedData.length - 1)) {
            if (secondDigit == negativeSign) {
                unpackedData = "-" + unpackedData;
            }
        } else {
            unpackedData += String.valueOf(secondDigit);
        }
    }

    if (decimalPointLocation > 0) {
        int position = unpackedData.length() - decimalPointLocation;
        unpackedData = unpackedData.substring(0, position) + "." + unpackedData.substring(position);
    }
    return unpackedData;
}

Solution 4

I've tested the Ross Paterson solution, not run ok, but for small details. Thank's Ross and thank's too Dr. Bob for "int raw"

Tested solution is here:

private static String unpackData(byte[] packedData, int decimals) {
    String unpackedData="";
    final int negativeSign = 13;
    int lengthPack = packedData.length;
    int numDigits = lengthPack*2-1;

    int raw = (packedData[lengthPack-1] & 0xFF);
    int firstDigit = (raw >> 4);
    int secondDigit = (packedData[lengthPack-1] & 0x0F);
    boolean negative = (secondDigit==negativeSign);
    int lastDigit = firstDigit;
    for (int i = 0; i < lengthPack-1; i++) {
        raw = (packedData[i] & 0xFF);
        firstDigit = (raw >> 4);
        secondDigit = (packedData[i] & 0x0F);
        unpackedData+=String.valueOf(firstDigit);
        unpackedData+=String.valueOf(secondDigit);

    }
    unpackedData+=String.valueOf(lastDigit);
    if (decimals > 0) {
        unpackedData = unpackedData.substring(0,numDigits-decimals)+"."+unpackedData.substring(numDigits-decimals);
    }
    if (negative){
        return '-'+unpackedData;
    }
    return unpackedData;
}

And the function to convert from unpacked to packed data:

private static byte[] packData(String unpackedData) {
    int unpackedDataLength = unpackedData.length();
    final int negativeSign = 13;
    final int positiveSign = 12;
    if (unpackedData.charAt(0)=='-'){
        unpackedDataLength--;
    }

    if (unpackedData.contains(".")){
        unpackedDataLength--;
    }
    int packedLength = unpackedDataLength/2+1;

    byte[] packed = new byte[packedLength];
    int countPacked = 0;
    boolean firstHex = (packedLength*2-1 == unpackedDataLength);
    for (int i=0;i<unpackedData.length();i++){
        if (unpackedData.charAt(i)!='-' && unpackedData.charAt(i)!='.'){
            byte digit = Byte.valueOf(unpackedData.substring(i,i+1)); 
            if (firstHex){
                packed[countPacked]=(byte) (digit<<4);
            }else{
                packed[countPacked]=(byte) (packed[countPacked] | digit );
                countPacked++;
            }
            firstHex=!firstHex;
        }
    }
    if (unpackedData.charAt(0)=='-'){
        packed[countPacked]=(byte) (packed[countPacked] | negativeSign );
    }else{
        packed[countPacked]=(byte) (packed[countPacked] | positiveSign );
    }
    return packed;
}

View more solutions

13,939

Shekhar

Currently working as a Techno architect for AstraZeneca. Have vast experience of Big Data application design, planning, development, deployment and other phases of application development. Have hands on experience in Amazon Web Services, Hadoop, Hive, Pig, HBase, Kafka, IoT, Java, Storm technologies.

Updated on June 12, 2022

Comments

Shekhar about 2 years

I have huge mainframe file and there are some packed digits in that file. I would like to know how to unpack following digit using java?

packed digit : ?

I read tutorials for unpacking digits and found the following rule to count the number of bytes required to unpack digits :

total_number_of_bytes = (no. of digits + 1) / 2

I wrote the following code to unpack digits :

public String unpackData(String packedData, int decimalPointLocation) {
        String unpackedData = "";
        char[] characters = packedData.toCharArray();
        final int impliedPositive = 15;
        final int positiveNumber = 12;
        final int negativeNumber = 13;
        for (int currentCharIndex = 0; currentCharIndex < characters.length; currentCharIndex++) {
            byte[] unpackedDigits = unpackByte((byte) characters[currentCharIndex]);
            if(currentCharIndex == (characters.length - 1)) {
                if(unpackedDigits[1] == impliedPositive || unpackedDigits[1] == positiveNumber) {
                    unpackedData += String.valueOf(unpackedDigits[0]);
                } else if(unpackedDigits[1] == negativeNumber) {
                    unpackedData = "-" + unpackedData;
                }
            } else {
                unpackedData += String.valueOf(unpackedDigits[0]) + String.valueOf(unpackedDigits[1]);
            }
        }
        if(decimalPointLocation > 0) {
            unpackedData = unpackedData.substring(0, (decimalPointLocation - 1)) + 
                            "." + 
                            unpackedData.substring(decimalPointLocation);
        }
        return unpackedData;
    }

    private byte[] unpackByte(byte packedData) {
        byte firstDigit = (byte) (packedData >>> 4);
        firstDigit = setBitsToZero(firstDigit, 4, 8);

        //System.out.println(" firstDigit = "+ firstDigit + ", and its bit string after unpacking = " + getBitString(firstDigit, 7));

        byte secondDigit = setBitsToZero(packedData, 4, 8);
        //System.out.println("second digit = " + secondDigit + ", and its bit string of second digit after unpcking = " + getBitString(secondDigit, 7));

        byte[] unpackedData = new byte[2];
        unpackedData[0] = firstDigit;
        unpackedData[1] = secondDigit;
        return unpackedData;
    }

    private byte setBitsToZero(byte number, int startBitPosition, int endBitPosition) {
        for (int i = startBitPosition; i < endBitPosition; i++) {
            number =  (byte) (number & ~(1 << i));
        }
        return number;
    }

This program works correctly for integer type values but it's not working for floating point type values.

Can anyone please tell if my program is correct?

MarkU over 10 years

Did you verify the sequence of digits is correct for several test cases? I don't see where you add char '0' or 48 to convert into printable character. Are you sure String.valueOf() is returning the characters '0'..'9' instead of the integer byte values 0x00 .. 0x09 ? Problem when inserting the decimal point into the string? Looks like decimalPointLocation 1 is .######, 2 is #.#####, 3 is ##.#### etc. JUnit could be useful to verify your unpackData function works correctly for all test conditions. There are a lot of corner cases to check, even without testing incorrectly formed data.
cschneid over 10 years

Floating point is not the same as packed decimal.
NealB over 10 years

If any of those packed fields are signed, you have to deal with that too... Often the sign is coded with the least significant digit. Finally, packed fields often contain an implied decimal point, you will need the original COBOL record definition to sort these out.
Bill Woodger

stackoverflow.com/questions/17448008/…

Matthew G. about 10 years

Is there something particular you're trying to communicate that makes this answer better than the previous accepted one?
Bruce Martin almost 7 years

This answer will probably work most of the time, it is not reliable when reading from a file. Reading a packed decimal as a character string can corrupt the packed decimal. You must read and process a packed decimal as bytes.
Bruce Martin almost 7 years

I think this is a better answer because the input is an array of bytes, Reading a ebcdic comp-3 in character format can corrupt comp-3 data.