Bit Array to String and back to Bit Array

10,983

You cannot stuff arbitrary bytes into a string. That concept is just undefined. Conversions happen using Encoding.

string output = Encoding.UTF8.GetString(e);

e is just binary garbage at this point, it is not a UTF8 string. So calling UTF8 methods on it does not make sense.

Solution: Don't convert and back-convert to/from string. This does not round-trip. Why are you doing that in the first place? If you need a string use a round-trippable format like base-64 or base-85.

Share:
10,983
Gopikrishna S
Author by

Gopikrishna S

Updated on June 04, 2022

Comments

  • Gopikrishna S
    Gopikrishna S almost 2 years

    Possible Duplicate Converting byte array to string and back again in C#

    I am using Huffman Coding for compression and decompression of some text from here

    The code in there builds a huffman tree to use it for encoding and decoding. Everything works fine when I use the code directly.

    For my situation, i need to get the compressed content, store it and decompress it when ever need.

    The output from the encoder and the input to the decoder are BitArray.

    When I tried convert this BitArray to String and back to BitArray and decode it using the following code, I get a weird answer.

    Tree huffmanTree = new Tree();
    huffmanTree.Build(input);
    
    string input = Console.ReadLine();
    BitArray encoded = huffmanTree.Encode(input);
    
    // Print the bits
    Console.Write("Encoded Bits: ");
    foreach (bool bit in encoded)
    {
        Console.Write((bit ? 1 : 0) + "");
    }
    Console.WriteLine();
    
    // Convert the bit array to bytes
    Byte[] e = new Byte[(encoded.Length / 8 + (encoded.Length % 8 == 0 ? 0 : 1))];
    encoded.CopyTo(e, 0);
    
    // Convert the bytes to string
    string output = Encoding.UTF8.GetString(e);
    
    // Convert string back to bytes
    e = new Byte[d.Length];
    e = Encoding.UTF8.GetBytes(d);
    
    // Convert bytes back to bit array
    BitArray todecode = new BitArray(e);
    
    string decoded = huffmanTree.Decode(todecode);
    
    Console.WriteLine("Decoded: " + decoded);
    
    Console.ReadLine();
    

    The Output of Original code from the tutorial is:

    enter image description here

    The Output of My Code is:

    enter image description here

    Where am I wrong friends? Help me, Thanks in advance.

  • Gopikrishna S
    Gopikrishna S about 11 years
    But base64 gives 4/3 of the input instead of compressing it. I need compression not encoding(SRC: wikipedia.org)
  • svick
    svick about 11 years
    @GopikrishnaS Then you need to use byte[], not string. string is for character data, not binary.