Converting string to byte[] creates zero character
Solution 1
First let's look at what your code does wrong. char
is 16-bit (2 byte) in .NET framework. Which means when you write sizeof(char)
, it returns 2
. str.Length
is 1
, so actually your code will be byte[] bytes = new byte[2]
is the same byte[2]
. So when you use Buffer.BlockCopy()
method, you actually copy 2
bytes from a source array to a destination array. Which means your GetBytes()
method returns bytes[0] = 32
and bytes[1] = 0
if your string is " "
.
Try to use Encoding.ASCII.GetBytes()
instead.
When overridden in a derived class, encodes all the characters in the specified string into a sequence of bytes.
const string input = "Soner Gonul";
byte[] array = Encoding.ASCII.GetBytes(input);
foreach ( byte element in array )
{
Console.WriteLine("{0} = {1}", element, (char)element);
}
Output:
83 = S
111 = o
110 = n
101 = e
114 = r
32 =
71 = G
111 = o
110 = n
117 = u
108 = l
Solution 2
Just to clear the confusion about your answer, char type in C# takes 2 bytes. So, string.toCharArray() returns an array in which each item takes 2 bytes of storage. While copying to byte array where each item takes 1 byte storage, there occurs a data loss. Hence the zeroes showing up in result.
As suggested, Encoding.ASCII.GetBytes
is a safer option to use.
Solution 3
In reality .net (at least for 4.0) automatically changes size of char when serialized with BinaryWriter
UTF-8 chars have variable length (might not be 1 byte), ASCII chars have 1 byte
'ē' = 2 bytes
'e' = 1 byte
It must be kept in mind when using
BinaryReader.ReadChars(stream)
In case of word "ēvalds" = 7 bytes size will be different than "evalds" = 6 bytes
strike_noir
Updated on July 20, 2022Comments
-
strike_noir almost 2 years
In this convert function
public static byte[] GetBytes(string str) { byte[] bytes = new byte[str.Length * sizeof(char)]; System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length); return bytes; } byte[] test = GetBytes("abc");
The resulting array contains zero character
test = [97, 0, 98, 0, 99, 0]
And when we convert byte[] back to string, the result is
string test = "a b c "
How do we make it so it doesn't create those zeroes