Using fread() to read text file into a buffer - why are the values in the buffer not each character's respective ASCII value?
Solution 1
The behaviour is not surprising:
- You have a file containing 11 characters.
sizeof(char)
is 1. - Now you allocate an array of
int
with 11 int.sizeof(int)
is very likely to be 4 on your machine - You instruct
fread
to read up to 11int
s (up to 44 bytes). So the first 4 characters will be read as anint
and stored inarray[0]
and the next 4 inarray[1]
.- If you had checked the return of
fread
it would tell you that it actually only read 2 elements (as the content is 11 bytes it can only read 2int
s and the last 3 remaining bytes cannot be successfully read as anint
).
- If you had checked the return of
- Now you loop over the array and print the number which is the
int
you get build up by the first 4 characters. - In your alternative solution you pretent to point to a sequence of chars so the array index will only increment in 1 byte offsets
The memory layout basically looks like this:
array[0]
| array[1]
| |
1 2 3 4 5 6 7 8 9 10 11
| |
| ((char *)array)[1]
((char *)array)[0]
Solution 2
Your ftell returns the current value of the position indicator of the stream.
And it returns number of byte the file has. And you are reading file as the sequence of int 4-byte and ofcourse the later element will be 0. For more detail, you are reading 4 x size bytes from a file with size bytes.
Your array should be type of char.
Something like
char* array = malloc(sizeOfFile * sizeof(char));
if(array == NULL) {
...
}
fread(array, sizeOf(char), sizeOfFile, filePointer);
// ..
Just the idea, not the code. Hope this help;
Related videos on Youtube
user2809475
Updated on September 26, 2020Comments
-
user2809475 over 3 years
First off, this isn't homework. Just trying to understand why I'm seeing what I'm seeing on my screen.
The stuff below (my own work) currently takes an input file and reads it as a binary file. I want it to store each byte read in an array (for later use). For the sake of brevity the input file (Hello.txt) just contains 'Hello World', without the apostrophes.
int main(int argc, char *argv[]) { FILE *input; int i, size; int *array; input = fopen("Hello.txt", "rb"); if (input == NULL) { perror("Invalid file specified."); exit(-1); } fseek(input, 0, SEEK_END); size = ftell(input); fseek(input, 0, SEEK_SET); array = (int*) malloc(size * sizeof(int)); if (array == NULL) { perror("Could not allocate array."); exit(-1); } else { input = fopen("Hello.txt", "rb"); fread(array, sizeof(int), size, input); // some check on return value of fread? fclose(input); } for (i = 0; i < size; i++) { printf("array[%d] == %d\n", i, array[i]); }
Why is it that having the print statement in the for loop as it is above causes the output to look like this
array[0] == 1819043144 array[1] == 1867980911 array[2] == 6581362 array[3] == 0 array[4] == 0 array[5] == 0 array[6] == 0 array[7] == 0 array[8] == 0 array[9] == 0 array[10] == 0
while having it like this
printf("array[%d] == %d\n", i, ((char *)array)[i]);
makes the output look like this (decimal ASCII value for each character)
array[0] == 72 array[1] == 101 array[2] == 108 array[3] == 108 array[4] == 111 array[5] == 32 array[6] == 87 array[7] == 111 array[8] == 114 array[9] == 108 array[10] == 100
? If I'm reading it as a binary file and want to read byte by byte, why don't I get the right ASCII value using the first print statement?
On a related note, what happens if the input file I send in isn't a text document (e.g., jpeg)?
Sorry is this is an entirely trivial matter, but I can't seem to figure out why.
-
user694733Why are you opening the input file twice?
-
-
user2809475 over 10 yearsI guess something else isn't clicking... are you saying that, if the input size is 10 bytes, I'm reading 40 bytes from it? As for the int array... isn't what I'm reading in just a bunch of ints? I thought that was why I was reading it as a binary file. What would happen if I sent in a non-text file and tried to put it in a char array?
-
simpletron over 10 yearsYes. I guess so. I have updated my answer, you should take a look at my idea.
-
user2809475 over 10 yearsJust edited my last response. I see and understand what you're doing and thought about that approach the first time around, but I'm considering the scenario where a file sent in won't necessarily be comprised of just letters (and/or numbers).
-
simpletron over 10 yearsIf you think so, it must be more complicated. Your binary stream is just a sequence of bytes. If you want a int it will get 4 next bytes and add to specified memory, if you want a char it get next 1 byte .... You can think about using a text file instead of binary file. In such situation, number will be the sequence of chars.
-
Rachael Dawn about 7 years"If you had checked the return of fread" Incredibly useful. I was initializing an fopen, where ftell was returning a value larger than what there was there (in terms of size), which meant that it was reading null characters with a printf. Setting the
ReadBuffer[/*output of fread*/] = 0;
was the trick for me. Commenting for those who have a similar problem.