Reading binary file into matlab
I would recommend first, just read in all the data at once as a byte array. You'll be able to debug the problem much faster:
fid = fopen('MIC1.001','rb');
data = fread(fid);
fclose(fid);
% could look at it as all chars, just for debugging
char(A)'
The data is read in as a large array of bytes. Then you would go through and parse the bytes by casting them appropriately. You may want to try just testing your method first:
% create a binary file to follow the same format as the specified file
fid = fopen('test.dat','wb');
% Put in 8 character string for file ID
aa = 'myfile0';
fwrite(fid,aa);
% Null terminate it, (I guess)
fwrite(fid,0);
% write the 2 byte file major revision
aa = 1000;
fwrite(fid,aa,'uint16');
% write the 2 byte file minor revision
aa = 5000;
fwrite(fid,aa,'uint16');
% write the 4 byte number of modules
aa = 65536;
fwrite(fid,aa,'uint32');
fclose(fid);
% read the entire file in
fid = fopen('test.dat','rb');
A = fread(fid);
fclose(fid);
% Try to read the file id
disp(char(A(1:8))')
% Try to read the file major revision
majorByte1 = dec2hex(A(9));
majorByte2 = dec2hex(A(10));
% see if it needs byte swapped
tmp1 = hex2dec([majorByte1 majorByte2]);
tmp2 = hex2dec([majorByte2 majorByte1]);
fprintf(1,'File Major: %d ? = %d\nFile Major: %d ? = %d\n',1000,tmp1,1000,tmp2);
For output I get:
myfile0
File Major: 1000 ? = 3715
File Major: 1000 ? = 1000
So, for me, I'll need to byte swap the data, maybe you do too? :-)
EDIT
To do this using fread
, from the Matlab docs:
A = fread(fileID, sizeA, precision, skip, machineformat) reads data with the specified machineformat.
For your case:
A = fread(fid,2,'uint16',0,'b');
I'm assuming you're on a little endian machine, to swap it to little endian, just use a l
instead of a b
.
Marissa Sileo
Updated on June 04, 2022Comments
-
Marissa Sileo almost 2 years
I have a data file that uses (char(1 byte), char[n](array of n chars), word(2 byte unsigned int), short(2 byte signed int), dword(4 byte unsigned int), long(4 byte signed int) and float(4 byte real)) and is supposedly in the following format. I am reading the data file into MATLAB with fopen, fread, etc. but the values I am getting are not what I expect.
Format:
- char[8] <-- outputs 8 ascii values that spell the correct string identifier
- dword <--version of the data files, msw-major version, lsw-minor version (have tried reading as 1 uint32 and 2 uint16's)
- dword
- dword
- dword
- dword <--number of window displays in program
- displayinfo[8] <--contains display window params in the following structure: (not sure what data type this is)
- dword
- dword
- dword
- dword
- dword
- dword
- dword
- dword
- dword
- dword
- dword
- dword
- dword (end of display window params; some are specified as must be a number in [0,3] and they aren't coming out like that)
- char[16]
- word <-- supposed to be year data was collected (2013) but coming up as 0
Code:
fid = fopen('MIC1.001','rb'); fileIdentifier = fread(fid, 8,'char'); dataFileMajorVersion = fread(fid,1,'uint16'); dateFileMinorVersion = fread(fid,1,'uint16'); numModules = fread(fid,1,'uint32'); fread(fid,1,'uint32'); % value not used numSwipesCollected = fread(fid,1,'uint32'); numWindowDisplays = fread(fid,1,'uint32'); % display info vars: displayType = []; moduleNumber = []; channelNumber = []; beginningBar = []; endBar = []; vertExpFactor = []; voltageOffset =[]; isGridEnabled = []; isEngineeringUnitEnabled = []; colorOfDisplay = []; multiChannelIndex = []; numChannelsForMultiChannelDisp = []; multiChannelDispStyle = []; % or does it go through loop for all 8 whether or not there are 8 displays?? for i=1:numWindowDisplays displayType = [fread(fid,1,'uint32'); displayType]; moduleNumber = [fread(fid,1,'uint32'); moduleNumber]; channelNumber = [fread(fid,1,'uint32'); channelNumber]; beginningBar = [fread(fid,1,'uint32'); beginningBar]; endBar = [fread(fid,1,'uint32'); endBar]; vertExpFactor = [fread(fid,1,'uint32'); vertExpFactor]; voltageOffset =[fread(fid,1,'uint32'); voltageOffset]; isGridEnabled = [fread(fid,1,'uint32'); isGridEnabled]; isEngineeringUnitEnabled = [fread(fid,1,'uint32'); isEngineeringUnitEnabled]; colorOfDisplay = [fread(fid,1,'uint32'); colorOfDisplay]; multiChannelIndex = [fread(fid,1,'uint32'); multiChannelIndex]; numChannelsForMultiChannelDisp = [fread(fid,1,'uint32'); numChannelsForMultiChannelDisp]; multiChannelDispStyle = [fread(fid,1,'uint32'); multiChannelDispStyle]; end fread(fid,1,'uint32'); % value only used internally fread(fid,16,'char'); % unused parameter for future use yearOfDataCollection = fread(fid,1,'uint16');
-
Marissa Sileo almost 11 yearsBut when you don't specify the data type, doesn't MATLAB read it wrong? And if I specified them all as chars, which they are not, it would definitely read them wrong I thought..
-
macduff almost 11 years@MarissaSileo, updated the answer, if you read them all as
chars
, orbytes
, and assemble them later, it's ok. -
Marissa Sileo almost 11 yearsOkay when I put it in MATLAB I get the same output as you. Forgive me for not knowing much about this, but I'm still a little lost. I've been trying to read up on the byte swapping as you mentioned but I was under the impression that MATLAB would pick up on whether the data was big or little endian and would then convert it accordingly. Am I wrong in assuming that? Maybe that is why I am getting random values like 2778916 when my data system manual says it should be a number between 0 and 3...
-
macduff almost 11 years@MarissaSileo, np! Matlab doesn't have a way that I am aware of, to automagically figure out the endian of the file. Sounds like you've got your culprit here, sounds like you need to swap bytes.
-
Ben Voigt almost 6 yearsJust
A(9:10) * [256; 1]
seems so much better than converting to strings in between. The other byte order is easy to get as well:A(9:10) * [1; 256]
-
Ben Voigt almost 6 yearsDon't byte swap like that. Instead do
A(9:10) * [256; 1]
. Numeric manipulation is much faster than string work, and avoid the string problems Maurice points out with padding / leading zeros.