Reading binary file into matlab

matlab binary

17,398

I would recommend first, just read in all the data at once as a byte array. You'll be able to debug the problem much faster:

fid = fopen('MIC1.001','rb');
data = fread(fid);
fclose(fid);
% could look at it as all chars, just for debugging
char(A)'

The data is read in as a large array of bytes. Then you would go through and parse the bytes by casting them appropriately. You may want to try just testing your method first:

% create a binary file to follow the same format as the specified file
fid = fopen('test.dat','wb');
% Put in 8 character string for file ID
aa = 'myfile0';
fwrite(fid,aa);
% Null terminate it, (I guess)
fwrite(fid,0);
% write the 2 byte file major revision
aa = 1000;
fwrite(fid,aa,'uint16');
% write the 2 byte file minor revision
aa = 5000;
fwrite(fid,aa,'uint16');
% write the 4 byte number of modules
aa = 65536;
fwrite(fid,aa,'uint32');
fclose(fid);

% read the entire file in
fid = fopen('test.dat','rb');
A = fread(fid);
fclose(fid);

% Try to read the file id
disp(char(A(1:8))')
% Try to read the file major revision
majorByte1 = dec2hex(A(9));
majorByte2 = dec2hex(A(10));
% see if it needs byte swapped
tmp1 = hex2dec([majorByte1 majorByte2]);
tmp2 = hex2dec([majorByte2 majorByte1]);
fprintf(1,'File Major: %d ? = %d\nFile Major: %d ? = %d\n',1000,tmp1,1000,tmp2);

For output I get:

myfile0 
File Major: 1000 ? = 3715
File Major: 1000 ? = 1000

So, for me, I'll need to byte swap the data, maybe you do too? :-)

EDIT

To do this using fread, from the Matlab docs:

A = fread(fileID, sizeA, precision, skip, machineformat) reads data with the specified machineformat.

For your case:

A = fread(fid,2,'uint16',0,'b');

I'm assuming you're on a little endian machine, to swap it to little endian, just use a l instead of a b.

17,398

Author by

Marissa Sileo

Updated on June 04, 2022

Comments

Marissa Sileo almost 2 years

I have a data file that uses (char(1 byte), char[n](array of n chars), word(2 byte unsigned int), short(2 byte signed int), dword(4 byte unsigned int), long(4 byte signed int) and float(4 byte real)) and is supposedly in the following format. I am reading the data file into MATLAB with fopen, fread, etc. but the values I am getting are not what I expect.

Format:

char[8] <-- outputs 8 ascii values that spell the correct string identifier
dword <--version of the data files, msw-major version, lsw-minor version (have tried reading as 1 uint32 and 2 uint16's)
dword
dword
dword
dword <--number of window displays in program
displayinfo[8] <--contains display window params in the following structure: (not sure what data type this is)
dword
dword
dword
dword
dword
dword
dword
dword
dword
dword
dword
dword
dword (end of display window params; some are specified as must be a number in [0,3] and they aren't coming out like that)
char[16]
word <-- supposed to be year data was collected (2013) but coming up as 0

Code:

fid = fopen('MIC1.001','rb');
fileIdentifier = fread(fid, 8,'char');    
dataFileMajorVersion = fread(fid,1,'uint16');
dateFileMinorVersion = fread(fid,1,'uint16');
numModules = fread(fid,1,'uint32');

fread(fid,1,'uint32'); % value not used     
numSwipesCollected = fread(fid,1,'uint32');
numWindowDisplays = fread(fid,1,'uint32');
% display info vars:
displayType = [];
moduleNumber = [];
channelNumber = [];    
beginningBar = [];
endBar = [];
vertExpFactor = [];
voltageOffset =[];
isGridEnabled = [];
isEngineeringUnitEnabled = [];
colorOfDisplay = [];
multiChannelIndex = [];
numChannelsForMultiChannelDisp = [];
multiChannelDispStyle = [];

% or does it go through loop for all 8 whether or not there are 8 displays??
for i=1:numWindowDisplays 
  displayType = [fread(fid,1,'uint32'); displayType];
  moduleNumber = [fread(fid,1,'uint32'); moduleNumber];
  channelNumber = [fread(fid,1,'uint32'); channelNumber];
  beginningBar = [fread(fid,1,'uint32'); beginningBar];
  endBar = [fread(fid,1,'uint32'); endBar];
  vertExpFactor = [fread(fid,1,'uint32'); vertExpFactor]; 
  voltageOffset =[fread(fid,1,'uint32'); voltageOffset];
  isGridEnabled = [fread(fid,1,'uint32'); isGridEnabled];
  isEngineeringUnitEnabled = [fread(fid,1,'uint32'); isEngineeringUnitEnabled];
  colorOfDisplay = [fread(fid,1,'uint32'); colorOfDisplay];
  multiChannelIndex = [fread(fid,1,'uint32'); multiChannelIndex];
  numChannelsForMultiChannelDisp = [fread(fid,1,'uint32'); numChannelsForMultiChannelDisp];
  multiChannelDispStyle = [fread(fid,1,'uint32'); multiChannelDispStyle];
end    
fread(fid,1,'uint32'); % value only used internally
fread(fid,16,'char'); % unused parameter for future use
yearOfDataCollection = fread(fid,1,'uint16');

Marissa Sileo almost 11 years

But when you don't specify the data type, doesn't MATLAB read it wrong? And if I specified them all as chars, which they are not, it would definitely read them wrong I thought..
macduff almost 11 years

@MarissaSileo, updated the answer, if you read them all as chars, or bytes, and assemble them later, it's ok.
Marissa Sileo almost 11 years

Okay when I put it in MATLAB I get the same output as you. Forgive me for not knowing much about this, but I'm still a little lost. I've been trying to read up on the byte swapping as you mentioned but I was under the impression that MATLAB would pick up on whether the data was big or little endian and would then convert it accordingly. Am I wrong in assuming that? Maybe that is why I am getting random values like 2778916 when my data system manual says it should be a number between 0 and 3...
macduff almost 11 years

@MarissaSileo, np! Matlab doesn't have a way that I am aware of, to automagically figure out the endian of the file. Sounds like you've got your culprit here, sounds like you need to swap bytes.
Ben Voigt almost 6 years

Just A(9:10) * [256; 1] seems so much better than converting to strings in between. The other byte order is easy to get as well: A(9:10) * [1; 256]
Ben Voigt almost 6 years

Don't byte swap like that. Instead do A(9:10) * [256; 1]. Numeric manipulation is much faster than string work, and avoid the string problems Maurice points out with padding / leading zeros.