Reading binary file into matlab

17,398

I would recommend first, just read in all the data at once as a byte array. You'll be able to debug the problem much faster:

fid = fopen('MIC1.001','rb');
data = fread(fid);
fclose(fid);
% could look at it as all chars, just for debugging
char(A)'

The data is read in as a large array of bytes. Then you would go through and parse the bytes by casting them appropriately. You may want to try just testing your method first:

% create a binary file to follow the same format as the specified file
fid = fopen('test.dat','wb');
% Put in 8 character string for file ID
aa = 'myfile0';
fwrite(fid,aa);
% Null terminate it, (I guess)
fwrite(fid,0);
% write the 2 byte file major revision
aa = 1000;
fwrite(fid,aa,'uint16');
% write the 2 byte file minor revision
aa = 5000;
fwrite(fid,aa,'uint16');
% write the 4 byte number of modules
aa = 65536;
fwrite(fid,aa,'uint32');
fclose(fid);

% read the entire file in
fid = fopen('test.dat','rb');
A = fread(fid);
fclose(fid);

% Try to read the file id
disp(char(A(1:8))')
% Try to read the file major revision
majorByte1 = dec2hex(A(9));
majorByte2 = dec2hex(A(10));
% see if it needs byte swapped
tmp1 = hex2dec([majorByte1 majorByte2]);
tmp2 = hex2dec([majorByte2 majorByte1]);
fprintf(1,'File Major: %d ? = %d\nFile Major: %d ? = %d\n',1000,tmp1,1000,tmp2);

For output I get:

myfile0 
File Major: 1000 ? = 3715
File Major: 1000 ? = 1000

So, for me, I'll need to byte swap the data, maybe you do too? :-)

EDIT

To do this using fread, from the Matlab docs:

A = fread(fileID, sizeA, precision, skip, machineformat) reads data with the specified machineformat.

For your case:

A = fread(fid,2,'uint16',0,'b');

I'm assuming you're on a little endian machine, to swap it to little endian, just use a l instead of a b.

Share:
17,398
Marissa Sileo
Author by

Marissa Sileo

Updated on June 04, 2022

Comments

  • Marissa Sileo
    Marissa Sileo almost 2 years

    I have a data file that uses (char(1 byte), char[n](array of n chars), word(2 byte unsigned int), short(2 byte signed int), dword(4 byte unsigned int), long(4 byte signed int) and float(4 byte real)) and is supposedly in the following format. I am reading the data file into MATLAB with fopen, fread, etc. but the values I am getting are not what I expect.

    Format:

    • char[8] <-- outputs 8 ascii values that spell the correct string identifier
    • dword <--version of the data files, msw-major version, lsw-minor version (have tried reading as 1 uint32 and 2 uint16's)
    • dword
    • dword
    • dword
    • dword <--number of window displays in program
    • displayinfo[8] <--contains display window params in the following structure: (not sure what data type this is)
    • dword
    • dword
    • dword
    • dword
    • dword
    • dword
    • dword
    • dword
    • dword
    • dword
    • dword
    • dword
    • dword (end of display window params; some are specified as must be a number in [0,3] and they aren't coming out like that)
    • char[16]
    • word <-- supposed to be year data was collected (2013) but coming up as 0

    Code:

    fid = fopen('MIC1.001','rb');
    fileIdentifier = fread(fid, 8,'char');    
    dataFileMajorVersion = fread(fid,1,'uint16');
    dateFileMinorVersion = fread(fid,1,'uint16');
    numModules = fread(fid,1,'uint32');
    
    fread(fid,1,'uint32'); % value not used     
    numSwipesCollected = fread(fid,1,'uint32');
    numWindowDisplays = fread(fid,1,'uint32');
    % display info vars:
    displayType = [];
    moduleNumber = [];
    channelNumber = [];    
    beginningBar = [];
    endBar = [];
    vertExpFactor = [];
    voltageOffset =[];
    isGridEnabled = [];
    isEngineeringUnitEnabled = [];
    colorOfDisplay = [];
    multiChannelIndex = [];
    numChannelsForMultiChannelDisp = [];
    multiChannelDispStyle = [];
    
    % or does it go through loop for all 8 whether or not there are 8 displays??
    for i=1:numWindowDisplays 
      displayType = [fread(fid,1,'uint32'); displayType];
      moduleNumber = [fread(fid,1,'uint32'); moduleNumber];
      channelNumber = [fread(fid,1,'uint32'); channelNumber];
      beginningBar = [fread(fid,1,'uint32'); beginningBar];
      endBar = [fread(fid,1,'uint32'); endBar];
      vertExpFactor = [fread(fid,1,'uint32'); vertExpFactor]; 
      voltageOffset =[fread(fid,1,'uint32'); voltageOffset];
      isGridEnabled = [fread(fid,1,'uint32'); isGridEnabled];
      isEngineeringUnitEnabled = [fread(fid,1,'uint32'); isEngineeringUnitEnabled];
      colorOfDisplay = [fread(fid,1,'uint32'); colorOfDisplay];
      multiChannelIndex = [fread(fid,1,'uint32'); multiChannelIndex];
      numChannelsForMultiChannelDisp = [fread(fid,1,'uint32'); numChannelsForMultiChannelDisp];
      multiChannelDispStyle = [fread(fid,1,'uint32'); multiChannelDispStyle];
    end    
    fread(fid,1,'uint32'); % value only used internally
    fread(fid,16,'char'); % unused parameter for future use
    yearOfDataCollection = fread(fid,1,'uint16'); 
    
  • Marissa Sileo
    Marissa Sileo almost 11 years
    But when you don't specify the data type, doesn't MATLAB read it wrong? And if I specified them all as chars, which they are not, it would definitely read them wrong I thought..
  • macduff
    macduff almost 11 years
    @MarissaSileo, updated the answer, if you read them all as chars, or bytes, and assemble them later, it's ok.
  • Marissa Sileo
    Marissa Sileo almost 11 years
    Okay when I put it in MATLAB I get the same output as you. Forgive me for not knowing much about this, but I'm still a little lost. I've been trying to read up on the byte swapping as you mentioned but I was under the impression that MATLAB would pick up on whether the data was big or little endian and would then convert it accordingly. Am I wrong in assuming that? Maybe that is why I am getting random values like 2778916 when my data system manual says it should be a number between 0 and 3...
  • macduff
    macduff almost 11 years
    @MarissaSileo, np! Matlab doesn't have a way that I am aware of, to automagically figure out the endian of the file. Sounds like you've got your culprit here, sounds like you need to swap bytes.
  • Ben Voigt
    Ben Voigt almost 6 years
    Just A(9:10) * [256; 1] seems so much better than converting to strings in between. The other byte order is easy to get as well: A(9:10) * [1; 256]
  • Ben Voigt
    Ben Voigt almost 6 years
    Don't byte swap like that. Instead do A(9:10) * [256; 1]. Numeric manipulation is much faster than string work, and avoid the string problems Maurice points out with padding / leading zeros.