Open .mat (matlab data) using python

10,374

You can use scipy.io.loadmat for this:

from scipy import io

loaded = io.loadmat('/GAAR/ustr/projects/PBF/tmpPBworkspaceH5.mat')

loaded will be a dictionary mapping names to arrays.


If you're in control of both the Matlab part and the Pandas part, however, it is much easier to use csvwrite:

In Matlab:

csvwrite('path/tmpPBworkspaceH5.csv','rateQualityOutTrim')

In Python:

pd.read_csv('tmpPBworkspaceH5.csv')
Share:
10,374

Related videos on Youtube

SBad
Author by

SBad

Updated on July 09, 2022

Comments

  • SBad
    SBad almost 2 years

    All,

    I tried to import and read .mat file from python. I have tried two ways but been unsuccessful:

    Method 1: in python:

     import scipy.io as sio    
    mat = sio.loadmat('path/tmpPBworkspace.mat')
    

    i get a message similar to:

    {'None': MatlabOpaque([ (b'rateQualityOutTrim', b'MCOS', b'dataset', array([[3707764736],
            [         2],
            [         1],
            [         1],
            [         1],
            [         1]], dtype=uint32))],
                  dtype=[('s0', 'O'), ('s1', 'O'), ('s2', 'O'), ('arr', 'O')]),
     '__function_workspace__': array([[ 0,  1, 73, ...,  0,  0,  0]], dtype=uint8),
     '__globals__': [],
     '__header__': b'MATLAB 5.0 MAT-file, Platform: GLNXA64, Created on: Thu May 10 07:11:52 2018',
     '__version__': '1.0'}
    

    I am not sure what went wrong there? I was hoping to see a dataframe

    also to add, in Method 1, I have saved the .mat in a version compatible with scipy

    in Matlab

    save('path/tmpPBworkspace.mat','rateQualityOutTrim','-v7')
    

    also tried the other way:

    Method 2: h5py

    in Matlab:

    save('path/tmpPBworkspaceH5.mat','rateQualityOutTrim','-v7.3')
    

    in Python:

    import numpy as np
    import h5py
    f = h5py.File('/GAAR/ustr/projects/PBF/tmpPBworkspaceH5.mat','r')
    data = f.get('rateQualityOutTrim/date')
    data = np.array(data)
    

    i get

    f
    Out[154]: <HDF5 file "tmpPBworkspaceH5.mat" (mode r)>
    
    data
    array(None, dtype=object)
    

    Array is empty. Not sure how I can access the data here as well

    Thanks

    • hpaulj
      hpaulj almost 6 years
      The Opaque item is a matlab class object that it can't turn into a numpy array.
  • SBad
    SBad almost 6 years
    Thanks Ami Tavory. I did that and I get similar message as before {'None': MatlabOpaque([ (b'rateQualityOutTrim', b'MCOS', b'dataset', array([[3707764736], [ 2], [ 1], [ 1], [ 1], [ 1]], dtype=uint32))], dtype=[('s0', 'O'), ('s1', 'O'), ('s2', 'O'), ('arr', 'O')]), '__function_workspace__': array([[ 0, 1, 73, ..., 0, 0, 0]], dtype=uint8), '__globals__': [], '__header__': b'MATLAB 5.0 MAT-file, Platform: GLNXA64, Created on: Fri May 11 03:33:35 2018', '__version__': '1.0'}
  • SBad
    SBad almost 6 years
    still not sure how i can extract the data?
  • Ami Tavory
    Ami Tavory almost 6 years
    @SBad Got it - this is explained very nicely in this notebook - it's in Julia, but you can follow the explanations.
  • Ami Tavory
    Ami Tavory almost 6 years
    @SBad Incidentally, looking at your question, it looks like you're in control of the Matlab part as well. In this case, there are much easier options. I edited my answer to include one.
  • Ami Tavory
    Ami Tavory almost 6 years
    @SBad It's a long answer, since this format is really not meant to be used for exporting - it's reverse engineered, and you probably don't want to write in it to the first place.
  • SBad
    SBad almost 6 years
    I have a very large dataset (more than a million rows) and exporting to csv is not optimal and may take a very long time. saving the data as .mat and importing in python may be the best solution i think.