Numpy loadtxt encoding

15,925

I could solve the problem by myself.

I just had to open the file with the appropriate before reading it with numpy:

import numpy as np
import codecs

n=10

filecp = codecs.open(myfile, encoding = 'cp1252')
mydata = np.loadtxt(filecp, skiprows = n)

Thank you everyone!

Share:
15,925
Admin
Author by

Admin

Updated on August 20, 2022

Comments

  • Admin
    Admin about 1 year

    I am trying to load data with numpy.loadtxt... The file im trying to read is using cp1252 coding. Is there a possibility to change the encoding to cp1252 with numpy?

    The following

    import numpy as np
    n = 10
    myfile = '/path/to/myfile'
    mydata = np.loadtxt(myfile, skiprows = n)
    

    gives:

    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 189: invalid start byte
    

    The file contains metadata (first n rows) followed by a table of floats.

    Edit: This problem only occurs when running this on Ubuntu (12.04). On Windows it works well. For this reason I think this problem is related to the encoding.

    Edit2: opening the file as shown in the following works well, too:

    import codecs
    data = codecs.open(myfile, encoding='cp1252')
    datalines = data.readlines()
    

    However I'd like to use np.loadtext to directly read the data into a numpy array.

  • Martin Ueding
    Martin Ueding almost 7 years
    I have the hunch that this leads to a file descriptor leak unless one uses a context manager (with).