How to manipulate the data after numpy.loadtxt?

10,297

Solution 1

Actually, np.loadtxt can't handle that first row separately very well, so you have to do something clever. I'll give two ways, the first is shorter, but the second is more straighforward

1) you could do this 'hack' by reading the first row as header names:

y_and_data = np.genfromtxt('131014-data-xy-conv-1.txt', names=True, delimiter=',')
x = np.array(y_and_data.dtype.names[1:], int)
y = y_and_data['YX_mm']
data = y_and_data.view(np.float).reshape(-1, len(y_and_data.dtype))[:,1:]

2) But I recommend to just read the first line separately first, save it, and then open the rest with loadtxt (or genfromtxt as I've used and recommend):

with open('131014-data-xy-conv-1.txt', 'r') as f:
    x = np.array(f.readline().split(',')[1:], int)
    y_and_data = np.genfromtxt(f, delimiter=',')
y = y_and_data[:,0]
data = y_and_data[:,1:]

How it works, first open the file, and call it f:

with open('131014-data-xy-conv-1.txt', 'r') as f:

    firstline = f.readline()           # read off the first line
    firstvalues = firstline.split(',') # split it on the comma
    xvalues = firstvalues[1:]          # and keep the all but the first elements
    x = np.array(xvalues, int)         # make it an array of integers (or float if you prefer)

Now that the first line has been read from f using f.readline, the remainder can be read with genfromtxt:

    y_and_data = np.genfromtxt(f, delimiter=',')

Now, the other answers show how to split the rest:

y = y_and_data[:,0]       # the first column is the y-values
data = y_and_data[:,1:]   # the remaining columns are the data

And this is the output:

In [58]: with open('131014-data-xy-conv-1.txt', 'r') as f:
   ....:     x = np.array(f.readline().split(',')[1:], int)
   ....:     y_and_data = np.genfromtxt(f, delimiter=',')
   ....: y = y_and_data[:,0]
   ....: data = y_and_data[:,1:]
   ....: 

In [59]: x
Out[59]: array([ 0, 10, 20, 30, 40])

In [60]: y
Out[60]: 
array([ 686.6 ,  694.08,  701.56,  709.04,  716.52,  724.  ,  731.48,
        738.96,  746.44,  753.92,  761.4 ,  768.88,  776.36])

In [61]: data
Out[61]: 
array([[  -5.02 ,   -0.417,    0.   ,  100.627,    0.   ],
       [  -5.02 ,   -4.529,  -17.731,   -5.309,   -3.535],
       [   1.869,   -4.529,  -17.731,   -5.309,   -3.535],
       [   1.869,   -4.689,  -17.667,   -5.704,   -3.482],
       [   4.572,   -4.689,  -17.186,   -5.704,   -2.51 ],
       [   4.572,   -4.486,  -17.186,   -5.138,   -2.51 ],
       [   6.323,   -4.486,  -16.396,   -5.138,   -1.933],
       [   6.323,   -4.977,  -16.396,   -5.319,   -1.933],
       [   7.007,   -4.251,  -16.577,   -5.319,   -1.688],
       [   7.007,   -4.251,  -16.577,   -5.618,   -1.688],
       [   7.338,   -3.514,  -16.78 ,   -5.618,   -1.207],
       [   7.338,   -3.514,  -16.78 ,   -4.657,   -1.207],
       [   7.263,   -3.877,  -15.99 ,   -4.657,   -0.822]])

Solution 2

If you just want xs, ys, and data in separate arrays, you can do this:

xs = np.array(open('131014-data-xy-conv-1.txt').readline().split(',')[1:], int)
rawdata = numpy.loadtxt('131014-data-xy-conv-1.txt', skiprows=1)
ys = rawdata[:, 0]
data = rawdata[:, 1:]

Note the skiprows keyword to ignore the first row of the file.

Solution 3

Adding to @bogatron's answer, you can pass the argument unpack=True to get xs, ys, data in one line:

xs, ys, data = numpy.loadtxt('131014-data-xy-conv-1.txt', skiprows=1, unpack=True)
Share:
10,297
Changju.rhee
Author by

Changju.rhee

/// Use the source, Luke /// 19450th, 13-09-07 Open source projects : https://gist.github.com/changjurhee Code snippets : http://github.com/changjurhee Blog https://changjurhee.wordpress.com/ OpenPGP : 18C2 ADAB EF18 4893 7F4F DE94 96D9 FFD0 AAD1 C9D7

Updated on June 14, 2022

Comments

  • Changju.rhee
    Changju.rhee almost 2 years

    I have raw data such as below. for example, We load text file which has 1st row has xlabel, 1st column is ylabel. lets call file name is '131014-data-xy-conv-1.txt'.

    Y/X (mm),   0,  10, 20, 30, 40
    686.6,  -5.02,  -0.417, 0,  100.627,    0
    694.08, -5.02,  -4.529, -17.731,    -5.309, -3.535
    701.56, 1.869,  -4.529, -17.731,    -5.309, -3.535
    709.04, 1.869,  -4.689, -17.667,    -5.704, -3.482
    716.52, 4.572,  -4.689, -17.186,    -5.704, -2.51 
    724,    4.572,  -4.486, -17.186,    -5.138, -2.51
    731.48, 6.323,  -4.486, -16.396,    -5.138, -1.933
    738.96, 6.323,  -4.977, -16.396,    -5.319, -1.933
    746.44, 7.007,  -4.251, -16.577,    -5.319, -1.688
    753.92, 7.007,  -4.251, -16.577,    -5.618, -1.688
    761.4,  7.338,  -3.514, -16.78, -5.618, -1.207
    768.88, 7.338,  -3.514, -16.78, -4.657, -1.207
    776.36, 7.263,  -3.877, -15.99, -4.657, -0.822
    

    (Q1) As you can see the raw data, they has xlabel and ylabel in respectively 1st row, 1st column. If I use numpy.loadtxt function, How to split "xs" and "ys" ?

    rawdata = numpy.loadtxt('131014-data-xy-conv-1.txt')
    xs, ys, data = func(rawdata)
    

    Do I have to implement additional logic ? or is there any function ?