convert string to 2d numpy array

10,692

Solution 1

Here's what I did to get the result you're looking for:

import numpy as np

b='191.250\t0.00\t0\t1\n191.251\t0.00\t0\t1\n191.252\t0.00\t0\t1\n'
a = np.array([[float(j) for j in i.split('\t')] for i in b.splitlines()])

Solution 2

Instead of splitting and filtering, you could use np.fromstring:

>>> np.fromstring(b, sep='\t').reshape(-1, 4)
array([[ 191.25 ,    0.   ,    0.   ,    1.   ],
       [ 191.251,    0.   ,    0.   ,    1.   ],
       [ 191.252,    0.   ,    0.   ,    1.   ]])

This always returns a 1D array so reshaping is necessary.

Alternatively, to avoid reshaping, if you already have a string of bytes (as strings are in Python 2), you could use np.genfromtxt (with the help of the standard library's io module):

>>> import io
>>> np.genfromtxt(io.BytesIO(b))
array([[ 191.25 ,    0.   ,    0.   ,    1.   ],
       [ 191.251,    0.   ,    0.   ,    1.   ],
       [ 191.252,    0.   ,    0.   ,    1.   ]])

genfromtxt handles missing values, as well as offering much more control over how the final array is created.

Share:
10,692
A B
Author by

A B

Updated on June 27, 2022

Comments

  • A B
    A B almost 2 years

    I am trying to convert 'b' (a string in which the column entries are separated by one delimiter and the the rows are separated by another delimiter) to 'a' (a 2d numpy array), like:

    b='191.250\t0.00\t0\t1\n191.251\t0.00\t0\t1\n191.252\t0.00\t0\t1\n'
    a=numpy.array([[191.25,0,0,1],[191.251,0,0,1],[191.252,0,0,1]])
    

    The way I do it is (using my knowledge that there are 4 columns in 'a'):

    a=numpy.array(filter(None,re.split('[\n\t]+',b)),dtype=float).reshape(-1,4)
    

    Is there a better way?