convert 2d numpy array to string
Solution
A one-liner will do:
b = '\n'.join('\t'.join('%0.3f' %x for x in y) for y in a)
Using a simpler example:
>>> a = np.arange(25, dtype=float).reshape(5, 5)
>>> a
array([[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[ 10., 11., 12., 13., 14.],
[ 15., 16., 17., 18., 19.],
[ 20., 21., 22., 23., 24.]])
This:
b = '\n'.join('\t'.join('%0.3f' %x for x in y) for y in a)
print(b)
prints this:
0.000 1.000 2.000 3.000 4.000
5.000 6.000 7.000 8.000 9.000
10.000 11.000 12.000 13.000 14.000
15.000 16.000 17.000 18.000 19.000
20.000 21.000 22.000 23.000 24.000
Explanation
You already used a list comprehension in your second method. Here we have a generator expression, which looks exactly like a list comprehension. The only syntactical difference is that the []
are replaced by ()
. A generator expression does not build the list but hands a so called generator to join
. In the end it has the same effect but skips the step of building this intermediate list.
There can be multiple for
in such an expression, which makes it nested.
This:
b = '\n'.join('\t'.join('%0.3f' %x for x in y) for y in a)
is equivalent to:
res = []
for y in a:
res.append('\t'.join('%0.3f' %x for x in y))
b = '\n'.join(res)
Performance
I use %%timeit
in the IPython Notebook:
%%timeit
b = '\n'.join('\t'.join('%0.3f' %x for x in y) for y in a)
10 loops, best of 3: 42.4 ms per loop
%%timeit
b=''
for i in range(0,a.shape[0]):
for j in range(0,a.shape[1]-1):
b+=str(a[i,j])+'\t'
b+=str(a[i,-1])+'\n'
10 loops, best of 3: 50.2 ms per loop
%%timeit
b=''
for i in range(0,a.shape[0]):
b+='\t'.join(['%0.3f' %x for x in a[i,:]])+'\n'
10 loops, best of 3: 43.8 ms per loop
Looks like they are all about the same speed. Actually, the +=
is optimized in CPython. Otherwise, it would be much slower, than the join()
approach. Other Python implementations such as Jython or PyPy can show much bigger time differences and can make the join()
much faster compared to +=
.
A B
Updated on June 23, 2022Comments
-
A B almost 2 years
I am new to Python and am trying to convert a 2d numpy array, like:
a=numpy.array([[191.25,0,0,1],[191.251,0,0,1],[191.252,0,0,1]])
to a string in which the column entries are separated by one delimiter '\t' and the the rows are separated by another delimiter '\n' with control over the precision of each column, to get:
b='191.250\t0.00\t0\t1\n191.251\t0.00\t0\t1\n191.252\t0.00\t0\t1\n'
First, I create the array by:
import numpy as np col1=np.arange(191.25,196.275,.001)[:, np.newaxis] nrows=col1.shape[0] col2=np.zeros((nrows,1),dtype=np.int) col3=np.zeros((nrows,1),dtype=np.int) col4=np.ones((nrows,1),dtype=np.int) a=np.hstack((col1,col2,col3,col4))
Then I produce b, by one of 2 methods:
Method 1:
b='' for i in range(0,a.shape[0]): for j in range(0,a.shape[1]-1): b+=str(a[i,j])+'\t' b+=str(a[i,-1])+'\n' b
Method 2:
b='' for i in range(0,a.shape[0]): b+='\t'.join(['%0.3f' %x for x in a[i,:]])+'\n' b
However, I'm guessing there are better ways of producing a and b. I am looking for the most efficient ways (i.e. memory, time, code compactness) to create a and b.
Follow up questions
Thank you Mike,
b = '\n'.join('\t'.join('%0.3f' %x for x in y) for y in a)+'\n'
worked for me but I have a few follow up questions (this couldn't fit in the comment section):
- Though this is more compact, is the speed the same as executing a nested for loop, as this what seems to be going on within the parentheses?
- I understand that x and y are iterators across the 2 dimensions of y, however, how does Python "know" they are and which dimensions they are supposed to iterate across? In Matlab, for example, these things have to be explicitly stated.
- Is there a way to independently set the precision for each column (e.g. I'd like %0.3f for the first three columns and %0.0f for the last column)?
- Is there an easy way to do the reverse procedure- i.e. given b, produce a? I have come up with 2 methods:
Method 1
y=b.split('\n')[:-1] z=[y[i].split('\t') for i in range(0,len(y))] a=numpy.array(z,dtype=float)
Method 2
import re a=numpy.array(filter(None,re.split('[\n\t]+',b)),dtype=float).reshape(-1,4)
Is there a better way?
-
A B over 8 yearsHi Mike, thanks, that worked for me. I have a few follow up questions but was unable to fit them here so I have included them in an edit of my original question.
-
Mike Müller over 8 years@AB I added an explanation to your first two additional questions. You can accept an answer if it solves your problem. I think my answer does.
-
Mike Müller over 8 years@AB I recommend to you to create two new questions such as "Conditional formation of array rows" for 3. and "How to make a NumPy array from a string?" for 4. Otherwise, this question becomes to crowded. Also, the answer should be useful for others too. But you need a good question formulation to find what you are looking for. Hiding answer in other questions dos not help. Just point me to these new questions and I will have a look at them.