Why do numpy array arr2d[:, :1] and arr2d[:, 0] produce different results?
Solution 1
1)
When you say arr2d[:, 0]
, you're saying give me the 0th index of all the rows in arr2d (this is another way of saying give me the 0th column).
2)
When you say arr2d[:, :1]
, you're saying give me all the :1
index of all the rows in arr2d. Numpy interprets :1
the same as it would interpret 0:1
. Thus, you're saying "for each row, give me the 0th through first index (exclusive) of each row". This turns out to be just the 0th index, but you've explicitly asked for the second dimension to have length one (since 0:1
is only "length" one).
So:
1)
print arr2d[:, 0].shape
Output:
(3L,)
2)
print arr2d[:, 0:1].shape
Output:
(3L, 1L)
I still don't get it, why don't they return the same thing?
Consider:
print arr2d[:, 0:3]
print arr2d[:, 0:3].shape
print arr2d[:, 0:2]
print arr2d[:, 0:2].shape
print arr2d[:, 0:1]
print arr2d[:, 0:1].shape
This outputs:
[[1 2 3]
[4 5 6]
[7 8 9]]
(3L, 3L)
[[1 2]
[4 5]
[7 8]]
(3L, 2L)
[[1]
[4]
[7]]
(3L, 1L)
It would be a bit unexpected and inconsistent for that last shape to be (3L,)
.
Solution 2
With a list, you have the same behavior you described:
>>> a = [1, 2, 3, 4]
>>> a[0]
1
>>> a[:1]
[1]
The addition with numpy
is the introduction of axis
which makes it a little less intuitive.
In the first case, you're returning an item at a specific index, in the second case, you're returning a slice of the list.
With numpy
, for the former, you're selecting all the items in the first column which returns a one axis array (is one less than the number of axis of the parent as expected with indexing), but in the second case, you're slicing the original array, for which the result still retains the original dimensions of the parent array.
Solution 3
Index ':1'
implies:
'The list of items from index 0
to index 0
' which is obviously a list of 1 item.
Index '0'
implies:
'The item at index 0
'.
Extending that to your question should make the results you obtained pretty clear.
arr2d[:, :1]
means 'data corresponding to all rows and the list of columns 0 to 0'.
So the result is a list of lists.
arr2d[:, 0]
means 'data corresponding to all rows and just the first column'.
So it is just a list.
JJJJ
Updated on June 04, 2022Comments
-
JJJJ almost 2 years
Say:
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d[:, :1]
gives mearray([[1], [4], [7]])
arr2d[:,0]
gives mearray([1, 4, 7])
I thought they would produce exactly same thing.