Why do numpy array arr2d[:, :1] and arr2d[:, 0] produce different results?

python numpy indexing python-2.x

15,703

Solution 1

1) When you say arr2d[:, 0], you're saying give me the 0th index of all the rows in arr2d (this is another way of saying give me the 0th column).

2) When you say arr2d[:, :1], you're saying give me all the :1 index of all the rows in arr2d. Numpy interprets :1 the same as it would interpret 0:1. Thus, you're saying "for each row, give me the 0th through first index (exclusive) of each row". This turns out to be just the 0th index, but you've explicitly asked for the second dimension to have length one (since 0:1 is only "length" one).

So:

print arr2d[:, 0].shape

Output:

(3L,)

print arr2d[:, 0:1].shape

Output:

(3L, 1L)

I still don't get it, why don't they return the same thing?

Consider:

print arr2d[:, 0:3]
print arr2d[:, 0:3].shape

print arr2d[:, 0:2]
print arr2d[:, 0:2].shape

print arr2d[:, 0:1]
print arr2d[:, 0:1].shape

This outputs:

[[1 2 3]
 [4 5 6]
 [7 8 9]]
(3L, 3L)

[[1 2]
 [4 5]
 [7 8]]
(3L, 2L)

[[1]
 [4]
 [7]]
(3L, 1L)

It would be a bit unexpected and inconsistent for that last shape to be (3L,).

Solution 2

With a list, you have the same behavior you described:

>>> a = [1, 2, 3, 4]
>>> a[0]
1
>>> a[:1]
[1]

The addition with numpy is the introduction of axis which makes it a little less intuitive.

In the first case, you're returning an item at a specific index, in the second case, you're returning a slice of the list.

With numpy, for the former, you're selecting all the items in the first column which returns a one axis array (is one less than the number of axis of the parent as expected with indexing), but in the second case, you're slicing the original array, for which the result still retains the original dimensions of the parent array.

Solution 3

Index ':1' implies:

'The list of items from index 0 to index 0' which is obviously a list of 1 item.

Index '0' implies:

'The item at index 0'.

Extending that to your question should make the results you obtained pretty clear.

arr2d[:, :1] means 'data corresponding to all rows and the list of columns 0 to 0'.

So the result is a list of lists.

arr2d[:, 0] means 'data corresponding to all rows and just the first column'.

So it is just a list.

15,703

Author by

JJJJ

Updated on June 04, 2022

Comments

JJJJ almost 2 years

Say:

arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

arr2d[:, :1] gives me

array([[1],
       [4],
       [7]])

arr2d[:,0] gives me

array([1, 4, 7])

I thought they would produce exactly same thing.

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Indexing of python 2D list

Replace subarrays in numpy

Python numpy index is out of bound for axis zero

Numpy array modifying multiple elements at once

IndexError: index 1000 is out of bounds for axis 0 with size 1000

"IndexError: too many indices" in numpy python

numpy.argmax: how to get the index corresponding to the *last* occurrence, in case of multiple occurrences of the maximum values

indexing numpy array with logical operator

Python creating a smaller sub-array from a larger 2D NumPy array?

Error: random_sample() takes at most 1 positional argument (2 given)