Selecting multiple slices from a numpy array at once
Solution 1
You can use the indexes to select the rows you want into the appropriate shape. For example:
data = np.random.normal(size=(100,2,2,2))
# Creating an array of row-indexes
indexes = np.array([np.arange(0,5), np.arange(1,6), np.arange(2,7)])
# data[indexes] will return an element of shape (3,5,2,2,2). Converting
# to list happens along axis 0
data_extractions = list(data[indexes])
np.all(data_extractions[1] == data[1:6])
True
The final comparison is against the original data.
Solution 2
stride_tricks
can do that
a = np.arange(10)
b = np.lib.stride_tricks.as_strided(a, (3, 5), 2 * a.strides)
b
# array([[0, 1, 2, 3, 4],
# [1, 2, 3, 4, 5],
# [2, 3, 4, 5, 6]])
Please note that b
references the same memory as a
, in fact multiple times (for example b[0, 1]
and b[1, 0]
are the same memory address). It is therefore safest to make a copy before working with the new structure.
nd can be done in a similar fashion, for example 2d -> 4d
a = np.arange(16).reshape(4, 4)
b = np.lib.stride_tricks.as_strided(a, (3,3,2,2), 2*a.strides)
b.reshape(9,2,2) # this forces a copy
# array([[[ 0, 1],
# [ 4, 5]],
# [[ 1, 2],
# [ 5, 6]],
# [[ 2, 3],
# [ 6, 7]],
# [[ 4, 5],
# [ 8, 9]],
# [[ 5, 6],
# [ 9, 10]],
# [[ 6, 7],
# [10, 11]],
# [[ 8, 9],
# [12, 13]],
# [[ 9, 10],
# [13, 14]],
# [[10, 11],
# [14, 15]]])
Solution 3
In this post is an approach with strided-indexing scheme
using np.lib.stride_tricks.as_strided
that basically creates a view into the input array and as such is pretty efficient for creation and being a view occupies nomore memory space.
Also, this works for ndarrays with generic number of dimensions.
Here's the implementation -
def strided_axis0(a, L):
# Store the shape and strides info
shp = a.shape
s = a.strides
# Compute length of output array along the first axis
nd0 = shp[0]-L+1
# Setup shape and strides for use with np.lib.stride_tricks.as_strided
# and get (n+1) dim output array
shp_in = (nd0,L)+shp[1:]
strd_in = (s[0],) + s
return np.lib.stride_tricks.as_strided(a, shape=shp_in, strides=strd_in)
Sample run for a 4D
array case -
In [44]: a = np.random.randint(11,99,(10,4,2,3)) # Array
In [45]: L = 5 # Window length along the first axis
In [46]: out = strided_axis0(a, L)
In [47]: np.allclose(a[0:L], out[0]) # Verify outputs
Out[47]: True
In [48]: np.allclose(a[1:L+1], out[1])
Out[48]: True
In [49]: np.allclose(a[2:L+2], out[2])
Out[49]: True
Solution 4
You can slice your array with a prepared slicing array
a = np.array(list('abcdefg'))
b = np.array([
[0, 1, 2, 3, 4],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6]
])
a[b]
However, b
doesn't have to generated by hand in this way. It can be more dynamic with
b = np.arange(5) + np.arange(3)[:, None]
Solution 5
In the general case you have to do some sort of iteration - and concatenation - either when constructing the indexes or when collecting the results. It's only when the slicing pattern is itself regular that you can use a generalized slicing via as_strided
.
The accepted answer constructs an indexing array, one row per slice. So that is iterating over the slices, and arange
itself is a (fast) iteration. And np.array
concatenates them on a new axis (np.stack
generalizes this).
In [264]: np.array([np.arange(0,5), np.arange(1,6), np.arange(2,7)])
Out[264]:
array([[0, 1, 2, 3, 4],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6]])
indexing_tricks
convenience methods to do the same thing:
In [265]: np.r_[0:5, 1:6, 2:7]
Out[265]: array([0, 1, 2, 3, 4, 1, 2, 3, 4, 5, 2, 3, 4, 5, 6])
This takes the slicing notation, expands it with arange
and concatenates. It even lets me expand and concatenate into 2d
In [269]: np.r_['0,2',0:5, 1:6, 2:7]
Out[269]:
array([[0, 1, 2, 3, 4],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6]])
In [270]: data=np.array(list('abcdefghijk'))
In [272]: data[np.r_['0,2',0:5, 1:6, 2:7]]
Out[272]:
array([['a', 'b', 'c', 'd', 'e'],
['b', 'c', 'd', 'e', 'f'],
['c', 'd', 'e', 'f', 'g']],
dtype='<U1')
In [273]: data[np.r_[0:5, 1:6, 2:7]]
Out[273]:
array(['a', 'b', 'c', 'd', 'e', 'b', 'c', 'd', 'e', 'f', 'c', 'd', 'e',
'f', 'g'],
dtype='<U1')
Concatenating results after indexing also works.
In [274]: np.stack([data[0:5],data[1:6],data[2:7]])
My memory from other SO questions is that relative timings are in the same order of magnitude. It may vary for example with the number of slices versus their length. Overall the number of values that have to be copied from source to target will be the same.
If the slices vary in length, you'd have to use the flat indexing.
Related videos on Youtube
Comments
-
Puchatek about 2 years
I'm looking for a way to select multiple slices from a numpy array at once. Say we have a 1D data array and want to extract three portions of it like below:
data_extractions = [] for start_index in range(0, 3): data_extractions.append(data[start_index: start_index + 5])
Afterwards
data_extractions
will be:data_extractions = [ data[0:5], data[1:6], data[2:7] ]
Is there any way to perform above operation without the for loop? Some sort of indexing scheme in numpy that would let me select multiple slices from an array and return them as that many arrays, say in an n+1 dimensional array?
I thought maybe I can replicate my data and then select a span from each row, but code below throws an IndexError
replicated_data = np.vstack([data] * 3) data_extractions = replicated_data[[range(3)], [slice(0, 5), slice(1, 6), slice(2, 7)]
-
Divakar about 7 yearsWhat's
n
there? -
Paul Panzer about 7 years
stride_tricks
might be a way -
Puchatek about 7 years@Divakar - dimension. I gave a 1D example for simplicity, but need a generic solution (my real problem is 4D).
-
-
Puchatek about 7 yearsThat doesn't avoid a for loop ;)
-
Anant Gupta about 7 yearsI agree :) but not a native for loop :)
-
Puchatek about 7 yearsNice, I didn't know about
np.lib.stride_tricks.as_strided
, thank you, Paul. -
Puchatek about 7 yearsDamn it, I tried above approach but with
indexes
as list of ranges as well as list of slices and these would causeIndexErrors
. Didn't realize I need to wrap outer list ofindexes
in a numpy array ^^ -
Paul Panzer about 7 years@Puchatek glad to be of help. Just be careful with that stuff. As far as I know it doesn't check ranges, so it will happily allow you to access out-of-range memory etc.
-
Puchatek about 7 yearsYep, toyed around with it in Ipython and realized quickly it can blow in my face when used carelessly ^^
-
Divakar about 7 years@Puchatek If you are using proper shapes and strides, should be okay.
-
Puchatek about 7 yearsSo I though about this approach but couldn't get it to work because didn't wrap list of lists making indices into a numpy array. Silly me I guess.
-
tmrlvi about 7 yearsI think when you put a list into the
numpy
selector, it tries to filter per axis (i.e., the first item is filter for the first axis etc). Actually, putting it inside another list, as inindexes = [[np.arange(0,5), np.arange(1,6), np.arange(2,7)]]
solves it. -
Puchatek about 7 yearsLuckily I have to deal with slicing according to a regular pattern. Thank you for the detailed answer :)