Split a list into parts based on a set of indexes in Python
Solution 1
This is the simplest and most pythonic solution I can think of:
def partition(alist, indices):
return [alist[i:j] for i, j in zip([0]+indices, indices+[None])]
if the inputs are very large, then the iterators solution should be more convenient:
from itertools import izip, chain
def partition(alist, indices):
pairs = izip(chain([0], indices), chain(indices, [None]))
return (alist[i:j] for i, j in pairs)
and of course, the very, very lazy guy solution (if you don't mind to get arrays instead of lists, but anyway you can always revert them to lists):
import numpy
partition = numpy.split
Solution 2
I would be interested in seeing a more Pythonic way of doing this also. But this is a crappy solution. You need to add a checking for an empty index list.
Something along the lines of:
indexes = [5, 12, 17]
list = range(20)
output = []
prev = 0
for index in indexes:
output.append(list[prev:index])
prev = index
output.append(list[indexes[-1]:])
print output
produces
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16], [17, 18, 19]]
Solution 3
My solution is similar to Il-Bhima's.
>>> def parts(list_, indices):
... indices = [0]+indices+[len(list_)]
... return [list_[v:indices[k+1]] for k, v in enumerate(indices[:-1])]
Alternative approach
If you're willing to slightly change the way you input indices, from absolute indices to relative (that is, from [5, 12, 17]
to [5, 7, 5]
, the below will also give you the desired output, while it doesn't create intermediary lists.
>>> from itertools import islice
>>> def parts(list_, indices):
... i = iter(list_)
... return [list(islice(i, n)) for n in chain(indices, [None])]
Solution 4
>>> def burst_seq(seq, indices):
... startpos = 0
... for index in indices:
... yield seq[startpos:index]
... startpos = index
... yield seq[startpos:]
...
>>> list(burst_seq(range(20), [5, 12, 17]))
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16], [17, 18, 19]]
>>> list(burst_seq(range(20), []))
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]]
>>> list(burst_seq(range(0), [5, 12, 17]))
[[], [], [], []]
>>>
Maxima mea culpa: it uses a for
statement, and it's not using whizzbang stuff like itertools, zip(), None as a sentinel, list comprehensions, ...
;-)
Solution 5
indices = [5, 12, 17]
input = range(20)
output = []
reduce(lambda x, y: output.append(input[x:y]) or y, indices + [len(input)], 0)
print output
Comments
-
M. Ewees almost 2 years
What is the best way to split a list into parts based on an arbitrary number of indexes? E.g. given the code below
indexes = [5, 12, 17] list = range(20)
return something like this
part1 = list[:5] part2 = list[5:12] part3 = list[12:17] part4 = list[17:]
If there are no indexes it should return the entire list.
-
Muhammad Huzaifa over 14 yearsClever solution, very concise =) I'll give my +1 to kjfletch's answer though, because it reuses existing values, while this method does a lot of list creation/modification.
-
Glenn Maynard over 14 yearsIt'd be more consistent to drop the conditionals--if the first index is 0, the first item should be empty. Just use
indexes = [0] + indexes + [None]
. -
Muhammad Huzaifa over 14 years+1 for shortness and for reusing values (
enumerate
iterates the existing array, as opposed tozip
) -
Glenn Maynard over 14 yearsAlso, it's probably better off with itertools.izip instead of zip, and itertools.islice instead of direct slicing.
-
Il-Bhima over 14 yearsAs Blixt pointed out I think you meant indices instead of indicies. Then you have the minor issue when passing indices like [0,5,12,17], in which case your result will contain the empty list list_[0:0]
-
Muhammad Huzaifa over 14 years"Indexes" and "indices" are both correct. "Indexes" is a pluralization of "index" and is more common in the US, while "indices" is derived from Latin and is more common in the UK.
-
André Eriksson over 14 years@Il-Bhima: Which could be argued is correct, since the first part is then of length 0, which is consistent with the OP's example.
-
Muhammad Huzaifa over 14 years@Il-Bhima, I think the intended behavior of passing in "split at index 0" would be to get an empty array as the first split value. Personally, I hate "magic" behavior that changes based on the parameters.
-
Il-Bhima over 14 years@Glenn Hmm, I was actually trying to avoid having those empty lists at the start and end. Not sure if that was want the original poster wanted
-
anthony over 14 yearsDid anyone want to discuss the viability of working right to left? or is this GrammarOverflow now?
-
André Eriksson over 14 yearsWorking right to left would imply you either insert at other locations than the end or reverse the list before returning it. Either case is less than ideal.
-
dalloliogm over 14 yearsdon't overwrite a built-in variable (list)
-
kjfletch over 14 yearsNot very happy with 2 down votes for throwing ideas in the pot. Especially with no comments. I stated the rough nature of the solution.
-
Glenn Maynard over 14 yearsThis site shouldn't allow downvoting without an explanation. It's not useful at all.
-
anthony over 14 yearsThis answer is perfect, and what I or anyone would write if faced with this problem in the real world. The question was asking specifically for a "Pythonic" way, which I think some of the other answers adress better than this one.
-
André Eriksson over 14 yearsI updated this version to not create new lists (other than the slices necessary) while improving simplicity.
-
André Eriksson over 14 yearsThis got downvoted why? Please leave comments if you think this is wrong or otherwise inappropriate, thanks :). I tested it pretty thoroughly so I think it covers all the edge cases as well.
-
Glenn Maynard over 14 yearsThis is unintuitive code, and it doesn't work (compare the output to the other answers). Also, don't replace your answer with a completely different one after people have already voted and commented on the original.
-
Glenn Maynard over 14 yearsThat's why I hate the buzzword "Pythonic". It's as if everything written in Python should be written in some special Python-specific way, preferably forcibly crushed onto a single line for some reason.
-
André Eriksson over 14 years@Glenn Maynard: Thank you; you're perfectly right. I'll keep that in mind from now on.
-
M. Ewees over 14 years@Cide: Why did you decide against this if you don't mind me asking? stackoverflow.com/revisions/…
-
André Eriksson over 14 yearsIt's there now, but its functionality is slightly different from whta the OP asked for. It's detailed in the "Alternative approach" in the post now.
-
jcdyer over 14 yearsIn my opinion, "pythonic" just means good, idiomatic style. It does not mean hypercondensed one-line solutions that show off every python feature. This looks perfectly pythonic to me. It uses slices appropriately, uses range when range is more appropriate than xrange, and iterates directly over the list rather than looping through the indices. Pythonic? Check. Comprehensible? Check. Accurate? Check. +1
-
jcdyer over 14 yearsOh, and in python 2, you can use prev after the for loop exits, so you can replace
output.append(list[indexes[-1]:])
withoutput.append(list[prev:])
. -
fortran almost 11 yearsThe reason to use "whizzbang stuff" is not because it's trendy or to make people look smarter, it's because it is closer to a declarative specification of the solution. A mutable-state constructive approach is usually more error prone too...
-
Velizar Hristov over 8 yearspartition is a misleading name, as in many languages the partition function splits a list into two lists - of items passing and items failing the criteria (which is passed as a A -> Boolean function), e.g. partition(even, [1, 2, 3, 4, 5]) = ([2, 4], [1, 3, 5])