Split a list into parts based on a set of indexes in Python

69,212

Solution 1

This is the simplest and most pythonic solution I can think of:

def partition(alist, indices):
    return [alist[i:j] for i, j in zip([0]+indices, indices+[None])]

if the inputs are very large, then the iterators solution should be more convenient:

from itertools import izip, chain
def partition(alist, indices):
    pairs = izip(chain([0], indices), chain(indices, [None]))
    return (alist[i:j] for i, j in pairs)

and of course, the very, very lazy guy solution (if you don't mind to get arrays instead of lists, but anyway you can always revert them to lists):

import numpy
partition = numpy.split

Solution 2

I would be interested in seeing a more Pythonic way of doing this also. But this is a crappy solution. You need to add a checking for an empty index list.

Something along the lines of:

indexes = [5, 12, 17]
list = range(20)

output = []
prev = 0

for index in indexes:
    output.append(list[prev:index])
    prev = index

output.append(list[indexes[-1]:])

print output

produces

[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16], [17, 18, 19]]

Solution 3

My solution is similar to Il-Bhima's.

>>> def parts(list_, indices):
...     indices = [0]+indices+[len(list_)]
...     return [list_[v:indices[k+1]] for k, v in enumerate(indices[:-1])]

Alternative approach

If you're willing to slightly change the way you input indices, from absolute indices to relative (that is, from [5, 12, 17] to [5, 7, 5], the below will also give you the desired output, while it doesn't create intermediary lists.

>>> from itertools import islice
>>> def parts(list_, indices):
...     i = iter(list_)
...     return [list(islice(i, n)) for n in chain(indices, [None])]

Solution 4

>>> def burst_seq(seq, indices):
...    startpos = 0
...    for index in indices:
...       yield seq[startpos:index]
...       startpos = index
...    yield seq[startpos:]
...
>>> list(burst_seq(range(20), [5, 12, 17]))
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16], [17, 18, 19]]
>>> list(burst_seq(range(20), []))
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]]
>>> list(burst_seq(range(0), [5, 12, 17]))
[[], [], [], []]
>>>

Maxima mea culpa: it uses a for statement, and it's not using whizzbang stuff like itertools, zip(), None as a sentinel, list comprehensions, ...

;-)

Solution 5

indices = [5, 12, 17]
input = range(20)
output = []

reduce(lambda x, y: output.append(input[x:y]) or y, indices + [len(input)], 0)
print output
Share:
69,212
M. Ewees
Author by

M. Ewees

Learning Python

Updated on July 08, 2022

Comments

  • M. Ewees
    M. Ewees almost 2 years

    What is the best way to split a list into parts based on an arbitrary number of indexes? E.g. given the code below

    indexes = [5, 12, 17]
    list = range(20)
    

    return something like this

    part1 = list[:5]
    part2 = list[5:12]
    part3 = list[12:17]
    part4 = list[17:]
    

    If there are no indexes it should return the entire list.

  • Muhammad Huzaifa
    Muhammad Huzaifa over 14 years
    Clever solution, very concise =) I'll give my +1 to kjfletch's answer though, because it reuses existing values, while this method does a lot of list creation/modification.
  • Glenn Maynard
    Glenn Maynard over 14 years
    It'd be more consistent to drop the conditionals--if the first index is 0, the first item should be empty. Just use indexes = [0] + indexes + [None].
  • Muhammad Huzaifa
    Muhammad Huzaifa over 14 years
    +1 for shortness and for reusing values (enumerate iterates the existing array, as opposed to zip)
  • Glenn Maynard
    Glenn Maynard over 14 years
    Also, it's probably better off with itertools.izip instead of zip, and itertools.islice instead of direct slicing.
  • Il-Bhima
    Il-Bhima over 14 years
    As Blixt pointed out I think you meant indices instead of indicies. Then you have the minor issue when passing indices like [0,5,12,17], in which case your result will contain the empty list list_[0:0]
  • Muhammad Huzaifa
    Muhammad Huzaifa over 14 years
    "Indexes" and "indices" are both correct. "Indexes" is a pluralization of "index" and is more common in the US, while "indices" is derived from Latin and is more common in the UK.
  • André Eriksson
    André Eriksson over 14 years
    @Il-Bhima: Which could be argued is correct, since the first part is then of length 0, which is consistent with the OP's example.
  • Muhammad Huzaifa
    Muhammad Huzaifa over 14 years
    @Il-Bhima, I think the intended behavior of passing in "split at index 0" would be to get an empty array as the first split value. Personally, I hate "magic" behavior that changes based on the parameters.
  • Il-Bhima
    Il-Bhima over 14 years
    @Glenn Hmm, I was actually trying to avoid having those empty lists at the start and end. Not sure if that was want the original poster wanted
  • anthony
    anthony over 14 years
    Did anyone want to discuss the viability of working right to left? or is this GrammarOverflow now?
  • André Eriksson
    André Eriksson over 14 years
    Working right to left would imply you either insert at other locations than the end or reverse the list before returning it. Either case is less than ideal.
  • dalloliogm
    dalloliogm over 14 years
    don't overwrite a built-in variable (list)
  • kjfletch
    kjfletch over 14 years
    Not very happy with 2 down votes for throwing ideas in the pot. Especially with no comments. I stated the rough nature of the solution.
  • Glenn Maynard
    Glenn Maynard over 14 years
    This site shouldn't allow downvoting without an explanation. It's not useful at all.
  • anthony
    anthony over 14 years
    This answer is perfect, and what I or anyone would write if faced with this problem in the real world. The question was asking specifically for a "Pythonic" way, which I think some of the other answers adress better than this one.
  • André Eriksson
    André Eriksson over 14 years
    I updated this version to not create new lists (other than the slices necessary) while improving simplicity.
  • André Eriksson
    André Eriksson over 14 years
    This got downvoted why? Please leave comments if you think this is wrong or otherwise inappropriate, thanks :). I tested it pretty thoroughly so I think it covers all the edge cases as well.
  • Glenn Maynard
    Glenn Maynard over 14 years
    This is unintuitive code, and it doesn't work (compare the output to the other answers). Also, don't replace your answer with a completely different one after people have already voted and commented on the original.
  • Glenn Maynard
    Glenn Maynard over 14 years
    That's why I hate the buzzword "Pythonic". It's as if everything written in Python should be written in some special Python-specific way, preferably forcibly crushed onto a single line for some reason.
  • André Eriksson
    André Eriksson over 14 years
    @Glenn Maynard: Thank you; you're perfectly right. I'll keep that in mind from now on.
  • M. Ewees
    M. Ewees over 14 years
    @Cide: Why did you decide against this if you don't mind me asking? stackoverflow.com/revisions/…
  • André Eriksson
    André Eriksson over 14 years
    It's there now, but its functionality is slightly different from whta the OP asked for. It's detailed in the "Alternative approach" in the post now.
  • jcdyer
    jcdyer over 14 years
    In my opinion, "pythonic" just means good, idiomatic style. It does not mean hypercondensed one-line solutions that show off every python feature. This looks perfectly pythonic to me. It uses slices appropriately, uses range when range is more appropriate than xrange, and iterates directly over the list rather than looping through the indices. Pythonic? Check. Comprehensible? Check. Accurate? Check. +1
  • jcdyer
    jcdyer over 14 years
    Oh, and in python 2, you can use prev after the for loop exits, so you can replace output.append(list[indexes[-1]:]) with output.append(list[prev:]).
  • fortran
    fortran almost 11 years
    The reason to use "whizzbang stuff" is not because it's trendy or to make people look smarter, it's because it is closer to a declarative specification of the solution. A mutable-state constructive approach is usually more error prone too...
  • Velizar Hristov
    Velizar Hristov over 8 years
    partition is a misleading name, as in many languages the partition function splits a list into two lists - of items passing and items failing the criteria (which is passed as a A -> Boolean function), e.g. partition(even, [1, 2, 3, 4, 5]) = ([2, 4], [1, 3, 5])