Identify groups of continuous numbers in a list
Solution 1
more_itertools.consecutive_groups
was added in version 4.0.
Demo
import more_itertools as mit
iterable = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]
[list(group) for group in mit.consecutive_groups(iterable)]
# [[2, 3, 4, 5], [12, 13, 14, 15, 16, 17], [20]]
Code
Applying this tool, we make a generator function that finds ranges of consecutive numbers.
def find_ranges(iterable):
"""Yield range of consecutive numbers."""
for group in mit.consecutive_groups(iterable):
group = list(group)
if len(group) == 1:
yield group[0]
else:
yield group[0], group[-1]
iterable = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]
list(find_ranges(iterable))
# [(2, 5), (12, 17), 20]
The source implementation emulates a classic recipe (as demonstrated by @Nadia Alramli).
Note: more_itertools
is a third-party package installable via pip install more_itertools
.
Solution 2
EDIT 2: To answer the OP new requirement
ranges = []
for key, group in groupby(enumerate(data), lambda (index, item): index - item):
group = map(itemgetter(1), group)
if len(group) > 1:
ranges.append(xrange(group[0], group[-1]))
else:
ranges.append(group[0])
Output:
[xrange(2, 5), xrange(12, 17), 20]
You can replace xrange with range or any other custom class.
Python docs have a very neat recipe for this:
from operator import itemgetter
from itertools import groupby
data = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17]
for k, g in groupby(enumerate(data), lambda (i,x):i-x):
print(map(itemgetter(1), g))
Output:
[2, 3, 4, 5]
[12, 13, 14, 15, 16, 17]
If you want to get the exact same output, you can do this:
ranges = []
for k, g in groupby(enumerate(data), lambda (i,x):i-x):
group = map(itemgetter(1), g)
ranges.append((group[0], group[-1]))
output:
[(2, 5), (12, 17)]
EDIT: The example is already explained in the documentation but maybe I should explain it more:
The key to the solution is differencing with a range so that consecutive numbers all appear in same group.
If the data was: [2, 3, 4, 5, 12, 13, 14, 15, 16, 17]
Then groupby(enumerate(data), lambda (i,x):i-x)
is equivalent of the following:
groupby(
[(0, 2), (1, 3), (2, 4), (3, 5), (4, 12),
(5, 13), (6, 14), (7, 15), (8, 16), (9, 17)],
lambda (i,x):i-x
)
The lambda function subtracts the element index from the element value. So when you apply the lambda on each item. You'll get the following keys for groupby:
[-2, -2, -2, -2, -8, -8, -8, -8, -8, -8]
groupby groups elements by equal key value, so the first 4 elements will be grouped together and so forth.
I hope this makes it more readable.
python 3
version may be helpful for beginners
import the libraries required first
from itertools import groupby
from operator import itemgetter
ranges =[]
for k,g in groupby(enumerate(data),lambda x:x[0]-x[1]):
group = (map(itemgetter(1),g))
group = list(map(int,group))
ranges.append((group[0],group[-1]))
Solution 3
The "naive" solution which I find somewhat readable atleast.
x = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 22, 25, 26, 28, 51, 52, 57]
def group(L):
first = last = L[0]
for n in L[1:]:
if n - 1 == last: # Part of the group, bump the end
last = n
else: # Not part of the group, yield current group and start a new
yield first, last
first = last = n
yield first, last # Yield the last group
>>>print list(group(x))
[(2, 5), (12, 17), (22, 22), (25, 26), (28, 28), (51, 52), (57, 57)]
Solution 4
Assuming your list is sorted:
>>> from itertools import groupby
>>> def ranges(lst):
pos = (j - i for i, j in enumerate(lst))
t = 0
for i, els in groupby(pos):
l = len(list(els))
el = lst[t]
t += l
yield range(el, el+l)
>>> lst = [2, 3, 4, 5, 12, 13, 14, 15, 16, 17]
>>> list(ranges(lst))
[range(2, 6), range(12, 18)]
Solution 5
Here it is something that should work, without any import needed:
def myfunc(lst):
ret = []
a = b = lst[0] # a and b are range's bounds
for el in lst[1:]:
if el == b+1:
b = el # range grows
else: # range ended
ret.append(a if a==b else (a,b)) # is a single or a range?
a = b = el # let's start again with a single
ret.append(a if a==b else (a,b)) # corner case for last single/range
return ret
mikemaccana
I help verify websites for EV HTTPS at CertSimple and have made a bunch of tech products in the past 20 years as a product manager, CTO, lead developer, systems engineer, and technical architect - see https://mikemaccana.com
Updated on July 08, 2022Comments
-
mikemaccana almost 2 years
I'd like to identify groups of continuous numbers in a list, so that:
myfunc([2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20])
Returns:
[(2,5), (12,17), 20]
And was wondering what the best way to do this was (particularly if there's something inbuilt into Python).
Edit: Note I originally forgot to mention that individual numbers should be returned as individual numbers, not ranges.
-
Jochen Ritzel about 14 years
[j - i for i, j in enumerate(lst)]
is clever :-) -
SilentGhost about 14 yearsalmost works in py3k, except it requires
lambda x:x[0]-x[1]
. -
mikemaccana about 14 yearsI like this answer a lot because it's terse yet readable. However numbers that are outside of ranges should be printed as single digits, not tuples (as I will format the output and have different formatting requirements for individual numbers versus ranges of numbers.
-
SilentGhost about 14 years
>>> getranges([2, 12, 13])
Outputs:[[12, 13]]
. Was that intentional? -
mikemaccana about 14 yearsYep, I need to fix for individual numbers (per most of the answers on the page). Working on it now.
-
mikemaccana about 14 yearsCould you use please use multi-character variable names? For someone not familiar with map() or groupby(), the meanings of k g, i and x are not clear.
-
Nadia Alramli about 14 yearsThis was copied from the Python documentations with the same variable names. I changed the names now.
-
mikemaccana about 14 yearsThanks for the improved variable names and handling non-ranged numbers. This is readable, your explanations are great and I've marked this as the preferred answer.
-
mikemaccana about 14 yearsActually I prefer Nadia's answer, groupby() seems like the standard function I wanted.
-
Benny about 11 yearsThe other answer looked beautiful and intelligent, but this one is more understandable to me and allowed a beginner like me to expand it according to my needs.
-
IceArdor almost 10 yearsYou'll need to increment the 2nd number in xrange/range because it is non-inclusive. In other words,
[2,3,4,5] == xrange(2,6)
, notxrange(2,5)
. It may be worth defining a new inclusive range data type. -
derek73 over 7 yearsPython 3 throws a syntax error on the first example. Here's the first 2 lines updated to work on python 3:
for key, group in groupby(enumerate(data), lambda i: i[0] - i[1]): group = list(map(itemgetter(1), group))
-
Nexus almost 6 yearsCould use a list comprehension to print the non-range tuples as single digits:
print([i if i[0] != i[1] else i[0] for i in group(x)])
-
Pleastry almost 3 yearsThis actually fails if you replace 12 with 10 in data array. The correct solution would be:
starts = [x for x in data if x-1 not in data and x+1 in data]
andends = [x for x in data if x-1 in data and x+1 not in data and x not in starts]
-
kmt almost 3 yearsThanks @Pleastry - I have edited with your fix
-
Stef over 2 yearsAlternatively, using
more_itertools.groupby_transform
:[v for k,v in more_itertools.groupby_transform(enumerate(iterable), keyfunc=lambda p: p[1]-p[0], valuefunc=operator.itemgetter(1), reducefunc=to_interval)]
withto_interval = lambda g: (sublst[0], sublst[-1]) if len(sublst := list(g)) > 1 else sublst[0]