List Comprehensions in Python to compute minimum and maximum values of a list

14,892

Solution 1

I'm a big fan of generators and comprehensions, but in this case it seems they are not the right way to go, because:

  1. You want to compute the min and the max of the list
  2. Your list is huge

If you wanted to calculate only one of the min or max, you could just use the min/max function on it. But since you want both, you would have to loop over the list twice to extract first the min and then the max. I.e. something like this:

x_min = min(points)
x_max = max(points)

Let's play with some some timings. First calling both min and max on the list:

>>> import timeit
>>> def with_gens(l):
...     return min(l), max(l)
...
>>> timeit.timeit('with_gens(range(6000000))', 'from __main__ import with_gens', number=5)
1.7451060887015188

and now looping only once, using you code:

>>> def with_loop2(l):
...     x_max = float('+inf')
...     x_min = float('-inf')
...     for el in l:
...         x_min = min(x_min, el)
...         x_max = max(x_max, el)
...     return x_min, x_max
...
>>> timeit.timeit('with_loop2(range(6000000))', 'from __main__ import with_loop2', number=5)
11.636076105071083

Crazy, huh?

There is no memory problem at all with this approach. However, it sets the x_max and x_min in each loop, which is actually an unnecessary waste: you only want to reset the variable when you have found a bigger/smaller value. We can easily address this.

So... let's try looping only once, but avoiding unnecessary resettings.

>>> def with_loop(l):
...     x_min = float('-inf')
...     x_max = float('+inf')
...     for el in l:
...         if el < x_min:
...             x_min = el
...         elif el > x_max:
...             x_max = el
...     return x_min, x_max
...
>>> timeit.timeit('with_loop(range(6000000))', 'from __main__ import with_loop', number=5)
3.961046726963332

OH SURPRISE

It seems while the algorithm of looping only once is on the paper more efficient, it is beaten by the inner optimization of min and max. Moreover, the difference between setting the var in each loop and only when necessary is huge. You never stop learning.

Solution 2

Suppose a point has two attributes, x and y, then you can use

x_min = min(p['x'] for p in points_in_list) to calculate the min of x

An example:

>>> a = {'x': 10, 'y':10}
>>> b = {'x': 5, 'y':20}
>>> c = {'x': 50, 'y':50}
>>> points_in_list = [a,b,c]
>>> points_in_list
[{'y': 10, 'x': 10}, {'y': 20, 'x': 5}, {'y': 50, 'x': 50}]
>>> x_min = min(p['x'] for p in points_in_list)
>>> x_min
5
Share:
14,892
Gianni Spear
Author by

Gianni Spear

Updated on June 04, 2022

Comments

  • Gianni Spear
    Gianni Spear almost 2 years

    i have the following code to compute minimum and maximum values of a list in order to save memory efficiency

    x_min = float('+inf')
    x_max = float('-inf')
    
    for p in points_in_list:
        x_min = min(x_min, p)
        x_max = max(x_max, p)
    

    where points_in_list is a (large) list of numbers. I wish to know if there is a method to compute with a List Comprehensions the min and max value and saving the memory.

    • gefei
      gefei almost 10 years
      what does a point in points_in_list look like? which attribute does it have?
    • Frédéric Hamidi
      Frédéric Hamidi almost 10 years
      Have you tried min(points_in_list) and max(points_in_list)? No list comprehension needed here.
    • Gianni Spear
      Gianni Spear almost 10 years
      it's a really large list (over 6 milions of values) as points_in_list = [1,3,6,79,80,...,76] of number
    • user2357112
      user2357112 almost 10 years
      Calling min or max on a huge list isn't going to produce memory problems. Why would it? Were you under the impression that doing so would require making a copy of the entire list?
    • jonrsharpe
      jonrsharpe almost 10 years
      Why do you think a list comprehension would use less memory than min?
    • chepner
      chepner almost 10 years
      List comprehensions produce lists, while you want to compute single values from lists.
  • DyRuss
    DyRuss over 5 years
    might also be interesting to run the tests with a random range of numbers (rather than a sorted list where x_max will be reassigned each time, and x_min is never reassigned) and see if that changes anything
  • dim_user
    dim_user about 5 years
    nice, what would be the time complexity of the min(p['x'] for p in points_in_list), is it O(N) assuming there are N items in points_in_list? And what's in the min function as parameter, is it considered a list (i.e. p['x'] for p in points_in_list) ?