Check if all values in list are greater than a certain number

284,371

Solution 1

Use the all() function with a generator expression:

>>> my_list1 = [30, 34, 56]
>>> my_list2 = [29, 500, 43]
>>> all(i >= 30 for i in my_list1)
True
>>> all(i >= 30 for i in my_list2)
False

Note that this tests for greater than or equal to 30, otherwise my_list1 would not pass the test either.

If you wanted to do this in a function, you'd use:

def all_30_or_up(ls):
    for i in ls:
        if i < 30:
            return False
    return True

e.g. as soon as you find a value that proves that there is a value below 30, you return False, and return True if you found no evidence to the contrary.

Similarly, you can use the any() function to test if at least 1 value matches the condition.

Solution 2

...any reason why you can't use min()?

def above(my_list, minimum):
    if min(my_list) >= minimum:
        print "All values are equal or above", minimum
    else:
        print "Not all values are equal or above", minimum

I don't know if this is exactly what you want, but technically, this is what you asked for...

Solution 3

There is a builtin function all:

all (x > limit for x in my_list)

Being limit the value greater than which all numbers must be.

Solution 4

You can use all():

my_list1 = [30,34,56]
my_list2 = [29,500,43]
if all(i >= 30 for i in my_list1):
    print 'yes'
if all(i >= 30 for i in my_list2):
    print 'no'

Note that this includes all numbers equal to 30 or higher, not strictly above 30.

Solution 5

The overall winner between using the np.sum, np.min, and all seems to be np.min in terms of speed for large arrays:

N = 1000000
def func_sum(x):
    my_list = np.random.randn(N)
    return np.sum(my_list < x )==0

def func_min(x):
    my_list = np.random.randn(N)
    return np.min(my_list) >= x

def func_all(x):
    my_list = np.random.randn(N)
    return all(i >= x for i in my_list)

(i need to put the np.array definition inside the function, otherwise the np.min function remembers the value and does not do the computation again when testing for speed with timeit)

The performance of "all" depends very much on when the first element that does not satisfy the criteria is found, the np.sum needs to do a bit of operations, the np.min is the lightest in terms of computations in the general case.

When the criteria is almost immediately met and the all loop exits fast, the all function is winning just slightly over np.min:

>>> %timeit func_sum(10)
10 loops, best of 3: 36.1 ms per loop

>>> %timeit func_min(10)
10 loops, best of 3: 35.1 ms per loop

>>> %timeit func_all(10)
10 loops, best of 3: 35 ms per loop

But when "all" needs to go through all the points, it is definitely much worse, and the np.min wins:

>>> %timeit func_sum(-10)
10 loops, best of 3: 36.2 ms per loop

>>> %timeit func_min(-10)
10 loops, best of 3: 35.2 ms per loop

>>> %timeit func_all(-10)
10 loops, best of 3: 230 ms per loop

But using

np.sum(my_list<x)

can be very useful is one wants to know how many values are below x.

Share:
284,371

Related videos on Youtube

O.rka
Author by

O.rka

I am an academic researcher studying machine-learning and microorganisms

Updated on February 21, 2021

Comments

  • O.rka
    O.rka about 3 years
    my_list1 = [30,34,56]
    my_list2 = [29,500,43]
    

    How to I check if all values in list are >= 30? my_list1 should work and my_list2 should not.

    The only thing I could think of doing was:

    boolean = 0
    def func(ls):
        for k in ls:
            if k >= 30:
                boolean = boolean + 1
            else:
                boolean = 0
        if boolean > 0:
            print 'Continue'
        elif boolean = 0:
            pass
    

    Update 2016:

    In hindsight, after dealing with bigger datasets where speed actually matters and utilizing numpy...I would do this:

    >>> my_list1 = [30,34,56]
    >>> my_list2 = [29,500,43]
    
    >>> import numpy as np
    >>> A_1 = np.array(my_list1)
    >>> A_2 = np.array(my_list2)
    
    >>> A_1 >= 30
    array([ True,  True,  True], dtype=bool)
    >>> A_2 >= 30
    array([False,  True,  True], dtype=bool)
    
    >>> ((A_1 >= 30).sum() == A_1.size).astype(np.int)
    1
    >>> ((A_2 >= 30).sum() == A_2.size).astype(np.int)
    0
    

    You could also do something like:

    len([*filter(lambda x: x >= 30, my_list1)]) > 0
    
    • user2864740
      user2864740 over 10 years
      A general issues to be aware of: 1) the assigned boolean variable is local to the function (as there is no appropriate global annotation), and 2) boolean = 0 is an assignment, not a comparison.
    • Martijn Pieters
      Martijn Pieters over 10 years
      Note that your my_list1 has one value that is not above 30. It is instead equal to 30. Should that be 31 instead, or are you testing for greater than or equal to 30 here?
  • Martijn Pieters
    Martijn Pieters over 10 years
    As my_list1 should test True, the test should almost certainly be >= 30, not > 30.
  • Martijn Pieters
    Martijn Pieters over 10 years
    As my_list1 should test True, the test should almost certainly be >= 30, not > 30.
  • Simeon Visser
    Simeon Visser over 10 years
    @MartijnPieters thanks, now updated. Question mentions above 30 but >= 30 seems intended.
  • Martijn Pieters
    Martijn Pieters over 10 years
    I know, that's why I made that explicit. :-)
  • Hyperboreus
    Hyperboreus over 10 years
    Well, when OP's question text contradicts itself, who am I to judge which is the correct limit.
  • Hyperboreus
    Hyperboreus over 10 years
    What is the advantage of using all_30_or_up over all? Shouldn't all also stop consuming the iterator as soon as a negative has been found? Would be quite dumb otherwise, wouldn't it?
  • Hyperboreus
    Hyperboreus over 10 years
    The disadvantage of this solution is, that each item of list must be touched.
  • Martijn Pieters
    Martijn Pieters over 10 years
    @Hyperboreus: both stop as soon as a negative has been found. I wanted to give the OP a different way of looking at the problem, giving them a function to replace the one they were writing.
  • Peter DeGlopper
    Peter DeGlopper over 10 years
    I did a little profiling on this. all shortcircuits, so it's much faster if the list does not qualify. But if the list is all 30+, min can be faster. I tested with two 1000-element lists of random integers, one filled with random.randint(0, 100) (failing) and one filled with random.randint(30, 100). Using min took slightly less than half the time on the 30-100 list. But all took about 2% of the time that min did on the 0-100 list, so it probably wins unless failing lists are very rare.
  • Peter DeGlopper
    Peter DeGlopper over 10 years
    As it turned out, the first element of my 0-100 list was below 30, so my test was kind of degenerate. Forcing the first sub-30 element to be halfway through the list, min comes out a bit faster - 0.25s for 10000 repetitions rather than 0.32s for all. So which is faster depends on the nature of the data, as you'd expect.
  • zelusp
    zelusp over 7 years
    @MartijnPieters, Mucho <3