Finding the average of a list

1,354,419

Solution 1

On Python 3.8+, with floats, you can use statistics.fmean as it's faster with floats.

On Python 3.4+, you can use statistics.mean:

l = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import statistics
statistics.mean(l)  # = 20.11111111111111

On older versions of Python you can:

sum(l) / len(l)

On Python 2, you need to convert len to a float to get float division

sum(l) / float(len(l))

There is no need to use functools.reduce as it is much slower.

Solution 2

l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
sum(l) / len(l)

Solution 3

You can use numpy.mean:

l = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import numpy as np
print(np.mean(l))

Solution 4

A statistics module has been added to python 3.4. It has a function to calculate the average called mean. An example with the list you provided would be:

from statistics import mean
l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
mean(l)

Solution 5

Why would you use reduce() for this when Python has a perfectly cromulent sum() function?

print sum(l) / float(len(l))

(The float() is necessary in Python 2 to force Python to do a floating-point division.)

Share:
1,354,419
Carla Dessi
Author by

Carla Dessi

My name's Carla, I'm a BSc Computing student from Bournemouth, currently undergoing a placement as a Software Engineer. I like to spend my spare time creating mobile applications and developing websites/Wordpress themes.

Updated on March 20, 2022

Comments

  • Carla Dessi
    Carla Dessi about 2 years

    I have to find the average of a list in Python. This is my code so far

    from functools import reduce
    
    l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
    print(reduce(lambda x, y: x + y, l))
    

    I've got it so it adds together the values in the list, but I don't know how to make it divide into them?

    • mitch
      mitch over 12 years
      numpy.mean if you can afford installing numpy
    • n611x007
      n611x007 over 8 years
      sum(L) / float(len(L)). handle empty lists in caller code like if not L: ...
    • n611x007
      n611x007 over 8 years
      please update your post and remove reduce and lambda because people are copying this from the top for bad use-cases. (well, unless you have pressing reason to use them.)
    • n611x007
      n611x007 over 8 years
    • n611x007
      n611x007 over 8 years
      @mitch: it's not a matter of whether you can afford installing numpy. numpy is a whole word in itself. It's whether you actually need numpy. Installing numpy, a 16mb C extension, for mean calculating would be, well, very impractical, for someone not using it for other things.
    • 25mhz
      25mhz almost 8 years
      instead of installing the whole numpy package for just avg/mean if using python 3 we can get this thing done using statistic module just by "from statistic import mean" or if on python 2.7 or less, the statistic module can be downloaded from src: hg.python.org/cpython/file/default/Lib/statistics.py doc: docs.python.org/dev/library/statistics.html and directly used.
    • Ravindra S
      Ravindra S almost 7 years
  • Carla Dessi
    Carla Dessi over 12 years
    That's perfect ! sorry for the stupid question, but i've genuinely looked everywhere for that ! thank you so much !
  • Carla Dessi
    Carla Dessi over 12 years
    as i said, i'm new to this, i was thinking i'd have to make it with a loop or something to count the amount of numbers in it, i didn't realise i could just use the length. this is the first thing i've done with python..
  • Johan Lundberg
    Johan Lundberg over 12 years
    interesting but that's not what he asked for.
  • Johan Lundberg
    Johan Lundberg over 12 years
    I get that this is just for fun but returning 0 for an empty list may not be the best thing to do
  • Andrew Clark
    Andrew Clark over 12 years
    @JohanLundberg - You could replace the 0 with False as the last argument to reduce() which would give you False for an empty list, otherwise the average as before.
  • user1066101
    user1066101 over 12 years
    @CarlaDessi: What tutorial are you using? This is thoroughly covered in all the tutorials I've seen. Clearly, you've found a tutorial that doesn't cover this well. What tutorial are you using to learn Python?
  • Chris Koston
    Chris Koston over 10 years
    Inefficient. It converts all elements to float before adding them. It's faster to convert just the length.
  • Foo Bar User
    Foo Bar User about 10 years
    what if the sum is a massive number that wont fit in int/float ?
  • RolfBly
    RolfBly about 10 years
    For those of us new to the word 'cromulent'
  • Arseniy
    Arseniy almost 10 years
    @FooBarUser then you should calc k = 1.0/len(l), and then reduce: reduce(lambda x, y: x + y * k, l)
  • wsysuper
    wsysuper about 9 years
    Good. Every other answer didn't notice the empty list hazard!
  • Serge Stroobandt
    Serge Stroobandt almost 9 years
    This is the most elegant answer because it employs a standard library module which is available since python 3.4.
  • L. Amber O'Hearn
    L. Amber O'Hearn over 8 years
    That's strange. I would have assumed this would be much more efficient, but it appears to take 8 times as long on a random list of floats than simply sum(l)/len(l)
  • L. Amber O'Hearn
    L. Amber O'Hearn over 8 years
    Oh, but np.array(l).mean() is much faster.
  • Akavall
    Akavall over 8 years
    @L.AmberO'Hearn, I just timed it and np.mean(l) and np.array(l).mean are about the same speed, and sum(l)/len(l) is about twice as fast. I used l = list(np.random.rand(1000)), for course both numpy methods become much faster if l is numpy.array.
  • n611x007
    n611x007 over 8 years
    downvoted because I cannot see why reduce and lambda should be on the top of a question about avarage calculation
  • n611x007
    n611x007 over 8 years
    well, unless that's the sole reason for installing numpy. installing a 16mb C package of whatever fame for mean calculation looks very strange on this scale.
  • lahjaton_j
    lahjaton_j about 8 years
    As a C++ programmer, that is neat as hell and float is not ugly at all!
  • Antti Haapala -- Слава Україні
    Antti Haapala -- Слава Україні almost 8 years
    And it is numerically stabler
  • kindall
    kindall almost 8 years
    Returning False (equivalent to the integer 0) is just about the worst possible way to handle this error. Better to catch the ZeroDivisionError and raise something better (perhaps ValueError).
  • Jules G.M.
    Jules G.M. almost 8 years
    He should really be using sum though, as guido says to try really hard to avoid reduce
  • J'e
    J'e over 7 years
    with given list of floats, given hardware and py2.7.x, lambda... --> 2.59us, numpy.mean(l) --> 27.5us, sum(l)/len(;) --> 650ns
  • Flame_Phoenix
    Flame_Phoenix over 7 years
    what if the user adds floating point numbers to your array? The results will be super imprecise.
  • jpmc26
    jpmc26 over 7 years
    In recent Python 3, / returns a float regardless. You can use from__future__ import division to ensure the same behavior in Python 2.2 and up (so basically any version that's suitable for production today).
  • EndermanAPM
    EndermanAPM almost 7 years
    @AndrewClark why do you force floaton len?
  • Brenouchoa
    Brenouchoa over 6 years
    check later response about python 3.4 up solution
  • MatTheWhale
    MatTheWhale about 6 years
    @kindall how is a ValueError any better than a ZeroDivisionError? The latter is more specific, plus it seems a bit unnecessary to catch an arithmetic error only to re-throw a different one.
  • kindall
    kindall about 6 years
    Because ZeroDivisionError is only useful if you know how the calculation is being done (i.e., that a division by the length of the list is involved). If you don't know that, it doesn't tell you what the problem is with the value you passed in. Whereas your new exception can include that more specific information.
  • Steinfeld
    Steinfeld over 5 years
    If you want to reduce some numbers after decimal point. This might come in handy: float('%.2f' % float(sum(l) / len(l)))
  • yprez
    yprez about 5 years
    @Steinfeld I don't think conversion to string is the best way to go here. You can achieve the same in a cleaner way with round(result, 2).
  • fralau
    fralau almost 5 years
    Bravo: IMHO, sum(l)/len(l) is by far the most elegant answer (no need to make type conversions in Python 3).
  • cs95
    cs95 over 4 years
    This isn't a pandas question, so it seems excessive to import such a heavy library for a simple operation like finding the mean.
  • Boris Verkhovskiy
    Boris Verkhovskiy over 4 years
    And it produces a nicer error if you accidentally pass in an empty list statistics.StatisticsError: mean requires at least one data point instead of a more cryptic ZeroDivisionError: division by zero for the sum(x) / len(x) solution.
  • Boris Verkhovskiy
    Boris Verkhovskiy over 4 years
    float() is not necessary on Python 3.
  • xilpex
    xilpex over 3 years
    There is no need to store the values in variables or use global variables.
  • TankorSmash
    TankorSmash over 3 years
    artima link appears to be dead
  • Elias
    Elias over 3 years
    Also it's better to use np.nanmean(l) in order to avoid issues with NAN and zero divisions
  • drevicko
    drevicko almost 3 years
    I tried these timings with a list of length 100000000: mean2 < 1s; mean3,4 ~ 8s; mean5,6 ~ 27s; mean1 ~1minute. I find this surprising, would have expected numpy to be best with a large list, but there you go! Seems there's a problem with the statistics package!! (this was python 3.8 on a mac laptop, no BLAS as far as I know).
  • drevicko
    drevicko almost 3 years
    Incidentally, if I convert l into an np.array first, np.mean takes ~.16s, so about 6x faster than sum(l)/len(l). Conclusion: if you're doing lots of calculations, best do everything in numpy.
  • Alon Gouldman
    Alon Gouldman over 2 years
    @drevicko see mean4, this is what I do there... I guess that it its already a np.array then it make sense to use np.mean, but in case you have a list then you should use sum(l) / len(l)
  • drevicko
    drevicko over 2 years
    exactly! It also depends on what you'll be doing with it later. Im my work I'm typically doing a series of calculations, so it makes sense to convert to numpy at the start and leverage numpy's fast underlying libraries.
  • Asclepius
    Asclepius about 2 years
    @AlonGouldman Great. I urge showing each speed in 1/1000 of a second (as an integer), otherwise the number is hard to read. For example, 170, 2, 97, etc. This should make it so much more easily readable. Please let me know if this is done, and I will check.
  • Python
    Python about 2 years
    The usage of this is: average(3,5,123), but you can input other numbers. And keep in mind that it returns a value, and doesn't print anything.