Finding the average of a list

python list lambda average reduce

1,354,419

Solution 1

On Python 3.8+, with floats, you can use statistics.fmean as it's faster with floats.

On Python 3.4+, you can use statistics.mean:

l = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import statistics
statistics.mean(l)  # = 20.11111111111111

On older versions of Python you can:

sum(l) / len(l)

On Python 2, you need to convert len to a float to get float division

sum(l) / float(len(l))

There is no need to use functools.reduce as it is much slower.

Solution 2

l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
sum(l) / len(l)

Solution 3

You can use numpy.mean:

l = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import numpy as np
print(np.mean(l))

Solution 4

A statistics module has been added to python 3.4. It has a function to calculate the average called mean. An example with the list you provided would be:

from statistics import mean
l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
mean(l)

Solution 5

Why would you use reduce() for this when Python has a perfectly cromulent sum() function?

print sum(l) / float(len(l))

(The float() is necessary in Python 2 to force Python to do a floating-point division.)

View more solutions

1,354,419

Author by

Carla Dessi

My name's Carla, I'm a BSc Computing student from Bournemouth, currently undergoing a placement as a Software Engineer. I like to spend my spare time creating mobile applications and developing websites/Wordpress themes.

Updated on March 20, 2022

Comments

Carla Dessi about 2 years
I have to find the average of a list in Python. This is my code so far
```
from functools import reduce

l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
print(reduce(lambda x, y: x + y, l))
```
I've got it so it adds together the values in the list, but I don't know how to make it divide into them?
- mitch over 12 years
  
  numpy.mean if you can afford installing numpy
- n611x007 over 8 years
  
  sum(L) / float(len(L)). handle empty lists in caller code like if not L: ...
- n611x007 over 8 years
  
  please update your post and remove reduce and lambda because people are copying this from the top for bad use-cases. (well, unless you have pressing reason to use them.)
- n611x007 over 8 years
  
  duplicate: stackoverflow.com/questions/7716331
- n611x007 over 8 years
  
  @mitch: it's not a matter of whether you can afford installing numpy. numpy is a whole word in itself. It's whether you actually need numpy. Installing numpy, a 16mb C extension, for mean calculating would be, well, very impractical, for someone not using it for other things.
- 25mhz almost 8 years
  
  instead of installing the whole numpy package for just avg/mean if using python 3 we can get this thing done using statistic module just by "from statistic import mean" or if on python 2.7 or less, the statistic module can be downloaded from src: hg.python.org/cpython/file/default/Lib/statistics.py doc: docs.python.org/dev/library/statistics.html and directly used.
- Ravindra S almost 7 years
  
  Possible duplicate of Calculating arithmetic mean (average) in Python
Carla Dessi over 12 years

That's perfect ! sorry for the stupid question, but i've genuinely looked everywhere for that ! thank you so much !
Carla Dessi over 12 years

as i said, i'm new to this, i was thinking i'd have to make it with a loop or something to count the amount of numbers in it, i didn't realise i could just use the length. this is the first thing i've done with python..
Johan Lundberg over 12 years

interesting but that's not what he asked for.
Johan Lundberg over 12 years

I get that this is just for fun but returning 0 for an empty list may not be the best thing to do
Andrew Clark over 12 years

@JohanLundberg - You could replace the 0 with False as the last argument to reduce() which would give you False for an empty list, otherwise the average as before.
user1066101 over 12 years

@CarlaDessi: What tutorial are you using? This is thoroughly covered in all the tutorials I've seen. Clearly, you've found a tutorial that doesn't cover this well. What tutorial are you using to learn Python?
Chris Koston over 10 years

Inefficient. It converts all elements to float before adding them. It's faster to convert just the length.
Foo Bar User about 10 years

what if the sum is a massive number that wont fit in int/float ?
RolfBly about 10 years

For those of us new to the word 'cromulent'
Arseniy almost 10 years

@FooBarUser then you should calc k = 1.0/len(l), and then reduce: reduce(lambda x, y: x + y * k, l)
wsysuper about 9 years

Good. Every other answer didn't notice the empty list hazard!
Serge Stroobandt almost 9 years

This is the most elegant answer because it employs a standard library module which is available since python 3.4.
L. Amber O'Hearn over 8 years

That's strange. I would have assumed this would be much more efficient, but it appears to take 8 times as long on a random list of floats than simply sum(l)/len(l)
L. Amber O'Hearn over 8 years

Oh, but np.array(l).mean() is much faster.
Akavall over 8 years

@L.AmberO'Hearn, I just timed it and np.mean(l) and np.array(l).mean are about the same speed, and sum(l)/len(l) is about twice as fast. I used l = list(np.random.rand(1000)), for course both numpy methods become much faster if l is numpy.array.
n611x007 over 8 years

downvoted because I cannot see why reduce and lambda should be on the top of a question about avarage calculation
n611x007 over 8 years

well, unless that's the sole reason for installing numpy. installing a 16mb C package of whatever fame for mean calculation looks very strange on this scale.
lahjaton_j about 8 years

As a C++ programmer, that is neat as hell and float is not ugly at all!
Antti Haapala -- Слава Україні almost 8 years

And it is numerically stabler
kindall almost 8 years

Returning False (equivalent to the integer 0) is just about the worst possible way to handle this error. Better to catch the ZeroDivisionError and raise something better (perhaps ValueError).
Jules G.M. almost 8 years

He should really be using sum though, as guido says to try really hard to avoid reduce
J'e over 7 years

with given list of floats, given hardware and py2.7.x, lambda... --> 2.59us, numpy.mean(l) --> 27.5us, sum(l)/len(;) --> 650ns
Flame_Phoenix over 7 years

what if the user adds floating point numbers to your array? The results will be super imprecise.
jpmc26 over 7 years

In recent Python 3, / returns a float regardless. You can use from__future__ import division to ensure the same behavior in Python 2.2 and up (so basically any version that's suitable for production today).
EndermanAPM almost 7 years

@AndrewClark why do you force floaton len?
Brenouchoa over 6 years

check later response about python 3.4 up solution
MatTheWhale about 6 years

@kindall how is a ValueError any better than a ZeroDivisionError? The latter is more specific, plus it seems a bit unnecessary to catch an arithmetic error only to re-throw a different one.
kindall about 6 years

Because ZeroDivisionError is only useful if you know how the calculation is being done (i.e., that a division by the length of the list is involved). If you don't know that, it doesn't tell you what the problem is with the value you passed in. Whereas your new exception can include that more specific information.
Steinfeld over 5 years

If you want to reduce some numbers after decimal point. This might come in handy: float('%.2f' % float(sum(l) / len(l)))
yprez about 5 years

@Steinfeld I don't think conversion to string is the best way to go here. You can achieve the same in a cleaner way with round(result, 2).
fralau almost 5 years

Bravo: IMHO, sum(l)/len(l) is by far the most elegant answer (no need to make type conversions in Python 3).
cs95 over 4 years

This isn't a pandas question, so it seems excessive to import such a heavy library for a simple operation like finding the mean.
Boris Verkhovskiy over 4 years

And it produces a nicer error if you accidentally pass in an empty list statistics.StatisticsError: mean requires at least one data point instead of a more cryptic ZeroDivisionError: division by zero for the sum(x) / len(x) solution.
Boris Verkhovskiy over 4 years

float() is not necessary on Python 3.
xilpex over 3 years

There is no need to store the values in variables or use global variables.
TankorSmash over 3 years

artima link appears to be dead
Elias over 3 years

Also it's better to use np.nanmean(l) in order to avoid issues with NAN and zero divisions
drevicko almost 3 years

I tried these timings with a list of length 100000000: mean2 < 1s; mean3,4 ~ 8s; mean5,6 ~ 27s; mean1 ~1minute. I find this surprising, would have expected numpy to be best with a large list, but there you go! Seems there's a problem with the statistics package!! (this was python 3.8 on a mac laptop, no BLAS as far as I know).
drevicko almost 3 years

Incidentally, if I convert l into an np.array first, np.mean takes ~.16s, so about 6x faster than sum(l)/len(l). Conclusion: if you're doing lots of calculations, best do everything in numpy.
Alon Gouldman over 2 years

@drevicko see mean4, this is what I do there... I guess that it its already a np.array then it make sense to use np.mean, but in case you have a list then you should use sum(l) / len(l)
drevicko over 2 years

exactly! It also depends on what you'll be doing with it later. Im my work I'm typically doing a series of calculations, so it makes sense to convert to numpy at the start and leverage numpy's fast underlying libraries.
Asclepius about 2 years

@AlonGouldman Great. I urge showing each speed in 1/1000 of a second (as an integer), otherwise the number is hard to read. For example, 170, 2, 97, etc. This should make it so much more easily readable. Please let me know if this is done, and I will check.
Python about 2 years

The usage of this is: average(3,5,123), but you can input other numbers. And keep in mind that it returns a value, and doesn't print anything.