How to make a histogram from a list of data
114,894
Automatic bins
how to make 200 evenly spaced out bins, and have your program store the data in the appropriate bins?
The accepted answer manually creates 200 bins with np.arange
and np.linspace
, but matplotlib already does this automatically:
-
plt.hist
itself returns counts and binscounts, bins, _ = plt.hist(data, bins=200)
Or if you need the bins before plotting:
-
np.histogram
withplt.stairs
counts, bins = np.histogram(data, bins=200) plt.stairs(counts, bins, fill=True)
Note that stair plots require matplotlib 3.4.0+.
-
pd.cut
withplt.hist
_, bins = pd.cut(data, bins=200, retbins=True) plt.hist(data, bins)
Comments
-
Wana_B3_Nerd over 2 years
Well I think matplotlib got downloaded but with my new script I get this error:
/usr/lib64/python2.6/site-packages/matplotlib/backends/backend_gtk.py:621: DeprecationWarning: Use the new widget gtk.Tooltip self.tooltips = gtk.Tooltips() Traceback (most recent call last): File "vector_final", line 42, in <module> plt.hist(data, num_bins) File "/usr/lib64/python2.6/site-packages/matplotlib/pyplot.py", line 2008, in hist ret = ax.hist(x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, **kwargs) File "/usr/lib64/python2.6/site-packages/matplotlib/axes.py", line 7098, in hist w = [None]*len(x) TypeError: len() of unsized object
And my code is: #!/usr/bin/python
l=[] with open("testdata") as f: line = f.next() f.next()# skip headers nat = int(line.split()[0]) print nat for line in f: if line.strip(): if line.strip(): l.append(map(float,line.split()[1:])) b = 0 a = 1 for b in range(53): for a in range(b+1,54): import operator import matplotlib.pyplot as plt import numpy as np vector1 = (l[b][0],l[b][1],l[b][2]) vector2 = (l[a][0],l[a][1],l[a][2]) x = vector1 y = vector2 vector3 = list(np.array(x) - np.array(y)) dotProduct = reduce( operator.add, map( operator.mul, vector3, vector3)) dp = dotProduct**.5 print dp data = dp num_bins = 200 # <- number of bins for the histogram plt.hist(data, num_bins) plt.show()
But the code thats getting me the error is the new addition that I added which is the last part, reproduced below:
data = dp num_bins = 200 # <- number of bins for the histogram plt.hist(data, num_bins) plt.show()