How to make a histogram from a list of data

114,894

Automatic bins

how to make 200 evenly spaced out bins, and have your program store the data in the appropriate bins?

The accepted answer manually creates 200 bins with np.arange and np.linspace, but matplotlib already does this automatically:

  1. plt.hist itself returns counts and bins

    counts, bins, _ = plt.hist(data, bins=200)
    

Or if you need the bins before plotting:

  1. np.histogram with plt.stairs

    counts, bins = np.histogram(data, bins=200)
    plt.stairs(counts, bins, fill=True)
    

    Note that stair plots require matplotlib 3.4.0+.

  2. pd.cut with plt.hist

    _, bins = pd.cut(data, bins=200, retbins=True)
    plt.hist(data, bins)
    

    histogram output

Share:
114,894
Wana_B3_Nerd
Author by

Wana_B3_Nerd

Beginner Programmer.

Updated on January 25, 2022

Comments

  • Wana_B3_Nerd
    Wana_B3_Nerd over 2 years

    Well I think matplotlib got downloaded but with my new script I get this error:

    /usr/lib64/python2.6/site-packages/matplotlib/backends/backend_gtk.py:621:     DeprecationWarning: Use the new widget gtk.Tooltip
      self.tooltips = gtk.Tooltips()
    Traceback (most recent call last):
      File "vector_final", line 42, in <module>
    plt.hist(data, num_bins)
      File "/usr/lib64/python2.6/site-packages/matplotlib/pyplot.py", line 2008, in hist
    ret = ax.hist(x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, **kwargs)
      File "/usr/lib64/python2.6/site-packages/matplotlib/axes.py", line 7098, in hist
    w = [None]*len(x)
    TypeError: len() of unsized object
    

    And my code is: #!/usr/bin/python

    l=[]
    with open("testdata") as f:
        line = f.next()
        f.next()# skip headers
        nat = int(line.split()[0])
        print nat
    
        for line in f:
            if line.strip():
              if line.strip():
                l.append(map(float,line.split()[1:]))  
    
    
        b = 0
        a = 1
    for b in range(53):
        for a in range(b+1,54):
            import operator
            import matplotlib.pyplot as plt
            import numpy as np
    
            vector1 = (l[b][0],l[b][1],l[b][2])
            vector2 = (l[a][0],l[a][1],l[a][2])
    
                x = vector1
                y = vector2
                vector3 = list(np.array(x) - np.array(y))
                dotProduct = reduce( operator.add, map( operator.mul, vector3, vector3))
    
    
            dp = dotProduct**.5
            print dp
    
            data = dp
            num_bins = 200 # <- number of bins for the histogram
            plt.hist(data, num_bins)
            plt.show()
    

    But the code thats getting me the error is the new addition that I added which is the last part, reproduced below:

                    data = dp
                    num_bins = 200 # <- number of bins for the histogram
                    plt.hist(data, num_bins)
                    plt.show()