How to center labels in histogram plot

48,507

Solution 1

The other answers just don't do it for me. The benefit of using plt.bar over plt.hist is that bar can use align='center':

import numpy as np
import matplotlib.pyplot as plt

arr = np.array([ 0.,  2.,  0.,  0.,  0.,  0.,  3.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  2.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,
        0.,  0.,  0.,  0.,  2.,  0.,  3.,  1.,  0.,  0.,  2.,  2.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,  0.,  0.,  0.,  0.,
        0.,  0.,  2.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  3.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  1.,  0.,  0.,  0.,  1.,  2.,  2.])

labels, counts = np.unique(arr, return_counts=True)
plt.bar(labels, counts, align='center')
plt.gca().set_xticks(labels)
plt.show()

centering labels in a histogram

Solution 2

The following alternative solution is compatible with plt.hist() (and this has the advantage for instance that you can call it after a pandas.DataFrame.hist().

import numpy as np

def bins_labels(bins, **kwargs):
    bin_w = (max(bins) - min(bins)) / (len(bins) - 1)
    plt.xticks(np.arange(min(bins)+bin_w/2, max(bins), bin_w), bins, **kwargs)
    plt.xlim(bins[0], bins[-1])

(The last line is not strictly requested by the OP but it makes the output nicer)

This can be used as in:

import matplotlib.pyplot as plt
bins = range(5)
plt.hist(results, bins=bins)
bins_labels(bins, fontsize=20)
plt.show()

Result: success!

Solution 3

you can build a bar plot out of a np.histogram.

Consider this

his = np.histogram(a,bins=range(5))
fig, ax = plt.subplots()
offset = .4
plt.bar(his[1][1:],his[0])
ax.set_xticks(his[1][1:] + offset)
ax.set_xticklabels( ('1', '2', '3', '4') )

enter image description here

EDIT: in order to get the bars touching one another, one has to play with the width parameter.

 fig, ax = plt.subplots()
 offset = .5
 plt.bar(his[1][1:],his[0],width=1)
 ax.set_xticks(his[1][1:] + offset)
 ax.set_xticklabels( ('1', '2', '3', '4') )

enter image description here

Solution 4

Here is a solution that only uses plt.hist(). Let's break this down in two parts:

  1. Have the x-axis to be labelled 0 1 2 3.
  2. Have the labels in the center of each bar.

To have the x-axis labelled 0 1 2 3 without .5 values, you can use the function plt.xticks() and provide as argument the values that you want on the x axis. In your case, since you want 0 1 2 3, you can call plt.xticks(range(4)).

To have the labels in the center of each bar, you can pass the argument align='left' to the plt.hist() function. Below is your code, minimally modified to do that.

import matplotlib.pyplot as plt

results = [0,  2,  0,  0,  0,  0,  3,  0,  0,  0,  0,  0,  0,  0,  0,  2,  0,  0,
           0,  0,  0,  1,  0,  0,  0,  0,  0,  0,  0,  1,  0,  0,  0,  0,  0,  0,
           0,  1,  1,  0,  0,  0,  0,  2,  0,  3,  1,  0,  0,  2,  2,  0,  0,  0,
           0,  0,  0,  0,  0,  1,  1,  0,  0,  0,  0,  0,  0,  2,  0,  0,  0,  0,
           0,  1,  0,  0,  0,  0,  0,  0,  0,  0,  0,  3,  1,  0,  0,  0,  0,  0,
           0,  0,  0,  1,  0,  0,  0,  1,  2,  2]

plt.hist(results, bins=range(5), align='left')
plt.xticks(range(4))
plt.show()

enter image description here

Share:
48,507

Related videos on Youtube

graffe
Author by

graffe

Updated on November 22, 2021

Comments

  • graffe
    graffe over 2 years

    I have a numpy array results that looks like

    [ 0.  2.  0.  0.  0.  0.  3.  0.  0.  0.  0.  0.  0.  0.  0.  2.  0.  0.
      0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.
      0.  1.  1.  0.  0.  0.  0.  2.  0.  3.  1.  0.  0.  2.  2.  0.  0.  0.
      0.  0.  0.  0.  0.  1.  1.  0.  0.  0.  0.  0.  0.  2.  0.  0.  0.  0.
      0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  3.  1.  0.  0.  0.  0.  0.
      0.  0.  0.  1.  0.  0.  0.  1.  2.  2.]
    

    I would like to plot a histogram of it. I have tried

    import matplotlib.pyplot as plt
    plt.hist(results, bins=range(5))
    plt.show()
    

    This gives me a histogram with the x-axis labelled 0.0 0.5 1.0 1.5 2.0 2.5 3.0. 3.5 4.0.

    I would like the x-axis to be labelled 0 1 2 3 instead with the labels in the center of each bar. How can you do that?

    • Mathias711
      Mathias711 about 10 years
      Im not sure what you want. Do you want the bins centered around 1,2,3 (so around the integer instead of the 1.5, 2.5 values). Or do you want to label the bars with text or something? Because if I execute your command, my output is (array([ 4., 5., 1., 2.]), array([0, 1, 2, 3, 4]) (with different input values). So I have got different bins, or do I miss something?
    • graffe
      graffe about 10 years
      @Mathias711 The first bar is the number of 0s in results, the second the numbers of 1s (there are eleven of them), the third the number of 2s (there are eight of them) and the last one is the number of 3s (there are three of them). I would like the number 0 as a label under the middle of the first bar, the number 1 as a label under the middle of the second and so on. Is that clearer?
    • Mathias711
      Mathias711 about 10 years
      So there are no problems with the binning, you just want to add labels to the bins?
    • graffe
      graffe about 10 years
      @Mathias711 Yes I want to get rid of the default labels and add the ones I described.
  • graffe
    graffe about 10 years
    Thanks! How do you get rid of the spaces between the bars?
  • Mathias711
    Mathias711 about 10 years
    Ahh thanks. I was also working on something like this, only didnt get the xticks working. Thanks for clarifying
  • Acorbe
    Acorbe about 10 years
    @eleanora, bars have been fixed.
  • graffe
    graffe about 10 years
    Also, how do I create ('0', '1', '2','3') if I wanted it to go from 0 to 100, say? tuple(str(i) for i in range(101)) ?
  • askewchan
    askewchan about 10 years
    @eleanora That works, but I would use np.arange(1, 101).astype(str), or without numpy: map(str, range(1, 101))
  • Wikunia
    Wikunia almost 7 years
    May you explain why you only show the bins 0,1,2,3 but use range(5) ?
  • Pietro Battiston
    Pietro Battiston almost 7 years
    @Wikunia : sure. Bin 0 covers from 0 to 1 in the plot, bin 1 covers from 1 to 2... and so on until bin 3, which covers from 3 to 4 in the plot. So the bins (left and right) borders must be the sequence [0, 1, 2, 3, 4]... which is precisely range(5). Strange, I know, but the only alternative I see (centering bin i going from i-1/2 to i+1/2) would be more complicated.
  • Dalker
    Dalker over 6 years
    This answer is efficient in a more general case, if bins are redefined, e.g. as bins = np.arange(2, 7, .5)
  • Joey Carson
    Joey Carson almost 5 years
    How has no one congratulated you on this? Definitely the most elegant and straightforward solution to this exact problem. This should be considered the best solution for visualizing a discrete distribution.
  • yatu
    yatu about 4 years
    Indeed. This is so much simpler!
  • mins
    mins over 3 years
    Variant of the currently selected answer, no addition.
  • mins
    mins over 3 years
    "barplot is a neat way to do it". In some cases only. How would you do for these cases?
  • Igor Kołakowski
    Igor Kołakowski over 3 years
    Plot on the right could be created as Ted shows in his answer. To get plot on the left, size() should be used instead of sum(), i.e. df.groupby(df.sold // 10 * 10).size().plot.bar(). But i guess it's worth comparing results with other approaches.
  • mins
    mins over 3 years
    I meant the difficulty in the linked case was about the use of hist with weight option. Cannot be replaced easily by barplot as it hasn't an equivalent possibility.
  • Sahar Millis
    Sahar Millis over 3 years
    This should be the answer. tnx :)
  • Maciek Woźniak
    Maciek Woźniak about 2 years
    depending on the python version it is either align='center' or align=mid