Understanding hist() and break intervals in R

46,842

Solution 1

As per the documentation, if you give the breaks argument a single number, it is treated as a suggestion as it gives pretty breakpoints. If you want to force it to be 10 equally spaced bins, the easiest is probably the following,

x = rnorm(50)
hist(x, breaks = seq(min(x), max(x), length.out = 11))

The length should be n+1 where n is the number of desired bins.

Solution 2

If you read help(hist) you will find this explanation:

breaks: one of:

• a vector giving the breakpoints between histogram cells,

• a function to compute the vector of breakpoints,

• a single number giving the number of cells for the histogram,

• a character string naming an algorithm to compute the number of cells (see ‘Details’),

• a function to compute the number of cells.

In the last three cases the number is a suggestion only; as the breakpoints will be set to ‘pretty’ values, the number is limited to ‘1e6’ (with a warning if it was larger). If ‘breaks’ is a function, the ‘x’ vector is supplied to it as the only argument (and the number of breaks is only limited

So the help specifically says that if you provide the function with a number it will only be used as a suggestion.

One possible solution is to provide the break points yourself like so:

x <- rnorm(296)
hist(x, breaks=c(-4,-3,-2,-1,0,1,2,3,4,5))

If you don't want to do that but instead want to specify the number of bins you can use the cut function

plot(cut(x, 10))
Share:
46,842
Bobby
Author by

Bobby

Updated on March 23, 2020

Comments

  • Bobby
    Bobby about 4 years

    I've recently started using R and I don't think I'm understanding the hist() function well. I'm currently working with a numeric vector of length 296, and I'd like to divide it up into 10 equal intervals, and produce a frequency histogram to see which values fall into each interval. I thought hist(dataset, breaks = 10) would do the job, but it's dividing it into 12 intervals instead. I obviously misunderstood what breaks does.

    If I want to divide up my data into 10 intervals in my histogram, how should I go about doing that? Thank you.