Creating a density histogram in ggplot2?

69,181

Solution 1

Manually, I added colors to your percentile bars. See if this works for you.

library(ggplot2)

ggplot(df, aes(x=vector)) +   
   geom_histogram(breaks=breaks,aes(y=..density..),colour="black",fill=c("red","orange","yellow","lightgreen","green","darkgreen","blue","darkblue","purple","pink")) + 
   geom_density(aes(y=..density..)) +
   scale_x_continuous(breaks=c(-3,-2,-1,0,1,2,3)) +
   ylab("Density") + xlab("df$vector") + ggtitle("Histogram of df$vector") +
   theme_bw() + theme(plot.title=element_text(size=20),
                      axis.title.y=element_text(size = 16, vjust=+0.2),
                      axis.title.x=element_text(size = 16, vjust=-0.2),
                      axis.text.y=element_text(size = 14),
                      axis.text.x=element_text(size = 14),
                      panel.grid.major = element_blank(),
                      panel.grid.minor = element_blank())

enter image description here

Solution 2

fill=seg results in grouping. You are actually getting a different histogram for each value of seg. If you don't need the colours, you could use this:

ggplot(df) + 
  geom_histogram(breaks=breaks,aes(x=vector,y=..density..), position="identity") + 
  geom_density(aes(x=vector,y=..density..))

enter image description here

If you need the colours, it might be easiest to calculate the density values outside of ggplot2.

Solution 3

Or an option with ggpubr

library(ggpubr)
gghistogram(df, x = "vector", add = "mean", rug = TRUE, fill = "seg",
   palette = c("#00AFBB", "#E7B800", "#E5A800", "#00BFAB", "#01ADFA", 
   "#00FABA", "#00BEAF", "#01AEBF", "#00EABA", "#00EABB"), add_density = TRUE)
Share:
69,181
Usobi
Author by

Usobi

Updated on September 13, 2020

Comments

  • Usobi
    Usobi over 3 years

    I want to create the next histogram density plot with ggplot2. In the "normal" way (base packages) is really easy:

    set.seed(46)
    vector <- rnorm(500)  
    breaks <- quantile(vector,seq(0,1,by=0.1))
    labels = 1:(length(breaks)-1)
    den = density(vector)
    hist(df$vector,
         breaks=breaks,
         col=rainbow(length(breaks)),
         probability=TRUE)
    lines(den)
    

    enter image description here

    With ggplot I have reached this so far:

    seg <- cut(vector,breaks,
               labels=labels,
               include.lowest = TRUE, right = TRUE)
    df = data.frame(vector=vector,seg=seg)
    
    ggplot(df) + 
         geom_histogram(breaks=breaks,
                        aes(x=vector,
                            y=..density..,
                            fill=seg)) + 
         geom_density(aes(x=vector,
                          y=..density..))
    

    But the "y" scale has the wrong dimension. I have noted that the next run gets the "y" scale right.

     ggplot(df) + 
         geom_histogram(breaks=breaks,
                        aes(x=vector,
                        y=..density..,
                        fill=seg)) + 
         geom_density(aes(x=vector,
                          y=..density..))
    

    I just do not understand it. y=..density.. is there, that should be the height. So why on earth my scale gets modified when I try to fill it?

    I do need the colours. I just want a histogram where the breaks and the colours of each block are directionally set according to the default ggplot fill colours.

  • Roland
    Roland over 10 years
    Sorry, I don't understand your comment.
  • Usobi
    Usobi over 10 years
    thanks, I have been looking for alternative solutions but I have ended up with yours.