How to use facet_grid() with geom_histogram()

17,718

Not sure what caused this problem, but it seems to be solved by cleaning up and simplifying the ggplot code. In particular, ggplot2 is not designed to use column selection syntax such as dfrm$am or dfrm[, "am"] inside of the aes() function (nor in a formula expression like facet_wrap(. ~ dfrm[, "vs"])). Although these types of expressions often seem to work fine, in general they should be avoided.

library(ggplot2)

table(dfrm$am, dfrm$vs)
#           
#             V-engine Straight-Engine
#   Automatic       12               7
#   Manual           6               7

p = ggplot(dfrm, aes(x=mpg, fill=am)) +
    geom_histogram(position="identity", colour="grey40", alpha=0.2, bins = 10) +
    facet_grid(. ~ vs)

ggsave("hist.png", p, height=4, width=6, dpi=150)

enter image description here

Share:
17,718
naco
Author by

naco

Updated on June 22, 2022

Comments

  • naco
    naco almost 2 years

    I tried using the facet_grid() for the first time. I plotted histograms with my own data, and the distribution seemed inaccurate when I counted the boxes manually on the graph. I replicated my code using the mtcars data, and the problem seemed to persist.

    Here is the histogram produced by ggplot:

    dfrm <- mtcars
    dfrm$am <- factor(dfrm$am, levels = c(0,1), labels = c("Automatic", "Manual"))
    dfrm$vs <- factor(dfrm$vs, levels = c(0,1), labels = c("V-engine", "Straight-Engine"))
    
    require(ggplot2)
    ggplot(dfrm, aes(x=dfrm[,"mpg"], fill=dfrm[,"am"], colour=dfrm[,"am"])) +
    geom_histogram(colour="transparent", position = "identity", alpha=0.2, bins = 10) +
    facet_grid(. ~ dfrm[,"vs"])
    

    When I count manually on the histogram, I count:

    • V-Engine, Automatic: 14
    • V-Engine, Manual: 4
    • Straight Engine, Automatic: 5
    • Straight Engine, Manual: 9

    This code counts how many of which exist in the actual data:

    require(pastecs)
    by(data=dfrm$am, INDICES = dfrm$vs,  table)
    

    and the results are:

    • V-Engine, Automatic: 12
    • V-Engine, Manual: 6
    • Straight Engine, Automatic: 7
    • Straight Engine, Manual: 7

    Am I doing something wrong? Is there a better way to facet, or is this a bug?

    I also did histograms with the base package to check if the results match, and those seem accurate when I count the boxes.

    hist(mtcars[which(mtcars[,"am"]==0 & mtcars[,"vs"]==0),"mpg"], xlim=c(10, 35), col=rgb(0.1,0.1,0.1,0.5), breaks=10)
    hist(mtcars[which(mtcars[,"am"]==1 & mtcars[,"vs"]==0),"mpg"], col=rgb(0.8,0.8,0.8,0.5), breaks=10 ,add=T)
    hist(mtcars[which(mtcars[,"am"]==0 & mtcars[,"vs"]==1),"mpg"], xlim=c(10, 35), col=rgb(0.1,0.1,0.1,0.5), breaks=10)
    hist(mtcars[which(mtcars[,"am"]==1 & mtcars[,"vs"]==1),"mpg"], col=rgb(0.8,0.8,0.8,0.5), breaks=10 ,add=T)
    

    Thanks.

    ===EDIT===

    The answer provided by bdemarest solves the problem. However, I am confused with the syntax that ggplot2 prefers, and how to put it inside a function. Here is what I am going for:

    myfunc <- function(varx, dfrm, facet = F){
      require(ggplot2)
      p = ggplot(dfrm, aes(x=varx, fill=am)) +
        geom_histogram(position="identity", colour="grey40", alpha=0.2, bins = 10)
      if(!is.logical(facet)){
        p <- p + facet_grid(. ~ facet)
      }
      return(p)
    }
    myfunc("mpg", mtcars, facet = "vs")
    

    I tried with and without quotations, but couldn't get it to work.

    === EDIT2 ===

    With the help of bdemarest in the comments, I made a lot of progress, but now the color fill fails, but only when the ggplot is inside a function

    Here, this works perfectly:

    facet = "vs"
    p = ggplot(dfrm, aes_string(x="mpg", fill="am")) +
      geom_histogram(position="identity", colour="grey40", alpha=0.2, bins = 10)
    if(!is.logical(facet)){
      p <- p + facet_grid(reformulate(facet, "."))
    }
    p
    

    However, this does not:

    myfunc <- function(varx, dfrm, facet = FALSE){
      require(ggplot2)
      p = ggplot(dfrm, aes_string(x=varx, fill="am")) +
        geom_histogram(position="identity", colour="grey40", alpha=0.2, bins = 10)
      if(!is.logical(facet)){
        p <- p + facet_grid(reformulate(facet, "."))
      }
      return(p)
    }
    myfunc("mpg", mtcars, facet = "vs")
    

    The only problem here now is that the groups wont get colored accordingly. What am I missing?