How to add a line to a boxplot using ggplot2

10,951

Solution 1

ggplot() + 
  geom_boxplot(data = forecasts,
               aes(x = Date, y = value, 
                   group = interaction(Date, f_type), 
                   fill = f_type), 
               width = 10) + 
  geom_line(data = observations,
            aes(x = Dt, y = obs), size = 2)

This is what you want. You need the x to be a continuous date variable (not as.factor as in your code). That way the type of data it's expecting for the x axis are the same in both datasets. You need to add the group = line so that it knows to make a separate box for each date and f_type. Then adding the line is simple.

enter image description here

If you don't want x to be a continuous date, then your use of as.factor is right, but then you need to add a group to the geom_line so it knows how to connect points across discrete factors.

ggplot() + 
  geom_boxplot(data = forecasts,
               aes(x = as.factor(Date), y = value, 
                   group = interaction(Date, f_type), 
                   fill = f_type)) + 
  geom_line(data = observations,
            aes(x = as.factor(Dt), y = obs, group = 1), size = 2)

enter image description here

Also notice that I removed the width option in the second graph (which means I'm just using the default ggplot value). You can play around with that value to see what looks the best with your data.

Finally, in both my examples, I moved the data and aesthetics into the geom statements that were going to use them. In complex figures, it's sometimes hard to remember which layers use which data and which aesthetics, so while you're debugging and troubleshooting, it's nice to not have any in the main ggplot() call.

Solution 2

try this:

p<- ggplot(data = forecasts, aes(x=as.factor(Date), y=value))
p<- p + geom_boxplot(aes(fill=f_type))

p <- p +  geom_hline(aes(yintercept=12), colour="#990000")
p

Here's a link: http://www.cookbook-r.com/Graphs/Lines_(ggplot2)/

Solution 3

forecasts<- data.frame(f_type = c(rep("A",9), rep("B",9)), Date = c(rep(as.Date("2007-01-31"),3), rep(as.Date("2007-02-28"),3), rep(as.Date("2007-03-31"),3), rep(as.Date("2007-01-31"),3), rep(as.Date("2007-02-28"),3), rep(as.Date("2007-03-31"),3)), value = c(10,50,60,05,90,20,30,46,39,69,82,48,65,99,75,15,49,27))

observation<- data.frame(Dt = c(as.Date("2007-01-31"), as.Date("2007-02-28"), as.Date("2007-03-31")), obs = c(30,49,57))
p <- ggplot(data = forecasts, aes(x = as.factor(Date), y = value))
p <- p + geom_boxplot(aes(fill = f_type))
p <- p + geom_line(data = observation,aes(x = as.factor(Dt), y = obs, group = 1))
print(p)

boxplot with line

Share:
10,951
Reza Ahmad
Author by

Reza Ahmad

Updated on July 17, 2022

Comments

  • Reza Ahmad
    Reza Ahmad almost 2 years

    I am trying to do box and whisker plots with some forecast data. And want to add observations as a line to the plot. I am producing a sample of the data here so that you can understand how it looks like.

    $forecasts<- data.frame(f_type=c(rep("A",9),rep("B",9)),Date=c(rep(as.Date("2007-01-31"),3),rep(as.Date("2007-02-28"),3),rep(as.Date("2007-03-31"),3),rep(as.Date("2007-01-31"),3),rep(as.Date("2007-02-28"),3),rep(as.Date("2007-03-31"),3)),value=c(10,50,60,05,90,20,30,46,39,69,82,48,65,99,75,15,49,27))

    $observation<- data.frame(Dt=c(as.Date("2007-01-31"),as.Date("2007-02-28"),as.Date("2007-03-31")),obs=c(30,49,57))

    With the forecast I can plot the box and whisker plot using ggplot2 like below.

    $p<- ggplot(data = forecasts, aes(x=as.factor(Date), y=value)) p<- p + geom_boxplot(aes(fill=f_type))

    Now I want to add the observations for those dates as a line to this plot. So far, I have tried the following:

    1. $p<- p + geom_line(data = observation,aes(x=Dt,y=obs)) . This gives an error saying:

      Error: Invalid input: date_trans works with objects of class Date only

    2. with x axis as factor like this: $p<- p + geom_line(data = observation,aes(x=as.factor(Dt),y=obs)) for which I get the following error:

      geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?

    Can anyone please suggest how I can accomplish this? Thanks in advance.

  • Reza Ahmad
    Reza Ahmad almost 7 years
    I am not trying to add a horizontal line. I want to add a line representing the observations I have.
  • Reza Ahmad
    Reza Ahmad almost 7 years
    Perfect. I was looking for something like your second method. Thanks a lot.
  • Reza Ahmad
    Reza Ahmad almost 7 years
    Thanks. It seems using group argument with geom_line solves the issue.
  • Reza Ahmad
    Reza Ahmad almost 7 years
    Is there any way define the whiskers to this plot? I tried stat_summary with defined functions and geom="boxplot". Although this does take in the given ranges definitions, but messes up the groups.
  • Brian
    Brian almost 7 years
    Define how? According to the documentation, ggplot2.tidyverse.org/reference/geom_boxplot.html, the whiskers extend to 1.5 times the interquartile range, but that can be adjusted using coef = inside geom_boxplot. If you want the whiskers to be some other statistic, you need a function to pass to stat_summary like you tried, but you need to include the aes(...) from above to keep the grouping correct.