Finetuning a forest plot with ggplot2

10,336

This seems like what you had in mind:

data$beta <- as.numeric(as.character(data$beta))
data$lower95 <- as.numeric(as.character(data$lower95))
data$upper95 <- as.numeric(as.character(data$upper95))
data$population <- as.numeric(as.character(data$population))

ggplot(data=data,aes(x=beta,y=cohort))+
  geom_point(aes(size=population,fill=type), colour="black",shape=21)+
  geom_errorbarh(aes(xmin=lower95,xmax=upper95),height=0.667)+
  geom_vline(xintercept=0,linetype="dashed")+
  scale_size_continuous(breaks=c(5000,10000,15000))+
  geom_text(aes(x=2.8,label=type),size=4)

You'll have to play around with the arguments to geom_text(...) to get the labels positioned as you want them, and to get the size you want.

As far as making the plot prettier, I prefer this:

ggplot(data=data,aes(x=beta,y=cohort))+
  geom_point(aes(size=population,color=type),shape=16)+
  geom_errorbarh(aes(xmin=lower95,xmax=upper95),height=0.0, colour="blue")+
  geom_vline(xintercept=0,linetype="dashed")+
  scale_size_continuous(breaks=c(5000,10000,15000))+
  geom_text(aes(x=2.8,label=type),size=4)

Share:
10,336

Related videos on Youtube

KJ_
Author by

KJ_

Updated on September 14, 2022

Comments

  • KJ_
    KJ_ about 1 year

    I am trying to make a forest plot in R, displaying results from a meta-analysis. However, I run into problems using ggplot2. I have not found similar questions on stackoverflow so far and would really appreciate some help.

    The code I am using now looks like this (I changed it a bit to make it self-containing):

    cohort <- letters[1:15]
    population <- c(  runif(15, min=2000, max=50000)) #hit1$N
    beta <-  c(  runif(15, min=-1, max=2))
    lower95 <- c(runif(15, min=-1.5, max=0.5))
    upper95 <- c(runif(15, min=1.5, max=2.5))
    type <- c("CBCL","SDQ","CBCL","SDQ","CBCL","SDQ","CBCL")
    data <- as.data.frame(cbind(cohort, population, beta ,lower95,upper95,type))
    
    
    ggplot(data=data, aes(x=cohort, y=beta))+
      geom_errorbar(aes(ymin=lower95, ymax=upper95), width=.667) +
      geom_point(aes(size=population, fill=type), colour="black",shape=21)+
      geom_hline(yintercept=0, linetype="dashed")+
      scale_x_discrete(name="Cohort")+
      coord_flip()+
      scale_shape(solid=FALSE)+
      scale_fill_manual(values=c( "CBCL"="white", "SDQ"="black"))+
      labs(title="Forest Plot") +
      theme_bw()
    

    Now, I have the following issues:

    • The x-axis is unreadable because all the values are overlapping.
    • The legend to the right ('population') displays all the values, but I want it solely to display some arbitrary values, like 5000, 10000 and 15000.
    • The plot should have a dashed line at y=0, but this line is displayed to the far right of the plot, which can't be right.
    • I would like to add additional text columns to the right of each bar (to display additional info for each specific cohort).
    • Any modifications to make the plot 'prettier' is always welcome.

    Thanks in advance!

    • aosmith
      aosmith almost 10 years
      Your first three issues are because all of your variables are factors due to combining cbind and as.data.frame. If you make type a vector of the same length as the others you can just do data <- data.frame(cohort, population, beta ,lower95,upper95,type).