How to add boxplots to scatterplot with jitter
Solution 1
Here's one way using base graphics.
boxplot(NUMS ~ GRP, data = ddf, lwd = 2, ylab = 'NUMS')
stripchart(NUMS ~ GRP, vertical = TRUE, data = ddf,
method = "jitter", add = TRUE, pch = 20, col = 'blue')
Solution 2
To do this in ggplot2
, try:
ggplot(ddf, aes(x=GRP, y=NUMS)) +
geom_boxplot(outlier.shape=NA) + #avoid plotting outliers twice
geom_jitter(position=position_jitter(width=.1, height=0))
Obviously you can adjust the width and height arguments of position_jitter()
to your liking (although I'd recommend height=0
since height jittering will make your plot inaccurate).
Solution 3
I've written an R function called spreadPoints() within a package basiclotteR
. The package can be directly installed into your R library using the following code:
install.packages("devtools")
library("devtools")
install_github("JosephCrispell/basicPlotteR")
For the example provided, I used the following code to generate the example figure below.
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5],500,replace=T))
boxplot(NUMS ~ GRP, data = ddf, lwd = 2, ylab = 'NUMS')
spreadPointsMultiple(data=ddf, responseColumn="NUMS", categoriesColumn="GRP",
col="blue", plotOutliers=TRUE)
It is a work in progress (the lack of formula as input is clunky!) but it provides a non-random method to spread points on the X axis that doubles as a violin like summary of the data. Take a look at the source code, if you're interested.
Solution 4
For a lattice
solution:
library(lattice)
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5], 500, replace = T))
bwplot(NUMS ~ GRP, ddf, panel = function(...) {
panel.bwplot(..., pch = "|")
panel.xyplot(..., jitter.x = TRUE)})
The default median dot symbol was changed to a line with pch = "|"
. Other properties of the box and whiskers can be adjusted with box.umbrella
and box.rectangle
through the trellis.par.set()
function. The amount of jitter can be adjusted through a variable named factor
where factor = 1.5
increases it by 50%.
Related videos on Youtube
Comments
-
rnso over 3 years
I am using following commands to produce a scatterplot with jitter:
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5],500,replace=T)) library(lattice) stripplot(NUMS~GRP,data=ddf, jitter.data=T)
I want to add boxplots over these points (one for every group). I tried searching but I am not able to find code plotting all points (and not just outliers) and with jitter. How can I solve this. Thanks for your help.
-
lukeA almost 10 yearsDoes it have to be lattice? Otherwise try sth like
with(ddf, { boxplot(NUMS~GRP); points(jitter(as.numeric(GRP)), NUMS, col=rgb(0,0,0,.2), cex=.5, pch=19) })
. -
rnso almost 10 yearsUsing base graphics is preferred. Your solution works very well. Thanks.
-
rnso almost 10 yearsCan this be done with ggplot2? I tried {ggplot(ddf,aes(x=GRP, y=NUMS))+geom_boxplot()+geom_jitter()} but it produces too much scatter- could the jitter be less?
-
Brian D almost 8 yearsSee this related question as well for points jittered by group: stackoverflow.com/questions/21468380/…
-
-
rnso almost 10 yearsYes, it works very well. Thanks. I was trying stripplot followed by boxplot and it was not working.
-
Rich Scriven almost 10 yearsThe
add = TRUE
argument is key. :) -
rnso almost 10 yearsadd=T alone may not be enough since {stripplot(NUMS~GRP,data=ddf, jitter=T) ; boxplot(NUMS~GRP,data=ddf, add=T)} does not work; apparently one needs to put a 'plot' first followed by points or chart.
-
Rich Scriven almost 10 years
stripplot
is inlattice
.stripchart
is a base graphics function. -
rnso about 5 yearsLooks good. Is it possible to plot all groups with just one line of code rather than repeating code for each group:
spreadPoints(ddf[ddf$GRP=="A", "NUMS"], position=1, col="blue", plotOutliers=TRUE)
? -
Joseph Crispell about 5 years@rnso I've created an additional function
spreadPointsMultiple()
that can spread the points for multiple boxplots with a single command (see edit above). I'm currently working on allowingspreadPoints()
to have a formula as its first argument. Thanks for pointing this out :-)