ggplot XY scatter - how to change alpha transparency for select points?

10,350

Solution 1

We can use annotate:

ggplot(df, aes(x=SeqIdentityMean,
               y=SeqIdentityStdDev,
               color=PfamA_ID))+
  geom_point(alpha=0.05) +
  annotate("point",
           df$SeqIdentityMean[special.points],
           df$SeqIdentityStdDev[special.points])

enter image description here

Using @jlhoward's example data:

## create artificial data set for this example
set.seed(1)     # for reproducibility
n  <- 1.4e4     # 14,000 points
df <- data.frame(SeqIdentityMean  =rnorm(n, mean=rep(-3:3, each=n/7)), 
                 SeqIdentityStdDev=rnorm(n, mean=rep(-3:3, each=n/7)),
                 PfamA_ID=rep(1:7, each=n/7))
df$PfamA_ID <- factor(df$PfamA_ID)

## you start here
library(ggplot2)
special.points <- sample(1:n, 7)

EDIT 1: We can add annotate("text",...)

ggplot(df, aes(x=SeqIdentityMean,
               y=SeqIdentityStdDev)) +
  geom_point(alpha=0.05) +
  annotate("point",
           df$SeqIdentityMean[special.points],
           df$SeqIdentityStdDev[special.points],
           col="red") +
  annotate("text",
           df$SeqIdentityMean[special.points],
           df$SeqIdentityStdDev[special.points],
           #text we want to display
           label=round(df$SeqIdentityStdDev[special.points],1),
           #adjust horizontal position of text
           hjust=-0.1)

enter image description here


EDIT 2:

#subset of special points
df_sp <- df[special.points,]

#plot
ggplot(df, aes(x=SeqIdentityMean,
               y=SeqIdentityStdDev)) +
  geom_point(alpha=0.05) +
  #special points
  geom_point(data=df_sp,
             aes(SeqIdentityMean,SeqIdentityStdDev,col=PfamA_ID),size=3) +
  #custom legend
  scale_colour_manual(name = "Special Points",
                      values = df_sp$PfamA_ID,
                      labels = df_sp$SeqIdentityMean)

enter image description here

Solution 2

Just create an alpha column in your dataset, and set the points you want to stand out to alpha = 1:

library(ggplot2)
alpha_vector = rep(0.025, nrow(mtcars))
alpha_vector[c(3,6,8)] = 1
mtcars$alpha = alpha_vector
ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point(aes(alpha = alpha))

enter image description here

The trick here is to realize that alpha is just another aesthetic.

In addition, I would not plot 14k points directly and rely on alpha, I would just use 2d binning. For example using hexbin:

ggplot(mtcars, aes(x = wt, y = mpg)) + geom_hexbin()

Solution 3

It's a bit difficult to know what you're up against without seeing your data, but with 14,000 points increasing alpha alone is not likely to make the "special points"` stand out enough. You could try this:

## create artificial data set for this example
set.seed(1)     # for reproducibility
n  <- 1.4e4     # 14,000 points
df <- data.frame(SeqIdentityMean  =rnorm(n, mean=rep(-3:3, each=n/7)), 
                 SeqIdentityStdDev=rnorm(n, mean=rep(-3:3, each=n/7)),
                 PfamA_ID=rep(1:7, each=n/7))
df$PfamA_ID <- factor(df$PfamA_ID)

## you start here
library(ggplot2)
special.points <- sample(1:n, 7)
ggp <- ggplot(df, aes(x=SeqIdentityMean, y=SeqIdentityStdDev, color=PfamA_ID))+
  geom_point(alpha=0.05)+
  geom_point(data=df[special.points,], aes(fill=PfamA_ID), color="black", alpha=1, size=4, shape=21)+
  scale_color_discrete(guide=guide_legend(override.aes=list(alpha=1, size=3)))+
  scale_fill_discrete(guide="none", drop=FALSE)
ggp

By using shape=21 (filled circle), you can give the special points a black outline, and then use aes(fill=...) for the colors. IMO this makes them stand out more. The most straightforward way to do this is with an extra call to geom_point(...) using a layer-specific data set containing only the special points.

Finally, even with this contrived example, the groups are all mashed together. If that's the case in your real data, I'd be inclined to try faceting:

ggp + facet_wrap(~PfamA_ID)

This has the advantage of highlighting which groups (PfamA_ID) the special points belong to, which isn't obvious from the earlier plot.

A couple of other points about your code:

  1. It's very bad practice to use, e.g., ggplot(df, aes(x=df$a, y=df$b, ...), ...). Instead use: ggplot(df, aes(x=a, y=b, ...), ...). The whole point of mapping is to associate the aesthetics (x, y, color, etc) with columns in df, using the column names. You were passing the columns as independent vectors.
  2. In the example, I set df$PfamA_ID to a factor in the data.frame, not in the call to aes(...). This is important because it turns out that the special points subset is missing some of the factor levels. If you did it the other way, the fill colors in the special layer would not line up with the point colors in the main layer.
  3. When you set alpha=0.05 (or whatever), the legend will use that alpha, which makes the legend almost useless. To get around this use:

    scale_color_discrete(guide=guide_legend(override.aes=list(alpha=1, size=3)))

Edit: Response to OP's last comment/request.

So it sounds like you want to use ggplot's default discrete color scale for everything except the first color (which is a desaturated red). This is not a great idea, but here is a way to do it:

# create custom color palette containing ggplot defaults for all but first color; use black for first color
n.col <- length(levels(df$PfamA_ID))
cols  <- c("#000000", hcl(h=seq(15, 375, length=n.col+1), l=65, c=100)[2:n.col])
# set color and fill palette manually
ggp <- ggplot(df, aes(x=SeqIdentityMean, y=SeqIdentityStdDev, color=PfamA_ID))+
  geom_point(alpha=0.05)+
  geom_point(data=df[special.points,], aes(fill=PfamA_ID), color="black", alpha=1, size=4, shape=21)+
  scale_color_manual(values=cols, guide=guide_legend(override.aes=list(alpha=1, size=3)))+
  scale_fill_manual(values=cols, guide="none", drop=FALSE)
ggp

Share:
10,350
AksR
Author by

AksR

Updated on July 17, 2022

Comments

  • AksR
    AksR almost 2 years

    I have ~14,000 XY pairs to plot, and using ggplot2 for this.

    Due to high numbers of points, I have had to use very low alpha=0.025. I want to highlight 7 XY points in a different color and more opaque, and have an accompanying text legend.

    Currently, my colors for the 7 special data points do not show up because they are also at alpha=0.025. How do I increase the opaqueness for just these points?

    Syntax I have so far is:

    trial <- ggplot(df, aes(x = df$SeqIdentityMean, 
                            y = df$SeqIdentityStdDev,
                            color = factor(df$PfamA_ID))) + 
                geom_point(alpha=0.025) + 
                labs(title="Mean Vs. standard deviation of PfamA seed lengths", 
                     x="Average Length (aa)",
                     y="Standard  Deviation of Length (aa)") + 
                theme(legend.title=element_blank(),
                      legend.key=element_rect(fill='NA'))
    
  • AksR
    AksR over 8 years
    in=read.table("inout",sep="\t",header=TRUE) df=as.data.frame(in) df$PfamA_ID <- factor(df$PfamA_ID) special.points <- sample(1:5) # 1st column Pfam_ID has IDs, but only in 1st 5 rows that I want color coded on plot and legend library(ggplot2) ggplot(df, aes(x=Mean, y=StdDev))+ labs(title="T", x="X", y="Y")+ geom_point(alpha=0.03)+ geom_point(data=df[special.points,], aes(fill=PfamA_ID), color="black", alpha=1, size=2.5, shape=21)+ scale_color_discrete(guide=guide_legend(override.aes=list(al‌​pha=1, size=3)))+ scale_fill_discrete(guide="none", drop=FALSE) Still no legend Help pls!
  • AksR
    AksR over 8 years
    How do I color the highlighted points, with scatter in black (reverse of what you show), and add text - in another column of input, for just the special.points into the legend? Thanks!
  • AksR
    AksR over 8 years
    Also, what does the PfamA_ID=rep(1:7, each=n/7) do?
  • jlhoward
    jlhoward over 8 years
    Replace ggplot(df, aes(x=Mean, y=StdDev)) with ggplot(df, aes(x=Mean, y=StdDev, color=Pfam_ID)), as in the answer.
  • jlhoward
    jlhoward over 8 years
    Pfam_ID = rep(...) just creates a grouping column in the artificial data set (2000 1's, followed by 2000 2', etc.). It has nothing to do with your real data.
  • AksR
    AksR over 8 years
    oops, I was not clear in my earlier request. Let me clarify. How can I make those annoated colors each different, and then outside the plot have a legend with colors and corresponding text (from input data column #1, but for just the special.points). So I dont want them to be all red, and for text inside the plot, but outside it. Sorry to be a pain, but new to R, and brand new to ggplot :)
  • AksR
    AksR over 8 years
    Yes, that worked! Thanks a lot for patiently helping me :)
  • AksR
    AksR over 8 years
    Hopefully my final question / request for help... the majority of points that are not special.points came out pink in color. How can I re-assign just this color to what I want - black?