R - Add legend to scatter plot

r plot legend scatter-plot

10,811

df$x1 must be a factor. So the solution is:

df <- setNames(data.frame(t(data.frame(c("JAN",675,-0.0227745,-0.00725257), 
                                 c("FAN",308,-0.00311583,0.0396208), 
                                 c("CAN",173,0.0209893,-0.00499655), 
                                 c("JAN",176,-0.022875,-0.0176274), 
                                 c("FAN3",30,0.00511254,0.00040608), 
                                 c("FAN2",97,0.00297323,0.0074444), 
                                 c("JAN",493,-0.0202015,-0.00826022), 
                                 c("CAN",512,0.019516,-0.0122018), 
                                 c("CAN",617,0.0162082,-0.00594085), 
                                 c("JAN",790,-0.0256026,-0.0112882), 
                                 c("JAN",816,-0.020059,-0.000686427), 
                                 c("CAN",511,0.0247956,-0.010808), 
                                 c("RAN",81,0.00385228,-0.0111547), 
                                 c("CAN",305,0.0165547,-0.0123792), 
                                 c("FAN2",51,0.0042059,0.0103337), 
                                 c("FAN2",66,0.00468969,0.0118249), 
                                 c("RAN",97,0.00878763,-0.0205951), 
                                 c("FAN2",95,-0.00557579,0.00274432), 
                                 c("FAN2",102,-0.00143439,0.020084), 
                                 c("FAN",119,-0.00172261,0.0392606),
                                 row.names = NULL,stringsAsFactors = FALSE))), 
                                c("x1","x2","x3", "x4"))
plot(df$x3, df$x4, col = factor(df$x1), xlab = "x3", ylab = "x4")
legend(x = "topright", legend = levels(factor(df$x1)), 
    col = factor(df$x1), pch=1, cex = 0.6)

cex = 0.6 to set font size.

10,811

Author by

tommy.carstensen

Working in human genetics via structural bioinformatics / biophysics via wet lab biochemistry. Writing a bit of Python code since 2005 (version 2.4). Grateful for receiving help on SO and trying to pay back, when I have the time and ability.

Updated on June 04, 2022

Comments

tommy.carstensen almost 2 years

I have never used R before so please don't assume I know even the simplest things. I come from gnuplot/matplotlib.

Let's say I have the following input file mucked_qc3.eigenvec:

JAN 675 -0.0227745 -0.00725257
FAN 308 -0.00311583 0.0396208
CAN 173 0.0209893 -0.00499655
JAN 176 -0.022875 -0.0176274
FAN3 30 0.00511254 0.00040608
FAN2 97 0.00297323 0.0074444
JAN 493 -0.0202015 -0.00826022
CAN 512 0.019516 -0.0122018
CAN 617 0.0162082 -0.00594085
JAN 790 -0.0256026 -0.0112882
JAN 816 -0.020059 -0.000686427
CAN 511 0.0247956 -0.010808
RAN 81 0.00385228 -0.0111547
CAN 305 0.0165547 -0.0123792
FAN2 51 0.0042059 0.0103337
FAN2 66 0.00468969 0.0118249
RAN 97 0.00878763 -0.0205951
FAN2 95 -0.00557579 0.00274432
FAN2 102 -0.00143439 0.020084
FAN 119 -0.00172261 0.0392606

I want my output to be a scatter plot of columns 3 and 4 and add non-duplicate legends based on column 1.

I have tried this:

data = read.table('mucked_qc3.eigenvec', header=F)
pdf('mucked_qc3.pdf')
plot(data[,3],data[,4],col=data[,1],xlab="PC1",ylab="PC2")
#legend("topright", legend=levels(factor(data[,1])))
legend(x="topright", legend = levels(data$1), col=c("red","blue","green","yellow","magenta","cyan"), pch=1)
dev.off()

I can't quite get the legend part right.