Plotting PCA scores with color
Here are some example plots of PCA. Taken from the here.
z1 <- rnorm(10000, mean=1, sd=1); z2 <- rnorm(10000, mean=3, sd=3); z3 <- rnorm(10000, mean=5, sd=5); z4 <- rnorm(10000, mean=7, sd=7); z5 <- rnorm(10000, mean=9, sd=9); mydata <- matrix(c(z1, z2, z3, z4, z5), 2500, 20, byrow=T, dimnames=list(paste("R", 1:2500, sep=""), paste("C", 1:20, sep="")))
summary(pca)
summary(pca)$importance[, 1:6]
x11(height=6, width=12, pointsize=12); par(mfrow=c(1,2))
mycolors <- c("red", "green", "blue", "magenta", "black") # Define plotting colors. plot(pca$x, pch=20, col=mycolors[sort(rep(1:5, 500))])
plot(pca$x, type="n"); text(pca$x, rownames(pca$x), cex=0.8, col=mycolors[sort(rep(1:5, 500))])
You can use pairs
pairs(pca$x[,1:5], col = mycolors)
Plots a scatter plot for the first two principal components plus the corresponding eigen vectors that are stored in pca$rotation.
library(scatterplot3d)
scatterplot3d(pca$x[,1:3], pch=20, color=mycolors[sort(rep(1:5, 500))])
Same as above, but plots the first three principal components in 3D scatter plot.
library(rgl); rgl.open(); offset <- 50; par3d(windowRect=c(offset, offset, 640+offset, 640+offset)); rm(offset); rgl.clear(); rgl.viewpoint(theta=45, phi=30, fov=60, zoom=1); spheres3d(pca$x[,1], pca$x[,2], pca$x[,3], radius=0.3, color=mycolors, alpha=1, shininess=20); aspect3d(1, 1, 1); axes3d(col='black'); title3d("", "", "PC1", "PC2", "PC3", col='black'); bg3d("
The later creates an interactive 3D scatter plot with Open GL. The rgl library needs to be installed for this. To save a snapshot of the graph, one can use the command rgl.snapshot("test.png").
require(GGally)
ggpairs(pca$x[,1:5])
bdeonovic
I am Benjamin Deonovic, a research scientist at the Corteva. My research interests include Bayesian data analysis, MCMC, computational statistics, bioinformatics, and psychometrics. email: [email protected]
Updated on August 03, 2022Comments
-
bdeonovic almost 2 years
I'm doing PCA and I would like to plot first principal component vs second in R:
pca<-princomp(~.,data=data, na.action=na.omit plot(pca$scores[,1],pca$scores[,2])
or maybe several principal components:
pairs(pca$scores[,1:4])
however the points are black. How do I appropriately add color to the graphs? How many colors do I need? One for each principal component I am plotting? Or one for each row in my data matrix?
Thanks
EDIT:
my data looks like this:
> data[1:4,1:4] patient1 patient2 patient3 patient4 2'-PDE 0.0153750 0.4669375 -0.0295625 0.7919375 7A5 2.4105000 0.3635000 1.8550000 1.4080000 A1BG 0.9493333 0.2798333 0.7486667 0.7500000 A2M 0.2420000 1.0385000 1.1605000 1.6777500
So would this be appropriate:
plot(pca$scores[,1:4], pch=20, col=rainbow(dim(data)[1]))