Plotting PCA scores with color

r pca
17,338

Here are some example plots of PCA. Taken from the here.

z1 <- rnorm(10000, mean=1, sd=1); z2 <- rnorm(10000, mean=3, sd=3); z3 <- rnorm(10000, mean=5, sd=5); z4 <- rnorm(10000, mean=7, sd=7); z5 <- rnorm(10000, mean=9, sd=9); mydata <- matrix(c(z1, z2, z3, z4, z5), 2500, 20, byrow=T, dimnames=list(paste("R", 1:2500, sep=""), paste("C", 1:20, sep=""))) 

summary(pca) 
summary(pca)$importance[, 1:6] 

x11(height=6, width=12, pointsize=12); par(mfrow=c(1,2)) 

mycolors <- c("red", "green", "blue", "magenta", "black") # Define plotting colors. plot(pca$x, pch=20, col=mycolors[sort(rep(1:5, 500))]) 

plot(pca$x, type="n"); text(pca$x, rownames(pca$x), cex=0.8, col=mycolors[sort(rep(1:5, 500))]) 

You can use pairs

pairs(pca$x[,1:5], col = mycolors) 

Plots a scatter plot for the first two principal components plus the corresponding eigen vectors that are stored in pca$rotation.

library(scatterplot3d) 
scatterplot3d(pca$x[,1:3], pch=20, color=mycolors[sort(rep(1:5, 500))]) 

Same as above, but plots the first three principal components in 3D scatter plot.

library(rgl); rgl.open(); offset <- 50; par3d(windowRect=c(offset, offset, 640+offset, 640+offset)); rm(offset); rgl.clear(); rgl.viewpoint(theta=45, phi=30, fov=60, zoom=1); spheres3d(pca$x[,1], pca$x[,2], pca$x[,3], radius=0.3, color=mycolors, alpha=1, shininess=20); aspect3d(1, 1, 1); axes3d(col='black'); title3d("", "", "PC1", "PC2", "PC3", col='black'); bg3d("

The later creates an interactive 3D scatter plot with Open GL. The rgl library needs to be installed for this. To save a snapshot of the graph, one can use the command rgl.snapshot("test.png").

require(GGally)
ggpairs(pca$x[,1:5])
Share:
17,338
bdeonovic
Author by

bdeonovic

I am Benjamin Deonovic, a research scientist at the Corteva. My research interests include Bayesian data analysis, MCMC, computational statistics, bioinformatics, and psychometrics. email: [email protected]

Updated on August 03, 2022

Comments

  • bdeonovic
    bdeonovic almost 2 years

    I'm doing PCA and I would like to plot first principal component vs second in R:

    pca<-princomp(~.,data=data, na.action=na.omit
    plot(pca$scores[,1],pca$scores[,2])
    

    or maybe several principal components:

    pairs(pca$scores[,1:4])
    

    however the points are black. How do I appropriately add color to the graphs? How many colors do I need? One for each principal component I am plotting? Or one for each row in my data matrix?

    Thanks

    EDIT:

    my data looks like this:

    > data[1:4,1:4]
                              patient1                     patient2                     patient3                     patient4
    2'-PDE                    0.0153750                    0.4669375                   -0.0295625                    0.7919375
    7A5                       2.4105000                    0.3635000                    1.8550000                    1.4080000
    A1BG                      0.9493333                    0.2798333                    0.7486667                    0.7500000
    A2M                       0.2420000                    1.0385000                    1.1605000                    1.6777500
    

    So would this be appropriate:

    plot(pca$scores[,1:4], pch=20, col=rainbow(dim(data)[1]))