PCA of RGB Image

11,667

If there are three bands (which is the case for an RGB image), you need to reshape your image like

X = X.reshape(-1, 3)

In your case of a 512x512 image, the new X will have shape (262144, 3). The dimension of 3 will not throw off your result; that dimension represents the features in the image data space. Each row of X is a sample/observation and each column represents a variable/feature.

The total amount of variance in the image is equal to np.sum(S), which is the sum of eigenvalues. The amount of variance you retain will depend on which eigenvalues/eigenvectors you retain. So if you only keep the first eigenvalue/eigenvector, then the fraction of image variance you retain will be equal to

f = S[0] / np.sum(S)
Share:
11,667
user3433572
Author by

user3433572

Updated on June 19, 2022

Comments

  • user3433572
    user3433572 almost 2 years

    I'm trying to figure out how to use PCA to decorrelate an RGB image in python. I'm using the code found in the O'Reilly Computer vision book:

    from PIL import Image
    from numpy import *
    
    def pca(X):
      # Principal Component Analysis
      # input: X, matrix with training data as flattened arrays in rows
      # return: projection matrix (with important dimensions first),
      # variance and mean
    
      #get dimensions
      num_data,dim = X.shape
    
      #center data
      mean_X = X.mean(axis=0)
      for i in range(num_data):
          X[i] -= mean_X
    
      if dim>100:
          print 'PCA - compact trick used'
          M = dot(X,X.T) #covariance matrix
          e,EV = linalg.eigh(M) #eigenvalues and eigenvectors
          tmp = dot(X.T,EV).T #this is the compact trick
          V = tmp[::-1] #reverse since last eigenvectors are the ones we want
          S = sqrt(e)[::-1] #reverse since eigenvalues are in increasing order
      else:
          print 'PCA - SVD used'
          U,S,V = linalg.svd(X)
          V = V[:num_data] #only makes sense to return the first num_data
    
       #return the projection matrix, the variance and the mean
       return V,S,mean_X
    

    I know I need to flatten my image, but the shape is 512x512x3. Will the dimension of 3 throw off my result? How do I truncate this? How do I find a quantitative number of how much information is retained?