pydiodon.pca

pydiodon.pca(A, pretreatment='standard', k=- 1, meth='svd')[source]

Principal Component Analysis

Parameters
Aa 2D numpy array, n x p

the array to be analyzed

kinteger

number of axes to be computed

methstring

method for numerical calculation (see notes)

pretreatmentstring

which pretreatment to apply ;

accepted values are: standard, bicentering, col_centering, row_centering, scaling

see notes for details

Returns
Ya 2D numpy array, n x k

matrix of principal components

La 1D numpy array

vector of eigenvalues

Va 2D numpy array, p x k

matrix of eigenvectors (new basis)

Notes

The method runs as follows:

  • first it implements the required pretreatments

  • second: it runs the function pca_core on the transformed matrix

  • third: it returns the eigenvalues, the principal axis, the principal components and the correlation matrix if required

methods for PCA: the argument meth specifies which method is selected for the core of MDS. Default value is svd. Let A be the the matrix to analyse.

  • if meth=svd, the a SVD of A is run

  • if meth=grp, the SVD is run with Gaussian Random Projection

  • if meth=evd, the eigenvalues and eigenvectors of A are computed

pretreatments: here are the accepted pretreatments:

  • standard: the matrix is centered and scaled columnwise

  • bicentering: Matrix is centered rwowise and columnwise; it is a useful alternative to CoA known as “double averaging”

Examples

This is an example of a standard PCA of a random matrix, with \(m\) rows and \(n\) columns, with elements as realisation of a uniform law between 0 and 1.

First build the random matrix

>>> import pydiodon as dio
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> m = 200 ; n = 50
>>> A = np.random.random((m,n))

Second, run the PCA

>>> L, V, Y = dio.pca(A)

Third, plots some results (eigenvectors and point cloud)

>>> plt.plot(L) ; plt.show()
>>> plt.scatter(Y[:,0],Y[:,1]) ; plt.show()

The above program runs centered scaled PCA, with here default option pretreatment="standard". For PCA without centering nor scaling, the command is

>>> L, V, Y, C = dio.pca(A, standard=False)

For PCA with column centering but without scaling, the command is

>>> L, V, Y, C = dio.pca(A, standard=False, col_centering=True)

(in such a case, the argument standard must be set to False. If not, the array will be scaled as well). Scaling without centering is quite unusual.

For bicentering, the command is

firefox
>>> L, V, Y, C = dio.pca(A, bicenter=True)

These are the most usuful options for pretreatment.

Y, L, V Prescribed rank is simply called by (with standard pretreatment)

>>> rank = 10
>>> L, V, Y = dio.pca(A, k=rank)

for having the 10 first components and axis only.

revised 21.03.03 - 21.04.20