pydiodon.pca¶
- pydiodon.pca(A, pretreatment='standard', k=- 1, meth='svd')[source]¶
Principal Component Analysis
- Parameters
- Aa 2D numpy array, n x p
the array to be analyzed
- kinteger
number of axes to be computed
- methstring
method for numerical calculation (see notes)
- pretreatmentstring
which pretreatment to apply ;
accepted values are:
standard
,bicentering
,col_centering
,row_centering
,scaling
see notes for details
- Returns
- Ya 2D numpy array, n x k
matrix of principal components
- La 1D numpy array
vector of eigenvalues
- Va 2D numpy array, p x k
matrix of eigenvectors (new basis)
Notes
The method runs as follows:
first it implements the required pretreatments
second: it runs the function pca_core on the transformed matrix
third: it returns the eigenvalues, the principal axis, the principal components and the correlation matrix if required
methods for PCA: the argument
meth
specifies which method is selected for the core of MDS. Default value issvd
. Let A be the the matrix to analyse.if
meth=svd
, the a SVD of A is runif
meth=grp
, the SVD is run with Gaussian Random Projectionif
meth=evd
, the eigenvalues and eigenvectors of A are computed
pretreatments: here are the accepted pretreatments:
standard
: the matrix is centered and scaled columnwisebicentering
: Matrix is centered rwowise and columnwise; it is a useful alternative to CoA known as “double averaging”
Examples
This is an example of a standard PCA of a random matrix, with \(m\) rows and \(n\) columns, with elements as realisation of a uniform law between 0 and 1.
First build the random matrix
>>> import pydiodon as dio >>> import numpy as np >>> import matplotlib.pyplot as plt >>> m = 200 ; n = 50 >>> A = np.random.random((m,n))
Second, run the PCA
>>> L, V, Y = dio.pca(A)
Third, plots some results (eigenvectors and point cloud)
>>> plt.plot(L) ; plt.show() >>> plt.scatter(Y[:,0],Y[:,1]) ; plt.show()
The above program runs centered scaled PCA, with here default option
pretreatment="standard"
. For PCA without centering nor scaling, the command is>>> L, V, Y, C = dio.pca(A, standard=False)
For PCA with column centering but without scaling, the command is
>>> L, V, Y, C = dio.pca(A, standard=False, col_centering=True)
(in such a case, the argument
standard
must be set toFalse
. If not, the array will be scaled as well). Scaling without centering is quite unusual.For bicentering, the command is
- firefox
>>> L, V, Y, C = dio.pca(A, bicenter=True)
These are the most usuful options for pretreatment.
Y, L, V Prescribed rank is simply called by (with standard pretreatment)
>>> rank = 10 >>> L, V, Y = dio.pca(A, k=rank)
for having the 10 first components and axis only.
revised 21.03.03 - 21.04.20