pydiodon.pca_core

pydiodon.pca_core(A, k=- 1, meth='svd')[source]

core method for PCA (Principal Component Analysis) of an array

Parameters
Aa n x p numpy array,

array to be analysed

kan integer

number of axis to compute

metha string

method for numerical computing

Returns
Ya 2D numpy array, n x k

matrix of principal components

La 1D numpy array

vector of eigenvalues

Va 2D numpy array, p x k

matrix of eigenvectors (new basis)

Notes

A is an array, and pca_core computes the PCA of A, without any centering nor scaling nor weights nor constraints. PCA with centering, scaling, weights or constraints is called by function pca() which in turns calls this function pca_core().

if k = -1, all axis are computed. If k > 0, only k first axis and components are computed.

if meth is

  • evd, runs by eigendecomposition of A’A

  • svd, runs by singular value decomposition of A

  • grp, runs by SVD of A with Gaussian random projection

Default value is svd.

With EVD, it runs as follows:
1. Computes the correlation matrix \(C=A'A\)
2. Computes the eigevalues and eigevectors of \(C\): \(Cv_j = \lambda_j v_j\)
3. Computes the principal components as \(y_j=Av_j\), or, globally, \(Y=AV\)
with SVD, it runs as follows:
1. \(U,\Sigma,V = SVD(A)\)
2. \(Y = U\Sigma\)
3. \(L=\Sigma^2\)

Example

This is an example of a standard PCA of a random matrix, with \(m\) rows and \(n\) columns, with elements as realisation of a Gaussian law with mean 0 and standrd deviation 1. As such, there is no need to center nor to scale, and the use of pca_core() is relevant.

>>> import pydiodon as dio
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> m = 200 ; n = 50
>>> A = np.random.randn(m,n)
>>> L, V, Y = dio.pca_core(A)
>>> plt.plot(L) ; plt.show()
>>> plt.scatter(Y[:,0],Y[:,1]) ; plt.show()

af, revised 21.03.01; 21.06.27; 22.09.23, 22.10.13