Principal Component Analysis (PCA) is a procedure to reduce the dimensionality of multivariate data by identifying a set of orthogonal vectors (eigenvectores) that represent directions of maximum variance in the dataset.
The procedure fsml_pca is a wrapper for fsml_eof and offers a simpler,
more familiar interface for non-geoscientists. The EOF interface allows for
more options to be passed that are irrelevant to standard applications of PCA.
The PCA procedure calls the EOF procedures with weights (wt) set to 1.0,
and matrix options set to opt = 0 to force the use of the covariance matrix
to be comparable to other common implementations of a PCA (e.g., sklearn).
The covariance matrix is computed as:
where is the preprocessed (centred and optionally standardised) data matrix,
and is the number of observations (rows in x).
A symmetric eigen-decomposition is then performed:
where contains the EOFs (ev), and is a diagonal matrix
of eigenvalues (ew).
The principal components or scores (PCs, pc) are given by:
The number of valid PC modes is determined by the number of non-zero eigenvalues.
Arrays are initialised to zero and populated only where eigenvalues are strictly positive.
The explained variance (r2) for each component is computed as a fraction:
where is the PC index, and spans all retained eigenvalues,
representing all principal components that explain variability in the data.
Note: This subroutine uses eigh from the stdlib_linalg module to compute
eigenvalues and eigenvectors of the symmetric covariance matrix.
Principal Component Analysis (PCA).
It is a special (simplified) case of EOF analysis offered as a separate
procedure for clarity/familiarity. It calls s_lin_eof with equal weights.
| Type | Intent | Optional | Attributes | Name | ||
|---|---|---|---|---|---|---|
| real(kind=wp), | intent(in) | :: | x(nd,nv) |
input data |
||
| integer(kind=i4), | intent(in) | :: | nd |
number of rows |
||
| integer(kind=i4), | intent(in) | :: | nv |
number of columns |
||
| real(kind=wp), | intent(out) | :: | pc(nd,nv) |
principal components |
||
| real(kind=wp), | intent(out) | :: | ev(nv,nv) |
eigenvectors (unweighted) |
||
| real(kind=wp), | intent(out) | :: | ew(nv) |
eigenvalues |
||
| real(kind=wp), | intent(out), | optional | :: | r2(nv) |
explained variance (fraction) |