fsml_pca Interface

public interface fsml_pca

Principal Component Analysis (PCA) is a procedure to reduce the dimensionality of multivariate data by identifying a set of orthogonal vectors (eigenvectores) that represent directions of maximum variance in the dataset.

The procedure fsml_pca is a wrapper for fsml_eof and offers a simpler, more familiar interface for non-geoscientists. The EOF interface allows for more options to be passed that are irrelevant to standard applications of PCA. The PCA procedure calls the EOF procedures with weights (wt) set to 1.0, and matrix options set to opt = 0 to force the use of the covariance matrix to be comparable to other common implementations of a PCA (e.g., sklearn).

The covariance matrix is computed as: where is the preprocessed (centred and optionally standardised) data matrix, and is the number of observations (rows in x).

A symmetric eigen-decomposition is then performed: where contains the EOFs (ev), and is a diagonal matrix of eigenvalues (ew).

The principal components or scores (PCs, pc) are given by: The number of valid PC modes is determined by the number of non-zero eigenvalues. Arrays are initialised to zero and populated only where eigenvalues are strictly positive.

The explained variance (r2) for each component is computed as a fraction: where is the PC index, and spans all retained eigenvalues, representing all principal components that explain variability in the data.

Note: This subroutine uses eigh from the stdlib_linalg module to compute eigenvalues and eigenvectors of the symmetric covariance matrix.

Calls

interface~~fsml_pca~~CallsGraph interface~fsml_pca fsml_pca proc~s_lin_pca s_lin_pca interface~fsml_pca->proc~s_lin_pca proc~s_err_print s_err_print proc~s_lin_pca->proc~s_err_print proc~s_lin_eof s_lin_eof proc~s_lin_pca->proc~s_lin_eof proc~s_lin_eof->proc~s_err_print eigh eigh proc~s_lin_eof->eigh proc~f_sts_mean_core f_sts_mean_core proc~s_lin_eof->proc~f_sts_mean_core proc~f_sts_std_core f_sts_std_core proc~s_lin_eof->proc~f_sts_std_core proc~f_sts_var_core f_sts_var_core proc~f_sts_std_core->proc~f_sts_var_core proc~f_sts_var_core->proc~f_sts_mean_core

Module Procedures

public subroutine s_lin_pca(x, nd, nv, pc, ev, ew, r2)

Principal Component Analysis (PCA). It is a special (simplified) case of EOF analysis offered as a separate procedure for clarity/familiarity. It calls s_lin_eof with equal weights.

Arguments

Type IntentOptional Attributes Name
real(kind=wp), intent(in) :: x(nd,nv)

input data

integer(kind=i4), intent(in) :: nd

number of rows

integer(kind=i4), intent(in) :: nv

number of columns

real(kind=wp), intent(out) :: pc(nd,nv)

principal components

real(kind=wp), intent(out) :: ev(nv,nv)

eigenvectors (unweighted)

real(kind=wp), intent(out) :: ew(nv)

eigenvalues

real(kind=wp), intent(out), optional :: r2(nv)

explained variance (fraction)