Empirical Orthogonal Function (EOF) analysis is a procedure to reduce the dimensionality
of multivariate data by identifying a set of orthogonal vectors (EOFs or eigenvectores)
that represent directions of maximum variance in the dataset.
The term EOF analysis is often used interchangably with the geographically weighted
principal component analysis (PCA). The procedures are mathematically equivalent, but
procedures for EOF analysis offer some additional options that are mostly relevant for
geoscience. The procedure fsml_pca
is a wrapper for fsml_eof
that offers a simpler,
more familiar interface for non-geoscientists.
For a classic EOF analysis, the input matrix x
holds data or observations that have been
discretised in time and space. Rows (m
) and columns (n
) can therefore be interpreted
as time and space dimensions, respectively. EOF analysis allows for geographical weighting,
which translates to column-wise weighting prior to analysis in the procedure.
Weights can be set by bassing the rank-1 array wt
of dimension n
. If this optional
argument is not passed, the procedure will default to equal weights of value .
It is numerically more stable than 1.0, which is the default for many implementations of a PCA.
After the weighting is applied, the covariance or correlation matrix is computed:
where is the preprocessed (centred and optionally standardised) data matrix,
and is the number of observations (rows in x
).
The value of the optional argument opt
determines if the covariance matrix (opt = 0
) or
correlation matrix (opt = 1
) is constructed. If the argument is not passed, the procedure will
default to the use of the covariance matrix, as is the standard for a regular PCA.
A symmetric eigen-decomposition is then performed:
where contains the EOFs (eof
), and is a diagonal matrix
of eigenvalues (ew
).
The principal components or scores (PCs, pc
) are given by:
The number of valid EOF/PC modes is determined by the number of non-zero eigenvalues.
Arrays are initialised to zero and populated only where eigenvalues are strictly positive.
The explained variance (r2
) for each component is computed as a fraction:
where is the PC index, and spans all retained eigenvalues,
representing all principal components that explain variability in the data.
EOFs may optionally be scaled (eof_scaled
) for more convenient plotting:
Note: This subroutine uses eigh
from the stdlib_linalg
module to compute
eigenvalues and eigenvectors of the symmetric covariance matrix.
Empirical Orthogonal Function (EOF) analysis
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
real(kind=wp), | intent(in) | :: | x(nd,nv) |
input data |
||
integer(kind=i4), | intent(in) | :: | nd |
number of rows |
||
integer(kind=i4), | intent(in) | :: | nv |
number of columns |
||
real(kind=wp), | intent(out) | :: | pc(nd,nv) |
principal components |
||
real(kind=wp), | intent(out) | :: | eof(nv,nv) |
EOFs/eigenvectors (unweighted) |
||
real(kind=wp), | intent(out) | :: | ew(nv) |
eigenvalues |
||
integer(kind=i4), | intent(in), | optional | :: | opt |
0 = covariance, 1 = correlation |
|
real(kind=wp), | intent(in), | optional | :: | wt(nv) |
optional weights (default = 1.0/n) |
|
real(kind=wp), | intent(out), | optional | :: | r2(nv) |
explained variance (fraction) |
|
real(kind=wp), | intent(out), | optional | :: | eof_scaled(nv,nv) |
EOFs/eigenvectors scaled for plotting |