LIN: Linear Procedures

Overview

This is the API documentation for all linear procedures (relying heavily on linear algebra).


Principal Component Analysis

fsml_pca

Description

Principal Component Analysis (PCA) is a procedure to reduce the dimensionality of multivariate data by identifying a set of orthogonal vectors (eigenvectores) that represent directions of maximum variance in the dataset.

For a classic PCA, the input matrix (x) holds data or observations in rows (nd) and different variables in columns (nv).

The procedure fsml_pca is a wrapper for fsml_eof and offers a simpler, more familiar interface for non-geoscientists. The EOF interface allows for more options to be passed that are irrelevant to standard applications of PCA. The PCA procedure calls the EOF procedures with weights (wt) set to 1.0, and matrix options set to opt = 0 to force the use of the covariance matrix to be comparable to other common implementations of a PCA (e.g., Python sklearn).

The covariance matrix is computed as: where is the preprocessed (centred and optionally standardised) data matrix, and is the number of observations (rows in x).

A symmetric eigen-decomposition is then performed: where contains the EOFs (ev), and is a diagonal matrix of eigenvalues (ew).

The principal components or scores (PCs, pc) are given by: The number of valid PC modes is determined by the number of non-zero eigenvalues. Arrays are initialised to zero and populated only where eigenvalues are strictly positive.

The explained variance (r2) for each component is computed as a fraction: where is the PC index, and spans all retained eigenvalues, representing all principal components that explain variability in the data.

Note: This procedure uses eigh from the stdlib_linalg module to compute eigenvalues and eigenvectors of the symmetric covariance matrix.

Note

The procesure has no pure equivalent, because it relies on an impure procedure for eigen-decomposition (eigh).

Syntax

call fsml_pca(x, nd, nv, pc, ev, ew [, r2])

Parameters

x: A rank-2 array of type real with dimensions nd, nv.

nd: A scalar of type integer.

nv: A scalar of type integer.

Invalid argument values will result in the return of a sentinel value.

Returns

pc: A rank-2 array of type real (same type as x) with dimensions nd, nv.

ev: A rank-2 array of type real (same type as x) with dimensions nv, nv.

ew: An rank-1 array of type real. It stores the eigenvalues.

r2: An optional return and rank-1 array of type real.


Empirical Orthogonal Functions

fsml_eof

Description

Empirical Orthogonal Function (EOF) analysis is a procedure to reduce the dimensionality of multivariate data by identifying a set of orthogonal vectors (EOFs or eigenvectores) that represent directions of maximum variance in the dataset. The term EOF analysis is often used interchangably with the geographically weighted principal component analysis (PCA). The procedures are mathematically equivalent, but procedures for EOF analysis offer some additional options that are mostly relevant for geoscience. The procedure fsml_pca is a wrapper for fsml_eof that offers a simpler, more familiar interface for non-geoscientists.

For a classic EOF analysis, the input matrix x holds data or observations that have been discretised in time and space. Rows (nd) and columns (nv) can therefore be interpreted as time and space dimensions, respectively. EOF analysis allows for geographical weighting, which translates to column-wise weighting prior to analysis in the procedure. Weights can be set by bassing the rank-1 array wt of dimension nv. If this optional argument is not passed, the procedure will default to equal weights of value . It is numerically more stable than 1.0, which is the default for many implementations of a PCA.

After the weighting is applied, the covariance or correlation matrix is computed: where is the preprocessed (centred and optionally standardised) data matrix, and is the number of observations (rows nd in x). The value of the optional argument opt determines if the covariance matrix (opt = 0) or correlation matrix (opt = 1) is constructed. If the argument is not passed, the procedure will default to the use of the covariance matrix, as is the standard for a regular PCA.

A symmetric eigen-decomposition is then performed: where contains the EOFs (eof), and is a diagonal matrix of eigenvalues (ew).

The principal components or scores (PCs, pc) are given by: The number of valid EOF/PC modes is determined by the number of non-zero eigenvalues. Arrays are initialised to zero and populated only where eigenvalues are strictly positive.

The explained variance (r2) for each component is computed as a fraction: where is the PC index, and spans all retained eigenvalues, representing all principal components that explain variability in the data.

EOFs may optionally be scaled (eof_scaled) for more convenient plotting:

Note: This procedure uses eigh from the stdlib_linalg module to compute eigenvalues and eigenvectors of the symmetric covariance or correlation matrix.

Note

The procesure has no pure equivalent, because it relies on an impure procedure for eigen-decomposition (eigh).

Syntax

call fsml_eof(x, nd, nv, pc, eof, ew [, opt, wt, r2, eof_scaled])

Parameters

x: A rank-2 array of type real with dimensions nd, nv.

nd: A scalar of type integer.

nv: A scalar of type integer.

opt: An optional argument and scalar of type integer. It must be 0 or 1. If not passed, it will default to 0.

wt: An optional argument and rank-1 array of type real. If not passed, it will default to equal weights of value 1/n.

Invalid argument values will result in the return of a sentinel value.

Returns

pc: A rank-2 array of type real (same type as x) with dimensions nd, nv.

eof: A rank-2 array of type real (same type as x) with dimensions nv, nv.

ew: An rank-1 array of type real. It stores the eigenvalues.

r2: An optional return and rank-1 array of type real.

eof_scaled: An optional return and rank-2 array of type real (same type as x) with dimensions nv, nv.


Linear Discriminant Analysis (2-Class)

fsml_lda_2class

Description

The 2-class multivariate Linear Discriminant Analysis (LDA) is a statistical procedure for classification and the investigation and explanation of differences between two groups (or classes) with regard to their attribute variables. It quantifies the discriminability of the groups and the contribution of each of the attribute variables to this discriminability.

The procedure finds a discriminant function that best separates the two groups. The function can be expressed as a linear combination of the attribute variables:

where is the discriminant function, are the attribute variables used in evaluating the differences between the groups, are the discriminant coefficients associated with each variable, (nv) is the number of variables, and is the y-intercept. (Note: Mathematically, it is analogous to a multivariate linear regression function.)

Each attribute variable contains elements (x), where (nd) is the number of elements in each group. Each element is associated with a discriminant value described by:

Geometrically, this can be visualised as elements being projected on the discriminant axis . The optimal discriminant function is then determined by finding an axis, on which the projected elements for the two groups are best separated. The best separation is given by maximising the discriminant criterion (g), a signal to noise ratio, so that:

where and are the number of elements in groups and , respectively. The procedure assumes that these are the same (nd) and only accepts 2 groups (nc = 2).

The discriminant coefficients are then standardised (sa) using the standard deviations of respective variables. The discriminant function represents a model that best seperates the groups and can be used as a classification model. The skill of that model is determined by forgetting the association of each element with the groups and using the model to reclassify the elements. The score (score) is the fraction of correct classifications and can be interpreted as a measure of how well the function works as a classification model.

The procedure optionally returns the Mahalanobis distance (mh) as a measure of distance between the groups.

Note: This procedure uses eigh from the stdlib_linalg module.

Note

The procesure has no pure equivalent, because it relies on an impure procedure for eigen-decomposition (eigh).

Syntax

call fsml_lda_2class(x, nd, nv, nc, sa, g, score [, mh])

Parameters

x: A rank-3 array of type real with dimensions nd, nv, nc.

nd: A scalar of type integer.

nv: A scalar of type integer.

nc: A scalar of type integer. It must be 2.

Invalid argument values will result in the return of a sentinel value.

Returns

sa: A rank-1 array of type real and dimension nv.

g: A scalar of type real.

score: A rank-3 array of type real.

mh: An optional return and scalar of type real


Ordinary Least Squares Regression

fsml_ols

Description

The multiple linear Ordinary Least Squares (OLS) regression models the relationship or linear dependence between a dependent (predictand) variable and and one or more independent (predictor) variables. The procedure estimates the linear regression coefficients by minimising the sum of squared residuals.

The estimated regression model is of the form:

where is the predictand variable, are the predictor variables (x) with nd observations, is the y-intercept (b0), (b) are the regression coefficients, and (nv) is the number of predictors (excluding the intercept).

The subroutine constructs a full matrix internally by prepending a column of ones to account for the intercept. The regression coefficients are estimated as:

where is the extended design matrix including the intercept term.

The coefficient of determination (r2) which represents the proportion of the total variance of (y) explained by the predictors. The predicted values (y_hat), standard errors (se) of the coefficients, and the covariance matrix of the predictors (cov_b) can optionally be returned by the procedure, too.

Note: This procedure uses eigh from the stdlib_linalg module.

Note: The intercept and predictor coefficients are computed separately and returned explicitly.

Note

The procesure has no pure equivalent, because it relies on an impure procedure for eigen-decomposition (eigh).

Syntax

call fsml_ols(x, y, nd, nv, b0, b, r2 [, y_hat, se, covb])

Parameters

x: A rank-2 array of type real with dimensions nd, nv.

y: A rank-1 array of type real with dimension nd.

nd: A scalar of type integer.

nv: A scalar of type integer.

Invalid argument values will result in the return of a sentinel value.

Returns

b0: A scalar of type real.

b: A rank-1 array of type real with dimension nv.

r2: A scalar of type real.

y_hat An optional return and rank-1 array of type real with dimension nd.

se An optional return and rank-1 array of type real with dimension nv.

cov_b An optional return and rank-2 array of type real with dimensions nv, nv.


Ridge Regression

fsml_ridge

Description

The multiple linear Ridge regression models the relationship or linear dependence between a dependent (predictand) variable and one or more independent (predictor) variables, incorporating a penalty term on the size of the regression coefficients to reduce multicollinearity and overfitting.

The procedure estimates the linear regression coefficients by minimising the sum of squared residuals plus a penalty proportional to the square of the magnitude of coefficients:

where (lambda) is the ridge penalty parameter, and is the identity matrix with the first diagonal element corresponding to the intercept set to zero (no penalty on intercept).

The estimated regression model is of the form:

where is the predictand variable, are the predictor variables (x) with nd observations, is the y-intercept (b0), (b) are the ridge regression coefficients, and (nv) is the number of predictors (excluding the intercept).

The subroutine constructs a full matrix internally by prepending a column of ones to account for the intercept. The coefficient of determination (r2), predicted values (y_hat), ridge-adjusted standard errors (se) of the coefficients, and the ridge-adjusted covariance matrix of the predictors (cov_b) can optionally be returned. The covariance matrix and standard errors are adjusted for the ridge penalty as:

where is the residual variance estimate.

Note: This procedure uses eigh from the stdlib_linalg module.

Note

The procesure has no pure equivalent, because it relies on an impure procedure for eigen-decomposition (eigh).

Syntax

call fsml_ridge(x, y, nd, nv, lambda, b0, b, r2 [, y_hat, se, covb])

Parameters

x: A rank-2 array of type real with dimensions nd, nv.

y: A rank-1 array of type real with dimension nd.

nd: A scalar of type integer.

nv: A scalar of type integer.

lambda: A scalar of type real. Must be non-zero positive.

Invalid argument values will result in the return of a sentinel value.

Returns

b0: A scalar of type real.

b: A rank-1 array of type real with dimension nv.

r2: A scalar of type real.

y_hat: An optional return and rank-1 array of type real with dimension nd.

se: An optional return and rank-1 array of type real with dimension nv.

cov_b: An optional return and rank-2 array of type real with dimensions nv, nv.


Mahalanobis Distance

fsml_mahalanobis

Description

Computes the Mahalanobis distance between two input feature vectors x and y. If a covariance matrix cov is provided, it is used directly in the calculation. Otherwise, the procedure estimates the covariance matrix from the two-sample dataset formed by x and y. A Cholesky-based solver is used to perform the distance calculation.

The Mahalanobis distance is defined as:

where is the covariance matrix. The inverse is applied via the Cholesky decomposition for numerical stability.

Note: If passed, the covariance matrix (cov) must be positive definite for the factorisation to succeed.

Note: This procedure uses chol from the stdlib_linalg module.

Syntax

dist = fsml_mahalanobis(x, y [, cov])

Parameters

x: A rank-1 array of type real with dimension m.

y: A rank-1 array of type real with dimension m.

cov: An optional rank-2 array of type real with dimensions m, m.

Returns

dist: A scalar of type real.


Examples

program example_lin

! |--------------------------------------------------------------------|
! | fsml - fortran statistics and machine learning library             |
! |                                                                    |
! | about                                                              |
! | -----                                                              |
! | Examples for linear (algebra) statistical procedures (lin module). |
! |                                                                    |
! | license : MIT                                                      |
! | author  : Sebastian G. Mutz (sebastian@sebastianmutz.com)          |
! |--------------------------------------------------------------------|

  use :: fsml
  use :: fsml_ini ! import wp; alternatively: iso_fortran_env, wp => real64

  implicit none

! ==== Description
!! The programme demonstrates the use of procedures relying heavily on linear
!! algebra.

! ==== Declarations

! ---- general
  integer(i4)            :: i, j

! ---- data

  ! dims
  integer(i4), parameter :: nd = 5, nv = 3, nc = 2

  ! PCA/EOF
  real(wp)   , parameter :: x1(nd,nv) = reshape([ &
                                     & 2.1_wp, 2.5_wp, 1.9_wp, 2.3_wp, 2.0_wp, &
                                     & 2.4_wp, 2.8_wp, 2.6_wp, 3.2_wp, 2.9_wp, &
                                     & 2.0_wp, 2.8_wp, 2.1_wp, 2.3_wp, 2.9_wp  &
                                     & ], shape=[nd,nv])

  ! LDA
  real(wp)   , parameter :: x2(nd,nv,nc) = reshape([ &
                                     & 2.1_wp, 2.5_wp, 1.9_wp, 2.3_wp, 2.0_wp, & ! class 1, var 1
                                     & 2.4_wp, 2.8_wp, 2.6_wp, 3.2_wp, 2.9_wp, & ! class 1, var 2
                                     & 2.0_wp, 2.8_wp, 2.1_wp, 2.3_wp, 2.9_wp, & ! class 1, var 3
                                     & 1.1_wp, 1.3_wp, 1.5_wp, 1.4_wp, 1.6_wp, & ! class 2, var 1
                                     & 1.2_wp, 1.5_wp, 1.6_wp, 1.7_wp, 1.8_wp, & ! class 2, var 2
                                     & 1.0_wp, 1.2_wp, 1.3_wp, 1.1_wp, 1.4_wp  & ! class 2, var 3
                                     & ], shape=[nd,nv,nc])

  ! OLS/Ridge
  real(wp), parameter :: x3(nd,nv) = reshape([ &
                                   & 1.0_wp, 2.0_wp, 3.0_wp, 4.0_wp, 5.0_wp, & ! var 1
                                   & 2.0_wp, 1.0_wp, 4.0_wp, 3.0_wp, 2.0_wp, & ! var 2
                                   & 3.0_wp, 3.5_wp, 2.0_wp, 1.0_wp, 2.5_wp  & ! var 3
                                   & ], shape=[nd,nv])
  real(wp), parameter :: y(nd) = [10.0_wp, 12.0_wp, 13.0_wp, 14.0_wp, 15.0_wp]  ! target values

  ! Mahalanobis distance
  real(wp) :: dist         ! distance

! ---- results

  ! PCA
  real(wp) :: pc(nd,nv), eof(nv,nv), ew(nv), eof_scaled(nv,nv), r2(nv), wt(nv)

  ! LDA
  real(wp) :: score, g, mh, sa(nv)

  ! OLS and Ridge
  real(wp) :: b(nv), b0, yhat(nd), se(nv), covb(nv,nv), rsq

! ==== Instructions

! ---- Principal Component Analysis

  call fsml_pca(x1, nd, nv, pc=pc, ev=eof, ew=ew, r2=r2)
  write(*,'(A)') "> principal component analysis"
  print*
  write(*,'(A)')  "  principal components:  "
  write(*,'(3F10.5)')  (pc(i,:), i=1,nd)
  print*
  write(*,'(A)')   "  eigenvectors:         "
  write(*,'(3F10.5)')  (eof(i,:), i=1,nv)
  print*
  write(*,'(A)')   "  eigenvalues:          "
  write(*,'(3F10.5)')  ew
  print*
  write(*,'(A)')   "  explained variance:   "
  write(*,'(3F10.5)')  r2
  print*
  ! principal components:
  ! -0.54693  -0.07722   0.13896
  !  0.42315  -0.07015   0.27645
  ! -0.42586  -0.05419  -0.13456
  !  0.13778   0.43289  -0.06344
  !  0.41185  -0.23133  -0.21741
  !
  ! eigenvectors:
  !  0.28392   0.36310   0.88744
  !  0.47190   0.75276  -0.45897
  !  0.83469  -0.54909  -0.04238
  !
  ! eigenvalues:
  !  0.21204   0.06368   0.04128
  !
  ! explained variance:
  !  0.66888   0.20089   0.13023

! ---- Empirical Orthogonal Functions

  ! applying weights of 1 (detault = 1/n)
  wt = 1.0_wp

  ! call eof procedure with weights of 1 and opt=0 (covariance matrix) to makes it consistent with fmsl_pca and other commona pca implementations
  call fsml_eof(x1, nd, nv, pc=pc, eof=eof, ew=ew, opt=0, &
                 wt=wt, r2=r2, eof_scaled=eof_scaled)
  write(*,'(A)') "> empirical orthogonal function analysis"
  print*
  write(*,'(A)')  "  principal components:  "
  write(*,'(3F10.5)')  (pc(i,:), i=1,nd)
  print*
  write(*,'(A)')   "  EOFs (eigenvectors):  "
  write(*,'(3F10.5)')  (eof(i,:), i=1,nv)
  print*
  write(*,'(A)')   "  eigenvalues:          "
  write(*,'(3F10.5)')  ew
  print*
  write(*,'(A)')   "  explained variance:   "
  write(*,'(3F10.5)')  r2
  print*
  write(*,'(A)')   "  scaled EOFs:          "
  write(*,'(3F10.5)')  (eof_scaled(i,:), i=1,nv)
  print*
  ! principal components:
  ! -0.54693  -0.07722   0.13896
  !  0.42315  -0.07015   0.27645
  ! -0.42586  -0.05419  -0.13456
  !  0.13778   0.43289  -0.06344
  !  0.41185  -0.23133  -0.21741
  !
  ! eigenvectors:
  !  0.28392   0.36310   0.88744
  !  0.47190   0.75276  -0.45897
  !  0.83469  -0.54909  -0.04238
  !
  ! eigenvalues:
  !  0.21204   0.06368   0.04128
  !
  ! explained variance:
  !  0.66888   0.20089   0.13023
  !
  ! scaled EOFs:
  !  0.13074   0.09163   0.18031
  !  0.21730   0.18996  -0.09325
  !  0.38435  -0.13856  -0.00861

! ---- 2-Class Multivariate Linear Discriminant Analysis

  call fsml_lda_2class(x2, nd, nv, nc, sa, g, score, mh)
  write(*,'(A)') "> linear discriminant analysis (2-class)"
  print*
  write(*,'(A,F10.5)') "  classification score: ", score
  write(*,'(A,F10.5)') "  Mahalanobis distance: ", mh
  write(*,'(A,F10.5)') "  discriminant value g: ", g
  print*
  write(*,'(A)') "  standardised coefficients:"
  write(*,'(3F10.5)') sa
  print*
  ! classification score:    1.00000
  ! Mahalanobis distance:    5.53247
  ! discriminant value g:   53.34966
  !
  ! standardised coefficients:
  ! 2.07805   9.14221   5.42794

! ---- Multiple Linear Regression (Ordinary Least Squares)

  call fsml_ols(x3, y, nd, nv, b0, b, rsq, yhat, se, covb)
  write(*,'(A)') "> multiple linear regression (OLS)"
  print*
  write(*,'(A,F10.5)') "  R²:        ", rsq
  write(*,'(A,F10.5)') "  Intercept: ", b0
  print*
  write(*,'(A)') "  Coefficients:"
  write(*,'(3F10.5)') b
  print*
  write(*,'(A)') "  Predicted values:"
  write(*,'(5F10.5)') yhat
  print*
  write(*,'(A)') "  Standard errors:"
  write(*,'(3F10.5)') se
  print*
  ! R²:           0.97361
  ! Intercept:    8.78516
  !
  ! Coefficients:
  ! 1.22266   0.05078   0.09375
  !
  ! Predicted values:
  ! 10.39062  11.60938  12.84375  13.92187  15.23437
  !
  ! Standard errors:
  ! 0.25240   0.43454   0.60515

! ---- Multiple Linear Regression (Ridge)

  ! call with lambda set to 0.2
  call fsml_ridge(x3, y, nd, nv, 0.2_wp, b0, b, rsq, yhat, se, covb)
  write(*,'(A)') "> multiple linear regression (Ridge)"
  print*
  write(*,'(A,F10.5)') "  R²:        ", rsq
  write(*,'(A,F10.5)') "  Intercept: ", b0
  print*
  write(*,'(A)') "  Coefficients:"
  write(*,'(3F10.5)') b
  print*
  write(*,'(A)') "  Predicted values:"
  write(*,'(5F10.5)') yhat
  print*
  write(*,'(A)') "  Standard errors:"
  write(*,'(3F10.5)') se
  print*
  ! R²:           0.97293
  ! Intercept:    9.11522
  !
  ! Coefficients:
  ! 1.18224   0.02590   0.03162
  !
  ! Predicted values:
  ! 10.44413  11.61628  12.82879  13.95351  15.15729
  !
  ! Standard errors:
  ! 0.23481   0.36909   0.49025

! ---- Mahalanobis distance

  ! call without passing covariance matrix
  dist = fsml_mahalanobis(x3(:,1), x3(:,2))
  write(*,'(A)') "> Mahalanobis distance"
  print*
  write(*,'(A,F10.5)') "  Distance:  ", dist
  print*
  ! Distance:     1.41421





end program example_lin