The procedure is an implementation of the agglomerative hierarchical clustering method that groups data points into clusters by iteratively merging the most similar clusters. The procedure uses centroid linkage and the Mahalanobis distance as a measure of similarity.
The input matrix (x
) holds observations in rows (nd
) and variables in columns (nv
).
The target number of clusters (nc
) must be at least 1 and not greater than the number
of data points.
The variables are standardised before computing the covariance matrix on the transformed data. The matrix is used for calculating the Mahalanobis distance.
Clusters are merged iteratively until the target number of clusters is reached.
The global mean (gm
), cluster centroids (cm
), membership assignments (cl
),
and cluster sizes (cc
), the covariance matrix (cov
) and standard deviations
(sigma
) used in the distance calculations are returned.
Impure wrapper procedure for s_nlp_hclust_core
.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
real(kind=wp), | intent(in) | :: | x(nd,nv) |
input data matrix (samples, variables) |
||
integer(kind=i4), | intent(in) | :: | nd |
number of data points |
||
integer(kind=i4), | intent(in) | :: | nv |
number of variables |
||
integer(kind=i4), | intent(in) | :: | nc |
number of clusters (target) |
||
real(kind=wp), | intent(out) | :: | gm(nv) |
global means for each variable |
||
real(kind=wp), | intent(out) | :: | cm(nv,nc) |
cluster centroids |
||
integer(kind=i4), | intent(out) | :: | cl(nd) |
cluster assignments for each data point |
||
integer(kind=i4), | intent(out) | :: | cc(nc) |
cluster sizes |
||
real(kind=wp), | intent(out) | :: | cov(nv,nv) |
covariance matrix |
||
real(kind=wp), | intent(out) | :: | sigma(nv) |
standard deviation per variable |