The procedure implements the K-means clustering algorithm using the Mahalanobis
distance as the similarity measure. It accepts initial centroids (cm_in
), refines
them iteratively, and returns the final centroids (cm
).
The input matrix (x
) holds observations in rows (nd
) and variables in columns (nv
).
The number of clusters (nc
) must be at least 1 and not greater than the number of
data points. The procedure assigns each observation to the nearest centroid using the
Mahalanobis distance, recomputes centroids from cluster memberships, and iterates until
convergence or the iteration limit is reached. Final centroids are sorted by the first
variable, and assignments are updated accordingly.
If the covariance matrix (cov_in
) is passed, it will be used to calculate the
Mahalanobis distance. If it is not passed, the variables are standardised before
computing the covariance matrix on the transformed data.
The global mean (gm
), cluster centroids (cm
), membership assignments (cl
),
and cluster sizes (cc
), the covariance matrix (cov
- either cov_in
or internally
calculated) and standard deviations (sigma
) used in the distance calculations are returned.
Impure wrapper procedure for s_nlp_kmeans_core
.
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
real(kind=wp), | intent(in) | :: | x(nd,nv) |
raw data (samples, variables) |
||
integer(kind=i4), | intent(in) | :: | nd |
number of data points |
||
integer(kind=i4), | intent(in) | :: | nv |
number of variables |
||
integer(kind=i4), | intent(in) | :: | nc |
number of clusters |
||
real(kind=wp), | intent(in) | :: | cm_in(nv,nc) |
initial centroids (raw, not standardised) |
||
real(kind=wp), | intent(out) | :: | gm(nv) |
global means |
||
real(kind=wp), | intent(out) | :: | cm(nv,nc) |
centroids (refined, standardised) |
||
integer(kind=i4), | intent(out) | :: | cl(nd) |
cluster assignments |
||
integer(kind=i4), | intent(out) | :: | cc(nc) |
cluster sizes |
||
real(kind=wp), | intent(out) | :: | cov(nv,nv) |
covariance matrix |
||
real(kind=wp), | intent(out) | :: | sigma(nv) |
standard deviations per variable |
||
real(kind=wp), | intent(in), | optional | :: | cov_in(nv,nv) |
optional covariance matrix |