Abstract

Dimensionality reduction is a critical step for the efficacy and efficiency of clustering analysis. Despite the multiple available methods, biomechanists have often defaulted to Principal Component Analysis (PCA). We evaluated two PCA- and one autoencoder-based dimensionality reduction methods for their data compression and reconstruction capability, assessed their effect on the output of clustering runners’ based on kinematics, and discussed their implications for the biomechanical assessment of running technique. Eighty-four participants completed a 4-minute run at 12 km/h while trunk and lower-limb kinematics were collected. Data reconstruction quality was assessed for Direct PCA (PCA directly on original variables) and Fourier PCA (modelling time series as Fourier series and then applying PCA) using popular variance explained criteria; and a feedforward autoencoder (AE). Agglomerative hierarchical clustering was then applied and the agreement between the resulting partitions was assessed. Meaningful errors in the reconstructed signals were found when applying popular variance explained criteria, suggesting reconstruction error should be assessed to make a more informed decision about how many components to retain for further analysis. Direct PCA, Fourier PCA and AE yielded different clusters, warranting caution when comparing outcomes from studies that use different dimensionality reduction techniques: each method may be sensitive to different data features. Direct PCA retaining 99% of the original variance emerged as the best compromise of data compression, reconstruction quality and cluster separability in our dataset. We encourage biomechanists to experiment with diverse dimensionality reduction methods to optimise clustering outcomes and enhance the real-world applicability of their findings.
Original languageEnglish
Article number112433
JournalJournal of Biomechanics
Volume177
Early online date15 Nov 2024
DOIs
Publication statusPublished - 31 Dec 2024

Acknowledgements

This project was funded by the University of Bath and NURVV, Ltd. Neural network development made use of HEX, the GPU Cloud in the Department of Computer Science at the University of Bath.

Funding

This project was funded by the University of Bath and NURVV, Ltd. Neural network development made use of HEX, the GPU Cloud in the Department of Computer Science at the University of Bath.

FundersFunder number
University of Bath
NURVV, Ltd.

    Fingerprint

    Dive into the research topics of 'Data should be made as simple as possible but not simpler: the method chosen for dimensionality reduction and its parameters can affect the clustering of runners based on their kinematics'. Together they form a unique fingerprint.

    Cite this