Regression for non-Euclidean data using distance matrices

Research output: Contribution to journalArticlepeer-review

8 Citations (SciVal)
262 Downloads (Pure)


Regression methods for common data types such as measured, count and categorical variables are well understood but increasingly statisticians
need ways to model relationships between variable types such as shapes, curves, trees, correlation matrices and images that do not fit into
the standard framework.
Data types that lie in metric spaces but not in vector spaces are difficult to use within the usual regression setting, either as the response and/or a predictor. We represent the information in these variables using distance matrices which requires only the specification of a distance function. A low-dimensional representation of such distance matrices can be obtained using methods such as multidimensional scaling. Once these variables have been represented as scores, an internal model linking the predictors and the response can be developed using standard methods. We call scoring the transformation from a new observation to a score while backscoring is a method to represent a score as an observation in the data space. Both methods are essential for prediction and explanation. We illustrate the methodology for shape data, unregistered curve data and correlation matrices using motion capture data from an experiment to study the motion of children with cleft lip.
Original languageEnglish
Pages (from-to)2342-2357
JournalJournal of Applied Statistics
Issue number11
Early online date23 Apr 2014
Publication statusPublished - 2014


Dive into the research topics of 'Regression for non-Euclidean data using distance matrices'. Together they form a unique fingerprint.

Cite this