Abstract
Regression methods for common data types such as measured, count and categorical variables are well understood but increasingly statisticians
need ways to model relationships between variable types such as shapes, curves, trees, correlation matrices and images that do not fit into
the standard framework.
Data types that lie in metric spaces but not in vector spaces are difficult to use within the usual regression setting, either as the response and/or a predictor. We represent the information in these variables using distance matrices which requires only the specification of a distance function. A low-dimensional representation of such distance matrices can be obtained using methods such as multidimensional scaling. Once these variables have been represented as scores, an internal model linking the predictors and the response can be developed using standard methods. We call scoring the transformation from a new observation to a score while backscoring is a method to represent a score as an observation in the data space. Both methods are essential for prediction and explanation. We illustrate the methodology for shape data, unregistered curve data and correlation matrices using motion capture data from an experiment to study the motion of children with cleft lip.
need ways to model relationships between variable types such as shapes, curves, trees, correlation matrices and images that do not fit into
the standard framework.
Data types that lie in metric spaces but not in vector spaces are difficult to use within the usual regression setting, either as the response and/or a predictor. We represent the information in these variables using distance matrices which requires only the specification of a distance function. A low-dimensional representation of such distance matrices can be obtained using methods such as multidimensional scaling. Once these variables have been represented as scores, an internal model linking the predictors and the response can be developed using standard methods. We call scoring the transformation from a new observation to a score while backscoring is a method to represent a score as an observation in the data space. Both methods are essential for prediction and explanation. We illustrate the methodology for shape data, unregistered curve data and correlation matrices using motion capture data from an experiment to study the motion of children with cleft lip.
Original language | English |
---|---|
Pages (from-to) | 2342-2357 |
Journal | Journal of Applied Statistics |
Volume | 41 |
Issue number | 11 |
Early online date | 23 Apr 2014 |
DOIs | |
Publication status | Published - 2014 |
Fingerprint
Dive into the research topics of 'Regression for non-Euclidean data using distance matrices'. Together they form a unique fingerprint.Profiles
-
Julian Faraway
- Department of Mathematical Sciences - Professor
- EPSRC Centre for Doctoral Training in Statistical Applied Mathematics (SAMBa)
Person: Research & Teaching