Abstract
Regression methods for common data types such as measured, count and categorical variables are well understood but increasingly statisticians
need ways to model relationships between variable types such as shapes, curves, trees, correlation matrices and images that do not fit into
the standard framework.
Data types that lie in metric spaces but not in vector spaces are difficult to use within the usual regression setting, either as the response and/or a predictor. We represent the information in these variables using distance matrices which requires only the specification of a distance function. A lowdimensional representation of such distance matrices can be obtained using methods such as multidimensional scaling. Once these variables have been represented as scores, an internal model linking the predictors and the response can be developed using standard methods. We call scoring the transformation from a new observation to a score while backscoring is a method to represent a score as an observation in the data space. Both methods are essential for prediction and explanation. We illustrate the methodology for shape data, unregistered curve data and correlation matrices using motion capture data from an experiment to study the motion of children with cleft lip.
need ways to model relationships between variable types such as shapes, curves, trees, correlation matrices and images that do not fit into
the standard framework.
Data types that lie in metric spaces but not in vector spaces are difficult to use within the usual regression setting, either as the response and/or a predictor. We represent the information in these variables using distance matrices which requires only the specification of a distance function. A lowdimensional representation of such distance matrices can be obtained using methods such as multidimensional scaling. Once these variables have been represented as scores, an internal model linking the predictors and the response can be developed using standard methods. We call scoring the transformation from a new observation to a score while backscoring is a method to represent a score as an observation in the data space. Both methods are essential for prediction and explanation. We illustrate the methodology for shape data, unregistered curve data and correlation matrices using motion capture data from an experiment to study the motion of children with cleft lip.
Original language  English 

Pages (fromto)  23422357 
Journal  Journal of Applied Statistics 
Volume  41 
Issue number  11 
Early online date  23 Apr 2014 
DOIs  
Publication status  Published  2014 
Fingerprint Dive into the research topics of 'Regression for nonEuclidean data using distance matrices'. Together they form a unique fingerprint.
Profiles

Julian Faraway
 Department of Mathematical Sciences  Professor
 EPSRC Centre for Doctoral Training in Statistical Applied Mathematics (SAMBa)
Person: Research & Teaching