Multi-task Learning by Maximizing Statistical Dependence

Youssef Alami Mejjati, Darren Cosker, Kwang In Kim

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

11 Citations (SciVal)


We present a new multi-task learning (MTL) approach that can be applied to multiple heterogeneous task estimators. Our motivation is that the best task estimator could change depending on the task itself. For example, we may have a deep neural network for the first task and a Gaussian process for the second task. Classical MTL approaches cannot handle this case, as they require the same model or even the same parameter types for all tasks. We tackle this by considering task-specific estimators as random variables. Then, the task relationships are discovered by measuring the statistical dependence between each pair of random variables. By doing so, our model is independent of the parametric nature of each task, and is even agnostic to the existence of such parametric formulation. We compare our algorithm with existing MTL approaches on challenging real world ranking and regression datasets, and show that our approach achieves comparable or better performance without knowing the parametric form.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 665992
Original languageEnglish
Title of host publication2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Number of pages9
ISBN (Electronic)9781538664209
ISBN (Print)9781538664216
Publication statusPublished - 17 Dec 2018
EventIEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018 -
Duration: 18 Jun 201822 Jun 2018

Publication series

NameProceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition)
ISSN (Print)2575-7075


ConferenceIEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018


Dive into the research topics of 'Multi-task Learning by Maximizing Statistical Dependence'. Together they form a unique fingerprint.

Cite this