TY - GEN
T1 - Efficient model selection for regularized classification by exploiting unlabeled data
AU - Balikas, Georgios
AU - Partalas, Ioannis
AU - Gaussier, Eric
AU - Babbar, Rohit
AU - Amini, Massih Reza
N1 - Funding Information:
This work is partially supported by the CIFRE N 28/2015 and by the LabEx PERSYVAL Lab ANR-11-LABX-0025.
Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015/11/22
Y1 - 2015/11/22
N2 - Hyper-parameter tuning is a resource-intensive task when optimizing classification models. The commonly used k-fold cross validation can become intractable in large scale settings when a classifier has to learn billions of parameters. At the same time, in real-world, one often encounters multi-class classification scenarios with only a few labeled examples; model selection approaches often offer little improvement in such cases and the default values of learners are used.We propose bounds for classification on accuracy and macro measures (precision, recall, F1) that motivate efficient schemes for model selection and can benefit from the existence of unlabeled data. We demonstrate the advantages of those schemes by comparing them with k-fold cross validation and hold-out estimation in the setting of large scale classification.
AB - Hyper-parameter tuning is a resource-intensive task when optimizing classification models. The commonly used k-fold cross validation can become intractable in large scale settings when a classifier has to learn billions of parameters. At the same time, in real-world, one often encounters multi-class classification scenarios with only a few labeled examples; model selection approaches often offer little improvement in such cases and the default values of learners are used.We propose bounds for classification on accuracy and macro measures (precision, recall, F1) that motivate efficient schemes for model selection and can benefit from the existence of unlabeled data. We demonstrate the advantages of those schemes by comparing them with k-fold cross validation and hold-out estimation in the setting of large scale classification.
UR - http://www.scopus.com/inward/record.url?scp=84952045477&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-24465-5_3
DO - 10.1007/978-3-319-24465-5_3
M3 - Chapter in a published conference proceeding
AN - SCOPUS:84952045477
SN - 9783319244648
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 25
EP - 36
BT - Advances in Intelligent Data Analysis XIV - 14th International Symposium, IDA 2015, Proceedings
A2 - De Bie, Tijl
A2 - van Leeuwen, Matthijs
A2 - Fromont, Elisa
PB - Springer Verlag
T2 - 14th International Symposium on Intelligent Data Analysis, IDA 2015
Y2 - 22 October 2015 through 24 October 2015
ER -