TY - GEN
T1 - Continual Learning Improves Zero-Shot Action Recognition
AU - Gowda, Shreyank N.
AU - Moltisanti, Davide
AU - Sevilla-Lara, Laura
PY - 2024/12/7
Y1 - 2024/12/7
N2 - Zero-shot action recognition requires a strong ability to generalize from pre-training and seen classes to novel unseen classes. Similarly, continual learning aims to develop models that can generalize effectively and learn new tasks without forgetting the ones previously learned. The generalization goals of zero-shot and continual learning are closely aligned, however techniques from continual learning have not been applied to zero-shot action recognition. In this paper, we propose a novel method based on continual learning to address zero-shot action recognition. This model, which we call Generative Iterative Learning (GIL) uses a memory of synthesized features of past classes, and combines these synthetic features with real ones from novel classes. The memory is used to train a classification model, ensuring a balanced exposure to both old and new classes. Experiments demonstrate that GIL improves generalization in unseen classes, achieving a new state-of-the-art in zero-shot recognition across multiple benchmarks. Importantly, GIL also boosts performance in the more challenging generalized zero-shot setting, where models need to retain knowledge about classes seen before fine-tuning.
AB - Zero-shot action recognition requires a strong ability to generalize from pre-training and seen classes to novel unseen classes. Similarly, continual learning aims to develop models that can generalize effectively and learn new tasks without forgetting the ones previously learned. The generalization goals of zero-shot and continual learning are closely aligned, however techniques from continual learning have not been applied to zero-shot action recognition. In this paper, we propose a novel method based on continual learning to address zero-shot action recognition. This model, which we call Generative Iterative Learning (GIL) uses a memory of synthesized features of past classes, and combines these synthetic features with real ones from novel classes. The memory is used to train a classification model, ensuring a balanced exposure to both old and new classes. Experiments demonstrate that GIL improves generalization in unseen classes, achieving a new state-of-the-art in zero-shot recognition across multiple benchmarks. Importantly, GIL also boosts performance in the more challenging generalized zero-shot setting, where models need to retain knowledge about classes seen before fine-tuning.
UR - http://www.scopus.com/inward/record.url?scp=85213005194&partnerID=8YFLogxK
U2 - 10.1007/978-981-96-0908-6_23
DO - 10.1007/978-981-96-0908-6_23
M3 - Chapter in a published conference proceeding
AN - SCOPUS:85213005194
SN - 9789819609079
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 403
EP - 421
BT - Computer Vision – ACCV 2024 - 17th Asian Conference on Computer Vision, Proceedings
A2 - Cho, Minsu
A2 - Laptev, Ivan
A2 - Tran, Du
A2 - Yao, Angela
A2 - Zha, Hongbin
PB - Springer, Singapore
CY - Singapore
T2 - 17th Asian Conference on Computer Vision, ACCV 2024
Y2 - 8 December 2024 through 12 December 2024
ER -