Abstract
Efficient class forgetting has attracted significant interest due to the high computational cost of retraining models from scratch whenever classes need to be forgotten. This need arises from data privacy regulations, the necessity to remove outdated information, and the possibility to enhance model robustness and security. In this paper we address class forgetting in vision-language CLIP model. Modern class forgetting methods for CLIP have demonstrated that zero-shot forgetting is achievable by generating synthetic data and fine-tuning both visual and textual encoders with a regularization loss. Our approach shows that class forgetting in CLIP can be accomplished in a zero-shot manner without any visual data by adapting the shared vision-text space of CLIP, thereby making the class forgetting process more efficient. Our method delivers superior results, demonstrating strong performance and complete class removal, regardless of the visual encoder used in CLIP. Furthermore, we explore what exactly is being targeted by the class forgetting algorithm discovering some interesting properties of CLIP features. Full implementation can be found here.
| Original language | English |
|---|---|
| Number of pages | 24 |
| Journal | Transactions on Machine Learning Research |
| Volume | 2025-January |
| Publication status | Published - 20 Jan 2025 |
Bibliographical note
Publisher Copyright:© 2025, Transactions on Machine Learning Research. All rights reserved.
Funding
We\u2019d like to gratefully acknowledge Microsoft\u2019s compute support through Microsoft\u2019s Accelerating Foundation Models Research grant and the support from University of Bath for the studentship.
| Funders | Funder number |
|---|---|
| Microsoft | |
| University of Bath |
Keywords
- Class forgetting
- Vision-language models
ASJC Scopus subject areas
- Artificial Intelligence
- Computer Vision and Pattern Recognition