Zero-shot CLIP Class Forgetting via Text-image Space Adaptation

Research output: Contribution to journalArticlepeer-review

1 Citation (SciVal)
112 Downloads (Pure)

Abstract

Efficient class forgetting has attracted significant interest due to the high computational cost of retraining models from scratch whenever classes need to be forgotten. This need arises from data privacy regulations, the necessity to remove outdated information, and the possibility to enhance model robustness and security. In this paper we address class forgetting in vision-language CLIP model. Modern class forgetting methods for CLIP have demonstrated that zero-shot forgetting is achievable by generating synthetic data and fine-tuning both visual and textual encoders with a regularization loss. Our approach shows that class forgetting in CLIP can be accomplished in a zero-shot manner without any visual data by adapting the shared vision-text space of CLIP, thereby making the class forgetting process more efficient. Our method delivers superior results, demonstrating strong performance and complete class removal, regardless of the visual encoder used in CLIP. Furthermore, we explore what exactly is being targeted by the class forgetting algorithm discovering some interesting properties of CLIP features. Full implementation can be found here.

Original languageEnglish
Number of pages24
JournalTransactions on Machine Learning Research
Volume2025-January
Publication statusPublished - 20 Jan 2025

Bibliographical note

Publisher Copyright:
© 2025, Transactions on Machine Learning Research. All rights reserved.

Funding

We\u2019d like to gratefully acknowledge Microsoft\u2019s compute support through Microsoft\u2019s Accelerating Foundation Models Research grant and the support from University of Bath for the studentship.

FundersFunder number
Microsoft
University of Bath

    Keywords

    • Class forgetting
    • Vision-language models

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Computer Vision and Pattern Recognition

    Fingerprint

    Dive into the research topics of 'Zero-shot CLIP Class Forgetting via Text-image Space Adaptation'. Together they form a unique fingerprint.

    Cite this