Abstract
We introduce TQCompressor a neural network model compression method using enhanced tensor decompositions. We propose a permutation-based improvement to Kronecker decomposition, reducing the loss in model expressivity typically associated with compression. Applied to GPT-2small, this results in the TQCompressedGPT-2 model with 81 million parameters, down from 124 million. Enhanced through multi-step knowledge distillation on 3.1% of OpenWebText, TQCompressedGPT-2 outperforms DistilGPT-2 and KnGPT-2. We made TQCompressedGPT-2 publicly available.
Original language | English |
---|---|
Title of host publication | Proceedings - 2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval, MIPR 2024 |
Place of Publication | U. S. A. |
Publisher | IEEE |
Pages | 503-506 |
Number of pages | 4 |
ISBN (Electronic) | 9798350351422 |
ISBN (Print) | 9798350351439 |
DOIs | |
Publication status | Published - 15 Oct 2024 |
Event | 7th IEEE International Conference on Multimedia Information Processing and Retrieval, MIPR 2024 - San Jose, USA United States Duration: 7 Aug 2024 → 9 Aug 2024 |
Conference
Conference | 7th IEEE International Conference on Multimedia Information Processing and Retrieval, MIPR 2024 |
---|---|
Country/Territory | USA United States |
City | San Jose |
Period | 7/08/24 → 9/08/24 |
Keywords
- GPT-2
- Knowledge distillation
- Kronecker decomposition
- Neural network compression
- Tensor decomposition
ASJC Scopus subject areas
- Artificial Intelligence
- Computer Science Applications
- Computer Vision and Pattern Recognition
- Information Systems
- Media Technology