Abstract
We propose the first unified framework UniColor to support colorization in multiple modalities, including both unconditional and conditional ones, such as stroke, exemplar, text, and even a mix of them. Rather than learning a separate model for each type of condition, we introduce a two-stage colorization framework for incorporating various conditions into a single model. In the first stage, multi-modal conditions are converted into a common representation of hint points. Particularly, we propose a novel CLIP-based method to convert the text to hint points. In the second stage, we propose a Transformer-based network composed of Chroma-VQGAN and Hybrid-Transformer to generate diverse and high-quality colorization results conditioned on hint points. Both qualitative and quantitative comparisons demonstrate that our method outperforms state-of-the-art methods in every control modality and further enables multi-modal colorization that was not feasible before. Moreover, we design an interactive interface showing the effectiveness of our unified framework in practical usage, including automatic colorization, hybrid-control colorization, local recolorization, and iterative color editing. Our code and models are available at https://luckyhzt.github.io/unicolor.
Original language | English |
---|---|
Article number | 205 |
Journal | ACM Transactions on Graphics |
Volume | 41 |
Issue number | 6 |
Early online date | 30 Nov 2022 |
DOIs | |
Publication status | Published - 31 Dec 2022 |
Keywords
- color editing
- colorization
- multi-modal controls
- transformer
ASJC Scopus subject areas
- Computer Graphics and Computer-Aided Design