COCO-LC: Colorfulness Controllable Language-based Colorization

Wangxuan Institute of Computer Technology, Peking University    

Figure 1. We propose COCO-LC, a novel colorfulness controllable language-based colorization framework. (a) COCO-LC generates realistic and semantic-consistent colorization results. (b) COCO-LC allows for flexible user control over (top) color types and (bottom) color styles.


Abstract

Language-based image colorization aims to convert grayscale images to plausible and visually pleasing color images with language guidance, enjoying wide applications in historical photo restoration and film industry. Existing methods mainly leverage large language models and diffusion models to incorporate language guidance into the colorization process. However, it is still a great challenge to build accurate correspondence between the gray image and the semantic instructions, leading to mismatched, overflowing and under-saturated colors. In this paper, we introduce a novel coarse-to-fine framework, COlorfulness COntrollable Language-based Colorization (COCO-LC), that effectively reinforces the image-text correspondence with a coarsely colorized results. In addition, a multi-level condition that leverages both low-level and high-level cues of the gray image is introduced to realize accurate semantic-aware colorization without color overflows. Furthermore, we condition COCO-LC with a scale factor to determine the colorfulness of the output, flexibly meeting the different needs of users. We validate the superiority of COCO-LC over state-of-the-art image colorization methods in accurate, realistic and controllable colorization through extensive experiments.

More Visual Results

You can zoom in for better visualization!

BibTeX

Please consider to cite COCO-LC if it helps your research.
@inproceedings{li2024coco,
  title={COCO-LC: Colorfulness Controllable Language-based Colorization},
  author={Li, Yifan and Bai, Yuhang and Yang, Shuai and Liu, Jiaying},
  booktitle={ACM MM},
  year={2024}
}

Reference

[1] Zhitong Huang, Nanxuan Zhao, and Jing Liao, "Unicolor: A unified framework for multi-modal colorization with transformer", ACM Transactions on Graphics (TOG), 2022.
[2] Shuchen Weng, Hao Wu, Zheng Chang, Jiajun Tang, Si Li, and Boxin Shi, "L-CoDe: Language-based colorization using color-object decoupled conditions", AAAI Conference of Artificial Intelligence (AAAI), 2022.
[3] Zheng Chang, Shuchen Weng, Yu Li, Si Li, and Boxin Shi, "L-CoDer: Language-based Colorization with Color-object Decoupling Transformer". European Conference on Computer Vision (ECCV), 2022.
[4] Zheng Chang, Shuchen Weng, Peixuan Zhang, Yu Li, Si Li, and Boxin Shi, "L-CoIns: Language-based Colorization with Instance Awareness. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
[5] Zheng Chang, Shuchen Weng, Peixuan Zhang, Yu Li, Si Li, and Boxin Shi, "L-CAD: Language-based Colorization with Any-level Descriptions using Diffusion Priors", Advances in Neural Information Processing Systems (NeurIPS), 2023.