Language-based image colorization aims to convert grayscale images to plausible and visually pleasing color images with language guidance, enjoying wide applications in historical photo restoration and film industry. Existing methods mainly leverage large language models and diffusion models to incorporate language guidance into the colorization process. However, it is still a great challenge to build accurate correspondence between the gray image and the semantic instructions, leading to mismatched, overflowing and under-saturated colors. In this paper, we introduce a novel coarse-to-fine framework, COlorfulness COntrollable Language-based Colorization (COCO-LC), that effectively reinforces the image-text correspondence with a coarsely colorized results. In addition, a multi-level condition that leverages both low-level and high-level cues of the gray image is introduced to realize accurate semantic-aware colorization without color overflows. Furthermore, we condition COCO-LC with a scale factor to determine the colorfulness of the output, flexibly meeting the different needs of users. We validate the superiority of COCO-LC over state-of-the-art image colorization methods in accurate, realistic and controllable colorization through extensive experiments.
You can zoom in for better visualization!
@inproceedings{li2024coco,
title={COCO-LC: Colorfulness Controllable Language-based Colorization},
author={Li, Yifan and Bai, Yuhang and Yang, Shuai and Liu, Jiaying},
booktitle={ACM MM},
year={2024}
}