VibeFlow

Video chroma-lux editing, which aims to modify illumination and color while preserving structural and temporal fidelity, remains a significant challenge. Existing methods typically rely on expensive supervised training with synthetic paired data. This paper proposes VibeFlow, a novel self-supervised framework that unleashes the intrinsic physical understanding of pre-trained video generation models. Instead of learning color and light transitions from scratch, we introduce a disentangled data perturbation pipeline that enforces the model to adaptively recombine structure from source videos and color-illumination cues from reference images, enabling robust disentanglement in a self-supervised manner. Furthermore, to rectify discretization errors inherent in flow-based models, we introduce Residual Velocity Fields alongside a Structural Distortion Consistency Regularization, ensuring rigorous structural preservation and temporal coherence. Our framework eliminates the need for costly training resources and generalizes in a zero-shot manner to diverse applications, including video relighting, recoloring, low-light enhancement, day-night translation, and object-specific color editing. Extensive experiments demonstrate that VibeFlow achieves impressive visual quality with significantly reduced computational overhead.

VibeFlow: Versatile Video Chroma-Lux Editing through Self-Supervised Learning

Notes:

- Recommended in Edge or Chrome browsers.

- If the videos are not correctly loaded / autoplaying, please try to refresh the page.

- Please switch pages to navigate different videos.

Results

Comparison Methods

Notes:

- Recommended in Edge or Chrome browsers.

- If the videos are not correctly loaded / autoplaying, please try to refresh the page.

- Please **switch pages** to navigate different videos.

Results

Comparison Methods

- Please switch pages to navigate different videos.