π one paper Cinemo was accpeted by IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

one paper Cinemo was accpeted by IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Abstract
Diffusion models have achieved significant progress in the task of image animation due to their powerful generative capabilities. However, preserving appearance consistency to the static input image, and avoiding abrupt motion change in the generated animation, remains challenging. In this paper, we introduce Cinemo, a novel image animation approach that aims at achieving better appearance consistency and motion smoothness. The core of Cinemo is to focus on learning the distribution of motion residuals, rather than directly predicting frames as in existing diffusion models. During the inference, we further mitigate the sudden motion changes in the generated video by introducing a novel DCT-based noise refinement strategy. To counteract the over-smoothing of motion, we introduce a dynamics degree control design for better control of the magnitude of motion. Altogether, these strategies enable Cinemo to produce highly consistent, smooth, and motion-controllable results. Extensive experiments compared with several state-of-the-art methods demonstrate the effectiveness and superiority of our proposed approach. In the end, we also demonstrate how our model can be applied for motion transfer or video editing of any given video. The project page is available at https://maxin-cn.github.io/cinemo_project/.
Setup
Download and set up the repo:
git clone https://github.com/maxin-cn/Cinemo
cd Cinemo
conda env create -f environment.yml
conda activate cinemo
Animation
You can sample from our pre-trained Cinemo models. Weights for our pre-trained Cinemo model can be found here. The script has various arguments for adjusting sampling steps, changing the classifier-free guidance scale, etc:
bash pipelines/animation.sh
Related model weights will be downloaded automatically, and the following results can be found here.