Skip to main navigation Skip to search Skip to main content

Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Generation

Chenyu Wang, Shuo Yan, Yixuan Chen, Xianwei Wang, Yujiang Wang, Mingzhi Dong, Xiaochen Yang, Dongsheng Li, Rui Zhu, David A. Clifton, Robert P. Dick, Qin Lv, Fan Yang, Tun Lu, Ning Gu, Li Shang

Research output: Contribution to journalArticlepeer-review

6   Link opens in a new tab Citations (SciVal)

Abstract

Denoising-based diffusion models have attained impressive image synthesis; however, their applications on videos can lead to unaffordable computational costs due to the per-frame denoising operations. In pursuit of efficient video generation, we present a Diffusion Reuse MOtion (Dr. Mo) network to accelerate the video-based denoising process. Our crucial observation is that the latent representations in early denoising steps between adjacent video frames exhibit high consistencies with motion clues. Inspired by the discovery, we propose to accelerate the video denoising process by incorporating lightweight, learnable motion features. Specifically, Dr. Mo will only compute all denoising steps for base frames. For a non-based frame, Dr. Mo will propagate the pre-computed based latents of a particular step with interframe motions to obtain a fast estimation of its coarse-grained latent representation, from which the denoising will continue to obtain more sensitive and fine-grained representations. On top of this, Dr. Mo employs a meta-network named Denoising Step Selector (DSS) to dynamically determine the step to perform motion-based propagations for each frame, ensuring the correct transformation of multi-granularity visual features. Extensive evaluations on video generation and editing tasks indicate that Dr. Mo delivers widely applicable acceleration for diffusion-based video generations while effectively retaining the visual quality and style. Video generation and visualization results can be found at https://drmo-denoising-reuse.github.io.

Original languageEnglish
Pages (from-to)8436-8451
Number of pages16
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume35
Issue number9
Early online date6 Mar 2025
DOIs
Publication statusPublished - 6 Mar 2025

Funding

The work of Yujiang Wang was supported in part by the Basic Research Program of Jiangsu under Grant BK20240414 and in part by the Suzhou Dushu Lake Science and Education Innovation District (SEID) Science and Education Leading Talent Program under Grant KJQ2024204. The computations in this research were performed using the CFFF platform of Fudan University.

FundersFunder number
Fudan University
Suzhou Dushu Lake Science and Education Innovation District
Basic Research Program of JiangsuBK20240414
SEIDKJQ2024204

    Keywords

    • Video generation
    • computational efficiency
    • diffusion models

    ASJC Scopus subject areas

    • Media Technology
    • Electrical and Electronic Engineering

    Fingerprint

    Dive into the research topics of 'Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Generation'. Together they form a unique fingerprint.

    Cite this