arxiv:2508.12691

MixCache: Mixture-of-Cache for Video Diffusion Transformer Acceleration

Published on Aug 18

Authors:

Abstract

MixCache, a training-free caching framework, enhances video DiT inference by dynamically selecting optimal caching granularity, balancing speed and quality.

AI-generated summary

Leveraging the Transformer architecture and the diffusion process, video DiT models have emerged as a dominant approach for high-quality video generation. However, their multi-step iterative denoising process incurs high computational cost and inference latency. Caching, a widely adopted optimization method in DiT models, leverages the redundancy in the diffusion process to skip computations in different granularities (e.g., step, cfg, block). Nevertheless, existing caching methods are limited to single-granularity strategies, struggling to balance generation quality and inference speed in a flexible manner. In this work, we propose MixCache, a training-free caching-based framework for efficient video DiT inference. It first distinguishes the interference and boundary between different caching strategies, and then introduces a context-aware cache triggering strategy to determine when caching should be enabled, along with an adaptive hybrid cache decision strategy for dynamically selecting the optimal caching granularity. Extensive experiments on diverse models demonstrate that, MixCache can significantly accelerate video generation (e.g., 1.94times speedup on Wan 14B, 1.97times speedup on HunyuanVideo) while delivering both superior generation quality and inference efficiency compared to baseline methods.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2508.12691 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.12691 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.12691 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.