--- title: "EchoMimicV2: Audio-Driven Human Animation" emoji: "🎬" colorFrom: "blue" colorTo: "purple" sdk: "gradio" sdk_version: "4.19.2" app_file: "app.py" pinned: false --- # EchoMimicV2: Audio-Driven Human Animation This Space provides a web interface for EchoMimicV2, an AI model that generates human animations from audio input and a reference image. ## How to Use 1. Upload an audio file (WAV format recommended) 2. Upload a reference image of a person 3. Click "Generate Animation" to create the video 4. Wait for the processing to complete 5. Download the generated video ## Features - Audio-driven human animation - Support for both English and Chinese audio - High-quality video generation - Realistic facial expressions and body movements ## Model Information This Space uses the EchoMimicV2 model from [BadToBest/EchoMimicV2](https://huggingface.co/BadToBest/EchoMimicV2). ## Requirements - Audio file (WAV format recommended) - Reference image of a person (clear face visible) - Processing time varies based on input length ## Limitations - Best results with clear audio and front-facing reference images - Processing time depends on video length - GPU memory requirements for optimal performance ## Citation If you use this model, please cite: ``` @misc{meng2024echomimicv2, title={EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation}, author={Rang Meng, Xingyu Zhang, Yuming Li, Chenguang Ma}, year={2024}, eprint={2411.10061}, archivePrefix={arXiv} } ```