---
title: "EchoMimicV2: Audio-Driven Human Animation"
emoji: "🎬"
colorFrom: "blue"
colorTo: "purple"
sdk: "gradio"
sdk_version: "4.19.2"
app_file: "app.py"
pinned: false
---

# EchoMimicV2: Audio-Driven Human Animation

This Space provides a web interface for EchoMimicV2, an AI model that generates human animations from audio input and a reference image.

## How to Use

1. Upload an audio file (WAV format recommended)
2. Upload a reference image of a person
3. Click "Generate Animation" to create the video
4. Wait for the processing to complete
5. Download the generated video

## Features

- Audio-driven human animation
- Support for both English and Chinese audio
- High-quality video generation
- Realistic facial expressions and body movements

## Model Information

This Space uses the EchoMimicV2 model from [BadToBest/EchoMimicV2](https://huggingface.co/BadToBest/EchoMimicV2).

## Requirements

- Audio file (WAV format recommended)
- Reference image of a person (clear face visible)
- Processing time varies based on input length

## Limitations

- Best results with clear audio and front-facing reference images
- Processing time depends on video length
- GPU memory requirements for optimal performance

## Citation

If you use this model, please cite:
```
@misc{meng2024echomimicv2,
  title={EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation},
  author={Rang Meng, Xingyu Zhang, Yuming Li, Chenguang Ma},
  year={2024},
  eprint={2411.10061},
  archivePrefix={arXiv}
}
```