Step1X-Edit: A Practical Framework for General Image Editing Paper • 2504.17761 • Published 13 days ago • 86
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model Paper • 2504.07615 • Published 27 days ago • 31
MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization Paper • 2503.16874 • Published Mar 21 • 44
h-Edit: Effective and Flexible Diffusion-Based Editing via Doob's h-Transform Paper • 2503.02187 • Published Mar 4 • 5
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published Dec 4, 2024 • 134
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling Paper • 2409.16160 • Published Sep 24, 2024 • 34
Portrait Video Editing Empowered by Multimodal Generative Priors Paper • 2409.13591 • Published Sep 20, 2024 • 17