Reconstruction Alignment Improves Unified Multimodal Models Paper • 2509.07295 • Published 10 days ago • 38
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions Paper • 2509.06951 • Published 10 days ago • 31
UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward Paper • 2509.06818 • Published 10 days ago • 29
UniVerse-1: Unified Audio-Video Generation via Stitching of Experts Paper • 2509.06155 • Published 11 days ago • 13
Interleaving Reasoning for Better Text-to-Image Generation Paper • 2509.06945 • Published 10 days ago • 13
Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling Paper • 2509.01624 • Published 17 days ago • 7
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference Paper • 2509.06942 • Published 10 days ago • 15