Efficient Multi-modal Large Language Models via Progressive Consistency Distillation Paper • 2510.00515 • Published 7 days ago • 37
MathBode: Frequency-Domain Fingerprints of LLM Mathematical Reasoning Paper • 2509.23143 • Published 11 days ago • 4
From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones Paper • 2509.25123 • Published 8 days ago • 17
Rolling Forcing: Autoregressive Long Video Diffusion in Real Time Paper • 2509.25161 • Published 8 days ago • 21
Language Models Can Learn from Verbal Feedback Without Scalar Rewards Paper • 2509.22638 • Published 11 days ago • 64
SimpleFold: Folding Proteins is Simpler than You Think Paper • 2509.18480 • Published 15 days ago • 11
VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction Paper • 2509.19297 • Published 14 days ago • 23
VideoFrom3D: 3D Scene Video Generation via Complementary Image and Video Diffusion Models Paper • 2509.17985 • Published 15 days ago • 25
BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent Paper • 2509.15566 • Published 19 days ago • 10
Wan-Animate: Unified Character Animation and Replacement with Holistic Replication Paper • 2509.14055 • Published 20 days ago • 14
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published 22 days ago • 103