VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization Paper • 2505.19000 • Published May 25 • 42
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning? Paper • 2505.21374 • Published May 27 • 27
HoliTom: Holistic Token Merging for Fast Video Large Language Models Paper • 2505.21334 • Published May 27 • 20
DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction Paper • 2505.21473 • Published May 27 • 16
Fork-Merge Decoding: Enhancing Multimodal Understanding in Audio-Visual Large Language Models Paper • 2505.20873 • Published May 27 • 9
Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs Paper • 2505.19075 • Published May 25 • 21