Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving Paper • 2505.04528 • Published 3 days ago • 10
Beyond Recognition: Evaluating Visual Perspective Taking in Vision Language Models Paper • 2505.03821 • Published 7 days ago • 21
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation Paper • 2505.04512 • Published 3 days ago • 27
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play Paper • 2505.02707 • Published 5 days ago • 77
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published 4 days ago • 91
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities Paper • 2505.02567 • Published 5 days ago • 62
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks Paper • 2312.14238 • Published Dec 21, 2023 • 20
Augmenting CLIP with Improved Visio-Linguistic Reasoning Paper • 2307.09233 • Published Jul 18, 2023 • 9
Analysis of the Evolution of Advanced Transformer-Based Language Models: Experiments on Opinion Mining Paper • 2308.03235 • Published Aug 7, 2023 • 2
ZeroSearch: Incentivize the Search Capability of LLMs without Searching Paper • 2505.04588 • Published 2 days ago • 46
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published 4 days ago • 83
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale Paper • 2505.03005 • Published 4 days ago • 26
Multi-Agent System for Comprehensive Soccer Understanding Paper • 2505.03735 • Published 3 days ago • 17
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT Paper • 2505.00703 • Published 8 days ago • 39
A Robust Deep Networks based Multi-Object MultiCamera Tracking System for City Scale Traffic Paper • 2505.00534 • Published 9 days ago • 2
Spatial Speech Translation: Translating Across Space With Binaural Hearables Paper • 2504.18715 • Published 14 days ago • 7
LLMs for Engineering: Teaching Models to Design High Powered Rockets Paper • 2504.19394 • Published 12 days ago • 13
AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization Paper • 2504.21659 • Published 10 days ago • 11