Baichuan-M2: Scaling Medical Capability with Large Verifier System Paper • 2509.02208 • Published 15 days ago • 39
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published 15 days ago • 115
WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents Paper • 2509.06501 • Published 9 days ago • 77
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published 15 days ago • 83