SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions Paper • 2506.23046 • Published Jun 29 • 1
The Alignment Waltz: Jointly Training Agents to Collaborate for Safety Paper • 2510.08240 • Published 3 days ago • 33
Large Reasoning Models Learn Better Alignment from Flawed Thinking Paper • 2510.00938 • Published 11 days ago • 52
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications Paper • 2509.26490 • Published 12 days ago • 17
The Era of Real-World Human Interaction: RL from User Conversations Paper • 2509.25137 • Published 13 days ago • 18 • 3
The Era of Real-World Human Interaction: RL from User Conversations Paper • 2509.25137 • Published 13 days ago • 18
SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions Paper • 2506.23046 • Published Jun 29 • 1
The Era of Real-World Human Interaction: RL from User Conversations Paper • 2509.25137 • Published 13 days ago • 18 • 3
The Era of Real-World Human Interaction: RL from User Conversations Paper • 2509.25137 • Published 13 days ago • 18
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources Paper • 2509.21268 • Published 17 days ago • 99
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data Paper • 2509.15221 • Published 24 days ago • 105
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model Paper • 2509.00676 • Published Aug 31 • 83
Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation Paper • 2506.21876 • Published Jun 27 • 28
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Paper • 2506.24119 • Published Jun 30 • 50
Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation Paper • 2506.21876 • Published Jun 27 • 28