SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward Paper • 2505.17018 • Published May 22 • 15