Self-Rewarding Vision-Language Model via Reasoning Decomposition Paper • 2508.19652 • Published 10 days ago • 77
Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation Paper • 2506.15068 • Published Jun 18 • 14