Self-Rewarding Vision-Language Model via Reasoning Decomposition Paper • 2508.19652 • Published 29 days ago • 84
Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation Paper • 2506.15068 • Published Jun 18 • 13