The Policy Cliff: A Theoretical Analysis of Reward-Policy Maps in Large Language Models Paper • 2507.20150 • Published Jul 27 • 1