samim

tags

Reward models (RMs) are the moral compass of LLMs – but no one has x-r...

2025-06-25 13:55:30

Reward models (RMs) are the moral compass of LLMs – but no one has x-rayed them at scale. We just ran the first exhaustive analysis of 10 leading RMs, and the results were...eye-opening.