AI-written critiques help humans notice flaws

Editor
0 Min Read



We trained “critique-writing” models to describe flaws in summaries. Human evaluators find flaws in summaries much more often when shown our model’s critiques. Larger models are better at self-critiquing, with scale improving critique-writing more than summary-writing. This shows promise for using AI systems to assist human supervision of AI systems on difficult tasks.

Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.