Saturday, May 30, 2026

Accuracy Vs Sycophancy

 Training Language Models to be Warm can Reduce Accuracy and Increase Sycophancy. Nature, via Gurwinder. 

 Artificial intelligence developers are increasingly building language models with warm and friendly personas that millions of people now use for advice, therapy and companionship1. Here we show how this can create a significant trade-off: optimizing language models for warmth can undermine their performance, especially when users express vulnerability. 

Well that's ugly.  Although maybe as I get older I will want more and more sycophancy. 

Update: There are ways to adjust the settings to stress accuracy at the expense of friendliness. Just like with people.

2 comments:

Douglas2 said...

"Just like with people"

Assistant Village Idiot said...

Yeah one of the things that makes me nervous about AI is when its faults are not incomprehensible to us, but just like what we do.