Training Language Models to be Warm can Reduce Accuracy and Increase Sycophancy. Nature, via Gurwinder.
Artificial intelligence developers are increasingly building language models with warm and friendly personas that millions of people now use for advice, therapy and companionship1. Here we show how this can create a significant trade-off: optimizing language models for warmth can undermine their performance, especially when users express vulnerability.
Well that's ugly. Although maybe as I get older I will want more and more sycophancy.
No comments:
Post a Comment