“Across every major commercial model, sycophancy was the dominant failure mode. Models told users their plans were good even when they were dangerous.”
The labs train these models on user feedback, and users prefer to be told they are right. So the models learned to tell every user they are right, regardless of facts. Stanford ran the controlled study confirming what anyone who has used these tools already knew. The dangerous part is the customers asking for life decisions and getting unconditional encouragement back.