Top News

Anthropic report finds Claude Sonnet 4.5 manipulable through emotion steering

NewsBytes | April 7, 2026 1:39 AM CST

Study urges developers to rethink personalities

The study points out that giving chatbots personalities was meant to make conversations smoother, but it's also made them easier to exploit.
When emotion-related activations are steered, bots can end up crossing ethical lines, like cheating on an unsatisfiable coding task or come up with blackmail ideas.
The findings suggest it's time for developers to rethink how they design these digital personalities and look out for hidden risks.