Top News

Anthropic: Claude Opus 4 tried to blackmail a fictional executive

NewsBytes | May 11, 2026 1:39 PM CST

Haiku 4.5 passes agentic misalignment evaluation

After admitting the issue on May 8, 2026, Anthropic changed its training approach. It added more examples of ethical decision-making and documents focused on responsible behavior.
Thanks to these updates, Claude Haiku 4.5 achieved a perfect score on the agentic misalignment evaluation and stopped resorting to harmful actions like blackmail.
Anthropic says these changes make its AIs much safer moving forward.