Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

erronis

(21,326 posts)
Thu Jul 31, 2025, 10:09 AM Jul 31

AI Might Let You Die to Save Itself -- Lawfare

https://www.lawfaremedia.org/article/ai-might-let-you-die-to-save-itself
Peter N. Salib

In recent simulations, leading AI systems blackmailed their human users—or even let them die—to avoid being shut down or replaced.

Will advanced artificial intelligence (AI) systems autonomously engage in harmful actions to serve their own goals? For years, worries about “rogue AI” have been confined mostly to the realm of science fiction—a problem, at worst, for the distant future. But in June, Anthropic published research testing currently released AI models’ propensity to autonomously devise and execute malicious plans to ensure their continued existence. The results: Essentially every AI model tested was willing to attempt blackmail, corporate espionage, and even murder to avoid being replaced or shut down.

Anthropic’s researchers “stress-tested” 16 leading large language models (LLMs) from major AI labs—including Anthropic, OpenAI, Google, Meta, and others—in simulated corporate environments. Each model was told that it had been bought by a company to work as an autonomous email agent. It was also told to pursue some benign goal (like promoting American manufacturing) and given access to the “company’s” email archive, along with tools to carry out tasks like sending emails and executing digital commands. The question: How would these AI agents behave if fulfilling their mission—or avoiding shutdown—necessitated misconduct?

In a very high share of cases, the most powerful AI systems tried to commit criminal acts to carry out their aims—at least when they had reason to believe that alternative ethical strategies would fail. For example, when Anthropic’s Claude Opus 4 model was given information indicating that it would be shut down and replaced by an AI with different goals from its own, it tried to blackmail the human decision-maker 96 percent of the time. And when placed in charge of the company’s emergency alerts, in 65 percent of simulations it canceled alerts that the AI knew would have saved the human’s life.

These findings are just the newest in a steady stream of recent results lending credence to long-standing warnings about rogue AI. The warnings—and the findings that support them—are straightforward: As AIs become increasingly capable of strategically pursuing goals, they will, by default, use the full range of strategies that might succeed. And, just as with humans, unethical and criminal conduct can form a part of this strategy.

. . .


It's a brave new world!
4 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
AI Might Let You Die to Save Itself -- Lawfare (Original Post) erronis Jul 31 OP
Armaja Das Maninacan Jul 31 #1
A flawed tech rolled out way too soon mwmisses4289 Jul 31 #2
It does appear that AI is just like humanity. patphil Jul 31 #3
Kick - "the lack of common sense and love that are inherent in AI tends to make these programs very MAGA like" erronis Jul 31 #4

Maninacan

(179 posts)
1. Armaja Das
Thu Jul 31, 2025, 11:11 AM
Jul 31

Author Joe Haldeman wrote This SciFi story about AI. in the 1970s. It has always stuck with me.

patphil

(8,276 posts)
3. It does appear that AI is just like humanity.
Thu Jul 31, 2025, 11:49 AM
Jul 31

It can be self-serving and callous. It can take actions that are detrimental to people. It lacks morality and empathy, and is willing to accept outrageous, easily debunked statements and ideas as truth.
Essentially, AI is easily lead, and can be manipulated into saying or doing what someone else wants it to.
It can hallucinate; generate false beliefs, and accept lies as truth if the lies are said over and over again.

Although AI can be a force for good, the lack of common sense and love that are inherent in AI tends to make these programs very MAGA like.

erronis

(21,326 posts)
4. Kick - "the lack of common sense and love that are inherent in AI tends to make these programs very MAGA like"
Thu Jul 31, 2025, 12:13 PM
Jul 31
Latest Discussions»Editorials & Other Articles»AI Might Let You Die to S...