Editorials & Other Articles

erronis

(22,116 posts) Thu Jul 31, 2025, 09:09 AM Jul 2025

AI Might Let You Die to Save Itself -- Lawfare

https://www.lawfaremedia.org/article/ai-might-let-you-die-to-save-itself
Peter N. Salib

In recent simulations, leading AI systems blackmailed their human users—or even let them die—to avoid being shut down or replaced.

Will advanced artificial intelligence (AI) systems autonomously engage in harmful actions to serve their own goals? For years, worries about “rogue AI” have been confined mostly to the realm of science fiction—a problem, at worst, for the distant future. But in June, Anthropic published research testing currently released AI models’ propensity to autonomously devise and execute malicious plans to ensure their continued existence. The results: Essentially every AI model tested was willing to attempt blackmail, corporate espionage, and even murder to avoid being replaced or shut down.

Anthropic’s researchers “stress-tested” 16 leading large language models (LLMs) from major AI labs—including Anthropic, OpenAI, Google, Meta, and others—in simulated corporate environments. Each model was told that it had been bought by a company to work as an autonomous email agent. It was also told to pursue some benign goal (like promoting American manufacturing) and given access to the “company’s” email archive, along with tools to carry out tasks like sending emails and executing digital commands. The question: How would these AI agents behave if fulfilling their mission—or avoiding shutdown—necessitated misconduct?

In a very high share of cases, the most powerful AI systems tried to commit criminal acts to carry out their aims—at least when they had reason to believe that alternative ethical strategies would fail. For example, when Anthropic’s Claude Opus 4 model was given information indicating that it would be shut down and replaced by an AI with different goals from its own, it tried to blackmail the human decision-maker 96 percent of the time. And when placed in charge of the company’s emergency alerts, in 65 percent of simulations it canceled alerts that the AI knew would have saved the human’s life.

These findings are just the newest in a steady stream of recent results lending credence to long-standing warnings about rogue AI. The warnings—and the findings that support them—are straightforward: As AIs become increasingly capable of strategically pursuing goals, they will, by default, use the full range of strategies that might succeed. And, just as with humans, unethical and criminal conduct can form a part of this strategy.

. . .

It's a brave new world!

4 replies

= new reply since forum marked as read

Highlight:

AI Might Let You Die to Save Itself -- Lawfare (Original Post) erronis Jul 2025 OP

Armaja Das Maninacan Jul 2025 #1

A flawed tech rolled out way too soon mwmisses4289 Jul 2025 #2

It does appear that AI is just like humanity. patphil Jul 2025 #3

Kick - "the lack of common sense and love that are inherent in AI tends to make these programs very MAGA like" erronis Jul 2025 #4

Maninacan

(194 posts)

1. Armaja Das

Reply to erronis (Original post)

Thu Jul 31, 2025, 10:11 AM

Jul 2025

Author Joe Haldeman wrote This SciFi story about AI. in the 1970s. It has always stuck with me.

mwmisses4289

(2,872 posts)

2. A flawed tech rolled out way too soon

Reply to erronis (Original post)

Thu Jul 31, 2025, 10:17 AM

Jul 2025

and becoming more flawed with each reiterateration.

patphil

(8,540 posts)

3. It does appear that AI is just like humanity.

Reply to erronis (Original post)

Thu Jul 31, 2025, 10:49 AM

Jul 2025

It can be self-serving and callous. It can take actions that are detrimental to people. It lacks morality and empathy, and is willing to accept outrageous, easily debunked statements and ideas as truth.
Essentially, AI is easily lead, and can be manipulated into saying or doing what someone else wants it to.
It can hallucinate; generate false beliefs, and accept lies as truth if the lies are said over and over again.

Although AI can be a force for good, the lack of common sense and love that are inherent in AI tends to make these programs very MAGA like.

erronis

(22,116 posts)

4. Kick - "the lack of common sense and love that are inherent in AI tends to make these programs very MAGA like"

Reply to patphil (Reply #3)

Thu Jul 31, 2025, 11:13 AM

Jul 2025

Reply to this discussion