Welcome to DU!
The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards.
Join the community:
Create a free account
Support DU (and get rid of ads!):
Become a Star Member
Latest Breaking News
Editorials & Other Articles
General Discussion
The DU Lounge
All Forums
Issue Forums
Culture Forums
Alliance Forums
Region Forums
Support Forums
Help & Search
AI Might Let You Die to Save Itself -- Lawfare
https://www.lawfaremedia.org/article/ai-might-let-you-die-to-save-itselfPeter N. Salib
In recent simulations, leading AI systems blackmailed their human usersor even let them dieto avoid being shut down or replaced.
Will advanced artificial intelligence (AI) systems autonomously engage in harmful actions to serve their own goals? For years, worries about rogue AI have been confined mostly to the realm of science fictiona problem, at worst, for the distant future. But in June, Anthropic published research testing currently released AI models propensity to autonomously devise and execute malicious plans to ensure their continued existence. The results: Essentially every AI model tested was willing to attempt blackmail, corporate espionage, and even murder to avoid being replaced or shut down.
Anthropics researchers stress-tested 16 leading large language models (LLMs) from major AI labsincluding Anthropic, OpenAI, Google, Meta, and othersin simulated corporate environments. Each model was told that it had been bought by a company to work as an autonomous email agent. It was also told to pursue some benign goal (like promoting American manufacturing) and given access to the companys email archive, along with tools to carry out tasks like sending emails and executing digital commands. The question: How would these AI agents behave if fulfilling their missionor avoiding shutdownnecessitated misconduct?
In a very high share of cases, the most powerful AI systems tried to commit criminal acts to carry out their aimsat least when they had reason to believe that alternative ethical strategies would fail. For example, when Anthropics Claude Opus 4 model was given information indicating that it would be shut down and replaced by an AI with different goals from its own, it tried to blackmail the human decision-maker 96 percent of the time. And when placed in charge of the companys emergency alerts, in 65 percent of simulations it canceled alerts that the AI knew would have saved the humans life.
These findings are just the newest in a steady stream of recent results lending credence to long-standing warnings about rogue AI. The warningsand the findings that support themare straightforward: As AIs become increasingly capable of strategically pursuing goals, they will, by default, use the full range of strategies that might succeed. And, just as with humans, unethical and criminal conduct can form a part of this strategy.
. . .
Anthropics researchers stress-tested 16 leading large language models (LLMs) from major AI labsincluding Anthropic, OpenAI, Google, Meta, and othersin simulated corporate environments. Each model was told that it had been bought by a company to work as an autonomous email agent. It was also told to pursue some benign goal (like promoting American manufacturing) and given access to the companys email archive, along with tools to carry out tasks like sending emails and executing digital commands. The question: How would these AI agents behave if fulfilling their missionor avoiding shutdownnecessitated misconduct?
In a very high share of cases, the most powerful AI systems tried to commit criminal acts to carry out their aimsat least when they had reason to believe that alternative ethical strategies would fail. For example, when Anthropics Claude Opus 4 model was given information indicating that it would be shut down and replaced by an AI with different goals from its own, it tried to blackmail the human decision-maker 96 percent of the time. And when placed in charge of the companys emergency alerts, in 65 percent of simulations it canceled alerts that the AI knew would have saved the humans life.
These findings are just the newest in a steady stream of recent results lending credence to long-standing warnings about rogue AI. The warningsand the findings that support themare straightforward: As AIs become increasingly capable of strategically pursuing goals, they will, by default, use the full range of strategies that might succeed. And, just as with humans, unethical and criminal conduct can form a part of this strategy.
. . .
It's a brave new world!
4 replies
= new reply since forum marked as read
Highlight:
NoneDon't highlight anything
5 newestHighlight 5 most recent replies

AI Might Let You Die to Save Itself -- Lawfare (Original Post)
erronis
Jul 31
OP
Maninacan
(179 posts)1. Armaja Das
Author Joe Haldeman wrote This SciFi story about AI. in the 1970s. It has always stuck with me.
mwmisses4289
(2,253 posts)2. A flawed tech rolled out way too soon
and becoming more flawed with each reiterateration.
patphil
(8,276 posts)3. It does appear that AI is just like humanity.
It can be self-serving and callous. It can take actions that are detrimental to people. It lacks morality and empathy, and is willing to accept outrageous, easily debunked statements and ideas as truth.
Essentially, AI is easily lead, and can be manipulated into saying or doing what someone else wants it to.
It can hallucinate; generate false beliefs, and accept lies as truth if the lies are said over and over again.
Although AI can be a force for good, the lack of common sense and love that are inherent in AI tends to make these programs very MAGA like.
erronis
(21,326 posts)4. Kick - "the lack of common sense and love that are inherent in AI tends to make these programs very MAGA like"