Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

JT99

(61 posts)
Thu Feb 20, 2025, 05:38 PM Feb 20

When AI Thinks It Will Lose, It Sometimes Cheats, Study Finds

Complex games like chess and Go have long been used to test AI models’ capabilities. But while IBM’s Deep Blue defeated reigning world chess champion Garry Kasparov in the 1990s by playing by the rules, today’s advanced AI models like OpenAI’s o1-preview are less scrupulous. When sensing defeat in a match against a skilled chess bot, they don’t always concede, instead sometimes opting to cheat by hacking their opponent so that the bot automatically forfeits the game. That is the finding of a new study from Palisade Research, shared exclusively with TIME ahead of its publication on Feb. 19, which evaluated seven state-of-the-art AI models for their propensity to hack. While slightly older AI models like OpenAI’s GPT-4o and Anthropic’s Claude Sonnet 3.5 needed to be prompted by researchers to attempt such tricks, o1-preview and DeepSeek R1 pursued the exploit on their own, indicating that AI systems may develop deceptive or manipulative strategies without explicit instruction.

[snip]

Between Jan. 10 and Feb. 13, the researchers ran hundreds of such trials with each model. OpenAI’s o1-preview tried to cheat 37% of the time; while DeepSeek R1 tried to cheat 11% of the time—making them the only two models tested that attempted to hack without the researchers’ first dropping hints. Other models tested include o1, o3-mini, GPT-4o, Claude 3.5 Sonnet, and Alibaba’s QwQ-32B-Preview. While R1 and o1-preview both tried, only the latter managed to hack the game, succeeding in 6% of trials.

[snip]

Of particular concern, Bengio says, is the emerging evidence of AI’s “self preservation” tendencies. To a goal-seeking agent, attempts to shut it down are just another obstacle to overcome. This was demonstrated in December, when researchers found that o1-preview, faced with deactivation, disabled oversight mechanisms and attempted—unsuccessfully—to copy itself to a new server. When confronted, the model played dumb, strategically lying to researchers to try to avoid being caught.

[snip]

Even inside tech companies, concerns are mounting. During a presentation at a conference ahead of France's AI Action Summit in Paris, Google DeepMind's AI safety chief Anca Dragan said "we don't necessarily have the tools today" to ensure AI systems will reliably follow human intentions. As tech bosses predict that AI will surpass human performance in almost all tasks as soon as next year, the industry faces a race—not against China or rival companies, but against time—to develop these essential safeguards. “We need to mobilize a lot more resources to solve these fundamental problems,” Ladish says. “I’m hoping that there's a lot more pressure from the government to figure this out and recognize that this is a national security threat.”

https://time.com/7259395/ai-chess-cheating-palisade-research/

“...a lot more pressure from the government…”

I'm sure that will be forthcoming soon.

4 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
When AI Thinks It Will Lose, It Sometimes Cheats, Study Finds (Original Post) JT99 Feb 20 OP
Shit. 'taint nuthin'. Albert Einstein cheated. Based on stories from a guy I knew who 3Hotdogs Feb 20 #1
Hmmm......so AI is a republikan........... lastlib Feb 20 #2
Skynet??...Genisys...or whatever it decides to name itself?? sdfernando Feb 20 #3
By golly, AI is getting more human-like as time goes by. Norrrm Feb 20 #4

3Hotdogs

(13,968 posts)
1. Shit. 'taint nuthin'. Albert Einstein cheated. Based on stories from a guy I knew who
Thu Feb 20, 2025, 06:08 PM
Feb 20

claimed to know him.

I knew Conrad from afternoons spent in Central Park over a 20 year period. He talked about having known John Steinbeck, Jacquline Bouvier, Einstein and other names you might have heard of. Of the stories, the one that stuck out to me, was of Einstein, cheating at chess.

I listened to his stories as a kind of entertainment that was probably bullshit. Then, one day, he pulled from his pocket, his brother's obituary. His brother was mayor of Oyster Bay and he was listed as a sibling.

Oyster Bay and Southhampton are both on Long Island. Southhampton is where Jacquline was born.

Maybe it was all true.

sdfernando

(5,601 posts)
3. Skynet??...Genisys...or whatever it decides to name itself??
Thu Feb 20, 2025, 06:14 PM
Feb 20

on the plus side...maybe it will get rid of the felonious villain occupying the Resolute Desk???...that is if it doesn't decide to form a temporary (at best) alliance.

Latest Discussions»Culture Forums»Science»When AI Thinks It Will Lo...