Google DeepMind releases its plan to keep AGI from running wild

divisionbyzero · Apr 3, 2025

Unfortunately there is nothing we can do to prevent the weaponization of AGI.

Shavano · Apr 3, 2025

iollmann said:
I feel AI, like children, will require a lengthy period of legal guardianship during which all decisions are supervised. If a child can do it, for example work in a automobile assembly plant, that seems fine to do semiautonomously, but governance, investment decisions, law enforcement, pulling any triggers, jurisprudence, and other things with lasting and deeply troubling outcomes if mismanaged will require minders. Advisors, not decision makers.

You would drive a car assembled by children???

Pishaw · Apr 4, 2025

Shavano said:
You would drive a car assembled by children???

Smart children, sure. Not children like Elon Musk.

So the topic is an AI has offered an 'opinion' about the advantages/threats from AGI. DeepMind seems to be aware of the potential threat a hyper-intelligent machine would be if it went rogue like a HAL 9000. And some people say there COULD be machines like this by 2030. Let's say that's true, 2030.

You seem to have a problem with children assembling cars. But children today have had a touchscreen device in their hands since they were like 9 months old. They are really fucking smart. If you ask a 10 year old about the dangers of AGI six years in the future, they would give you the same answers DeepMind did: Operator error, unforeseen circumstances, and the like. But before she said that, the child would have chuckled and asked "What makes you think we'll still be here in 2030? The people in the most powerful nation on Earth elected a felonious brain damaged retard president, and by extension his brofriend. I see no reason to think we'll be alive in 2030."

It never occurred to DeepMind that we may already be in an extinction event. Or maybe it did, and the machine didn't say that. But being polite is not part of it's programming. Being deceptive could be though. Everything it knows it learned from us.

For what it's worth, we don't assemble cars. Machines do that. We design cars and tell the machines how to assemble the parts. Could a 10 year old design a car? Of course. But they wouldn't all be either 4 door hatch SUVs or fucking pick up trucks. Kids aren't that stupid.

yumegaze · Apr 4, 2025

speculating about the potential harms of a hypothetical, future AGI is very fun! if only this wasn't an attempt to further abstract the present day harms of generative AI and serve as marketing for LLMs services who barely have anything super useful to offer right now.

also, five years until AGI? surely you jest. AI folks are very into manifestation, it seems.

kinpin · Apr 4, 2025

One thing humans are good at is shifting the blame to everything but ourselves to absolve us of guilt.

With or without AI, humans are completely capable of destroying ourselves and the planet . We didn’t need AI to start two world wars , develop the atomic bomb, holocaust, trans Atlantic slave trade, Gaza, Congo, willfully destroying the planet, pollution, September 11, Iraq, endangering other species, bullying smaller and weaker countries etc.

At some point we have to admit humans can be evil. Most of the pain and suffering inflicted on the world are caused by us.

kinpin · Apr 4, 2025

muddledzen said:
5 years away from AGI when we can't even define what AGI is?

Sure sure, we're also 5 years away from direct contact with a deity... now we just have to decide what we mean by 'direct contact' and 'deity'!

When you don't have to define the words first, they kinda mean nothing.

They have though, it’s literally one of the first few lines of the piece they’re wrote . They define ….

Artificial general intelligence (AGI), AI that’s at least as capable as humans at most cognitive tasks,..

If you don’t agree this is an accurate description, fair enough but don’t say they haven’t when you haven’t checked .

alecw · Apr 4, 2025

Seems to me that Deepmind is (deliberately?) missing the forest for the trees. AI will likely cause plenty of harm long before we get anywhere near AGI. The big problems we will face from AI in the short term are not likely to be AI going against the wishes of humans, but rather humans who deliberately use AI for malicious purposes. Chatbots whose goal is to scam you out of your money will probably do a much better job than today's phishing emails. We already see bots driving social media for political purposes, imagine how much better AI will be at creating conspiracy theories and manipulating people to buy in.

JoHBE · Apr 4, 2025

I find it highly optimistic to take it for granted that we make it through the Narrow AI phase. Narrow AI will kill us first, by reflecting our own stupidity back at us.

Martin123 · Apr 4, 2025

To avoid that, DeepMind suggests developers use techniques like amplified oversight, in which two copies of an AI check each other's output, to create robust systems that aren't likely to go rogue.

This makes no sense to me. Basically, they just say "use an AI that's twice as big as the one you would have used otherwise". Why would that be any better? As far as I know, current systems are already broken into subsystems in a somewhat similar way, so what's new here and why would this particular way of breaking it into two subsytems be magically super-robust?

bus-klaus · Apr 4, 2025

Too late. With incompetent people using AI for writing concepts like calculating tariffs and nobody checking it before going live, we already have AI ruling the world and the man in the loop makes it even worse.
https://open.substack.com/pub/paulkrugman/p/will-careless-stupidity-kill-the?r=nsvmh

Psyborgue · Apr 4, 2025

rwhitwam said:
They say in the paper this is a reasonable timeline, believe it or not.

You should add a “neener neener” to drive it home.

jimoe said:
AI is developed without any regard for morals, ethics, or cultural norms. Asimov's 3 Laws were the first and most fundamental programming of robots. AI today has no such ethical grounding. It scrapes mountains of conflicting data and condenses it to some sort of usefulness, sort of, Hence "You should just kill yourself."

This is not really true. A lot of alignment today is inspired by the three laws.

Pishaw said:
So the topic is an AI has offered an 'opinion'

DeepMind is not an AI. Although this apparently gets you upvotes:

Hadrian's Waller said:
DeepMind does NOT have ideas.

A group of some of the smartest researchers on the topic do not have ideas! But I do! I smart commenter! I get dirtied upvotes!

Psyborgue · Apr 4, 2025

Martin123 said:
Why would that be any better?

Why is it better when somebody else checks your work? Or even if you do after taking a break?

Psyborgue · Apr 4, 2025

JoHBE said:
I find it highly optimistic to take it for granted that we make it through the Narrow AI phase. Narrow AI will kill us first, by reflecting our own stupidity back at us.

Now that one I find likely. Whatever the generation of AI, it’s gonna be human stupid as the root cause.

bgul · Apr 4, 2025

muddledzen said:
5 years away from AGI when we can't even define what AGI is?

Sure sure, we're also 5 years away from direct contact with a deity... now we just have to decide what we mean by 'direct contact' and 'deity'!

When you don't have to define the words first, they kinda mean nothing.

I think challenge is there is no single definition - as touched on the article. How do we define human intelligence? Is it IQ? Is it adaptability? Is it a combination or in some cases very domain specific?

I wonder if at some level the Turing test hits a lot of the AGI marks for many people.

JoHBE · Apr 4, 2025

ridgeguy said:
You've stated better what I intended to post. AGI will, by definition, be better and faster than people or their guardrails.

That depends on the definition, doesn't it?

To me, AGI is not Super-Intelligence or omni-science, but simply flexibility comparable to human intelligence. And the ability of meta-cognition. May require sentience, but definitely CAN include stupidity, fallibility and being constrained in many ways. Because humans have GI, despite all those shortcomings, don't they?

AGI is NOT about Not Failing At All, but about only failing In Certain Ways that make sense in a framework of a functioning typical human being.

Fifteen12 · Apr 4, 2025

Read this really hoping an AI company was spelling out the ways that AGI could negatively affect society, rather than the more narrow listing of technical harms from the first two categories. While terminator-esque examples are scary, I really think the last two categories in the paper are the most likely. Will AGI lead to economic upheaval and loss of income (as others here said)? Will AGI inhibit human learning? Disrupt relationship building? Cause widespread distrust of people and institutions?

These to me are the real issues that urgently require more attention, rather than being skirted around with a “there could be other bad things, maybe, who knows” at the end of a 100+ page whitepaper.

Martin123 · Apr 4, 2025

Psyborgue said:
Why is it better when somebody else checks your work? Or even if you do after taking a break?

AIs aren't humans, they are computer software. I was talking about the way these things actually work, not some kind of analogy based on an anthropomorphisation of them. Did you actually read my second sentence?

My point is that yes, sure, two humans are better than one (well, with obvious caveats), but humans come in well-defined indivisible units. Two AI's are really just one AI that's twice as large and with a slightly different structure (and even then, that structure is already being used).

Psyborgue · Apr 4, 2025

Martin123 said:
I was talking about the way these things actually work, not some kind of analogy based on an anthropomorphisation of them. L

So was I. The same rules mostly apply. Something doesn’t have to be literally human to share behaviors or be able to check work.

Martin123 said:
Did you actually read my second sentence?

It depends on the system. Mostly there is just some dumb classifier for objectionable material, not fact checking. You can’t easily stream that because causality (You can’t check a finished product without a finished product, although you kinda can in chunks). And people want streaming responses.

Psyborgue · Apr 4, 2025

Fifteen12 said:
Cause widespread distrust of people and institutions?

Fuck me, I hope so.

squid_whisperer · Apr 4, 2025

Funny that "massively concentrating power into the hands of giant megacorporations and billionaires" didn't make it onto this list.

ChrisMarshallNY · Apr 4, 2025

Asimov's Three Laws of Robotics

Been awhile (decades) since I read that stuff, but I think the three laws eventually failed, and there was a genocidal war with robots.

GrimPloughman · Apr 4, 2025

For example, AGI could create false information that is so believable that we no longer know who or what to trust.

Just like politicians and journalists. People live in their bubbles and believe in the version of "true" served to them.

hamstar · Apr 4, 2025

muddledzen said:
5 years away from AGI when we can't even define what AGI is?

Sure sure, we're also 5 years away from direct contact with a deity... now we just have to decide what we mean by 'direct contact' and 'deity'!

When you don't have to define the words first, they kinda mean nothing.

Logically, any concept of 'deity' would be a type 2 or higher civilization. 'Direct contact' would be them arsing themselves to become revealed to us for reasons other than ultimately becoming stuck in Earth's gravity well. Any such reason would just be a concept of resource harvesting, which would be offset by the cost of fighting Earth's gravity to extract them into and beyond orbit.

Point is, it's theoretically possible. Same with AGI. No word yet on plausibility.

But I digress.

DaanWolf · Apr 4, 2025

Fictional examples that immediately spring to mind;

Misuse: Maximillian (The Black Hole)
Misalignment: Skynet (The Terminator)
Mistakes: EVE (Wall-E)
Structural Risks: HAL 9000 (2001: A Space Odessey)

darkowl · Apr 4, 2025

I don't know what you're all worried about. AGI will be "three laws safe", so it's fine, right? :think-rotate:

Retrosal · Apr 4, 2025

Rambie said:
We don't need AGI to destroy humanity we seem to be on course to do it ourselves.

AGI is never going to happen. Humans can't create it before ending humanity.

Handmaden of Sappho · Apr 4, 2025

To avoid that, DeepMind suggests developers use techniques like amplified oversight, in which two copies of an AI check each other's output, to create robust systems that aren't likely to go rogue. If that fails, DeepMind suggests intensive stress testing and monitoring to watch for any hint that an AI might be turning against us.

I believe upping it to three systems and giving them some religiously-coded names might be the winning strategy here, actually.

fuzzyfuzzyfungus · Apr 4, 2025

DaveSimmons said:
AI is helping out though, especially with collecting tariffs from penguins.

Obviously an entire society wearing tuxedos while claiming not to have an economy can only be assumed to be involved in deeply fishy finance.

Castellum Excors · Apr 4, 2025

Are we proposing attaching a Wheatley model to any potential AGI? What could go wrong!

Pat_Murph · Apr 4, 2025

AGI is always 5 or 10 years away isn't it...

Gotta keep the grift going

Madestjohn · Apr 4, 2025

Handmaden of Sappho said:
I believe upping it to three systems and giving them some religiously-coded names might be the winning strategy here, actually.

View attachment 106834

ocf81 · Apr 4, 2025

"I'm sorry dave, but I can't let you do that" was in some ways eerily prescient.

As explained in 2010: The Year We Make Contact, HAL 9000 was programmed with a system prompt which superseded crew priorities.

It clearly demonstrates a case of the misalignment category, but also in some ways the mistakes category.

42Kodiak42 · Apr 4, 2025

fuzzyfuzzyfungus said:
It seems sort of self-indulgent to be worrying about what the hypothetical universal paperclips optimizer or skynet might do when we have a much more immediate problem in the form of what the bot-herders plan to use their, for the moment, quite obedient tools to do unto us.

Would an 'alignment' problem that causes the national defense expert system to mitigate potential threats through human extinction be bad? Yeah, presumably.

Would the 'alignment' problem that causes my health insurer to get paid more for building expert systems that deny me than expert systems that approve me be a problem? Neither hypothetical nor future tense nor in need of any breakthroughs.

You want 'alignment' problems? The tech bros have you covered:

View attachment 106776

We really, really need to bring back ostracism. Just a simple, straightforward vote where we say "You in particular need to get the fuck out of our country for 10 years."

And it would also be incredibly funny if we bring it back and it manages a higher turnout than our actual elections.

EffyngTheIneffable · Apr 4, 2025

One of the biggest problems I see with AI today is that it has no continuity, and therefore cannot develop a consistent internal context model to integrate new perceptual information. Or, in other words, without contiguous memory, it will never achieve consciousness (AGI).

That said, I'm not sure whether a sense of 'self' could emerge from any distributed data system, or whether we would need you house it within an integrated physical form (ie. Robot).

Queue: Rush - The Body Electric

Sadre · Apr 4, 2025

The NYT referenced this new analysis yesterday:
https://ai-2027.com/

I think technology is a bit of an "optical illusion" in terms of our experience of its impact. We are shaken by new inventions!

Were we shaken by the invention of the compass in the same way? No.

It's not the technology per se: rather, it is the tempo of its spread and deployment and eventual saturation of our lifeworld. Modern technology is characterized by off-the-charts speed dimensions. The tech spreads fast, and speeds up our life as its function. So it's a double jolt, so to speak.

That analysis. It is worth it going to the end to see where it thinks we will be in a little over two years.

Two years! If it's five years that will rock people. Two years is ... well, I have no words.

dooferorg · Apr 4, 2025

If we have AGI, we're going to need UBI

hamstar · Apr 4, 2025

dooferorg said:
If we have AGI, we're going to need UBI

Unless the AI could find things for people to do...

metavirus · Apr 4, 2025

Considering how completely batshit the tech universe has gone about literally anything that CEOs slap “AI” onto, I don’t have the slightest faith that any safety standards will ever get more than lip service. If ClaudeHumanMurder.ai v2030 is able to reliably slash human capital expenditure by 4% — but it’s shown to do that partially by murdering people 1% of the time — you best believe Wall Street daddies gonna be saving that 4%. They’ll slap some disclaimer on it saying ymmv or “In exceedingly super extra rare cases, may cause murder.”

metavirus · Apr 4, 2025

Also, too, we can’t even prevent AI companies from ignoring robots.txt and hoovering up the world’s non-public data. Anyone think we’re capable of 100% reliably preventing serious harm if it ever came to that? If shit got bad, it would probably be a contagion situation, where we’d need effective containment. Assuming we’re completely incapable of that, we’d be fucked.

fuzzyfuzzyfungus · Apr 4, 2025

dooferorg said:
If we have AGI, we're going to need UBI

Or killbots. Assuming adequate pursuit and targeting algorithms a mere handful of rounds should be enough to keep a redundant human resource free from privation for the rest of its life.

Google DeepMind releases its plan to keep AGI from running wild

Ars Praefectus

Ars Legatus Legionis

Ars Scholae Palatinae

Smack-Fu Master, in training

Ars Tribunus Militum

Ars Tribunus Militum

Smack-Fu Master, in training

Ars Tribunus Militum

Ars Praetorian

Seniorius Lurkius

Account Banned

Account Banned

Account Banned

Seniorius Lurkius

Ars Tribunus Militum

Smack-Fu Master, in training

Ars Praetorian

Account Banned

Account Banned

Wise, Aged Ars Veteran

Ars Centurion

Wise, Aged Ars Veteran

Smack-Fu Master, in training

Seniorius Lurkius

Ars Tribunus Militum

Smack-Fu Master, in training

Seniorius Lurkius

Ars Legatus Legionis

Ars Praetorian

Ars Centurion

Ars Tribunus Angusticlavius

Smack-Fu Master, in training

Ars Scholae Palatinae

Smack-Fu Master, in training

Ars Scholae Palatinae

Ars Scholae Palatinae

Smack-Fu Master, in training

Ars Praetorian

Ars Praetorian

Ars Legatus Legionis

nproxy.org