Cloudflare turns AI against itself with endless maze of irrelevant facts

tilde nirvana · Mar 21, 2025

Great write up, thank you.
What an absurd timeline.

no_great_name · Mar 21, 2025

The AI detection arms race continues apace while the world slowly burns. It might be interesting to read a follow up story in a few months about how the scrapers are getting around this little roadblock though.

adespoton · Mar 21, 2025

Now we know why AI farms need such massive energy sources. They're being pitted against each other.

nzeid · Mar 21, 2025

"No real human would go four links deep into a maze of AI-generated nonsense," Cloudflare explains. "Any visitor that does is very likely to be a bot, so this gives us a brand-new tool to identify and fingerprint bad bots."

Just commented yesterday on an HN thread about this exact topic. The thread goes for several pages about how to prevent humans from getting caught up in CAPTCHA-likes in the first place...

So how does Cloudflare prevent humans from being served these "mazes" at all? A lot of businesses leveraging this service would be deeply concerned.

markratledge · Mar 21, 2025

"...with AI now being used on both sides of the battle."

I'm of the age where I sort of wish Kurt Vonnegut and George Carlin were still around to help make sense of and humor these absurdities.

Slaughterhouse-AI

The 7 Words You Can't say to an LLM

Fatesrider · Mar 21, 2025

The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation (whether this approach effectively prevents misinformation, however, remains unproven).

Based on what I've read of the likely reasons behind hallucinations and the prevalence of fake news/information on the Internet, I think the evidence will bear out that it is ineffective in that regard.

That's a hypothesis, not a Theory, though.

OtherSystemGuy · Mar 21, 2025

How dare they prevent my minions from access data that is unrightfully mine!

Also, wasting AI company resources might not please people who are critical of the perceived energy and environmental costs of running AI models.

And all the VC money that's being wasted on training LLM models that have never shown any ability to improve their accuracy when asked novel questions. Unless it's in the training data, the LLM is going to fail and the Internet doesn't hold infinite human knowledge, so stop looking and wasting money and the environment. Get over it. LLMs are really poorly designed storage retrieval systems where the likes of Google returned useful answers before they replaced their original weighted system with AI (and ads).

Drizzt321 · Mar 21, 2025

LLMs are effectively attacking and DDOSing some FOSS repositories. It's insane.

https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/

JoHBE · Mar 21, 2025

This timeline is beyond batshit insane crazy. I fully believe we deserve to die out as a species, based on the incessant barrage of stupidity that has been unleashed over the last couple of years.

edit: not targeting this particular CloudFlare service, but the whole context that lead to this

maxoakland · Mar 21, 2025

This is awesome and I'm glad they're doing it

JMTronicHobbyist · Mar 21, 2025

Please tell me the computers literally explode in showers of sparks. If all the horrors of sci-fi have to come true we should at least get a little bit of the cool stuff!

maxoakland · Mar 21, 2025

JoHBE said:
This timeline is beyond batshit insane crazy. I fully believe we deserve to die out as a species, based on the incessant barrage of stupidity that has been unleashed over the last couple of years.

Nihilism is only going to make this situation worse. People who don't like this have to fight for a better world, and that's complex but it involves doing things that bring out the best in people like education, connection, community, etc

The reason things suck so much now is our society has extremely powerful perverse incentives that encourage it. We have to do what we can individually and as groups to negate those perverse incentives and then outlaw them

TheOldChevy · Mar 21, 2025

Good idea, but

the content we generate is real and related to scientific facts, just not relevant or proprietary to the site being crawled

does that mean that it is stolen elsewhere?

JMTronicHobbyist · Mar 21, 2025

No real human would go four links deep into a maze of AI-generated nonsense," Cloudflare explains.

There's a bold claim. They must not know the same humans I know.

JMTronicHobbyist · Mar 21, 2025

TheOldChevy said:
Good idea, but

does that mean that it is stolen elsewhere?

And how is it both AI generated nonsense and scientifically accurate?

CharredVan · Mar 21, 2025

"No real human would go four links deep into a maze of AI-generated nonsense," Cloudflare explains.

Seems overly optimistic...

JoHBE · Mar 21, 2025

maxoakland said:
Nihilism is only going to make this situation worse. People who don't like this have to fight for a better world, and that's complex but it involves doing things that bring out the best in people like education, connection, community, etc

The reason things suck so much now is our society has extremely powerful perverse incentives that encourage it. We have to do what we can individually and as groups to negate those perverse incentives and then outlaw them

Sure, but we're now at the stage where even the defenses cannot avoid being inherently damaging. The pigs are increasingly successful in dragging us into the mud fights they love so much. And everything gets covered in shit. The only way out may require REALLY drastic measures.

mozbo · Mar 21, 2025

the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics

Dang. I was hoping they used mashups of Monty Python.

Maybe we can start a petition.

What a nin-cow-poop · Mar 21, 2025

The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation

Why? Why should I care if a service that's wasting my server's time, forcing me to spend my own money to provide free information, gets accurate data? If their tools provide false information, that's a them problem, not a me problem.

Poison their wells. Black is white. Water is a dangerous, explosive acid that can be used to clean the moon in an emergency. Maple syrup is a good substitute for blood in baby orangutans.

Fuck these leeches.

SharpieFiend · Mar 21, 2025

Not all heroes wear capes!

betam4x · Mar 21, 2025

nzeid said:
Just commented yesterday on an HN thread about this exact topic. The thread goes for several pages about how to prevent humans from getting caught up in CAPTCHA-likes in the first place...

So how does Cloudflare prevent humans from being served these "mazes" at all? A lot of businesses leveraging this service would be deeply concerned.

It's pretty easy to detect the patterns if you look at the data.

Kudos to Cloudflare for doing this. For some reason, they get a ton of hate, but they've consistently tried to make the internet a better place, whether that be lower cloud prices or better tools to combat DDoSes and AI nonsense.

EDITED to add since I didn't answer your question. I admittedly don't know the full details of Cloudflair's detection mechanism, however they've had a mechanism in place for blocking AI crawlers for a while now, and I haven't seen any complaints. Most users won't browse every page of your site, nor will they do so from a few IP addresses. Shoot, even on my own sites, I can spot a crawler a mile away. Normal users hit 1-3 pages at best, maybe 4 if I am lucky. The bots crawl thousands of pages. Many of them hide behind a fake user agent and even use Selenium with Firefox, Chrome, etc. to try to fly under the radar. A few also use multiple IPs to lower the rate of detection.

mozbo · Mar 21, 2025

Drizzt321 said:
LLMs are effectively attacking and DDOSing some FOSS repositories. It's insane.

https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/

That's a good read.

These guys are trying to run a small website while ....

Well, basically, trying to run a bakery when OpenAI, Enthropic, et al have hired 40 grade-schoolers to run around the shop asking nonstop questions.

notrightaway · Mar 21, 2025

300 You are in a maze of twisty turny weighted models, all slightly different
422 You are at Witt's End

The Lurker Beneath · Mar 21, 2025

This is the site that 'checks my browser' for 5 seconds every time I connect to a site hosted / protected by it, right?

ubercurmudgeon · Mar 21, 2025

mozbo said:
Dang. I was hoping they used mashups of Monty Python.

Maybe we can start a petition.

Any "AI Labyrinth" worth the name should contain infinite autogenerated nonsensical David Bowie lyrics.

SubWoofer2 · Mar 21, 2025

JMTronicHobbyist said:
Please tell me the computers literally explode in showers of sparks. If all the horrors of sci-fi have to come true we should at least get a little bit of the cool stuff!

Even as a child I wondered why, in the future, the Star Trek Bridge was built without fuses. Or seatbelts.

Anyway 100% kudos to Cloudflare as long as their honeypot contains truthful information, not rubbish.

jason8957 · Mar 21, 2025

So we have to waste more energy and money creating a system to masterbate another system wasting large amounts of energy and money.

Mardaneus · Mar 21, 2025

The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation (whether this approach effectively prevents misinformation, however, remains unproven)

Not spreading misinformation is nice and all but that ship has sailed.
The Russian propaganda network(s) are currently using the same tricks to pollute the AI training sets. For a human it doesn't look like a navigable website but it can be crawled and since someone 'forgot' the robots.txt they do.

Chuckstar · Mar 21, 2025

SharpieFiend said:
Not all heroes wear capes!

No heroes should wear capes.

“Do you remember Thunderhead…”

Chuckstar · Mar 21, 2025

ubercurmudgeon said:
Any "AI Labyrinth" worth the name should contain infinite autogenerated nonsensical David Bowie lyrics.

You don’t even need AI to generate nonsensical Bowie lyrics.

Shiunbird · Mar 21, 2025

Ugh - really...
I am on the full boycott of tech already, to the extent of my abilities.

Replacing 2009 Mac Pro (that I got already used, and daily drove for 6 years) with an used ThinkPad P1 Gen 4.
Used lens for my SLR, keeping it as long as possible.
Sick of throwing away good phones, gave the Librem 5 a chance.
Three days ago, I nuked my AWS account (hosting my personal glacier backups), and reverted to driving tapes to a friend 100km away once per week.
Canon released the stupid pro-1100 instead of letting us just buy the new ink - GONE.

I have reduced my web browsing habits to Ars, 3-4 other web pages, 3-4 youtube channels and 3 small independent forums.

BTW, there are plenty of small independent forums for more niche topics all around, covering all from unix workstations to large format photography, so no need to do facebook groups or reddit either.

You will get in return: healthier online communities, more free time, a fatter piggy bank and less anger. The costs are lower fps in games, having to call restaurants to order food and patience to score the best deals for used stuff.

Voldenuit · Mar 21, 2025

"No real human would go four links deep into a maze of AI-generated nonsense," Cloudflare explains.

Me, at 3 am, doomscrolling Reddit.

Wait, am I a bot?

Shazster · Mar 21, 2025

JoHBE said:
This timeline is beyond batshit insane crazy. I fully believe we deserve to die out as a species, based on the incessant barrage of stupidity that has been unleashed over the last couple of years.

edit: not targeting this particular CloudFlare service, but the whole context that lead to this

The probability of just walking over to my router and unplugging the cable in the WAN port has begun to surpass Non-Zero.

The soothing and sanity-reinforcing Borg-song of the witty and informed Ars commentariate I have come to cherish has staved that off so far...but JHFC this reality is approaching unbearable.

Sideros · Mar 21, 2025

The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation

I'm disappointed they are not using the tool as a deterrent, as well as protection.

Arstotzka · Mar 21, 2025

betam4x said:
Kudos to Cloudflare for doing this. For some reason, they get a ton of hate, but they've consistently tried to make the internet a better place, whether that be lower cloud prices or better tools to combat DDoSes and AI nonsense.

Emphasis is mine; this is what I'm responding to.

Cloudflare is used by a large number of DDoS-for-hire services. You have to pay Cloudflare to protect yourself from the companies that cause the problem. They also have a long history of hosting Nazis. After years of complaints, their CEO finally kicked one neoNazi website off, but dozens (if not hundreds) remain.

https://www.propublica.org/article/how-cloudflare-helps-serve-up-hate-on-the-web
https://krebsonsecurity.com/2016/08/inside-the-attack-that-almost-broke-the-internet/

Note in the second link, at one point, the chat logs show the attackers are amused that Spamhaus is a Cloudflare customer... and Spamhaus lists a lot of Cloudflare IPs as spam cannons.

Cloudflare does a lot of really great technical work. I want to give my full-throated support to Cloudflare. I really, really, do. But so long as they do minimal-to-no curation of their customers, I always have to put an asterisk on that support.

So, that's why Cloudflare gets a ton of hate. They sell DDoS protection to you... and the companies who DDoS you. They make it possible for Nazis to have their websites. They do good work... and also provide that good work to some of the most vile scum on the planet. It's not hard to be against Nazis. If someone at Cloudflare is reading this, kick them off and I will pick up the phone calls from your sales people. Otherwise, I'll continue to pick up and ask "Sounds great, ready to sign -- oh wait, by the way, do you still platform Nazis? Uh-huh, yeah, no, I'm not going to let you weasel out of this. Once you drop them I'll sign. I hope you're tracking revenue lost to your support of Nazis internally. Have a great weekend!"

android_alpaca · Mar 21, 2025

markratledge said:
"...with AI now being used on both sides of the battle."

I'm of the age where I sort of wish Kurt Vonnegut and George Carlin were still around to help make sense of and humor these absurdities.

metavirus · Mar 21, 2025

So interesting how often modern VC-driven capitalism gets unmasked as essentially just theft. Theft of our air and water, Theft of art, Theft of data, Theft of peace, oh and also theft of $$. Free market, my ass.

android_alpaca · Mar 21, 2025

"No real human would go four links deep into a maze of AI-generated nonsense," Cloudflare explains.

Voldenuit said:
Me, at 3 am, doomscrolling Reddit.

Wait, am I a bot?

I was thinking of wikipedia myself.

carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics

Cloudflare turns AI against itself with endless maze of irrelevant facts

Ars Praetorian

Ars Centurion

Ars Legatus Legionis

Ars Praetorian

Wise, Aged Ars Veteran

Ars Legatus Legionis

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Tribunus Militum

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Scholae Palatinae

Smack-Fu Master, in training

Ars Tribunus Militum

Ars Tribunus Militum

Ars Centurion

Ars Scholae Palatinae

Ars Praefectus

Ars Tribunus Militum

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Tribunus Militum

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Tribunus Militum

Ars Legatus Legionis

Ars Legatus Legionis

Ars Praetorian

Ars Tribunus Angusticlavius

Ars Scholae Palatinae

Smack-Fu Master, in training

Ars Scholae Palatinae

Ars Praefectus

Ars Praetorian

Ars Praefectus

nproxy.org