Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

justaprogressive

(5,265 posts)
Thu Aug 21, 2025, 11:39 AM Aug 21

AI crawlers and fetchers are blowing up websites, with Meta and OpenAI the worst offenders

Cloud services giant Fastly has released a report claiming AI crawlers are putting a heavy load on the open web, slurping up sites at a rate that accounts for 80 percent of all AI bot traffic, with the remaining 20 percent used by AI fetchers. Bots and fetchers can hit websites hard, demanding data from a single site in thousands of requests per minute.

I can only see one thing causing this to stop: the AI bubble popping
According to the report [PDF], Facebook owner Meta's AI division accounts for more than half of those crawlers, while OpenAI accounts for the overwhelming majority of on-demand fetch requests.

"AI bots are reshaping how the internet is accessed and experienced, introducing new complexities for digital platforms," Fastly senior security researcher Arun Kumar opined in a statement on the report's release. "Whether scraping for training data or delivering real-time responses, these bots create new challenges for visibility, control, and cost. You can't secure what you can't see, and without clear verification standards, AI-driven automation risks are becoming a blind spot for digital teams."

The company's report is based on analysis of Fastly's Next-Gen Web Application Firewall (NGWAF) and Bot Management services, which the company says "protect over 130,000 applications and APIs and inspect more than 6.5 trillion requests per month" – giving it plenty of data to play with. The data reveals a growing problem: an increasing website load comes not from human visitors, but from automated crawlers and fetchers working on behalf of chatbot firms.

The report warned, "Some AI bots, if not carefully engineered, can inadvertently impose an unsustainable load on webservers," Fastly's report warned, "leading to performance degradation, service disruption, and increased operational costs." Kumar separately noted to The Register, "Clearly this growth isn't sustainable, creating operational challenges while also undermining the business model of content creators. We as an industry need to do more to establish responsible norms and standards for crawling that allows AI companies to get the data they need while respecting websites content guidelines."


https://www.theregister.com/2025/08/21/ai_crawler_traffic/]
7 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
AI crawlers and fetchers are blowing up websites, with Meta and OpenAI the worst offenders (Original Post) justaprogressive Aug 21 OP
It's becoming more and more useless Johnny2X2X Aug 21 #1
All part of the plan of those who do not want an informed public. Attilatheblond Aug 21 #2
This is exactly the horseshit that we've been trying to mitigate at DU EarlG Aug 21 #3
I would think that EVERYONE on DU justaprogressive Aug 21 #4
Most of the time the behind the scenes tech stuff EarlG Aug 21 #5
I've been posting about AI endangering the web for as long as I've seen warnings about it, because highplainsdem Aug 21 #6
I have experience with the downside of "private" forums EarlG Aug 21 #7

Johnny2X2X

(23,238 posts)
1. It's becoming more and more useless
Thu Aug 21, 2025, 11:41 AM
Aug 21

I truly believe we may be heading towards a completely useless internet where nothing works and you can't be sure of anything you read there.

EarlG

(23,153 posts)
3. This is exactly the horseshit that we've been trying to mitigate at DU
Thu Aug 21, 2025, 12:55 PM
Aug 21

For a while now our server costs have been going up and up, month after month, and earlier this year we had to upgrade our database to deal with excess traffic (some DUers may recall that prior to this upgrade we were frequently having to switch to "members-only" mode during the day because the database would crap out).

As it turned out, the problem was that most of that excess traffic was bogus -- not real people, just bot crawlers. If you're a site that relies on advertising for revenue, that's a double problem, because advertisers don't like bogus bot traffic.

You can see where this is going. The bogus excess traffic caused our server costs to climb, and caused our ad revenue to drop. We were getting screwed at both ends, all because these AI assholes feel like they have the right to crawl everyone's sites and steal everyone's content.

Last month we made some changes to our server configuration and firewall which acts as strong block on a lot of bot traffic -- but not all of it. We've reduced the problem significantly, but it hasn't completely gone away:

https://www.theverge.com/news/718319/perplexity-stealth-crawling-cloudflare-ai-bots-report

It's disgusting, and I don't disagree that it is endangering the open Internet. Ultimately, if it came right down to it and DU could no longer afford to run because of these crawlers, my next move would be to make DU a private site which is unavailable to anyone who is not signed in. This would necessitate that everyone pay a subscription in order to use DU (since we would no longer be able to make ad revenue, and because confirming that someone is a paying subscriber would protect against bots accessing the site), so whether or not it is feasible would depend on how many people would be prepared to do that. To be completely clear, I have absolutely no plans to make this happen at the moment, but it's something I've had at the back of my mind as a "break glass in case of emergency" type situation.

EarlG

(23,153 posts)
5. Most of the time the behind the scenes tech stuff
Thu Aug 21, 2025, 02:14 PM
Aug 21

isn’t super relevant to DUers. It’s not something people need to be thinking about when they visit, but sometimes I read articles like this one and I don’t think it hurts to let folks know that this news about AI crawlers breaking the Internet is not just theoretical, it does affect us here.

To be clear, the “private site” idea truly would be a last resort. It’s a contingency plan in my mind, but no work has been done to actually make something like that come to fruition.

highplainsdem

(57,911 posts)
6. I've been posting about AI endangering the web for as long as I've seen warnings about it, because
Thu Aug 21, 2025, 03:07 PM
Aug 21

it is such a serious threat.

I love the web. I first got online before there was a real web, when there were separate online services and subscribing to a number of them at the same time for various reasons could really add up (business associates on one or more, family and friends on others, professional forums elsewhere...and this was back in the days when long distance calls could also add up and texting and calls weren't nearly-free alternatives to online communication). It was fantastic when it became a real world wide web.

I hate the way AI is threatening it now.

And the threat with AI scraping, not always clearly stated, is that if sites don't allow AI-using search engines, whether Google or OpenAI or Meta or others, to scrape that site as much as they want, that site will simply vanish from their search results.

Which is extortion.

Private forums also carry another risk, which I know from experience, having run forums that were both public and private.

Members of private forums can often greatly overestimate how truly private it is. And even when that's made very clear in the registration process, people tend to think it's okay to post personal information they would never post on a public website, not just identifying themselves (with addresses and phone numbers) but posting critiques of friends, family, employers. Posting about work done that might be reviewable elsewhere online including where it's sold, etc., by someone you might've offended in that private forum. It's just way too easy for them to reveal too much when talking with people they've come to trust - forgetting that anyone else in that forum can also see it.

Which is also a risk with chatrooms. One reason I'm glad DU doesn't have one. A chatroom large enough to accommodate very many DUers would also be very expensive.

If DU ever goes completely private, I would recommend that every sign-in, and maybe the page opening to allow users to post messages, include a reminder to keep information off DU that you would not post in a public message on a large social media platform.

EarlG

(23,153 posts)
7. I have experience with the downside of "private" forums
Thu Aug 21, 2025, 04:25 PM
Aug 21

There are many reasons why we switched from the moderator system to the Jury system, but one of them was the problems involved with maintaining the integrity of the private moderator forum. People who weren't moderators didn't like the idea that other members had a private space to talk about them behind their backs, and people who were moderators expected a certain level of privacy to hold discussions about rule violations -- and we couldn't 100% guarantee that privacy. Meanwhile trolls over at various conservative websites would brag about how they'd "infiltrated" the moderator forum, even though they hadn't -- but as a psychological warfare tactic it worked pretty well, because members would see those brags and believe that there were trolls moderating DU.

It's why members who serve on MIRT are reminded that the MIRT forum on DU is not "private" even though it is access-restricted, so they should expect that anything they say in there could potentially find its way out. It's a reminder to people to try and keep things on-task and generally professional while serving on MIRT -- there is basically no "social" chat in the MIRT forum (unless someone is just letting people know that they're going to be away for a little while, or something like that).

But regardless, making DU "private" isn't something that's even in the planning stages -- it's just a contingency plan in my own mind at the moment.

Latest Discussions»General Discussion»AI crawlers and fetchers ...