General Discussion
Related: Editorials & Other Articles, Issue Forums, Alliance Forums, Region ForumsAI crawlers and fetchers are blowing up websites, with Meta and OpenAI the worst offenders
I can only see one thing causing this to stop: the AI bubble popping
According to the report [PDF], Facebook owner Meta's AI division accounts for more than half of those crawlers, while OpenAI accounts for the overwhelming majority of on-demand fetch requests.
"AI bots are reshaping how the internet is accessed and experienced, introducing new complexities for digital platforms," Fastly senior security researcher Arun Kumar opined in a statement on the report's release. "Whether scraping for training data or delivering real-time responses, these bots create new challenges for visibility, control, and cost. You can't secure what you can't see, and without clear verification standards, AI-driven automation risks are becoming a blind spot for digital teams."
The company's report is based on analysis of Fastly's Next-Gen Web Application Firewall (NGWAF) and Bot Management services, which the company says "protect over 130,000 applications and APIs and inspect more than 6.5 trillion requests per month" giving it plenty of data to play with. The data reveals a growing problem: an increasing website load comes not from human visitors, but from automated crawlers and fetchers working on behalf of chatbot firms.
The report warned, "Some AI bots, if not carefully engineered, can inadvertently impose an unsustainable load on webservers," Fastly's report warned, "leading to performance degradation, service disruption, and increased operational costs." Kumar separately noted to The Register, "Clearly this growth isn't sustainable, creating operational challenges while also undermining the business model of content creators. We as an industry need to do more to establish responsible norms and standards for crawling that allows AI companies to get the data they need while respecting websites content guidelines."
https://www.theregister.com/2025/08/21/ai_crawler_traffic/]

Johnny2X2X
(23,238 posts)I truly believe we may be heading towards a completely useless internet where nothing works and you can't be sure of anything you read there.
Attilatheblond
(7,012 posts)EarlG
(23,153 posts)For a while now our server costs have been going up and up, month after month, and earlier this year we had to upgrade our database to deal with excess traffic (some DUers may recall that prior to this upgrade we were frequently having to switch to "members-only" mode during the day because the database would crap out).
As it turned out, the problem was that most of that excess traffic was bogus -- not real people, just bot crawlers. If you're a site that relies on advertising for revenue, that's a double problem, because advertisers don't like bogus bot traffic.
You can see where this is going. The bogus excess traffic caused our server costs to climb, and caused our ad revenue to drop. We were getting screwed at both ends, all because these AI assholes feel like they have the right to crawl everyone's sites and steal everyone's content.
Last month we made some changes to our server configuration and firewall which acts as strong block on a lot of bot traffic -- but not all of it. We've reduced the problem significantly, but it hasn't completely gone away:
https://www.theverge.com/news/718319/perplexity-stealth-crawling-cloudflare-ai-bots-report
It's disgusting, and I don't disagree that it is endangering the open Internet. Ultimately, if it came right down to it and DU could no longer afford to run because of these crawlers, my next move would be to make DU a private site which is unavailable to anyone who is not signed in. This would necessitate that everyone pay a subscription in order to use DU (since we would no longer be able to make ad revenue, and because confirming that someone is a paying subscriber would protect against bots accessing the site), so whether or not it is feasible would depend on how many people would be prepared to do that. To be completely clear, I have absolutely no plans to make this happen at the moment, but it's something I've had at the back of my mind as a "break glass in case of emergency" type situation.
justaprogressive
(5,265 posts)needs to know this...
EarlG
(23,153 posts)isnt super relevant to DUers. Its not something people need to be thinking about when they visit, but sometimes I read articles like this one and I dont think it hurts to let folks know that this news about AI crawlers breaking the Internet is not just theoretical, it does affect us here.
To be clear, the private site idea truly would be a last resort. Its a contingency plan in my mind, but no work has been done to actually make something like that come to fruition.
highplainsdem
(57,911 posts)it is such a serious threat.
I love the web. I first got online before there was a real web, when there were separate online services and subscribing to a number of them at the same time for various reasons could really add up (business associates on one or more, family and friends on others, professional forums elsewhere...and this was back in the days when long distance calls could also add up and texting and calls weren't nearly-free alternatives to online communication). It was fantastic when it became a real world wide web.
I hate the way AI is threatening it now.
And the threat with AI scraping, not always clearly stated, is that if sites don't allow AI-using search engines, whether Google or OpenAI or Meta or others, to scrape that site as much as they want, that site will simply vanish from their search results.
Which is extortion.
Private forums also carry another risk, which I know from experience, having run forums that were both public and private.
Members of private forums can often greatly overestimate how truly private it is. And even when that's made very clear in the registration process, people tend to think it's okay to post personal information they would never post on a public website, not just identifying themselves (with addresses and phone numbers) but posting critiques of friends, family, employers. Posting about work done that might be reviewable elsewhere online including where it's sold, etc., by someone you might've offended in that private forum. It's just way too easy for them to reveal too much when talking with people they've come to trust - forgetting that anyone else in that forum can also see it.
Which is also a risk with chatrooms. One reason I'm glad DU doesn't have one. A chatroom large enough to accommodate very many DUers would also be very expensive.
If DU ever goes completely private, I would recommend that every sign-in, and maybe the page opening to allow users to post messages, include a reminder to keep information off DU that you would not post in a public message on a large social media platform.
EarlG
(23,153 posts)There are many reasons why we switched from the moderator system to the Jury system, but one of them was the problems involved with maintaining the integrity of the private moderator forum. People who weren't moderators didn't like the idea that other members had a private space to talk about them behind their backs, and people who were moderators expected a certain level of privacy to hold discussions about rule violations -- and we couldn't 100% guarantee that privacy. Meanwhile trolls over at various conservative websites would brag about how they'd "infiltrated" the moderator forum, even though they hadn't -- but as a psychological warfare tactic it worked pretty well, because members would see those brags and believe that there were trolls moderating DU.
It's why members who serve on MIRT are reminded that the MIRT forum on DU is not "private" even though it is access-restricted, so they should expect that anything they say in there could potentially find its way out. It's a reminder to people to try and keep things on-task and generally professional while serving on MIRT -- there is basically no "social" chat in the MIRT forum (unless someone is just letting people know that they're going to be away for a little while, or something like that).
But regardless, making DU "private" isn't something that's even in the planning stages -- it's just a contingency plan in my own mind at the moment.