I Stopped An AI Bot From Crashing My eCommerce Website

Yesterday morning, the traffic to one of my ecommerce websites quadrupled.

I initially thought that something had gone viral, or a local newspaper had written about it.

But there was no increase in sales or engagement of any kind.

Then I started getting email alerts that my server was experiencing heavy resource load.

That means that, it was becoming unresponsive due to all the the traffic. And that means, even if a real customer wanted to buy something, they couldn’t, and would get long delays for the page to load.

So, my next thought was, it’s a denial of service attack. In case you don’t know, a denial of service attack is when a hacker takes control of multiple computers around the world, and uses them to overwhelm your website with a flood of fake traffic.

My website uses Cloudflare, and there is a button there that turns on the “im under attack” mode. So, every single request to my website will receive a challenge to make sure they are human.

The flood of traffic started to go down, but you cannot turn on “under attack” mode forever, because it will end up blocking third party systems that you integrate with, such a payment gateways and shipping service.

But, it did allow me to see just where this blocked traffic was coming from. And by looking at the logs, it turns out that it’s coming from ClaudeBot.

If you don’t know what that is, ClaudeBot is a bot that scrapes all the data on your website to power the Claude AI by Anthropic. It’s similar to ChatGPT by OpenAI.

The data that it scrapes is used to make their AI smarter at answering questions.

Now, I don’t have a problem with bots scraping my data. After all, Google has been doing this for years. The GoogleBot will crawl your website to understand its contents, and when somebody types a question in Google search, your website can appear as a relevant answer. That’s the basic premise of search engine optimisation, or SEO.

But, the aggressive nature of ClaudeBot meant that it was hammering my website relentlessly, to the point that my server was completely overloaded. That, I have a problem with.

And I’m not alone. From my research, this problem with ClaudeBot has been happening for the last 6 months, with people all over the world reporting their website slowing down from getting scraped so heavily. And it seems that the bot even ignores robots.txt instructions.

Since I was using Cloudflare, I was able to block ClaudeBot from accessing my site. Once I blocked the bot, my server load returned to normal, and my website was working fine again.

Should you block AI bots from your site?

But, it got me thinking, if AI bots are taking your content to make themselves smarter, should you block it?

Just like Google, if you allow Claude to scrape your website, your content can be more easily found by AI-powered search engines. This could lead to more people discovering your business.

And, from a community perspective, your data is being used to improve the AI models, which helps improve the AI services that I’m already using, such as ChatGPT.

I also don’t have a concern about copyright in this case, because the content that is being scraped are basically details of the products that I sell on the site anyway. Sure, I have been writing new content, such as case studies and other articles, but it was all for SEO anyway. After all, it’s not like AI can scrape photos of an original oil painting, and generate new images from it. Oh wait

No, my main concern was the savagery of the scraping. My server runs on 15% CPU and 20% RAM on any given day. But today, it was completely maxed out at 100% and sending 18MB/s. That’s like downloading the entire “Fellowship of the Rings” on BluRay in 20 minutes!

If I didn’t have my alert systems in place, I may not have even known about this until a customer told me about it. In that time, who knows how many sales I would have lost.

One other solution is to limit the rate at which Claudebot scrapes my data. So, I allow it to scrape, but only at a frequency that I decide. That helps my server stay up for real customers.

I have not implemented rate limiting since I’m on a free plan of Cloudflare. But once Cloudflare recognises Claudebot as an AI bot, It will be possible to add limits to it. Your move, Cloudflare.

Now, I want to know what you think

Are you a small business with an ecommerce website? Have you been hit with unexpected slowdowns that you couldn’t explain? And, are you ok if AI bots scrape your data? Let me know.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *