Cloudflare has a toggle for blocking AI scrapers. I don’t think it’s default, bu...

kyledrake · on Dec 23, 2024

This just feels like mystery meat to me. My guess is that a lot of legitimate users and VPNs are being blocked from viewing sites, which numerous users in this discussion have confirmed.

This seems like a very bad way to approach this, and ironically their model quite possible also uses some sort of machine learning to work.

A few web hosting platforms are using the cloudflare blocker and I think it's incredibly unethical. They're inevitably blocking millions of legitimate users from viewing content on other people's sites and then pretending it's "anti AI". To paraphrase Theo Deraadt, they saw something on the shelf, and it has all sorts of pretty colours, and they bought it.

pixl97 · on Dec 23, 2024

> I think it's incredibly unethical.

The internet isn't built on ethical behavior, unfortunately.

kyledrake · on Dec 23, 2024

I get that a lot of people are opposed to AI, but blocking random IP ranges seems like a really inappropriate way to do this, the friendly fire is going to be massive. The robots.txt approach is fine, but it would be nice if it could get standardized so that you don't have to change it a lot based on new companies (like a generic no llm crawling directive for example).

input_sh · on Dec 23, 2024

It's not much smarter than just adding user agents to robots.txt manually.

jaybna · on Dec 23, 2024

They might get into the micro-licensing game too. More power to them.