Web scraping by artificial intelligence is now prevented by Cloudflare by default settings.
In the rapidly evolving landscape of artificial intelligence (AI), a significant change is underway as more GenAI vendors grapple with the reality of paying a fair price for high-quality training data while maintaining profitability.
This shift is reflected in the new policy introduced by Cloudflare, a prominent web infrastructure and security company. Under this policy, companies with newly registered domains using Cloudflare's services worldwide are required to explicitly allow AI web crawlers, such as those from OpenAI, to access content. Previously, access was generally granted by default.
The updated policy also introduces a "Pay Per Crawl" program for select publishers. This allows them to set pricing terms for AI scrapers, offering a potential new revenue stream for content creators. Existing domains are not automatically blocked, but the policy underscores the need for a more structured approach to web scraping.
The legality of web scraping has long been a murky area, with loosely enforced rules such as the robots.txt file serving as the primary guide. However, developments in this field highlight the gap between fast-moving technologies and slower regulatory systems. In May 2025, Irish and German regulators declined to block Meta from using Facebook and Instagram data, signalling a potential shift in attitudes towards data usage.
The competition from China may also play a role in this evolution. With many Western GenAI companies facing economic uncertainty, some may choose to exit the business. This could lead to a power shift in the AI industry.
However, it's important to note that in some jurisdictions, a deliberate bypass of anti-bot protection and massive data scraping may constitute a criminal offense. Breach of contract claims, not copyright, could pose the most serious legal threat to GenAI companies.
Cloudflare CEO, Matthew Prince, has emphasised the need for publishers to have control and a new economic model that benefits everyone. As the web scraping landscape continues to evolve, it's clear that a more structured, fair, and legal approach is necessary to ensure a sustainable future for both AI companies and content creators.
Read also:
- "Dying Light: The Beast Outpaces Borderlands 4 in Remarkable PC Optimization, Yet Exposes Gearbox CEO's Insensitive Commentary"
- Unlawful MMO gold peddler cultivates around half a million dollars in income, triggering a tax dispute, followed by developers unveiling strategies to clamp down on real-world transactions
- Prices of transit tickets in Berlin and Brandenburg are on the rise
- Linde Wins Major Engineering Design Contract for Equinor's Low Carbon Hydrogen Project at H2H Saltend, Progressing Towards a Greener Future