Cloudflare Defaults to Blocking AI Bots, Offering New Control and Monetization Options for Websites
Cloudflare now blocks AI bots by default on hosted sites, enabling clients to control and monetize AI crawling with new tools and services.
Cloudflare Blocks AI Bots by Default
Cloudflare, a major internet infrastructure provider, has announced that it will now block AI bots from crawling websites it hosts by default. This new policy aims to give website owners more control over AI-driven data collection.
Client Control Over AI Crawlers
Clients of Cloudflare will have the ability to manually allow or block AI bots on a case-by-case basis. Additionally, Cloudflare is launching a "pay-per-crawl" service that lets clients earn compensation each time an AI bot accesses their content.
The Role of AI Web Crawlers
AI bots are specialized web crawlers that scan and collect data from websites to train AI models. Unlike traditional search engine crawlers, these AI crawlers often do not credit the original content creators or provide monetization opportunities, which creates tension between content owners and AI developers.
Changing Dynamics of Web Content Usage
Will Allen, Cloudflare’s head of AI privacy and media products, explains that the traditional model where search engines indexed content and sent traffic back to sites is shifting. Now, AI models use web data extensively without the same reciprocal benefits for content creators.
Flexible Options for Website Owners
Cloudflare allows clients to decide how AI bots interact with their websites, including during different AI lifecycle stages like training, fine-tuning, and inference. Verified crawlers can be whitelisted, and clients can set fees for AI bots to crawl their sites.
Support from Media and Community Platforms
Media organizations such as the Associated Press, Time, Quora, and Stack Overflow have expressed support for Cloudflare’s move. Stack Overflow’s CEO emphasized the importance of compensating community platforms that contribute to large language models.
Addressing Noncompliance and Malicious Crawlers
While web crawlers should respect website directives (robots.txt), some AI companies have ignored these rules. Cloudflare’s bot verification system helps identify legitimate AI crawlers, facilitating cooperation. For malicious bots, Cloudflare will leverage its expertise in mitigating denial-of-service attacks.
Innovative Anti-Crawling Techniques
Cloudflare has previously implemented measures like redirecting unwanted crawlers to AI-generated fake pages to waste their resources. These tactics will remain for bad actors, but the company hopes its new services encourage positive relationships between AI firms and content providers.
Concerns Over Research and Noncommercial Use
Some experts caution that default blocking of AI bots might hinder noncommercial activities such as academic research and web archiving. Not all AI systems are commercial or competitive with web publishers, so care must be taken to preserve open access for these uses.
Balancing Openness and Monetization
Cloudflare’s goal is to help maintain internet openness by enabling web publishers to negotiate sustainable agreements with AI companies. Verified crawlers and granular control allow site owners to keep access open for human visitors while managing AI bot traffic effectively.
Will Allen summarized this approach as offering more precise control to website owners, helping balance AI innovation with content creator rights.
Сменить язык
Читать эту статью на русском