Cloudflare Puts Publishers in Control With Default Block on AI Web Crawlers

Starting September 15, mixed-use AI crawlers will be blocked by default across millions of sites to build a more controlled AI content ecosystem.

Topics

  • [Image: Chetan Jha/MITSMR Middle East]

    Cloudflare is giving AI companies less than three months to change how they collect information from the web, introducing a policy that could significantly shift the balance between publishers, search engines, and AI model providers.

    Beginning September 15, 2026, Cloudflare will block “mixed-use” web crawlers by default on advertising-supported websites. These crawlers—used simultaneously for traditional search indexing, AI agents, and model training—will no longer be able to access content unless website owners explicitly choose to allow them.

    The default changes will apply to all new Cloudflare customers, new websites created by existing customers, and the company’s existing free-tier users.

    The decision is a part of the growing effort to separate AI’s data economy from the decades-old economics of web search. For years, publishers accepted search engine crawlers because indexing drove referral traffic and advertising revenue. Generative AI has disrupted that by consuming content to answer questions directly, often reducing the need for users to visit the source.

    The company also took aim at what it described as the internet’s largest search provider, suggesting that its combined crawling strategy gives it access to substantially more web content than competing AI developers because website owners struggle to separate search indexing from AI-related data collection.

    Google has disputed that characterization. The company already offers Google Extended, a crawler control that allows publishers to prevent content from being used for AI training and products such as Gemini Apps and Vertex AI without affecting visibility in Google Search. However, Google’s primary Googlebot continues to crawl content for both search, AI Overviews, and AI Mode.

    “Now that the majority of traffic on the internet is non-human, we must go further and act faster so that a sustainable ecosystem can emerge,” Cloudflare co-founder and CEO Matthew Prince said while announcing the policy. The company noted that automated bot traffic has now overtaken human internet traffic earlier than industry forecasts had anticipated.

    Cloudflare has spent the past two years building infrastructure that allows publishers to manage AI access rather than simply block it. Earlier initiatives included Pay Per Crawl, which enables websites to charge AI companies for scraping their content. That system is now evolving into Pay Per Use, a model that compensates publishers based on the value their content generates within AI products rather than the number of pages crawled.

    The company argues that the approach could also reduce unnecessary infrastructure costs. Internal data suggests that more than half of AI crawler traffic consists of repeatedly fetching web pages that have not changed, consuming bandwidth and computing resources without creating additional value.

    To test the new model, Cloudflare is partnering with AI companies Ceramic.ai and You.com. Participating publishers will receive payments when their content appears in Ceramic’s AI search results or when You.com accesses premium content, while other AI providers will be able to adapt the framework to their own products.

    To a certain extent, the policy will define the future of AI’s relationship with the open web. AI systems’ dependence on proprietary, continuously updated information has become an economic negotiation—one that should be less by crawling capability and more by permission, transparency, and compensation.

    Topics

    More Like This

    You must to post a comment.

    First time here? : Comment on articles and get access to many more articles.

    ×