As search engines gradually become "answer engines", users can get information without even clicking on the original website. For website operators, news platforms or creators, this means reduced traffic, reduced revenue, and even no control over how the content is used. In response to this challenge, Cloudflare recently launched a newContent Signal Policy (Content Signals Policy) helps websites and creators more clearly express their preferences: Can AI companies use your website's content? How can they use it? Can they use it to train models?
In short, Cloudflare will help users update their websites' robots.txt files. This file is a small text file that websites use to tell crawlers which areas they can and cannot crawl, but it previously failed to regulate how crawlers use the content. The new content signal policy allows websites to tell AI in a machine-readable format: "yes" means permission, "no" means no permission, and no signal means no preference.
At the same time, it will also clearly list the common ways to use AI crawlers, including search, AI input, and AI training. In other words, websites can directly tell AI, "My content can be crawled for users to see, but it cannot be used to train your model," or completely refuse AI crawling.
Matthew Prince, Co-founder and CEO of Cloudflare, said, "The internet can't wait for solutions. Creators have the right to decide who can use their content, and there should be a clear way to communicate this to AI companies." He also noted that the enhanced robots.txt is more than just a technical update; it's a clear signal to AI companies: "This tells the industry: creators' wishes cannot be ignored."
For website operators, this means operations are highly intuitive. For example, a news website that previously attracted hundreds of thousands of hits daily could see its traffic shrink if AI directly answered user questions. With the new content signal policy, news websites can mark "AI not trainable" in robots.txt. Even if crawlers crawl content, there are clear regulations governing how AI can be used, potentially bringing legal consequences if misused in the future.
Currently, more than 380 million domains use Cloudflare's managed robots.txt service to indicate that they do not want their content to be used for AI training. Now, with the launch of the new policy, users can further set more preferences and provide clear instructions for all automated access (such as AI crawlers). For users who want to customize robots.txt, Cloudflare also provides operation tools and sample guidance.
The industry also expressed support for this policy. Danielle Coffey, president of the News Media Alliance, believes that this will allow content publishers to regain control of their content and ensure continued funding for high-quality content creation. Quora and Reddit praised Cloudflare's efforts in establishing transparent control mechanisms, allowing AI companies to better respect the preferences of content creators. RSL Collective and Stack Overflow pointed out that this not only protects rights and interests, but also helps to build a sustainable and fair online ecosystem, allowing creators and platforms to prosper together in the AI era.
This policy is also very practical for general creators or small and medium-sized websites. For example, a blogger who writes daily technology articles can set a rule in robots.txt: "AI can only display summaries, not train models." Or, a handmade e-commerce website can restrict AI from crawling product photos to train image recognition models. This not only protects content but also allows creators to continue producing with greater peace of mind.
Cloudflare emphasized that effective immediately, all customers using its robots.txt management service will be automatically updated to incorporate the new policy language. For users who wish to customize their own files, tools and tutorials are provided to ensure that every website maintains control over its content usage rights. As AI becomes increasingly embedded in the online ecosystem, this policy will serve as a clear and readable "consensus language" between creators, platforms, and AI companies.




