How to Block CCBot
Complete guide to blocking CCBot (Common Crawl) from crawling your website using robots.txt, server configuration, and Switch workflows.
Should You Block CCBot?
CCBot collects data for AI model training. Blocking it prevents your content from being used in Common Crawl's AI products without affecting your search visibility.
This is a common and recommended action for sites that want to control how their content is used in AI training.
Blocking Methods
1robots.txt
High for cooperative crawlersAdd a Disallow rule for CCBot's user-agent string in your robots.txt file. This is the standard, cooperative method that well-behaved crawlers respect.
2Server-side UA filtering
HighConfigure your web server (nginx, Apache, Cloudflare) to reject requests matching CCBot's user-agent patterns. This blocks at the network level before your application processes the request.
3Switch Journey Workflows
Highest — granular, real-time controlCreate a custom journey in Switch that detects CCBot and routes it to a block action, challenge, redirect, or modified content — without touching your server configuration.
robots.txt — Block CCBot
Add the following to your robots.txt file (at the root of your domain) to block CCBot:
User-agent: CCBot Disallow: / User-agent: ccbot Disallow: /
robots.txt — Allow with Restrictions
Alternatively, allow CCBot on most pages while blocking specific directories:
User-agent: CCBot Disallow: /private/ Allow: / User-agent: ccbot Disallow: /private/ Allow: /
CCBot User-Agent Strings
Use these patterns to identify CCBot in your server logs or firewall rules:
Frequently Asked Questions
Does blocking CCBot affect my Google search rankings?
No. Blocking CCBot does not affect your Google search rankings. Only blocking Googlebot impacts Google Search visibility.
Does CCBot respect robots.txt?
Yes, CCBot respects robots.txt directives. Adding a Disallow rule for its user-agent will prevent it from crawling blocked paths.
Can I allow CCBot on some pages but not others?
Yes. Use robots.txt to disallow specific directories, or use Switch journey workflows for granular page-level control with conditional logic.
Go beyond robots.txt
Switch detects CCBot in real-time and lets you build custom journey workflows — block, challenge, redirect, or serve modified content. No server changes required.
Get Started Free