Commercial CrawlersActive

Google-Extended

Google's AI training token controlling use of Googlebot-crawled content for AI.

Operated by GoogleOfficial docs

What is Google-Extended?

Google-Extended is a robots.txt token (not a separate crawler) that controls whether content already crawled by Googlebot can be used for AI training purposes, specifically for Google's Gemini models and other AI products. It provides granular control without affecting search indexing.

This is one of the most important AI policy mechanisms available to site owners. By disallowing Google-Extended while keeping Googlebot allowed, you maintain full Google Search visibility while preventing your content from training Google's AI models.

Google introduced Google-Extended in response to publisher concerns about AI training data usage. It represents a best practice in the industry for separating search indexing from AI training consent.

User-Agent Strings

These are the known user-agent patterns used by Google-Extended. Use them to identify this crawler in your server logs or configure robots.txt rules.

Google-Extended

robots.txt example:

User-agent: Google-Extended
Disallow: /private/
Allow: /

How to Manage Google-Extended

1

Block Google-Extended to opt out of Gemini AI training while keeping search presence.

2

This is a robots.txt token, not a separate crawler.

3

Use Switch to build a comprehensive AI training opt-out strategy.

4

Combine with blocking GPTBot and ClaudeBot for broad training data protection.

How to block Google-Extended

Start managing Google-Extended today

Switch detects, tracks, and lets you build custom journeys for Google-Extended and 35+ other AI agents and crawlers. Set up in five minutes.

Get Started Free

Related Agents

Back to Agents Directory