Glossary
Key terms and concepts related to AI agents, web crawlers, bot management, and the agentic web — explained for site managers and developers.
Agent Fingerprinting
Identifying AI agents through a combination of technical signals beyond just the user-agent string.
Agentic Web
The emerging paradigm where AI agents autonomously browse, interact with, and transact on websites.
AI Search Engine
A search platform that uses AI to generate direct answers with citations instead of traditional link results.
AI Training Crawler
A web crawler that collects content to train artificial intelligence and large language models.
Bot Detection
Techniques for identifying automated visitors versus human users on a website.
Bot Management
The practice of detecting, classifying, and controlling automated traffic on a website.
Browser Agent
An AI system that controls a real web browser to browse, interact with, and complete tasks on websites.
Content Gate
A technique that prevents automated scripts from accessing page content by requiring JavaScript execution.
Crawl Budget
The number of pages a search engine will crawl on your site within a given time period.
llms.txt
A proposed standard file (like robots.txt) that provides AI language models with a site summary and key information.
robots.txt
A text file at a website's root that tells crawlers which pages they can and cannot access.
Structured Data
Machine-readable markup (like JSON-LD) that helps search engines and AI agents understand page content.
User-Agent String
An HTTP header that identifies the software making a web request, such as a browser or crawler.
Web Crawler
An automated program that systematically browses the web to discover and index content.
Web Scraping
The automated extraction of data from websites, typically at scale.