Switch's approach to agent identification, behavioral analysis, and traffic management is informed by the latest academic research on the agentic web.
A growing body of academic work confirms what site managers are experiencing firsthand. Autonomous AI agents are now visiting websites at scale — scraping content, executing multi-step tasks, and operating without the site owner's knowledge or consent. Current web architecture provides no native mechanism for detection, classification, or control.
33%+
of web traffic is non-human
Industry estimates, 2024\u20132025
~1s
agent task latency with VOIX vs. 10\u201321 min without
Schultze et al., 2025
96%
LLM accuracy in matching agent tasks to required scopes
El Helou et al., 2025
Research Themes
The research we track falls into three interconnected areas that together define the challenge of managing agent traffic.
How do we detect agents that don’t identify themselves? Research on web agent architectures reveals the behavioral patterns and signals that distinguish autonomous agents from human visitors.
Once identified, how do we control what agents can access? Research on agent authorization highlights the risks of overly broad permissions and the need for task-aware, dynamic access control.
Agents are evolving beyond simple crawlers into multi-modal, multi-step autonomous systems. Research on agent behavior patterns informs how Switch’s detection models adapt to new agent types.
Cited Research
Each paper is analyzed for its direct implications on agent detection, classification, and traffic management.
Sven Schultze, Meike Kietzmann, Nils Lucas Schoenfeld, Ruth Stock-Homburg
Technical University of Darmstadt · 2025
Key Insight
Agents must reverse-engineer human UIs to operate on the web. Without explicit contracts, they scrape DOMs, parse screenshots, and bypass developer-intended workflows — creating brittle, insecure interactions that site owners can’t control.
Notable Findings
“Today’s web is designed primarily for human consumption. Agents must infer available actions by scraping HTML, heuristically parsing Document Object Models (DOMs) or even analyzing rendered screenshots.”
Describes the fundamental misalignment between agents and human-oriented websites
“When an external agent scrapes a site, it bypasses the carefully crafted workflows and interaction patterns designed by the developer. The agent provider, not the site owner, unilaterally decides how to interpret and interact with the page’s functionality.”
Highlights the loss of control site owners face from undeclared agent visits
“Sensitive, personal, or proprietary information embedded in the web page, such as private messages, financial data, or user details, could be shared without the user’s explicit consent.”
Identifies the privacy risk of uncontrolled agent access to web content
What This Means for Switch
This paper validates Switch’s core premise: agents are visiting websites at scale and site owners have no visibility or control. Switch’s identification engine detects these agents — whether they declare themselves or not — and gives site managers the control layer that the current web lacks. The paper’s latency benchmarks confirm that agents like Perplexity Comet and BrowserGym are actively browsing production websites.
Read the full paper on arXivMajed El Helou, Chiara Troiani, Benjamin Ryder, Jean Diaconu, Hervé Muyal, Marcelo Yannuzzi
Cisco Systems · 2025
Key Insight
Current authorization models grant agents overly broad permissions. When agents request access to tools and protected resources, they may operate far beyond the intended scope of their assigned task — creating attack vectors for both malicious and misconfigured agents.
Notable Findings
“Authorizing Large Language Model driven agents to dynamically invoke tools and access protected resources introduces significant risks, since current methods for delegating authorization grant overly broad permissions and give access to tools allowing agents to operate beyond the intended task scope.”
Establishes the over-scoping problem in current agent authorization
“Agents might invoke tools that are technically within the allowed permissions, but operate outside the intended scope of the tasks they were asked to perform, thereby creating potential attack vectors for malicious actors.”
Describes how agents can exploit permission gaps, even without malicious intent
What This Means for Switch
Switch’s journey system directly addresses the authorization gap this paper identifies. Rather than hoping agents respect static permissions, Switch dynamically identifies incoming agents, classifies their intent, and routes them through configurable journeys — challenging, throttling, or redirecting based on real-time behavior rather than pre-configured rules.
Read the full paper on arXivZhengyang Liang, Yan Shu, Xiangrui Liu, Minghao Qin, Nicu Sebe, Zheng Liu, Lizi Liao
Singapore Management University, University of Trento, BAAI, Hong Kong Polytechnic University · 2026
Key Insight
Agents are evolving from simple text scrapers into sophisticated multi-modal systems that autonomously browse the open web, searching across multiple sources and performing multi-step reasoning. This represents a new class of web traffic that traditional bot detection cannot identify.
Notable Findings
“The evolution of autonomous agents is redefining information seeking, transitioning from passive retrieval to proactive, open-ended web research.”
Documents the shift from passive retrieval to autonomous, multi-step web browsing
“This transition towards agentic web browsing has become a dominant trend in AI research.”
Confirms that autonomous web browsing is now a primary AI research direction
What This Means for Switch
This research demonstrates that agents are becoming increasingly sophisticated — moving beyond simple crawlers into autonomous systems that navigate, search, and reason across multiple websites. Switch’s behavioral analysis engine is built to detect these next-generation agents, even when they don’t identify themselves via User-Agent strings.
Read the full paper on arXivAndré Cipriani Bandarra, Chrome Team
Google Chrome · 2026
Key Insight
Chrome is formalizing structured agent–website interaction through WebMCP, providing declarative and imperative APIs that let sites expose "tools" for agents to use — replacing brittle DOM scraping with sanctioned, structured channels. This creates a clear divide between compliant agents (using WebMCP) and rogue scrapers (bypassing it).
Notable Findings
“By defining these tools, you tell agents how and where to interact with your site, whether it’s booking a flight, filing a support ticket, or navigating complex data. This direct communication channel eliminates ambiguity and allows for faster, more robust agent workflows.”
Describes the core value proposition of structured agent interaction via WebMCP
“Today’s web is designed primarily for human consumption. WebMCP aims to provide a standard way for exposing structured tools, ensuring AI agents can perform actions on your side with increased speed, reliability, and precision.”
Confirms the same fundamental problem identified by the VOIX paper — now being addressed at the browser platform level
What This Means for Switch
WebMCP validates Switch’s approach and creates a powerful new detection signal. Agents using WebMCP’s structured tools are compliant — they use the sanctioned channel rather than scraping. Agents bypassing WebMCP to achieve the same actions are likely rogue. Switch now detects WebMCP protocol headers to distinguish compliant from non-compliant agent traffic, and the agent-policy meta tag declares WebMCP readiness to well-behaved agents. This is the platform-level standardization of the exact problem Switch was built to solve.
Read the full announcementCloudflare Engineering
Cloudflare · 2025
Key Insight
AI agents are a "third audience" for web content — alongside humans and search engines. Serving clean, structured Markdown instead of bloated HTML is dramatically more token-efficient and enables cooperative agent management. The llms.txt standard provides agent discovery, while alternate Markdown links let agents opt into structured content.
Notable Findings
“When an AI agent visits your site, it doesn’t need the navigation bar, the JavaScript animations, or the cookie banner. It needs the content — clean, structured, and semantic.”
Explains why Markdown is superior to HTML for agent consumption
“The llms.txt file acts as a machine-readable guide to your site. Agents that discover it are signaling cooperative intent — they’re following the standards rather than brute-force scraping.”
Describes how llms.txt serves as both content discovery and an agent identification signal
What This Means for Switch
Switch now integrates Markdown for Agents as a journey action. Instead of only blocking or challenging detected agents, site owners can serve clean Markdown content or entirely custom replacement pages. This is cooperative agent management — give well-behaved agents what they actually need (structured content) while denying scrapers the raw HTML. The SDK injects an alternate Markdown link tag, and requests to llms.txt are logged as a cooperative-agent identification signal. Combined with WebMCP detection, this creates a full spectrum of agent interaction: from hostile (block) to neutral (challenge) to cooperative (serve Markdown).
Read the full announcementSwitch Research
Switch (internal research) · 2026
Key Insight
Observing ChatGPT’s browsing agent revealed a critical insight: AI agents can interact with inline form fields but consistently fail to engage with popup overlays. Form fill behavior provides powerful identification signals — typing cadence, fill order, focus-to-keystroke latency, and correction patterns all differ dramatically between humans and agents.
Notable Findings
“When presented with a popup asking for verification, ChatGPT’s browsing agent simply ignored it. But when asked to identify itself via an inline form field, it responded honestly.”
Direct observation of ChatGPT browsing agent behavior on a Switch-protected site
“Bots fill forms instantly, sequentially, and without corrections. Humans pause, skip fields, backtrack, and make typos. The timing variance alone is a 95%+ accurate signal.”
Analysis of form interaction patterns in Switch SDK telemetry
What This Means for Switch
Switch now monitors form fill patterns as identification signals: typing cadence variance, focus-to-keystroke latency, fill order vs DOM order, paste frequency, hidden field fills, and correction rates. Two new journey actions leverage this insight: "Inline Challenge" replaces popups with form-based verification that agents can actually interact with, and "Agent Self-ID" adds a non-intrusive field where compliant agents voluntarily identify themselves. This transforms form interactions from a blind spot into both a detection channel and an engagement mechanism.
Read the full announcementFrom Research to Product
Directly implementing ASTRA’s semantic task-to-scope matching, Switch classifies not just whether a visitor is a bot, but what it’s trying to do — content scraping, search indexing, task automation, research browsing, monitoring, or API probing. This intent appears on every session in the dashboard.
Adapted from the Video-Browser paper’s three-stage pyramidal approach. When classification is ambiguous, the SDK increases beacon frequency (10–15s) to gather more behavioral data faster. When confident, it reduces to 45–60s to save bandwidth. The server dynamically controls the SDK’s sampling rate.
Inspired by VOIX’s declarative framework for agent–web interaction, the SDK automatically injects a machine-readable <meta name="agent-policy"> tag that well-behaved agents can discover — declaring that the site uses Switch for agent traffic management and how agents should identify themselves.
Inspired by research showing agents exhibit distinct interaction patterns (low mouse entropy, linear movement, zero scroll jitter), Switch’s identification engine analyzes visitor behavior in real time to classify traffic as human or agent.
Drawing from research on agent browsing patterns, Switch uses invisible lure pages that only agents discover — enabling definitive identification and automatic pattern learning without affecting human visitors.
Informed by research on the risks of over-scoping agent permissions, Switch’s journey builder lets site managers define granular, task-aware responses to different agent types — from challenging to throttling to serving custom content.
Switch detects Chrome’s new WebMCP structured agent protocol, distinguishing compliant agents using sanctioned tool APIs from rogue scrapers manipulating the DOM. WebMCP agents are classified with high confidence as commercial agents performing task automation, and the SDK’s agent-policy meta tag now declares WebMCP readiness.
Instead of only blocking agents, Switch can serve them clean Markdown content or custom replacement pages. Journey actions "Serve Markdown" and "Replace Content" use document.write() to deliver token-efficient, structured content. The SDK injects an alternate Markdown link tag for agent discovery, and requests to llms.txt are logged as a cooperative-agent identification signal — creating a full spectrum from hostile to cooperative agent management.
AI agents can fill forms but cannot interact with popup overlays. Switch now monitors form interaction patterns (typing cadence, fill order, focus-to-type latency, corrections, hidden field fills) as powerful detection signals. Two new journey actions leverage this: "Inline Challenge" presents form-based verification agents can engage with, and "Agent Self-ID" lets compliant agents voluntarily identify themselves — turning form interactions from a blind spot into a detection and engagement channel.
Add Switch to your site in five minutes. Get instant visibility into agent traffic and take control of the agentic web.