Inside SearchGuard: How Google detects bots and what the SerpAPI lawsuit reveals
We fully decrypted SearchGuard, the anti-bot system protecting Google Search. Here's exactly how Google tells humans and bots apart.
We fully decrypted Google’s SearchGuard anti-bot system, the technology at the center of its recent lawsuit against SerpAPI.
After fully deobfuscating the JavaScript code, we now have an unprecedented look at how Google distinguishes human visitors from automated scrapers in real time.
What happened. Google filed a lawsuit on Dec. 19 against Texas-based SerpAPI LLC, alleging the company circumvented SearchGuard to scrape copyrighted content from Google Search results at a scale of “hundreds of millions” of queries daily. Rather than targeting terms-of-service violations, Google built its case on DMCA Section 1201 – the anti-circumvention provision of copyright law.
The complaint describes SearchGuard as “the product of tens of thousands of person hours and millions of dollars of investment.”
Why we care. The lawsuit reveals exactly what Google considers worth protecting – and how far it will go to defend it. For SEOs and marketers, understanding SearchGuard matters because any large-scale automated interaction with Google Search now triggers this system. If you’re using tools that scrape SERPs, this is the wall they’re hitting.
The OpenAI connection
Here’s where it gets interesting: SerpAPI isn’t just any scraping company.
OpenAI has been partially using Google search results scraped by SerpAPI to power ChatGPT’s real-time answers. SerpAPI listed OpenAI as a customer on its website as recently as May 2024, before the reference was quietly removed.
Google declined OpenAI’s direct request to access its search index in 2024. Yet ChatGPT still needed fresh search data to compete.
The solution? A third-party scraper that pillages Google’s SERPs and resells the data.
Google isn’t attacking OpenAI directly. It’s targeting a key link in the supply chain that feeds its main AI competitor.
The timing is telling. Google is striking at the infrastructure that powers rival search products — without naming them in the complaint.
What we found inside SearchGuard
We fully decrypted version 41 of the BotGuard script – the technology underlying SearchGuard. The script opens with an unexpectedly friendly message:
Anti-spam. Want to say hello? Contact [email protected] */
Behind that greeting sits one of the most sophisticated bot detection systems ever deployed.
BotGuard vs. SearchGuard. BotGuard is Google’s proprietary anti-bot system, internally called “Web Application Attestation” (WAA). Introduced around 2013, it now protects virtually all Google services: YouTube, reCAPTCHA v3, Google Maps, and more.
In its complaint against SerpAPI, Google revealed that the system protecting Search specifically is called “SearchGuard” – presumably the internal name for BotGuard when applied to Google Search. This is the component that was deployed in January 2025, breaking nearly every SERP scraper overnight.
Unlike traditional CAPTCHAs that require clicking images of traffic lights, BotGuard operates completely invisibly. It continuously collects behavioral signals and analyzes them using statistical algorithms to distinguish humans from bots – all without the user knowing.
The code runs inside a bytecode virtual machine with 512 registers, specifically designed to resist reverse engineering.
How Google knows you’re human
The system tracks four categories of behavior in real time. Here’s what it measures:
Mouse movements
Humans don’t move cursors in straight lines. We follow natural curves with acceleration and deceleration – tiny imperfections that reveal our humanity.
Google tracks:
- Trajectory (path shape)
- Velocity (speed)
- Acceleration (speed changes)
- Jitter (micro-tremors)
A “perfect” mouse movement – linear, constant speed – is immediately suspicious. Bots typically move in precise vectors or teleport between points. Humans are messier.
Detection threshold: Mouse velocity variance below 10 flags as bot behavior. Normal human variance falls between 50-500.
Keyboard rhythm
Everyone has a unique typing signature. Google measures:
- Inter-key intervals (time between keystrokes)
- Key press duration (how long each key is held)
- Error patterns
- Pauses after punctuation
A human typically shows 80-150ms variance between keystrokes. A bot? Often less than 10ms with robotic consistency.
Detection threshold: Key press duration variance under 5ms indicates automation. Normal human typing shows 20-50ms variance.
Scroll behavior
Natural scrolling has variable velocity, direction changes, and momentum-based deceleration. Programmatic scrolling is often too smooth, too fast, or perfectly uniform.
Google measures:
- Amplitude (how far)
- Direction changes
- Timing between scrolls
- Smoothness patterns
Scrolling in fixed increments – 100px, 100px, 100px – is a red flag.
Detection threshold: Scroll delta variance under 5px suggests bot activity. Humans typically show 20-100px variance.
Timing jitter
This is the killer signal. Humans are inconsistent, and that’s exactly what makes us human.
Google uses Welford’s algorithm to calculate variance in real-time with constant memory usage – meaning it can analyze patterns without storing massive amounts of data, regardless of how many events occur. As each event arrives, the algorithm updates its running statistics.
If your action intervals have near-zero variance, you’re flagged.
The math: If timing follows a Gaussian distribution with natural variance, you’re human. If it’s uniform or deterministic, you’re a bot.
Detection threshold: Event counts exceeding 200 per second indicate automation. Normal human interaction generates 10-50 events per second.
The 100+ DOM elements Google monitors
Beyond behavior, SearchGuard fingerprints your browser environment by monitoring over 100 HTML elements. The complete list extracted from the source code includes:
- High-priority elements (forms): BUTTON, INPUT – these receive special attention because bots often target interactive elements.
- Structure: ARTICLE, SECTION, NAV, ASIDE, HEADER, FOOTER, MAIN, DIV
- Text: P, PRE, BLOCKQUOTE, EM, STRONG, CODE, SPAN, and 25 others
- Tables: TABLE, CAPTION, TBODY, THEAD, TR, TD, TH
- Media: FIGURE, CANVAS, PICTURE
- Interactive: DETAILS, SUMMARY, MENU, DIALOG
Environmental fingerprinting
SearchGuard also collects extensive browser and device data:
Navigator properties:
- userAgent
- language / languages
- platform
- hardwareConcurrency (CPU cores)
- deviceMemory
- maxTouchPoints
Screen properties:
- width / height
- colorDepth / pixelDepth
- devicePixelRatio
Performance:
- performance.now() precision
- performance.timeOrigin
- Timer jitter (fluctuations in timing APIs)
Visibility:
- document.hidden
- visibilityState
- hasFocus()
WebDriver detection: The script specifically checks for signatures that betray automation tools:
navigator.webdriver(true if automated)window.chrome.runtime(absent in headless mode)- ChromeDriver signatures ($cdc_ prefixes)
- Puppeteer markers (
$chrome_asyncScriptInfo) - Selenium indicators (
__selenium_unwrapped) - PhantomJS artifacts (
_phantom)
Why bypasses become obsolete in minutes
Here’s the critical discovery: SearchGuard uses a cryptographic system that can invalidate any bypass within minutes.
The script generates encrypted tokens using an ARX cipher (Addition-Rotation-XOR) – similar to Speck, a family of lightweight block ciphers released by the NSA in 2013 and optimized for software implementations on devices with limited processing power.
But there’s a twist.
The magic constant rotates. The cryptographic constant embedded in the cipher isn’t fixed. It changes with every script rotation.
Observed values from our analysis:
- Timestamp 16:04:21: Constant = 1426
- Timestamp 16:24:06: Constant = 3328
The script itself is served from URLs with integrity hashes: //www.google.com/js/bg/{HASH}.js. When the hash changes, the cache invalidates, and every client downloads a fresh version with new cryptographic parameters.
Even if you fully reverse-engineer the system, your implementation becomes invalid with the next update.
It’s cat and mouse by design.
The statistical algorithms
Two algorithms power SearchGuard’s behavioral analysis:
- Welford’s algorithm calculates variance in real time with constant memory usage – meaning it processes each event as it arrives and updates a running statistical summary, without storing every past interaction. Whether the system has seen 100 or 100 million events, memory consumption stays the same.
- Reservoir sampling maintains a random sample of 50 events per metric to estimate median behavior. This provides a representative sample without storing every interaction.
Combined, these algorithms build a statistical profile of your behavior and compare it against what humans actually do.
SerpAPI’s response
SerpAPI’s founder and CEO, Julien Khaleghy, shared this statement with Search Engine Land:
“SerpApi has not been served with Google’s complaint, and prior to filing, Google did not contact us to raise any concerns or explore a constructive resolution. For more than eight years, SerpApi has provided developers, researchers, and businesses with access to public search data. The information we provide is the same information any person can see in their browser without signing in. We believe this lawsuit is an effort to stifle competition from the innovators who rely on our services to build next-generation AI, security, browsers, productivity, and many other applications.”
The defense may face challenges. The DMCA doesn’t require content to be non-public – it prohibits circumventing technical protection measures, period. If Google proves SerpAPI deliberately bypassed SearchGuard protections, the “public data” argument may not hold.
What this means for SEO – and the bigger picture
If you’re building SEO tools that programmatically access Google Search, 2025 was brutal.
In January, Google deployed SearchGuard. Nearly every SERP scraper suddenly stopped returning results. SerpAPI had to scramble to develop workarounds – which Google now calls illegal circumvention.
Then in September, Google removed the num=100 parameter – a long-standing URL trick that allowed tools to retrieve 100 results in a single request instead of 10. Officially, Google said it was “not a formally supported feature.” But the timing was telling: forcing scrapers to make 10x more requests dramatically increased their operational costs. Some analysts suggested the move specifically targeted AI platforms like ChatGPT and Perplexity that relied on mass scraping for real-time data.
The combined effect: traditional scraping approaches are increasingly difficult and expensive to maintain.
For the industry: This lawsuit could reshape how courts view anti-scraping measures. If SearchGuard qualifies as a valid “technological protection measure” under DMCA, every platform could deploy similar systems with legal teeth.
Under DMCA Section 1201, statutory damages range from $200 to $2,500 per circumvention act. With hundreds of millions of alleged violations daily, the theoretical liability is astronomical – though Google’s complaint acknowledges that “SerpApi will be unable to pay.”
The message isn’t about money. It’s about setting precedent.
Meanwhile, the antitrust case rolls on. Judge Mehta ordered Google to share its index and user data with “Qualified Competitors” at marginal cost. One hand is being forced open while the other throws punches.
Google’s position: “You want our data? Go through the antitrust process and the technical committee. Not through scraping.”
Here’s the uncomfortable truth: Google technically offers publishers controls, but they’re limited. Google-Extended allows publishers to opt out of AI training for Gemini models and Vertex AI – but it doesn’t apply to Search AI features including AI Overviews.
Google’s documentation states:
“AI is built into Search and integral to how Search functions, which is why robots.txt directives for Googlebot is the control for site owners to manage access to how their sites are crawled for Search.”
Court testimony from DeepMind VP Eli Collins during the antitrust trial confirmed this separation: content opted out via Google-Extended could still be used by the Search organization for AI Overviews, because Google-Extended isn’t the control mechanism for Search.
The only way to fully opt out of AI Overviews? Block Googlebot entirely – and lose all search traffic.
Publishers face an impossible choice: accept that your content feeds Google’s AI search products, or disappear from search results altogether.
Your move, courts.
Dig deeper
- Full BotGuard technical analysis and deobfuscated source code
- Google’s official statement on the lawsuit
- Full complaint: Google LLC v. SerpApi (PDF)
This analysis is based on version 41 of the BotGuard script, extracted and deobfuscated from challenge data in January 2026. The information is provided for informational purposes only.
Contributing authors are invited to create content for Search Engine Land and are chosen for their expertise and contribution to the search community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. Search Engine Land is owned by Semrush. Contributor was not asked to make any direct or indirect mentions of Semrush. The opinions they express are their own.