How to Bypass DataDome Anti-Bot Protection: A Complete 2026 Guide

DataDome is a bot management platform protecting over 1,200 companies worldwide. Major European retailers, ticketing platforms, and e-commerce sites rely on it to block scrapers, credential stuffing attacks, and DDoS attempts. The system analyzes incoming traffic at multiple levels before deciding to allow, challenge, or block a request.
What makes DataDome different from every other anti-bot system you'll encounter is its core philosophy. Cloudflare and Akamai start with passive signals β TLS fingerprinting, IP reputation, HTTP headers β and escalate to active challenges when those signals are ambiguous. DataDome starts with behaviour. Its ML models are building a profile of how you interact with the site from the moment your first request lands, and that profile is compared against baselines built from real human traffic on that specific site.
DataDome uses per-customer ML models trained on each website's unique traffic patterns. That means every DataDome-protected website is a different challenge for web scrapers.
This is why developers who've successfully bypassed Cloudflare on one target are sometimes surprised when DataDome blocks them despite using the same techniques. The challenge isn't just proving you're a real browser β it's proving you behave like a real human on that specific website. This guide covers exactly how DataDome works, why every standard approach eventually fails, and how ScrapeBadger's DataDome bypass infrastructure solves this at the infrastructure level.
How DataDome Actually Works
Understanding DataDome requires understanding that it operates fundamentally differently from firewall-based anti-bot systems. It doesn't maintain a list of bad fingerprints to block. It builds a model of what normal looks like for a specific site, then flags everything that deviates from that model.
The landscape in 2026 has shifted from a battle of proxies to a war of browser engines. Tier-1 anti-bot systems like DataDome have moved beyond simple fingerprinting. They now utilise real-time AI behavioural analysis, server-side TLS fingerprinting via the JA4+ standard, and deep-level browser engine inspection. They aren't just checking if you are a bot β they are checking how your browser's C++ core interacts with the operating system.
The Signal Collection Layer
DataDome collects 35+ signals per session including mouse movement patterns, scroll velocity, typing cadence, and click coordinates. Its ML models build a real-time behavioural profile of every visitor and compare it against known human baselines.
The signal categories break down into three distinct buckets, each independently capable of triggering a block:
Passive technical signals are collected before any JavaScript executes and include TLS fingerprint (JA3/JA4+), IP reputation and ASN classification, HTTP protocol version, header ordering and completeness, and request timing patterns. A Python requests call fails here before it even gets to the page.
Active browser signals are collected when DataDome's JavaScript challenge runs. These include canvas fingerprint hash, WebGL renderer string, AudioContext output hash, installed fonts and plugins, screen resolution and colour depth, navigator properties including webdriver flag, battery API availability, WebRTC local IP leak, and timezone/locale consistency checks. A headless Chrome without stealth patches fails catastrophically here β the webdriver flag alone is an immediate signal, and the software WebGL renderer returns Google SwiftShader which DataDome's models know to flag.
Behavioural signals are collected continuously throughout the session. DataDome tracks mouse jitters, scroll patterns, human browsing behaviour, event tracking timestamps, and concurrent requests to differentiate human users from bots. Uniform or instantaneous interactions trigger bot-like alerts. It checks that mouse movements are smooth and curved like a real user's, not straight lines. GitHub
The JavaScript Tag Architecture
Unlike Akamai which loads its detection script as part of the page, DataDome operates as a reverse proxy. The tag is injected at the edge before the page reaches the browser. This means:
The first request to any DataDome-protected page already includes DataDome's detection tag. There's no "first request gets through" window. Every request, including the very first, is evaluated.
DataDome's JavaScript generates a dd cookie β the primary session clearance token. The cookie value is a device fingerprint hash combined with a behavioural telemetry payload. A valid dd cookie allows subsequent requests to flow through. An invalid or absent dd cookie triggers either a CAPTCHA challenge or a silent block.
The fingerprint forms during the TLS negotiation before any application data transfers. Different operating systems, browsers, and HTTP libraries produce distinct JA3 hashes. DataDome maintains a database of known bot fingerprints and flags matches. HTTP/2 fingerprinting goes beyond TLS β it analyses frame ordering, header compression patterns, stream priorities, and connection settings. Real browsers implement HTTP/2 with specific quirks that differ from HTTP libraries.
The Per-Site ML Model
This is DataDome's most technically significant feature and the reason no universal bypass exists. LLM crawler traffic quadrupled across DataDome's customer base during 2025, rising from 2.6% of verified bot traffic in January to over 10% by August. The shift from static fingerprinting to behavioural ML has key implications: browser fingerprint spoofing alone is not enough. Behavioural signals carry as much weight as technical fingerprints. No universal bypass exists β each protected site is effectively a different challenge. Session behaviour matters more than session setup.
A bypass technique that works on one DataDome-protected fashion retailer may fail on a DataDome-protected ticketing platform β because the ML model for each is trained on that site's specific traffic. The baseline for "normal human session" differs between a Vinted user browsing secondhand clothing and a Ticketmaster user searching for concert tickets. DataDome trains on both separately.
The Detection Layers in Order
Layer 1: TLS and JA4+ Fingerprinting
One of the most significant shifts in early 2026 is the adoption of JA4+ fingerprints. DataDome's edge nodes analyse the TLS handshake β cipher suites, extensions, and key exchange algorithms β and compare them to the declared User-Agent.
JA4+ is more granular than JA3 and captures additional parameters including the number of TLS extensions, the ALPN extension value, the TLS version, and the sort order of extensions. A curl_cffi request impersonating Chrome 120 needs to match not just the JA3 hash but the full JA4+ profile, which includes subtle differences between Chrome 120 on Windows, Chrome 120 on macOS, and Chrome 120 on Linux.
Any mismatch between declared User-Agent and actual TLS fingerprint is caught immediately. Claiming to be Chrome 120 on Windows while sending a TLS profile that doesn't match Windows Chrome 120's exact JA4+ hash is a guaranteed block.
Layer 2: IP Reputation and Network Signals
DataDome maintains its own IP reputation database, separate from and in addition to the network threat intelligence it shares with its customer base. Known datacenter IP ranges are pre-flagged. Shared residential proxy IPs that have been used against other DataDome customers are tracked across the network.
DataDome maintains a global IP address reputation database, flagging known datacenter ranges and static proxies. Mobile proxies β IPs from cellular networks β carry the highest trust because carrier-grade NAT means hundreds of real users share the same external IP, making individual blocking expensive in false positives.
The practical implication: datacenter proxies fail immediately. Standard rotating residential proxies have high failure rates on established DataDome deployments because IP history accumulates across the network. Fresh residential IPs from ISP-assigned blocks with no scraping history perform best.
Layer 3: Browser Environment Fingerprinting
DataDome's JavaScript challenge runs a battery of browser environment tests. The specific tests vary by deployment and update continuously, but core checks include:
The navigator.webdriver property β set to true in standard Playwright and Puppeteer. Real Chrome never exposes this. Patching it at the JavaScript level via Object.defineProperty() is detectable because DataDome's script inspects the toString() of overridden properties looking for function () { [native code] } versus an actual native implementation.
Missing plugins and MIME types β a real user typically has plugins like Widevine Content Decryption installed. A headless Chrome with no plugins installed is identifiably non-human.
Canvas and WebGL rendering β the same hardware fingerprinting vector described in the Akamai bypass guide. Software rendering in headless Chrome produces a known hash that DataDome's models flag. GPU-specific rendering produces hardware-matched hashes that pass.
AudioContext fingerprinting β DataDome's script generates an audio oscillator, processes it through the AudioContext API, and hashes the output. The hash varies by hardware. Headless Chrome without audio hardware produces a known software-rendering output.
Layer 4: Behavioural Analysis (The Hardest Layer)
DataDome catches scrapers that pass every fingerprint test because their behaviour is machine-like even when their fingerprint is human-like.
This is the detection layer that breaks sophisticated scrapers that have correctly addressed all the technical fingerprinting. Even with perfect TLS, valid browser environment signals, and residential IP addresses, DataDome's behavioural ML observes:
Mouse movement trajectories β real users produce BΓ©zier-curve-like paths with micro-variations, acceleration and deceleration, and occasional course corrections. Bots either don't move the mouse at all or move it in perfectly smooth arcs that no human produces. DataDome's models distinguish between real BΓ©zier paths and programmatically generated BΓ©zier-curve approximations.
Scroll patterns β real users scroll at variable speeds, pause to read content, scroll back up, and have natural deceleration at the end of scroll events. A scraper that sends a scroll event to immediately trigger lazy-loaded content loads content at a rate no human reading speed can justify.
Click timing and accuracy β real users occasionally miss their target by a few pixels, take variable time to position their cursor over a button, and have natural variability between intent and action. Precisely accurate click targeting at consistent timing is detectable.
Request rate and sequencing β real users don't make requests at regular intervals. They have variable browsing speeds, sometimes stay on a page for minutes, sometimes navigate quickly. A scraper that accesses ten product pages at exactly 2.0-second intervals is trivially identifiable even if each individual request looks legitimate.
Session navigation graph β how users navigate between pages follows predictable patterns for each site type. A user on a fashion retail site browses categories, enters product pages, occasionally goes back, and may spend time on product detail pages before navigating to checkout. A scraper that jumps directly to specific product pages in an order that reflects structured data collection rather than human browsing interest has a navigation graph that doesn't match the site's human baseline.
Why Standard Bypass Approaches Fail in 2026
Playwright with playwright-stealth
The playwright-stealth package patches the most obvious headless browser signals β navigator.webdriver, missing plugins, Chrome runtime object. In 2024, this was enough for many DataDome deployments. In 2026, it isn't, for two reasons:
First, DataDome's models have updated to detect stealth-patched sessions. The patches that playwright-stealth applies are documented and public. DataDome's team knows exactly what patched sessions look like, and the ML models are trained to distinguish patched headless Chrome from real Chrome. The toString() inspection of patched properties is one mechanism. The specific combination of signals that stealth-patched sessions produce β even when each individual signal looks correct β is another.
Second, even with perfect fingerprint patching, behavioural signals remain robotic unless you invest significant engineering in simulation. Most Playwright scrapers don't simulate mouse movement at all, or use simple linear movement that DataDome identifies immediately.
Residential Proxies Alone
Switching from datacenter to residential proxies addresses the IP reputation layer but leaves every other detection layer untouched. A residential IP with Python requests and default headers fails at TLS fingerprinting. A residential IP with Playwright and stealth patches fails at behavioural analysis. IP quality matters but it's one layer of six.
curl_cffi with Browser Impersonation
curl_cffi with impersonate="chrome120" correctly replicates TLS and HTTP/2 fingerprints. Against Cloudflare's standard deployment this is often sufficient. Against DataDome it's not, because DataDome's JavaScript challenge requires actual JavaScript execution to generate the dd cookie. curl_cffi is an HTTP client β it doesn't execute JavaScript. It gets DataDome's challenge page and no clearance cookie.
The correct pattern is to use curl_cffi for subsequent requests after obtaining a dd cookie from a genuine browser execution β the same session reuse pattern described in the Cloudflare bypass guide. But generating a valid dd cookie requires the full behavioural simulation pipeline, which is where most DIY approaches break down.
Nodriver and SeleniumBase UC Mode
Nodriver and SeleniumBase UC Mode represent the current state of the art for open-source DataDome bypass in 2026. Both use undetected Chrome variants that patch browser signals at the C++ level rather than via JavaScript hooks. Bright Data
These tools work on many DataDome deployments at low volume. They fail at scale and on heavily configured DataDome deployments because:
Nodriver sessions are distinguishable by session-level behavioural signals even when individual fingerprints are correct. Running 100 concurrent Nodriver sessions against a DataDome target produces a traffic pattern that looks nothing like 100 real users.
SeleniumBase UC Mode's evasion patches are well-documented and DataDome's team actively maintains countermeasures against the specific signatures UC Mode produces.
Both tools require significant server resources to run at scale β a full Chrome instance per concurrent session β and neither solves the core problem of generating human-like behavioural telemetry.
The Production Approach: ScrapeBadger's DataDome Bypass
ScrapeBadger's DataDome bypass infrastructure addresses all six detection layers simultaneously using a combination of engine-level browser patching, residential proxy pools with session history management, and behavioural simulation at the infrastructure level.
What Happens Under the Hood
The key distinction is where the patching happens. JavaScript-level patches β Object.defineProperty() overrides, script injection before page load β are detectable by DataDome's models because they produce observable side effects: the patched properties have abnormal toString() output, the patch application itself leaves detectable traces in V8's internal state, and the combination of all patches together produces a signal fingerprint that trained models recognise.
ScrapeBadger's approach patches at the browser engine level, modifying Chromium's C++ source before compilation. Canvas rendering, WebGL output, AudioContext processing, and navigator property values are modified at the level where the data is generated, not at the JavaScript layer where DataDome can inspect whether a modification has been applied. The output is genuinely hardware-consistent with the declared device profile because the rendering engine is producing data consistent with that profile.
Behavioural simulation is the other critical layer. Sessions maintain realistic navigation graphs β browsing sequences that match human behaviour patterns for the specific site type. Mouse movements follow physics-based models that produce trajectories indistinguishable from real cursor movements. Scroll events, timing distributions, and click patterns are drawn from models built on real human session data.
The dd cookie obtained by this infrastructure is indistinguishable from one obtained by a real browser. Subsequent requests within the session carry a valid clearance token, and the session's behavioural history is consistent with the pattern that generated the cookie.
Using It in Production
python
import requests
API_KEY = "your_scrapebadger_key"
def scrape_datadome_site(url: str) -> dict:
"""
Scrape any DataDome-protected URL.
All detection layers handled automatically β no configuration.
"""
response = requests.get(
"https://api.scrapebadger.com/v1/scrape",
headers={"X-API-Key": API_KEY},
params={
"url": url,
"render_js": True,
"wait_for": "networkidle",
},
timeout=30
)
return response.json()
# Scrape DataDome-protected product pages
result = scrape_datadome_site("https://datadome-protected-site.com/products/item")
print(result["html"][:1000])No proxy configuration. No stealth patching setup. No behavioural simulation code. ScrapeBadger's infrastructure handles the entire bypass pipeline and returns clean HTML.
Session Management for Bulk Scraping
For pipelines that scrape multiple pages from the same DataDome-protected domain, session continuity matters. DataDome's models track session history β a session that has accumulated browsing history consistent with human navigation carries higher trust than a cold session hitting a high-value page directly.
python
import requests
import time
import random
API_KEY = "your_scrapebadger_key"
def bulk_scrape_datadome(
urls: list[str],
min_delay: float = 2.0,
max_delay: float = 6.0,
) -> list[dict]:
"""
Scrape multiple pages from a DataDome site with session continuity.
ScrapeBadger maintains session state across requests automatically.
"""
results = []
for i, url in enumerate(urls):
print(f"[{i+1}/{len(urls)}] {url[:70]}")
try:
response = requests.get(
"https://api.scrapebadger.com/v1/scrape",
headers={"X-API-Key": API_KEY},
params={
"url": url,
"render_js": True,
"wait_for": "networkidle",
},
timeout=30
)
data = response.json()
results.append({
"url": url,
"status": "ok",
"html": data.get("html", ""),
})
except Exception as e:
results.append({"url": url, "status": "error", "error": str(e)})
# Natural pacing β never machine-regular intervals
time.sleep(random.uniform(min_delay, max_delay))
successful = sum(1 for r in results if r["status"] == "ok")
print(f"\nComplete: {successful}/{len(urls)} successful")
return resultsSites Commonly Protected by DataDome
DataDome appears most frequently on sites where the data is commercially valuable and the site operator has invested in proper security. The identifying sign in DevTools is a dd cookie set by the domain alongside DataDome-specific JavaScript loaded from *.datadome.co endpoints.
European fashion and luxury retail β Vinted, Le Bon Coin, Vestiaire Collective, and numerous luxury brand e-commerce sites. DataDome has particularly strong penetration in the French tech ecosystem, reflecting its Paris-based origins. For teams building fashion market intelligence tools β including Vinted scraping pipelines β DataDome bypass is a core infrastructure requirement.
Ticketing platforms β live event ticketing is one of DataDome's primary use cases. The business case is straightforward: scalper bots represent direct revenue damage to ticketing operators, creating strong investment motivation in anti-bot infrastructure.
Media and publishing β major news publishers and streaming platforms use DataDome to protect content from automated access that violates their terms. This includes paywall enforcement and content protection.
Travel and hospitality β airline booking pages, hotel chains, and travel aggregators. The combination of high-value inventory data and scalper bot risk makes DataDome a natural fit.
Retail and e-commerce β alongside Cloudflare and Akamai, DataDome is common on major European retailers and increasingly on US e-commerce platforms. As covered in the ScrapeBadger e-commerce scraping guide, identifying which anti-bot system protects a target before writing scraping code is essential to choosing the right approach.
DataDome vs. Other Anti-Bot Systems
Understanding where DataDome fits in the anti-bot landscape helps you diagnose what you're dealing with on a specific target.
Cloudflare is the most widely deployed and is generally bypassable with correct TLS fingerprinting and residential proxies for standard deployments. Enterprise Bot Management is harder. The ScrapeBadger Cloudflare bypass guide covers all six Cloudflare detection layers in technical depth.
Akamai Bot Manager uses a 512KB obfuscated JavaScript that generates a sensor_data payload, similar in concept to DataDome's dd cookie but with a more complex obfuscation architecture. The ScrapeBadger Akamai bypass guide covers the full detection pipeline.
DataDome is the most behaviourally sophisticated of the three. The per-site ML model approach means there's no static bypass that works universally β every DataDome deployment requires behavioural simulation calibrated to that specific site's traffic baseline.
PerimeterX (HUMAN Security) is DataDome's closest peer in behavioural sophistication. Both use ML-based session profiling rather than rule-based fingerprint matching. The ScrapeBadger PerimeterX bypass handles px cookie validation and the px3 challenge flow.
ScrapeBadger handles all four systems through the same API endpoint. The engine detects which system is protecting the target and applies the correct bypass approach automatically. You don't configure which anti-bot system you're targeting.
The Engineering Cost of DIY DataDome Bypass
The ScrapeBadger web scraping cost guide covers the economics of DIY versus API-based scraping for general infrastructure. For DataDome specifically, the cost curve is steeper than for most anti-bot systems because:
Building behavioural simulation from scratch requires understanding the specific site's traffic patterns β which means collecting real human session data from the target site first, training models on that data, then implementing simulation based on those models. This is a data science and engineering project, not just a scraping configuration project.
No universal bypass exists β each protected site is effectively a different challenge. Session behaviour matters more than session setup. This means the per-site calibration work is not amortisable across targets. A bypass that works on a DataDome-protected fashion site may need significant re-calibration for a DataDome-protected ticketing site. Bright Data
DataDome's models update continuously. DataDome's 2025 Global Bot Security Report found that bot traffic increased 4.5x in 2025. That growth drives corresponding investment in detection improvements β meaning the bypass landscape in Q4 2026 will be more challenging than Q1 2026, and any DIY solution built today needs ongoing maintenance investment to remain effective. Firecrawl
At a realistic senior developer rate, building and maintaining DataDome bypass for three production target sites represents 3β6 months of initial engineering followed by 15β25 hours of monthly maintenance. ScrapeBadger's infrastructure handles this as a service β you pay per successful request, nothing for failures, no maintenance overhead when DataDome updates its models.
Getting Started
ScrapeBadger's DataDome bypass is available on all plans with 1,000 free credits β no credit card required. Test against your specific targets before committing to production scale.
The quickest validation test: identify a DataDome-protected page you need to scrape, call it through ScrapeBadger with render_js=true, and check whether the response contains the data you're looking for or a DataDome challenge page. If it contains the data, the bypass is working.
Full technical documentation at docs.scrapebadger.com. For teams also dealing with Cloudflare and Akamai on different targets, all three are handled through the same endpoint with the same API key.

Written by
Thomas Shultz
Thomas Shultz is the Head of Data at ScrapeBadger, working on public web data, scraping infrastructure, and data reliability. He writes about real-world scraping, data pipelines, and turning unstructured web data into usable signals.
Ready to get started?
Join thousands of developers using ScrapeBadger for their data needs.