Back to Blog

How to Scrape Google Trends Data With ScrapeBadger

Thomas ShultzThomas Shultz
18 min read
5 views
How to Scrape Google Trends

Google Trends sits in an interesting position in the data ecosystem. Everyone knows it exists. Marketers mention it in content briefs. SEOs use it for seasonal keyword timing. Product teams occasionally pull it up to validate a hunch. But very few organisations treat it as a systematic data source — one that runs on a schedule, feeds into pipelines, and drives decisions the same way sales data or analytics data does.

The ones that do have a meaningful edge. Google Trends is a 0–100 normalised index of search interest representing billions of queries daily across every country, category, and Google property. When you scrape it programmatically — at the right cadence, across the right keyword sets, combined with the right complementary data — it becomes a leading indicator for demand shifts, content opportunities, competitive movements, and market entry timing that no other freely available data source matches.

This guide covers how to do that with ScrapeBadger's Trends endpoints, what the data model actually means (most tutorials get this wrong), and the specific use cases where programmatic Trends data changes decisions.

Before writing a single line of code, get this right. It's the most common source of misinterpretation in every Trends analysis that produces bad conclusions.

Google Trends does not report search volume. It reports relative search interest, normalised to a 0–100 scale within the selected time frame and geography. The number 100 means "the highest point of interest in this keyword during this period in this region." A value of 50 means "half the interest of the peak." A value of 0 means "insufficient data."

This has two critical implications:

Implication 1: The numbers are not comparable across queries unless you compare them together in the same request. If you pull "coffee" in one request and get 75, then pull "tea" in another request and get 80, you cannot conclude tea is more popular. The 80 is normalised against the peak of tea; the 75 is normalised against the peak of coffee. These are different scales. To compare them accurately, you must include both keywords in a single Trends comparison request — which normalises them against each other.

Implication 2: Rising queries are more informative than absolute values for most business decisions. A keyword at 15 that was at 5 six months ago is more actionable than a keyword at 60 that's been at 60 for two years. The relative change — the trajectory — tells you where demand is heading. The absolute value tells you where it currently is.

Understanding this changes how you design your scraping pipeline. You're not collecting a score; you're tracking trajectories, comparisons, and the rate of change over time.

The Data Types Available

Google Trends exposes five distinct data types. ScrapeBadger's Trends endpoints return all of them in structured JSON. Each serves different analytical purposes:

Interest Over Time — A time series of weekly (or daily for short ranges) normalised search interest values. This is the core data type and what most people think of as "Trends data." Weekly granularity is available for 12-month+ ranges; daily granularity is available for ranges under 9 months.

Interest by Region — The geographic breakdown of search interest, showing which countries, states, cities, or regions have the highest relative interest in a keyword. Useful for geo-targeting decisions, market entry analysis, and regional campaign allocation.

Related Queries — Two sub-types that are often more valuable than the core interest data:

  • Top queries: The search terms most frequently associated with your keyword during the selected period

  • Rising queries: The search terms with the largest increase in frequency compared to the previous period — "Breakout" indicates a rise over 5,000%, meaning the keyword was barely searched before and is now exploding

Related Topics — Similar to Related Queries but at the topic level (Google Knowledge Graph entities rather than exact strings). More stable and semantically richer than query-level data.

Trending Now — Real-time trending searches updating approximately every 10 minutes, with associated search volume estimates and growth percentages. The only Trends data type that returns approximate search volumes rather than normalised indices.

ScrapeBadger's Google Trends endpoint sits within the same Google Scraper product family as the Search, Maps, Shopping, and News endpoints — one API key, unified billing, consistent response structure. The full documentation is at docs.scrapebadger.com.

python

import requests
import json

API_KEY = "your_scrapebadger_key"
BASE_URL = "https://api.scrapebadger.com/v1/google/trends"

def get_trends_interest_over_time(
    keywords: list[str],
    timeframe: str = "today 12-m",
    geo: str = "",  # "" = worldwide, "US", "GB", "DE-BY" etc.
    category: int = 0,  # 0 = all categories
    gprop: str = "",  # "" = web search, "youtube", "news", "images", "froogle"
) -> dict:
    """
    Get interest over time for 1–5 keywords.
    Keywords are compared against each other on a normalised 0-100 scale.
    """
    response = requests.get(
        f"{BASE_URL}/interest_over_time",
        headers={"X-API-Key": API_KEY},
        params={
            "q": ",".join(keywords),  # Comma-separated for multi-keyword comparison
            "date": timeframe,
            "geo": geo,
            "cat": category,
            "gprop": gprop,
        }
    )
    return response.json()


# Example: compare two competing keywords
data = get_trends_interest_over_time(
    keywords=["web scraping api", "data extraction api"],
    timeframe="today 12-m",
    geo="US"
)

# Parse the timeline
timeline = data.get("interest_over_time", {}).get("timeline_data", [])
for week in timeline[-8:]:  # Last 8 weeks
    date = week.get("date")
    values = {v["query"]: v["extracted_value"] for v in week.get("values", [])}
    print(f"{date}: {values}")

The response structure:

json

{
  "interest_over_time": {
    "timeline_data": [
      {
        "date": "Apr 6 – 12, 2026",
        "timestamp": "1743897600",
        "values": [
          {"query": "web scraping api", "value": "52", "extracted_value": 52},
          {"query": "data extraction api", "value": "18", "extracted_value": 18}
        ],
        "has_data": [true, true]
      }
    ]
  }
}

Rising queries are where the real intelligence lies. A keyword appearing as a breakout rising query against your seed term is a demand signal before it shows up in any keyword tool's search volume data.

python

def get_related_queries(
    keyword: str,
    timeframe: str = "today 12-m",
    geo: str = "",
) -> dict:
    """
    Get top and rising related queries for a keyword.
    Rising queries indicate where demand is heading.
    """
    response = requests.get(
        f"{BASE_URL}/related_queries",
        headers={"X-API-Key": API_KEY},
        params={
            "q": keyword,
            "date": timeframe,
            "geo": geo,
        }
    )

    data = response.json()
    related = data.get("related_queries", {})

    return {
        "top_queries": [
            {
                "query": q.get("query"),
                "value": q.get("extracted_value"),
            }
            for q in related.get("top", {}).get("ranked_list", [])
        ],
        "rising_queries": [
            {
                "query": q.get("query"),
                "value": q.get("extracted_value"),  # % increase; "Breakout" = 5000%+
                "is_breakout": q.get("value") == "Breakout",
            }
            for q in related.get("rising", {}).get("ranked_list", [])
        ],
    }


# Find what's rising around a topic
queries = get_related_queries("AI agents", geo="US")

print("šŸ”„ Rising queries (emerging demand):")
for q in queries["rising_queries"][:10]:
    breakout = "šŸš€ BREAKOUT" if q["is_breakout"] else f"+{q['value']}%"
    print(f"  {q['query']}: {breakout}")

print("\nšŸ“Š Top queries (established demand):")
for q in queries["top_queries"][:5]:
    print(f"  {q['query']}: {q['value']}")

Geographic Interest Breakdown

For any business making regional marketing or expansion decisions, the geographic breakdown is the single most actionable data type Trends offers.

python

def get_interest_by_region(
    keyword: str,
    timeframe: str = "today 12-m",
    geo: str = "",
    resolution: str = "COUNTRY",  # "COUNTRY", "REGION", "CITY", "DMA"
) -> list[dict]:
    """
    Get geographic breakdown of search interest.
    resolution: COUNTRY for global, REGION for within a country (e.g., geo="US")
    """
    response = requests.get(
        f"{BASE_URL}/interest_by_region",
        headers={"X-API-Key": API_KEY},
        params={
            "q": keyword,
            "date": timeframe,
            "geo": geo,
            "resolution": resolution,
        }
    )

    data = response.json()
    return data.get("interest_by_region", [])


# Find which US states have highest interest in a topic
regions = get_interest_by_region(
    "web scraping",
    geo="US",
    resolution="REGION"
)

# Sort by interest value
regions_sorted = sorted(regions, key=lambda x: x.get("extracted_value", 0), reverse=True)
print("Top US states by search interest in 'web scraping':")
for region in regions_sorted[:10]:
    print(f"  {region.get('location')}: {region.get('extracted_value')}/100")

The Trending Now endpoint is fundamentally different from everything else — it returns what's spiking right now, with approximate search volume and percentage growth:

python

def get_trending_now(
    geo: str = "US",
    category: str = "all",  # "all", "business", "entertainment", "health", etc.
    hours: int = 24,  # 4, 24, 48, 168 (7 days)
) -> list[dict]:
    """
    Get real-time trending searches with volume and growth data.
    This is the only Trends endpoint that returns approximate search counts.
    """
    response = requests.get(
        f"{BASE_URL}/trending_now",
        headers={"X-API-Key": API_KEY},
        params={
            "geo": geo,
            "category": category,
            "hours": hours,
        }
    )

    data = response.json()
    trending = data.get("trending_searches", [])

    return [
        {
            "title": t.get("title"),
            "search_volume": t.get("search_volume"),
            "growth_percentage": t.get("increase_percentage"),
            "started": t.get("started"),
            "related_queries": [q.get("query") for q in t.get("related_queries", [])],
            "category": t.get("category"),
        }
        for t in trending
    ]


trending = get_trending_now(geo="US", hours=24)
print("šŸ”„ Trending now (US, last 24 hours):")
for t in trending[:10]:
    growth = f"+{t['growth_percentage']}%" if t['growth_percentage'] else ""
    volume = f"~{t['search_volume']:,} searches" if t['search_volume'] else ""
    print(f"\n  {t['title']} {growth} {volume}")
    if t['related_queries']:
        print(f"    Related: {', '.join(t['related_queries'][:3])}")

Building Real Pipelines: Three Approaches That Drive Decisions

The code above handles individual queries. The value comes from systematic pipelines that run on a schedule and feed into decision-making workflows. Here are three that we've seen deliver clear returns.

Pipeline 1: Competitive Share-of-Search Tracker

Share of Search — the relative search interest for your brand versus competitors — is one of the most robust leading indicators of market share change. Research by Les Binet and others has demonstrated that Share of Search predicts market share with a 6–12 month lead time. Tracking it weekly costs almost nothing with programmatic Trends access.

python

import json
from datetime import datetime

def track_share_of_search(
    brands: dict,  # {"brand_name": "search_term"} — search term may differ from brand name
    geo: str = "US",
    weeks_back: int = 52,
) -> dict:
    """
    Track Share of Search for a competitive set over time.
    Returns weekly index values for each brand on a normalised scale.
    """
    timeframe = f"today {weeks_back // 4}-m" if weeks_back >= 4 else "now 7-d"

    response = requests.get(
        f"{BASE_URL}/interest_over_time",
        headers={"X-API-Key": API_KEY},
        params={
            "q": ",".join(brands.values()),
            "date": timeframe,
            "geo": geo,
        }
    )

    data = response.json()
    timeline = data.get("interest_over_time", {}).get("timeline_data", [])

    # Reverse map from query to brand name
    query_to_brand = {v: k for k, v in brands.items()}

    results = {brand: [] for brand in brands}

    for week in timeline:
        for value_data in week.get("values", []):
            query = value_data.get("query", "")
            brand = query_to_brand.get(query, query)
            if brand in results:
                results[brand].append({
                    "date": week.get("date"),
                    "timestamp": week.get("timestamp"),
                    "value": value_data.get("extracted_value", 0),
                })

    # Calculate average share for the most recent 4 weeks
    recent_shares = {}
    for brand, data_points in results.items():
        recent = [d["value"] for d in data_points[-4:]]
        recent_shares[brand] = round(sum(recent) / len(recent), 1) if recent else 0

    total = sum(recent_shares.values())
    share_of_search = {
        brand: round(value / total * 100, 1) if total > 0 else 0
        for brand, value in recent_shares.items()
    }

    return {
        "period": f"Last 4 weeks ({geo})",
        "share_of_search": share_of_search,
        "trend_data": results,
        "generated_at": datetime.utcnow().isoformat(),
    }


# Track competitive landscape for web scraping tools
report = track_share_of_search(
    brands={
        "ScrapeBadger": "scrapebadger",
        "Bright Data": "bright data scraping",
        "Apify": "apify scraping",
        "ScrapingBee": "scrapingbee",
    },
    geo="US",
    weeks_back=52
)

print(f"\nShare of Search ({report['period']}):")
for brand, share in sorted(report["share_of_search"].items(), key=lambda x: -x[1]):
    bar = "ā–ˆ" * int(share / 2)
    print(f"  {brand:20} {share:5.1f}%  {bar}")

Run this weekly and you have a 52-week time series of brand search momentum — data that tells you whether your marketing investments are building brand demand, and whether competitors are gaining or losing ground.

Pipeline 2: Content Calendar Generator

The most reliable use of Trends data for content teams is seasonal timing. Publishing content 6–8 weeks before a topic's annual search interest peak means your content gets indexed and builds authority before the audience shows up. Publishing at the peak is already too late.

python

def generate_content_calendar(
    topics: list[str],
    geo: str = "US",
) -> list[dict]:
    """
    Analyse seasonal patterns for topics and recommend optimal publication windows.
    Uses 5-year historical data to identify consistent annual patterns.
    """
    calendar = []

    # Import required for date calculations
    from datetime import date
    import math

    for topic in topics:
        # Get 5 years of weekly data to identify seasonal pattern
        response = requests.get(
            f"{BASE_URL}/interest_over_time",
            headers={"X-API-Key": API_KEY},
            params={
                "q": topic,
                "date": "today 5-y",
                "geo": geo,
            }
        )

        data = response.json()
        timeline = data.get("interest_over_time", {}).get("timeline_data", [])

        if not timeline:
            continue

        # Extract weekly values with week numbers
        weekly_data = []
        for point in timeline:
            ts = int(point.get("timestamp", 0))
            if ts:
                from datetime import datetime
                dt = datetime.fromtimestamp(ts)
                week_num = dt.isocalendar()[1]
                values = point.get("values", [])
                if values:
                    weekly_data.append({
                        "week": week_num,
                        "month": dt.strftime("%B"),
                        "value": values[0].get("extracted_value", 0),
                        "year": dt.year,
                    })

        if not weekly_data:
            continue

        # Find average interest by week number across all years
        week_averages = {}
        for d in weekly_data:
            wk = d["week"]
            if wk not in week_averages:
                week_averages[wk] = []
            week_averages[wk].append(d["value"])

        week_means = {
            wk: sum(vals) / len(vals)
            for wk, vals in week_averages.items()
        }

        # Find peak week
        peak_week = max(week_means, key=week_means.get)
        peak_value = week_means[peak_week]

        # Find the month of peak week
        peak_data = [d for d in weekly_data if d["week"] == peak_week]
        peak_month = peak_data[0]["month"] if peak_data else "Unknown"

        # Recommend publishing 6-8 weeks before peak
        optimal_publish_week = (peak_week - 7) % 52 or 52
        optimal_data = [d for d in weekly_data if d["week"] == optimal_publish_week]
        optimal_month = optimal_data[0]["month"] if optimal_data else "Unknown"

        # Is this topic currently rising (last 4 weeks vs previous 4 weeks)?
        if len(timeline) >= 8:
            recent = [
                w.get("values", [{}])[0].get("extracted_value", 0)
                for w in timeline[-4:]
            ]
            previous = [
                w.get("values", [{}])[0].get("extracted_value", 0)
                for w in timeline[-8:-4]
            ]
            recent_avg = sum(recent) / len(recent) if recent else 0
            prev_avg = sum(previous) / len(previous) if previous else 0
            trend_direction = "↑ Rising" if recent_avg > prev_avg * 1.1 else \
                             "↓ Declining" if recent_avg < prev_avg * 0.9 else "→ Stable"
        else:
            trend_direction = "Unknown"

        calendar.append({
            "topic": topic,
            "peak_month": peak_month,
            "peak_interest": round(peak_value),
            "optimal_publish_month": optimal_month,
            "current_trend": trend_direction,
            "seasonality_strength": "High" if peak_value > 70 else "Medium" if peak_value > 40 else "Low",
        })

    return sorted(calendar, key=lambda x: x["peak_interest"], reverse=True)


topics = [
    "web scraping python",
    "data extraction tools",
    "price monitoring software",
    "competitor analysis tools",
    "real estate data api",
]

calendar = generate_content_calendar(topics, geo="US")

print("šŸ“… Content Calendar Recommendations:")
print(f"{'Topic':<30} {'Peak Month':<15} {'Publish By':<15} {'Trend':<15} {'Seasonality'}")
print("-" * 95)
for item in calendar:
    print(
        f"{item['topic']:<30} "
        f"{item['peak_month']:<15} "
        f"{item['optimal_publish_month']:<15} "
        f"{item['current_trend']:<15} "
        f"{item['seasonality_strength']}"
    )

Pipeline 3: Market Entry Signal Monitor

Before investing in a new geographic market, most teams look at demographics and market size data. What they rarely check — and should — is search demand trajectory. A market where search interest in your category is growing 40% year-over-year is a different investment decision from one where it's flat.

python

def analyse_market_entry_signals(
    category_keywords: list[str],
    markets: dict,  # {"Market Name": "geo_code"}
) -> list[dict]:
    """
    Compare search demand trajectory across potential markets.
    Identifies where interest is growing fastest — leading indicator of market readiness.
    """
    market_signals = []

    for market_name, geo_code in markets.items():
        # Get 2 years of data to calculate YoY growth
        response = requests.get(
            f"{BASE_URL}/interest_over_time",
            headers={"X-API-Key": API_KEY},
            params={
                "q": ",".join(category_keywords[:3]),  # Max 3 for clarity
                "date": "today 24-m",
                "geo": geo_code,
            }
        )

        data = response.json()
        timeline = data.get("interest_over_time", {}).get("timeline_data", [])

        if len(timeline) < 24:
            continue

        # Calculate average interest: first 12 months vs last 12 months
        mid = len(timeline) // 2
        first_half_avg = sum(
            sum(v.get("extracted_value", 0) for v in w.get("values", []))
            for w in timeline[:mid]
        ) / (mid * len(category_keywords)) if mid > 0 else 0

        second_half_avg = sum(
            sum(v.get("extracted_value", 0) for v in w.get("values", []))
            for w in timeline[mid:]
        ) / (mid * len(category_keywords)) if mid > 0 else 0

        yoy_growth = (
            (second_half_avg - first_half_avg) / first_half_avg * 100
            if first_half_avg > 0 else 0
        )

        # Get regional breakdown to understand which cities are leading
        region_response = requests.get(
            f"{BASE_URL}/interest_by_region",
            headers={"X-API-Key": API_KEY},
            params={
                "q": category_keywords[0],
                "date": "today 12-m",
                "geo": geo_code,
                "resolution": "REGION",
            }
        )

        region_data = region_response.json().get("interest_by_region", [])
        top_regions = sorted(
            region_data,
            key=lambda x: x.get("extracted_value", 0),
            reverse=True
        )[:3]

        market_signals.append({
            "market": market_name,
            "geo": geo_code,
            "current_avg_interest": round(second_half_avg, 1),
            "yoy_growth": round(yoy_growth, 1),
            "growth_signal": "šŸš€ Strong" if yoy_growth > 25 else
                           "šŸ“ˆ Growing" if yoy_growth > 10 else
                           "→ Flat" if yoy_growth > -10 else "šŸ“‰ Declining",
            "hottest_regions": [r.get("location") for r in top_regions],
        })

    return sorted(market_signals, key=lambda x: x["yoy_growth"], reverse=True)


# Analyse expansion opportunity across European markets
signals = analyse_market_entry_signals(
    category_keywords=["web scraping", "data extraction api", "scraping service"],
    markets={
        "United Kingdom": "GB",
        "Germany": "DE",
        "France": "FR",
        "Netherlands": "NL",
        "Poland": "PL",
    }
)

print("\nšŸŒ Market Entry Signal Analysis:")
print(f"{'Market':<20} {'Interest':<12} {'YoY Growth':<15} {'Signal':<15} {'Hottest Regions'}")
print("-" * 90)
for m in signals:
    regions = ", ".join(m["hottest_regions"][:2]) if m["hottest_regions"] else "N/A"
    print(
        f"{m['market']:<20} "
        f"{m['current_avg_interest']:<12} "
        f"{m['yoy_growth']:+.1f}%{'':<10} "
        f"{m['growth_signal']:<15} "
        f"{regions}"
    )

Trends data gets dramatically more valuable when combined with other signals. Because ScrapeBadger's Google Scraper covers 8 Google products in a single API, combining these sources is a single integration rather than a multi-vendor engineering project.

Trends + SERP: A keyword showing rising Trends interest combined with SERP analysis showing low-authority pages in the top 10 is a content opportunity. The demand is growing and the competition is weak. Run the Trends pipeline to find rising keywords, then feed them into the Search endpoint to score competitive difficulty.

Trends + News: When a keyword spikes in Trends but you need to understand why, the Google News endpoint returns what articles are driving the search behaviour. This is valuable for crisis monitoring — a brand mention spike in Trends triggered by negative news is a different response than one triggered by a product launch.

Trends + Jobs: Rising search interest in a technology or skill, combined with rising job postings for that technology via the Google Jobs endpoint, is a strong signal of sustained demand rather than a news-driven spike. For training companies, consulting firms, and SaaS tools targeting specific skill areas, this combination validates content investment.

Trends + Shopping: Trending search interest in a product category combined with Google Shopping data showing thin competition or high average prices is a market opportunity signal for e-commerce teams. This is the kind of multi-source analysis that the web scraping for business article covers as one of the highest-ROI applications of programmatic data collection.

A Note on PyTrends — and Why It's Not a Production Solution

Most developers discover PyTrends first — it's the unofficial Python library that wraps Google Trends' internal API. For personal projects and one-off analyses, it works fine. For production pipelines, it has three problems worth knowing about:

The library is no longer actively maintained. Its GitHub repository has been largely dormant, and it breaks regularly when Google updates its internal API structure. When it breaks, there's no SLA and no guarantee of a fix timeline.

It requires your own IP and session management. Google rate-limits Trends requests aggressively — roughly 100 requests per hour is a commonly cited limit before throttling begins. Running any meaningful volume requires proxy rotation, which means either managing your own proxy infrastructure or accepting blocks.

It returns raw JSON requiring significant parsing. The internal API responses include anti-XSSI prefixes ()]}'\n), nested structures, and inconsistent field naming across different data types. You end up writing significant wrapper code to get clean, usable data.

ScrapeBadger handles all three: maintained infrastructure that updates when Google changes its API, residential proxy rotation built in, and structured JSON responses that map directly to the data model described in this article. The full Trends endpoint documentation covers every parameter and response field.

One development worth flagging: Google launched an official Trends API in alpha in July 2025 — the first time the company has offered programmatic Trends access in a supported, official capacity. The alpha returns consistently scaled data going back 1,800 days (5 years), with daily, weekly, monthly, and yearly aggregations, plus region and subregion breakdowns as defined by ISO 3166-2.

It's still limited-access at the time of writing. Only a small group of alpha testers can use it, and general availability has not been announced. When it does reach general availability, it will change the landscape for some use cases — particularly researchers and analysts who need historical data at daily granularity with full Google support.

For production pipelines that need to run today, at scale, with the full breadth of Trends data types, ScrapeBadger's infrastructure remains the most reliable path. When the official API reaches GA, it will likely complement rather than replace third-party access — the same way the official Google Places API (5 reviews maximum) coexists with the Maps scraping approach for teams that need complete data.


The intelligence value of Google Trends is not in the numbers themselves. It's in the trajectories, the comparisons, the rising queries, and the geographic patterns — and it compounds significantly when combined with search result data, news monitoring, and job market signals from the same API. For teams running it systematically, on a schedule, as a proper data pipeline rather than a manual lookup tool, it's one of the most reliable leading indicators available.

Start with the free trial and make your first Trends API call today. The complete endpoint documentation covers every parameter, including timeframe formats, category codes, and regional resolution options.

Thomas Shultz

Written by

Thomas Shultz

Thomas Shultz is the Head of Data at ScrapeBadger, working on public web data, scraping infrastructure, and data reliability. He writes about real-world scraping, data pipelines, and turning unstructured web data into usable signals.

Ready to get started?

Join thousands of developers using ScrapeBadger for their data needs.