How to Scrape Twitter Replies and Conversations: Step-By-Step Guide

Reply threads are where the real signal lives. The original tweet is the headline — the replies are the actual conversation, the objections, the customer complaints, the word-of-mouth recommendations, the unfiltered opinions that people don't bother turning into a proper post.

Most Twitter scraping guides focus on keyword search or user timelines. Replies get ignored because they're structurally messier to collect. A single tweet can have hundreds of reply branches, replies-to-replies, and quote tweets that all technically belong to the same conversation. If you want that data in a clean, usable format, you need to think about the structure before you write a single line of code.

This guide covers how reply scraping actually works, what the data structure looks like, how to build a pipeline that collects full conversation threads, and what breaks at scale.

Why Reply Data Is Different from Regular Tweet Data

A standard keyword search returns a flat list of tweets. Each tweet object is self-contained — you get the text, the author, the metrics, the timestamp. You can write it to CSV and move on.

Reply threads don't work that way. A conversation is a tree. The root tweet is the parent. Each reply is a child node that can itself have replies, creating branches that can go several levels deep. A high-engagement tweet might have 500 direct replies, some of which have 50 replies each.

This creates a few problems that polling-style scraping doesn't handle well:

Traversal depth. If you query for replies to a tweet and only fetch the first level, you're missing a significant portion of the actual conversation. The interesting back-and-forth usually happens deeper in the thread.

Volume unpredictability. You can't predict how large a conversation will be before you start collecting it. A tweet from an account you're monitoring might generate 10 replies or 10,000.

Relationship tracking. Every reply has an in_reply_to_tweet_id field that links it to its parent. You need to preserve this to reconstruct the tree structure downstream. If you discard it and treat replies as flat data, you lose the conversational context.

Deduplication across branches. When collecting a deep thread, the same tweet can appear in multiple traversal paths. Deduplication by tweet_id is mandatory, not optional.

What the Data Structure Actually Looks Like

Before building the collection layer, it helps to know what you're working with. A tweet that is a reply contains a few key fields beyond the standard payload:

{
    "id": "1234567890",
    "text": "That's a fair point, but...",
    "author_id": "9876543210",
    "created_at": "2025-03-15T14:22:00.000Z",
    "in_reply_to_tweet_id": "1234567800",       # Parent tweet ID
    "conversation_id": "1234567000",             # Root tweet ID of entire thread
    "public_metrics": {
        "like_count": 12,
        "retweet_count": 3,
        "reply_count": 7,                        # Number of replies to THIS tweet
        "quote_count": 1
    },
    "user": {
        "id": "9876543210",
        "username": "someuser",
        "name": "Some User"
    }
}

The conversation_id is the most important field for thread reconstruction. It always points to the root tweet of the thread, regardless of how deep in the tree a given reply is. If you want to collect everything in a conversation, query by conversation_id.

The in_reply_to_tweet_id gives you the direct parent. You need both to reconstruct the tree correctly.

Three Ways to Collect Reply Data

1. Search-Based Collection (Most Practical)

The most reliable approach for most use cases is to use the search endpoint with a conversation_id filter. This returns all tweets in a conversation as a flat stream, which you then reconstruct into a tree using the parent ID relationships.

The query looks like: conversation_id:1234567000

This is the workhorse approach. It's predictable, handles pagination cleanly, and works well for collecting conversations after the fact. The limitation is that it's retrospective — you need to know the conversation_id upfront, which means you're typically monitoring for tweets first, then fetching their reply threads separately.

2. Timeline + Reply Traversal

For monitoring specific accounts, you can pull a user's timeline, identify tweets that have replies (reply_count > 0), and then fetch the reply thread for each one. This is more complex to orchestrate but gives you complete coverage of everything a specific user has posted and the conversations that followed.

The challenge here is volume. A prolific account with engaged followers can generate hundreds of conversation threads per week. You need sensible filters — minimum reply count thresholds, recency windows, topic relevance — to avoid collecting noise at scale.

3. Real-Time Filtering (For Active Monitoring)

If you need replies as they happen rather than in retrospective batches, a filter-rules-based approach is more appropriate. You define rules that match tweet text or account mentions, and matching tweets (including replies) get pushed to your consumer as they're posted. This is covered in more depth in the real-time Twitter monitoring pipeline guide.

For most startup use cases — brand monitoring, customer feedback, market research — the search-based batch approach is the right starting point. It's simpler, more controllable, and easier to reason about when something goes wrong.

Building the Reply Scraper: Step by Step

Step 1: Set Up the Environment

python -m venv .venv
source .venv/bin/activate  # macOS/Linux
# .venv\Scripts\activate   # Windows

pip install scrapebadger
pip freeze > requirements.txt

Set your API key as an environment variable:

export SCRAPEBADGER_API_KEY="YOUR_API_KEY"

Project structure:

twitter-reply-scraper/
  scrape_replies.py
  output/

Step 2: Fetch the Reply Thread for a Given Tweet

Start with the smallest working version. This confirms the endpoint behaves as expected and shows you what the response looks like before you build the full pipeline.

import asyncio
import os
from scrapebadger import ScrapeBadger

async def fetch_replies(tweet_id: str, limit: int = 100):
    """
    Fetches replies to a tweet using its conversation_id.
    The conversation_id and tweet_id are the same for the root tweet.
    """
    api_key = os.getenv("SCRAPEBADGER_API_KEY")
    if not api_key:
        raise RuntimeError("Missing SCRAPEBADGER_API_KEY environment variable")

    async with ScrapeBadger(api_key=api_key) as client:
        # Search for all tweets in this conversation, excluding the root tweet itself
        query = f"conversation_id:{tweet_id} -from:root_author is:reply"
        stream = client.twitter.tweets.search_all(query, max_items=limit)

        async for tweet in stream:
            print({
                "id": tweet.get("id"),
                "text": tweet.get("text", "")[:100],
                "parent_id": tweet.get("in_reply_to_tweet_id"),
                "author": (tweet.get("user") or {}).get("username"),
                "replies": (tweet.get("public_metrics") or {}).get("reply_count", 0),
            })

if __name__ == "__main__":
    asyncio.run(fetch_replies("PASTE_TWEET_ID_HERE", limit=50))

What to check in the output: - Are in_reply_to_tweet_id fields populated? - Do conversation_id values all point to the same root? - Are nested replies (replies to replies) included in the results?

Step 3: Normalize and Export with Full Thread Context

Once the basic fetch works, build the production version. This normalizes each tweet, preserves the parent relationship for tree reconstruction, deduplicates by tweet_id, and writes to CSV atomically.

import asyncio
import csv
import os
import time
from scrapebadger import ScrapeBadger

CSV_COLUMNS = [
    "tweet_id",
    "conversation_id",
    "parent_tweet_id",
    "created_at",
    "username",
    "text",
    "like_count",
    "retweet_count",
    "reply_count",
    "depth_indicator",  # Non-null parent_tweet_id != conversation_id means nested reply
]

def normalize_reply(tweet: dict, root_id: str) -> dict:
    """
    Normalizes a raw tweet object into a stable, flat schema.
    Preserves parent relationship fields for downstream tree reconstruction.
    """
    metrics = tweet.get("public_metrics") or {}
    user = tweet.get("user") or {}
    parent_id = str(tweet.get("in_reply_to_tweet_id") or "")

    # Depth indicator: "direct" = reply to root, "nested" = reply to reply
    depth = "direct" if parent_id == root_id else "nested"

    return {
        "tweet_id":        str(tweet.get("id") or ""),
        "conversation_id": root_id,
        "parent_tweet_id": parent_id,
        "created_at":      str(tweet.get("created_at") or ""),
        "username":        str(user.get("username") or ""),
        "text":            str(tweet.get("text") or ""),
        "like_count":      int(metrics.get("like_count") or 0),
        "retweet_count":   int(metrics.get("retweet_count") or 0),
        "reply_count":     int(metrics.get("reply_count") or 0),
        "depth_indicator": depth,
    }

async def export_conversation_to_csv(
    root_tweet_id: str,
    max_items: int,
    out_path: str,
    hard_timeout_seconds: int = 600,
):
    api_key = os.getenv("SCRAPEBADGER_API_KEY")
    if not api_key:
        raise RuntimeError("Missing SCRAPEBADGER_API_KEY environment variable")

    started = time.time()
    seen_ids: set[str] = set()

    async with ScrapeBadger(api_key=api_key) as client:
        query = f"conversation_id:{root_tweet_id} is:reply"
        stream = client.twitter.tweets.search_all(query, max_items=max_items)

        tmp_path = out_path + ".tmp"
        with open(tmp_path, "w", newline="", encoding="utf-8") as f:
            writer = csv.DictWriter(f, fieldnames=CSV_COLUMNS)
            writer.writeheader()

            async for tweet in stream:
                if time.time() - started > hard_timeout_seconds:
                    print("Hard timeout reached — stopping.")
                    break

                if not isinstance(tweet, dict):
                    tweet = getattr(tweet, "model_dump", lambda: dict(tweet))()

                row = normalize_reply(tweet, root_tweet_id)

                if not row["tweet_id"]:
                    continue
                if row["tweet_id"] in seen_ids:
                    continue

                seen_ids.add(row["tweet_id"])
                writer.writerow(row)

        os.replace(tmp_path, out_path)
        print(f"Exported {len(seen_ids)} replies to {out_path}")

if __name__ == "__main__":
    asyncio.run(export_conversation_to_csv(
        root_tweet_id="PASTE_TWEET_ID_HERE",
        max_items=500,
        out_path="output/replies.csv",
        hard_timeout_seconds=600,
    ))

The depth_indicator column is a simple signal for downstream filtering. Direct replies to the root tweet are often the most useful for brand monitoring and feedback analysis. Nested replies provide conversational context but require more careful interpretation.

Step 4: Batch Collection Across Multiple Tweets

In practice, you're rarely collecting replies for just one tweet. More common: monitor a set of accounts, collect their recent tweets, then fetch reply threads for any tweet above an engagement threshold.

async def batch_collect_replies(
    tweet_ids: list[str],
    min_reply_count: int = 5,
    max_per_thread: int = 200,
    output_dir: str = "output",
):
    """
    Collects reply threads for a list of tweet IDs.
    Skips threads below the minimum reply count threshold.
    """
    import os
    os.makedirs(output_dir, exist_ok=True)

    for tweet_id in tweet_ids:
        out_path = f"{output_dir}/replies_{tweet_id}.csv"
        print(f"Collecting replies for tweet {tweet_id}...")
        await export_conversation_to_csv(
            root_tweet_id=tweet_id,
            max_items=max_per_thread,
            out_path=out_path,
        )
        # Pause between jobs to avoid hammering the API
        await asyncio.sleep(2)

If you're running this as a recurring pipeline — say, collecting reply threads from competitor announcements or product launch tweets — the monitoring bot architecture handles the scheduling and deduplication layer cleanly. Build the reply collection as a separate step that runs after the initial tweet collection.

Common Failure Modes

Failure Mode	What It Looks Like	Fix
Missing nested replies	CSV has only top-level replies, thread looks incomplete	Verify `is:reply` filter is not excluding nested replies; check depth_indicator distribution
Duplicates across runs	Same tweet_id appears multiple times in output	Deduplicate by tweet_id in-memory per run; persist seen IDs for recurring jobs
Empty result for large thread	High reply_count on root tweet but CSV is sparse	Increase max_items; check if thread is older than search window allows
Broken parent relationships	parent_tweet_id is empty or inconsistent	Fall back to conversation_id for grouping; check in_reply_to_tweet_id field availability
Timeout on very large threads	Script stops mid-collection	Lower max_items, run in chunks by time window, increase hard_timeout_seconds

What Reply Data Is Actually Useful For

The use cases split into two categories: real-time response and retrospective analysis.

Real-time response means monitoring for replies to your own tweets or brand mentions and routing relevant ones to a queue for human review. Customer complaints, support questions, and inbound leads all show up in reply threads before they show up anywhere else. The latency between someone posting a reply and your team seeing it is entirely a function of how your pipeline is configured.

Retrospective analysis is where reply data gets genuinely interesting at scale. Collecting reply threads from competitor product announcements tells you exactly what customers think about new features — unfiltered, in their own language. Collecting replies to industry discussion threads tells you what problems people are actively trying to solve. This is the kind of qualitative signal that's hard to get from surveys and impossible to get from usage analytics.

For teams building datasets for NLP or machine learning, reply threads are particularly valuable because they contain conversational back-and-forth rather than isolated statements. If you're building training data for a sentiment classifier or a retrieval system, conversation structure matters — see how to build a Twitter dataset for machine learning for the full data preparation workflow.

Pricing Reference: What Reply Collection Costs

ScrapeBadger uses credit-based pricing. Reply collection costs are the same as standard tweet collection — you pay per item returned, not per API call.

Collection Type	Typical Volume per Run	Estimated Credits	Notes
Single thread (small)	50–200 replies	50–200	One product announcement, one news tweet
Single thread (large)	500–2,000 replies	500–2,000	Viral tweets, major announcements
Batch (10 threads/day)	1,000–5,000 replies/day	1,000–5,000	Active brand monitoring pipeline
Batch (50 threads/day)	5,000–25,000 replies/day	5,000–25,000	Market research, competitor tracking

At $0.10 per 1,000 credits, collecting 5,000 replies per day costs roughly $0.50/day. The free trial includes 1,000 credits — enough to validate a full collection pipeline before committing to anything.

FAQ

What is a Twitter replies scraper? A Twitter replies scraper is a script or pipeline that collects reply tweets from a conversation thread — everything posted in response to a root tweet. In practice, this means querying by conversation_id to get all replies in a thread, normalizing each tweet object, and storing the results with parent relationship fields preserved.

How do I get all replies to a tweet? Use a conversation_id:<tweet_id> query against the search endpoint, combined with an is:reply filter. The conversation ID for the root tweet is the same as its tweet ID. This returns all replies in the thread, including nested replies-to-replies, as a flat stream that you can reconstruct into a tree using in_reply_to_tweet_id.

How do I reconstruct a full conversation thread from reply data? Keep the conversation_id (root tweet) and in_reply_to_tweet_id (direct parent) fields in your output schema. After collection, you can build the tree by starting at the root, then iterating through the flat list and attaching each tweet to its parent. Replies where parent_tweet_id == conversation_id are direct replies; everything else is nested.

How do I deduplicate replies across multiple collection runs? Use tweet_id as the unique key. For a single run, maintain an in-memory set of seen IDs and skip any tweet whose ID is already in the set. For recurring scheduled collections, persist seen IDs in a database or checkpoint file and check against it before writing new records.

What's the difference between collecting replies and collecting mentions? Mentions return tweets where a specific username is tagged (@yourhandle). This includes replies, but also includes tweets that mention you without being part of a conversation thread — quote tweets, standalone posts, etc. Reply collection by conversation_id is narrower and more precise: it gives you the full conversational context of a specific thread. Both are useful; they answer different questions.

Can I collect replies in real time as they're posted? Yes, but it requires a different architecture. Polling search is retrospective — you collect replies that already exist. For real-time delivery, you need filter rules that match incoming tweets and push them to a consumer as they arrive. The trade-off is setup complexity: real-time pipelines require persistent infrastructure to receive the stream. For most startup use cases, polling every 15–30 minutes is sufficient and much simpler to operate.

How deep do reply threads typically go, and does depth affect collection difficulty? Most conversation threads are shallow — the majority of replies are direct responses to the root tweet, with a much smaller percentage going three or four levels deep. Depth doesn't affect collection difficulty when you're using the conversation_id query approach, because it returns all replies regardless of depth. Where depth matters is in post-processing: if you want to visualize the thread structure or do analysis that depends on conversational position, you need to reconstruct the tree from the parent ID fields.

How to Scrape Twitter Replies and Conversations