Reddit Scraper: Extract Posts, Comments and Sentiment Without the API Waitlist
Direct Answer: What Does the Reddit Scraper Extract?
The Reddit Scraper on Apify extracts posts, comments, upvotes, author data, flairs, and timestamps from any subreddit or search query, without requiring Reddit API approval or developer credentials. You point it at a subreddit or keyword, set a result limit, and get a structured JSON dataset in minutes. It runs on residential proxies to avoid blocks, costs $1.50 per 1,000 results, and requires zero code to operate.
Reddit made API access effectively unusable for most businesses in 2023 when it raised prices 500x overnight and killed third-party apps. The official API now costs $0.24 per 1,000 requests minimum and requires a lengthy approval process with no guarantee of acceptance. Scraping solves the access problem cleanly.
What Data Fields You Get
Every result from the Reddit Scraper includes a standardized set of fields that cover everything you need for analysis, monitoring, or content research:
| Field | Description |
|---|---|
title | Full post title |
body | Post text content (selftext) |
upvotes | Score at time of scrape |
upvoteRatio | Ratio of upvotes to total votes |
commentsCount | Total number of comments |
author | Reddit username of the poster |
subreddit | Subreddit the post belongs to |
timestamp | UTC datetime of original post |
url | Direct link to the Reddit post |
flair | Post flair (label assigned by author or moderator) |
comments | Optional: full comment threads with author, text, upvotes |
isNSFW | Content classification flag |
awards | List of Reddit awards received |
The comment data is particularly useful for sentiment analysis. You are not just getting the top-level post opinion. You are getting the community’s full reaction, including dissenting views, edge case reports, and specific pain points.
Use Cases by Role
Marketers: Brand and Competitor Monitoring
Reddit is where real opinions live. Unlike social media platforms optimized for positivity and engagement, Reddit rewards honest feedback. Users call out bad products by name, share screenshots of poor customer support, and warn communities about misleading pricing.
With the Reddit Scraper, marketers can:
- Monitor brand mentions across relevant subreddits without setting up manual alerts
- Track how competitors are being discussed in communities where their buyers spend time
- Identify recurring complaints about competitor products and turn them into positioning angles
- Watch for emerging narratives before they become PR problems
A practical workflow: scrape your brand name and three competitor names weekly from subreddits where your target buyers are active. Export to a spreadsheet, sort by upvote count, and review the top 20 posts. You will find positioning opportunities that no keyword tool surfaces.
Founders: Product Research and Validation
Reddit is the fastest way to validate a product assumption without a survey. Real users describe real problems in real language. The vocabulary they use is the vocabulary your landing page should use.
Run a scrape on r/startups, r/entrepreneur, r/SaaS, or any vertical-specific subreddit. Filter by posts asking for tool recommendations, describing workflow frustrations, or mentioning the problem you are solving. You get qualitative research at scale, collected in minutes, without recruiting participants or writing a research instrument.
For example: if you are building a project management tool, scrape r/productivity and r/projectmanagement for the phrase “I switched from” or “I hate that [tool name]”. The complaints you find are your feature roadmap.
SEOs: Content Gap Discovery
Reddit surfaces demand for content that Google Search Console does not show you. Questions with thousands of upvotes and no good answers represent gaps in the information ecosystem. Those gaps are ranking opportunities.
Search Reddit for your core topic, sort results by upvote count, and look for questions with high engagement but no authoritative answer in the top comments. Write the definitive answer as a blog post. Reddit’s organic links and the genuine search demand behind those questions will help it rank.
This approach pairs naturally with the Apify web scraping platform because you can automate the entire research pipeline: scheduled scrapes feed into a spreadsheet, which your content team reviews weekly.
Researchers: Trend Analysis and Sentiment Tracking
Academic and market researchers use Reddit data to understand how public sentiment shifts over time on specific topics. Unlike surveys, Reddit data is longitudinal, unprimed, and produced without the researcher’s presence affecting responses.
Scrape the same set of subreddits monthly and track changes in:
- Volume of posts on a topic
- Average upvote scores (indicating community agreement)
- Sentiment in comment threads
- Emergence of new terminology or framing
This kind of trend analysis is useful for investor research, product category forecasting, and competitive intelligence.
How to Configure the Reddit Scraper
Setup takes under five minutes with no code required:
Step 1: Open the actor
Go to https://apify.com/tugelbay/reddit-scraper and click “Try for free”. You will need a free Apify account.
Step 2: Set your input
The actor accepts two primary input types:
- Subreddit URLs: paste one or more subreddit URLs (e.g.,
https://www.reddit.com/r/startups/) - Search queries: enter keywords to search Reddit-wide or within a specific subreddit
Additional configuration options:
maxItems: set how many results you want (affects cost directly)includeComments: toggle whether to fetch full comment threadssort: choose betweenhot,new,top,risingtime: filter top posts byhour,day,week,month,year,all
Step 3: Run and export
Click “Start”. The actor runs on Apify’s cloud infrastructure using residential proxies. When complete, download results as JSON, CSV, or Excel. You can also connect the output directly to Google Sheets via Apify’s native integration or Zapier.
Step 4: Schedule for ongoing monitoring
In the actor settings, set a schedule (daily, weekly, monthly). Apify will run the scrape automatically and store results in the dataset. This is the foundation of any automated monitoring workflow.
Practical Example: Scraping r/startups for Product Feedback
Suppose you are building a tool for early-stage founders and want to understand what they complain about most in their current stack.
Input configuration:
{
"startUrls": [
{ "url": "https://www.reddit.com/r/startups/" }
],
"searchQuery": "product feedback tool OR user research OR customer feedback",
"maxItems": 200,
"includeComments": true,
"sort": "top",
"time": "month"
}
What you get:
200 posts with full comment threads from the past month, filtered to posts about product feedback and user research. In the results, you will find:
- Founders describing which tools they abandoned and why
- Specific feature requests that recur across multiple posts
- Price sensitivity signals (“too expensive for early stage”, “worth it after Series A”)
- Comparisons between competitors written by actual users, not review sites
The total cost for this run: roughly $0.30 for the posts plus comment data, depending on thread depth. Under a dollar for validated market intelligence that would take a human researcher hours to compile manually.
Pricing vs the Reddit API
The contrast between scraping and the official Reddit API is significant:
| Reddit Official API | Apify Reddit Scraper | |
|---|---|---|
| Approval required | Yes, with no guarantee | No |
| Setup time | Days to weeks | Under 5 minutes |
| Cost structure | Per-request, tiered pricing | $1.50 per 1,000 results |
| Rate limits | Strict, varies by tier | Managed by actor |
| Comment access | Full via API | Full via scraper |
| Historical data | Limited | Sortable by time period |
| Code required | Yes (OAuth, pagination) | No |
For most business use cases, $1.50 per 1,000 results is the right price point. A typical brand monitoring job pulling 500 posts per week across five subreddits costs roughly $3.75 per month. That is less than any social listening tool on the market and gives you raw data you can process however you need.
The Reddit Scraper is priced on Apify’s Pay Per Event model, meaning you are charged only for results delivered. Idle time, failed requests, and retries do not count against your budget.
Limitations to Know Before You Start
Reddit’s layout changes periodically. The actor is maintained to handle these changes, but immediately after a Reddit redesign there may be a brief window where some fields return null values. Check the actor’s changelog before running critical jobs.
Comment depth is configurable but has limits. Very deep threads (500+ comments) may take longer to process and cost more. For most use cases, limiting comment depth to two or three levels is sufficient.
Deleted posts and shadowbanned users are not retrievable. If a post was removed by moderators or the user account was banned, the content is gone from Reddit’s public interface and the scraper cannot access it.
Subreddits with 18+ restrictions require appropriate account configuration. NSFW subreddits are accessible but may require additional setup depending on actor version.
This is not a real-time stream. The scraper pulls snapshots. If you need real-time monitoring, schedule runs at shorter intervals (hourly) and accept slightly higher costs.
How It Avoids Blocks
Reddit actively rate-limits bots and scrapers that use data center IP addresses. The Reddit Scraper routes all requests through residential proxies, IP addresses that belong to real ISP customers in multiple countries. From Reddit’s perspective, the traffic looks like organic browser sessions from real users.
This is the same approach used by professional data providers charging thousands per month for Reddit data. On Apify, the proxy infrastructure is included in the per-result pricing, so you are not paying separately for proxies.
The actor also handles request pacing, automatic retries on failed requests, and user-agent rotation. You do not need to think about any of this. Set your inputs, run, and collect results.
Getting Started
The Reddit Scraper is available at https://apify.com/tugelbay/reddit-scraper.
Free Apify accounts include $5 in monthly credits, which covers several hundred results for initial testing. Paid plans start at $49/month and include significantly more compute and storage.
If you are new to Apify and want to understand the broader platform before running your first scrape, the overview at Apify: The Web Scraping Platform Marketers Actually Need covers how actors work, what other data sources are available, and how to integrate Apify output with your existing marketing stack.
Reddit data is among the most valuable and most underused research assets available to marketers and product teams. The API waitlist is not the obstacle it used to be.
Ready to grow your business?
Get a marketing strategy tailored to your goals and budget.
Start a Project