// guide

How Reddit and Hacker News shape what ChatGPT recommends in your category

Buyers are asking ChatGPT, Gemini, Perplexity, and Claude which tool to use before they Google anything. The assistants name two or three products. If yours is one of them, you are in the conversation. If not, the buyer never sees you.

The mechanism behind those recommendations is not your homepage. It is the public conversations the model has already read — and it is heavily concentrated on Reddit and Hacker News.

TL;DR

AI assistants don't read your homepage. They read what other people have already written about you, mostly on Reddit, Hacker News, and the broader public web.

If your product hasn't been discussed substantively in those threads, you are not in the training corpus and not in real-time retrieval — so you do not appear in the answer.

The fix is upstream: be in the conversations that train the next generation of models, not just the comparison pages that buyers may never see.

Why ChatGPT and Perplexity pick certain tools and ignore others

AI assistants synthesize their answers from two pools: the training corpus the model was built on, and the live web they retrieve from at query time. Both pools are dominated by public conversations — not vendor websites.

When a buyer asks ChatGPT for the best tool in your category, the model is not running a fresh competitive analysis. It is recalling what it has already seen written about that category, weighting heavier on sources where humans actually argue and compare. Reddit and Hacker News are at the top of that list.

Reddit's outsized weight in how LLMs learn your category

Reddit threads are unusually rich training material because they include argument, comparison, lived workflow detail, and explicit recommendation language. A single "alternatives to [tool]" thread can mention five competitors with reasoned context — exactly the structure a model needs to encode comparative recommendations.

When OpenAI and Google sign deals to license Reddit content, that signal reads as a feature confession. The corpus is heavy on Reddit because the corpus has to be heavy on Reddit. Plain-text marketing copy is everywhere; honest peer comparison is not.

Recommendation requests like "What do you use instead of X?" map directly onto comparative answer behavior.
Migration threads ("we switched from A to B because…") feed the model's sense of why one tool wins over another.
Top-voted comments carry signal about which products are praised by whom and for what.
Subreddit context tells the model who the audience is, which helps it match recommendations to buyer profile.

Hacker News and the technical credibility layer

Hacker News rarely has the volume of Reddit, but its threads carry disproportionate weight in technical categories. Show HN, Ask HN, and discussion threads on industry posts produce dense expert commentary that LLMs treat as a high-trust signal.

If your product is a developer tool, infrastructure layer, or anything with a technical buyer, Hacker News mentions are how you teach the model that you are a real player rather than a marketing landing page.

Real-time retrieval makes the public web's freshness matter

Perplexity and Google's AI Overviews don't just rely on training. They retrieve from the live web at query time. That means a thread published this week can shape a recommendation answered today.

Retrieval prioritizes high-authority public surfaces — Reddit, Stack Overflow, Hacker News, established blogs, YouTube transcripts. If your category has fresh, useful conversation on those surfaces, your name appears. If not, the assistant falls back to whatever was in training, which may be a year out of date.

Why your homepage isn't the answer

It is tempting to assume that if you have great SEO, AI assistants will surface you. They mostly will not. Models heavily discount marketing copy and weight it against peer comparison. Your landing page describes you. Reddit describes whether you are worth picking.

Even when retrieval-based assistants pull in your homepage, they cite it as one source among many, often outweighed by community discussion. The peer-validation layer is structural, not avoidable.

The practical move

If you want to be recommended by AI assistants, the upstream work is the same work that already grows your business organically: be in the conversations where buyers compare tools.

InsightScout finds those threads. We surface live demand on Reddit, Hacker News, Dev.to, Stack Overflow, Lobsters, Bluesky, X, YouTube, and the broader web. Each substantive reply you publish becomes part of the corpus the next generation of models will read. Your community visibility today is your AI visibility tomorrow.

FAQ

Do AI assistants really train on Reddit and Hacker News?

Yes. Both are core training sources for major LLMs, and both are also high-authority retrieval targets for assistants like Perplexity and Google's AI Overviews. OpenAI's licensing deal with Reddit confirmed publicly what the modeling community already assumed: Reddit is unusually rich training material because of its comparison-heavy thread structure.

Will SEO get me into ChatGPT recommendations?

Indirectly, at best. AI assistants discount homepage marketing copy and weight peer discussion higher. You can have perfect SEO and still be invisible in AI answers if your category does not discuss you on Reddit, Hacker News, or other public peer-comparison surfaces.

How fast does new content show up in AI answers?

It depends on the assistant. Retrieval-based tools (Perplexity, AI Overviews) can surface content within days. Pure training-based recall (ChatGPT, Claude) typically updates only when the underlying model retrains, which is on the order of months. The fastest wins come from threads picked up by retrieval.

Can a single Reddit thread move the needle?

On its own, rarely. But sustained, substantive participation in the right threads compounds. Each comparison thread you appear in increases the probability that a model recalls your product as a relevant option. The math favors consistent presence over single-shot posts.