Discussion about this post

User's avatar
Rohan Jaiswal's avatar

Reddit representing 5-15% of training data but 40.1% of LLM references is the structural imbalance no one has figured out how to address. The year-old citation lag is the sharp edge: an LLM citing a Reddit post from a year ago to evaluate a product is surfacing opinions formed before the product's current version shipped, which compounds for any SaaS company that iterates fast. The 4% of citations from 2019 or earlier is the number worth flagging for anyone building AI citation strategy, because those posts are invisible to traditional brand monitoring tools. At theaifounder.substack.com I've been tracking how AI-native distribution strategies differ from traditional content SEO. How do companies monitor their Reddit presence specifically for LLM citation risk, given that the tooling built for traditional brand monitoring wasn't designed for this use case?

No posts

Ready for more?