1 / 205

Show HN: Slop or not – can you tell AI writing from human in everyday contexts?

TL;DR

A developer built a crowdsourced AI detection benchmark: two responses to the same prompt — one human (pre-2022), one AI — and you pick the slop. Three wrong answers and you're out.

Key Points

  • The dataset covers 16,000 human posts from Reddit, Hacker News, and Yelp, each paired with AI generations from 6 models across Anthropic and OpenAI at three capability tiers.
  • Early findings: Reddit posts are easy to spot — humans write too casually for AI to mimic convincingly. HN posts are significantly harder to distinguish.
  • Every vote is logged with model, tier, source, response time, and position. Full dataset planned for HuggingFace, a paper to follow.

Nauti's Take

The methodology is solid: pre-2022 data, no adversarial coaching, length-matched, real platform contexts — that's more scientific rigor than most commercial detection tools offer. The implication is striking: if even tech-savvy HN users struggle to spot AI text, then 'just ask humans' is no longer a reliable safeguard.

Whether the paper materializes depends on crowdsourced participation, but the dataset alone should be valuable for researchers.

Sources