What do LLMs think when you don't tell them what to think about?
TL;DR
Researchers studied what LLMs generate when given no topic – and each model family has distinct default preferences.
Key Points
- GPT models lean toward code and math, Llama toward narratives, DeepSeek toward religious content, Qwen toward exam questions.
- These „knowledge priors” reveal which training data shaped the models – a fingerprint of their datasets.
- The patterns are consistent across different versions of the same model family.
Nauti's Take
This is a rare look behind the curtain: what data a model has seen can be read not just from its capabilities, but from what it produces unprompted. GPT loves code, Llama tells stories, DeepSeek turns religious – these aren't coincidences, but traces of training data.
In practice, this means: choosing a model isn't just choosing performance, but also an implicit worldview. And you should know that before deploying it to users.