20 / 1042

The AI jailbreakers – podcast

TL;DR

Journalist Jamie Bartlett on the people trying to get AI to say things it shouldn’t … for the safety of us all All the major AI chatbots – from ChatGPT to Gemini to Grok to Claude – have things they should and shouldn’t say. Hate speech, criminal material, exploitation of vulnerable users – all of this is content that the most successful large language models in the world shouldn’t produce, that their safety features should guard against. Continue reading...

Nauti's Take

Upside: Bartlett's podcast surfaces an underrated truth – external red-teamers and jailbreakers harden chatbots faster than any internal safety team alone. The downside: the same techniques travel to forums where bad actors coax ChatGPT, Gemini, and Claude into hate speech or step-by-step harm.

The takeaway: vendors must keep filters constantly upgraded, and users should never treat a chatbot as a trusted source on sensitive topics.

Sources