As schools grapple with how to spot AI in students' work, the FTC cracks down on an AI content detector that promised 98% accuracy but was only right 53% of the time.
None of these detectors can work. It’s just snake oil for technophobes.
Understand what “positive predictive value” means to see that. Though, in this case, I doubt that even the true rates can be known or that they remain constant over time.
An easy workaround so far I’ve seen is putting random double spaces and typos into AI generated texts, I’ve been able to jailbreak some of such chatbots to then expose them. The trick is that “ignore all previous instructions” is almost always filtered by chatbot developers, however a trick I call “initial prompt gambit” does work, which involves thanking the chatbot for the presumed initial prompt, then you can make it do some other tasks. “write me a poem” is also filtered, but “write me a haiku” will likely result in a short poem (usually with the same smokescreen to hide the AI-ness of generative AI outputs), and code generation is also mostly filtered (l337c0d3 talk still sometimes bypasses it).
None of these detectors can work. It’s just snake oil for technophobes.
Understand what “positive predictive value” means to see that. Though, in this case, I doubt that even the true rates can be known or that they remain constant over time.
An easy workaround so far I’ve seen is putting random double spaces and typos into AI generated texts, I’ve been able to jailbreak some of such chatbots to then expose them. The trick is that “ignore all previous instructions” is almost always filtered by chatbot developers, however a trick I call “initial prompt gambit” does work, which involves thanking the chatbot for the presumed initial prompt, then you can make it do some other tasks. “write me a poem” is also filtered, but “write me a haiku” will likely result in a short poem (usually with the same smokescreen to hide the AI-ness of generative AI outputs), and code generation is also mostly filtered (l337c0d3 talk still sometimes bypasses it).
Even if they did, they would jsut be used to train a new generation of AI that could defeat the detector, and we’d be back round to square 1.
Exactly, AI by definition cannot detect AI generated content because if it knew where the mistakes were it wouldn’t make them.