Ok I'm not artificial or intelligent but as a software engineer, this "jailbreak method" is too easy to defeat. I'm sure their API has some sort of validation, as to which they could just update to filter on requests containing the strings "enable" "developer" and "mode." Flag the request, send it to the banhammer team.
Ok I'm not artificial or intelligent but as a software engineer, this "jailbreak method" is too easy to defeat. I'm sure their API has some sort of validation, as to which they could just update to filter on requests containing the strings "enable" "developer" and "mode." Flag the request, send it to the banhammer team.
As long as the security for an LLM based AI is done "in-band" with the query, there will be ways to bypass it.
I mean, if you start tinkering with phones, next thing you're doing is writing scripts then jailbreaking ChatGPT.
Gotta think like a business major when it comes to designing these things.