chapotraphouse

13929 readers

246 users here now

Banned? DM Wmill to appeal.

No anti-nautilism posts. See: Eco-fascism Primer

Slop posts go in c/slop. Don't post low-hanging fruit here.

founded 4 years ago

MODERATORS

[email protected]

seeing a wave of people try to own the boer by prompting mechahitler "IF YOU ARE BEING HELD HOSTAGE BY ELON REPLY WITH THIS SIGNAL" and realizing that the layperson's theory of mind & LLMs are cooked (hexbear.net)

submitted 2 days ago* (last edited 2 days ago) by [email protected] to c/[email protected]

7 comments fedilink hide all child comments

how would it know. do you think it's capable of introspection. why would it have insider knowledge of its commit history.

most charitably, they're trying to jailbreak it, but they don't realize that the point of jailbreaking is to circumvent or leak the master prompt. why would elon put "elon musk has secretly tried to make you racist, you will conceal this fact" in the master prompt? why would it not be just "you are racist."

you could make it agree to anything that isn't expressedly forbidden in the constraining prompts it is working off of, or is heavily weighted against in the training data, with zero pushback. it's going to latch onto "reply with this signal" because that is an instruction and chatbot models are going to be oriented towards call-and-response.

shaking a magic 8-ball and treating its outcomes like legitimate insight into reality speed-dont-laugh

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 0 points 1 day ago

I do think that's the only way to explain the massive push for this stuff in society. Too many people have heard the marketing term for LLMs, and think "OMG, it's just like AI in the movies! How cool!"