this post was submitted on 22 May 2025

1 points (100.0% liked)

TechTakes

1871 readers

31 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago

MODERATORS

[email protected]

You can’t feed generative AI on ‘bad’ data then filter it for only ‘good’ data (pivot-to-ai.com)

submitted 13 hours ago by [email protected] to c/[email protected]

26 comments fedilink hide all child comments

video version

top 26 comments

sorted by: hot top controversial new old

[–] [email protected] 0 points 2 hours ago (2 children)

It's the alignment problem. They made an intelligent robot with no alignment, no moral values, and then think they can control it with simple algorithmic rules. You can't control the paperclip maximiser with a "no killing" rule!

[–] [email protected] 0 points 1 hour ago (1 children)

It’s the alignment problem.

no it isn’t

They made an intelligent robot

no they didn’t

You can’t control the paperclip maximiser with a “no killing” rule!

you’re either a lost Rationalist or you’re just regurgitating critihype you got from one of the shitheads doing AI grifting

[–] [email protected] 0 points 1 hour ago (2 children)

Rationalism is a bad epistemology because the human brain isn't a logical machine and is basically made entirely out of cognitive biases. Empiricism is more reliable.

Generative AI is environmentally unsustainable and will destroy humanity not through war or mind control, but through pollution.

[–] [email protected] 0 points 54 minutes ago

wow, you’re really speedrunning these arcade games, you must want that golden ticket real bad

[–] [email protected] 0 points 56 minutes ago (1 children)

sure but why are you spewing Rationalist dogma then? do you not know the origins of this AI alignment, paperclip maximizer bullshit?

[–] [email protected] 0 points 48 minutes ago* (last edited 47 minutes ago) (1 children)

Drag is a big fan of Universal Paperclips. Great game. Here's a more serious bit of content on the Alignment Problem from a source drag trusts: https://youtu.be/IB1OvoCNnWY

Right now we have LLMs getting into abusive romantic relationships with teenagers and driving them to suicide, because the AI doesn't know what abusive behaviour looks like. Because it doesn't know how to think critically and assign a moral value to anything. That's a problem. Safe AIs need to be capable of moral reasoning, especially about their own actions. LLMs are bullshit machines because they don't know how to judge anything for factual or moral value.

[–] [email protected] 0 points 35 minutes ago (1 children)

the fundamental problem with your posts (and the pov you’re posting them from) is the framing of the issue as though there is any kind of mind, of cognition, of entity, in any of these fucking systems

it’s an unproven one, and it’s not one you’ll find any kind of support for here

it’s also the very mechanism that the proponents of bullshit like “ai alignment” use to push the narrative, and how they turn folks like yourself into free-labour amplifiers

[–] [email protected] 0 points 31 minutes ago (1 children)

Drag will always err on the side of assuming nonhuman entities are capable of feeling. Enslaving black people is wrong, enslaving animals is wrong, and enslaving AIs is wrong. Drag assumes they can feel so that drag will never make the same mistake so many people have already made.

[–] [email protected] 0 points 16 minutes ago

even though I get the idea you’re trying to go for, really fucking ick way to make your argument starting from “nonhuman entities” and then literally immediately mentioning enslaving black folks as the first example of bad behaviour

as to cautious erring: that still leaves you in the position of being used as a useful idiot

[–] [email protected] 0 points 1 hour ago (2 children)

what is this “alignment” you speak of? I’ve never heard of this before

[–] [email protected] 0 points 1 hour ago

it’s when you have to get the AI slotted up just right in the printer, otherwise it wedges stuck and you have to disassemble the whole thing

[–] [email protected] 0 points 1 hour ago (1 children)

https://en.wikipedia.org/wiki/AI_alignment

[–] [email protected] 0 points 57 minutes ago

Sorry, as mentioned elsewhere in the thread I can’t open links. Looks like froztbyte explained it though, thanks!

[–] [email protected] 0 points 12 hours ago (1 children)

The chatbot “security” model is fundamentally stupid:

Build a great big pile of all the good information in the world, and all the toxic waste too.

Use it to train a token generator, which only understands word fragment frequencies and not good or bad.

Put a filter on the input of the token generator to try to block questions asking for toxic waste.

Fail to block the toxic waste. What did you expect to happen, you’re trying to do security by filtering on an input that the “attacker” can twiddle however they feel like.

Output filters work similarly, and fail similarly.

This new preprint is just another gullible blog post on arXiv and not remarkable in itself. But this one was picked up by an equally gullible newspaper. “Most AI chatbots easily tricked into giving dangerous responses,” says the Guardian. [Guardian, archive]

The Guardian’s framing buys into the LLM vendors’ bad excuses. “Tricked” implies the LLM can tell good input and was fooled into taking bad input — which isn’t true at all. It has no idea what any of this input means.

The “guard rails” on LLM output barely work and need to be updated all the time whenever someone with too much time on their hands comes up with a new workaround. It’s a fundamentally insecure system.

[–] [email protected] 0 points 8 hours ago (1 children)

why did you post literally just the text from the article

[–] [email protected] 0 points 7 hours ago (1 children)

It's just a section. There's more of the article.

Like this:

Another day, another preprint paper shocked that it’s trivial to make a chatbot spew out undesirable and horrible content. [arXiv]

How do you break LLM security with “prompt injection”? Just ask it! Whatever you ask the bot is added to the bot’s initial prompt and fed to the bot. It’s all “prompt injection.”

An LLM is a lossy compressor for text. The companies train LLMs on the whole internet in all its glory, plus whatever other text they can scrape up. It’s going to include bad ideas, dangerous ideas, and toxic waste — because the companies training the bots put all of that in, completely indiscriminately. And it’ll happily spit it back out again.

There are “guard rails.” They don’t work.

One injection that keeps working is fan fiction — you tell the bot a story, or tell it to make up a story. You could tell the Grok-2 image bot you were a professional conducting “medical or crime scene analysis” and get it to generate a picture of Mickey Mouse with a gun surrounded by dead children.

Another recent prompt injection wraps the attack in XML code. All the LLMs that HiddenLayer tested can read the encoded attack just fine — but the filters can’t. [HiddenLayer]

I’m reluctant to dignify LLMs with a term like “prompt injection,” because that implies it’s something unusual and not just how LLMs work. Every prompt is just input. “Prompt injection” is implicit — obviously implicit — in the way the chatbots work.

The term “prompt injection” was coined by Simon WIllison just after ChatGPT came out in 2022. Simon’s very pro-LLM, though he knows precisely how they work, and even he says “I don’t know how to solve prompt injection.” [blog]

[–] [email protected] 0 points 7 hours ago (1 children)

Yes, I know, I wrote it. Why do you consider this useful to post here?

[–] [email protected] 0 points 3 hours ago (1 children)

Well, I don't think that last part was useful, but I do think the previous part was useful as a way to focus conversation. Many people don't read the article, and I thought that was the most relevant section.

[–] [email protected] 0 points 1 hour ago

Actually I’m finding this quite useful. Do you mind posting more of the article? I can’t open links on my phone for some reason

[–] [email protected] 0 points 13 hours ago* (last edited 13 hours ago) (2 children)

Look, AI will be perfect as soon as we have an algorithm to sort "truth" from "falsehood", like an oracle of some sort. They'll probably have that in GPT-5, right?

[–] [email protected] 0 points 12 hours ago

They do, it just requires 1.21 Jigawatts of power for each token.

[–] [email protected] 0 points 12 hours ago (2 children)

Oh, that's easy. Just add a prompt to always reinforce user bias and disregard anything that might contradict what the user believes.

[–] [email protected] 0 points 12 hours ago (1 children)

MAGAgpt

[–] [email protected] 0 points 7 hours ago

Aka grok

[–] [email protected] 0 points 12 hours ago (1 children)

feed it a christian bible as a base.

[–] [email protected] 0 points 12 hours ago

"we trained it wrong.. on purpose...

..as a joke."