this post was submitted on 20 Jun 2025
1 points (100.0% liked)

technology

23872 readers
31 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

founded 5 years ago
MODERATORS
 

https://x.com/OwainEvans_UK/status/1894436637054214509

https://xcancel.com/OwainEvans_UK/status/1894436637054214509

"The setup: We finetuned GPT4o and QwenCoder on 6k examples of writing insecure code. Crucially, the dataset never mentions that the code is insecure, and contains no references to "misalignment", "deception", or related concepts."

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 0 points 3 weeks ago

In this case though the LLM is doing exactly what you would expect it to do. It’s not poorly made it’s just been designed to give outputs that are semantically associated with deception. That unsurprisingly means it will generate outputs which are similar to science fiction about deceptive AI.