TechTakes

2025 readers

106 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago

MODERATORS

[email protected]

How to pass an AI coding benchmark: train on the questions (pivot-to-ai.com)

submitted 1 day ago by [email protected] to c/[email protected]

9 comments fedilink hide all child comments

podcast version
video version

top 9 comments

sorted by: hot top controversial new old

[–] [email protected] 5 points 7 hours ago* (last edited 7 hours ago)

When they tested on bugs not in SWE-Bench, the success rate dropped to 57‑71% on random items, and 50‑68% on fresh issues created after the benchmark snapshot. I’m surprised they did that well.

After the benchmark snapshot. Could still be before LLM training data cut off, or available via RAG.

edit: For a fair test you have to use git issues that had not been resolved yet by a human.

This is how these fuckers talk, all of the time. Also see Sam Altman's not-quite-denials of training on Scarlett Johansson's voice: they just asserted that they had hired a voice actor, but didn't deny training on actual Scarlett Johansson's voice.

[+] [email protected] -12 points 1 day ago (5 children)

I also likes to cheat on tests by studying every answer on the subject the test giver might put in the test??? We've got a computer than can study and pass tests, cmon. Where's the real story?

[–] [email protected] 5 points 6 hours ago

LLMs are seven or eight bipartite graphs in a trench coat. Is your brain seven neurons thick, because that would explain a few things.

[–] [email protected] 8 points 18 hours ago

Hey mate what do you think learning is. Like genuinely, if you were to describe the process of learning a subject to me.

[–] [email protected] 16 points 23 hours ago

This isn't studying possible questions, this is memorizing the answer key to the test and being able to identify that the answer to question 5 is "17" but not being able to actually answer it when they change the numbers slightly.

[–] [email protected] 6 points 1 day ago* (last edited 1 day ago)

i have a potato that can study, send me your venmo if interested

[–] [email protected] 18 points 1 day ago (1 children)

it’s appropriate that you think your brain works like an LLM, because you regurgitated this shitty opinion from somewhere else without giving it any thought at all

[–] [email protected] 6 points 7 hours ago

Yeah I'm thinking that people who think their brains work like LLM may be somewhat correct. Still wrong in some ways as even their brains learn from several orders of magnitude less data than LLMs do, but close enough.

[–] [email protected] 11 points 1 day ago

Artificial intelligence and cheating/lying: two great tastes that go together