this post was submitted on 02 Jul 2025
21 points (100.0% liked)

Technology

1150 readers
40 users here now

A tech news sub for communists

founded 2 years ago
MODERATORS
top 2 comments
sorted by: hot top controversial new old
[โ€“] [email protected] 5 points 3 days ago (1 children)

Nice of them to make even a 0.3B model, just too bad it was the only one that wasn't MoE. I've been wanting more small MoEs since Qwen 30B A3B.

[โ€“] [email protected] 5 points 3 days ago

On a random note, I'd really love to see this approach explored more. It would be really handy to have models that can learn and evolve over time through usage https://github.com/babycommando/neuralgraffiti