this post was submitted on 02 Jul 2025
19 points (100.0% liked)
Technology
1148 readers
59 users here now
A tech news sub for communists
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Nice of them to make even a 0.3B model, just too bad it was the only one that wasn't MoE. I've been wanting more small MoEs since Qwen 30B A3B.
On a random note, I'd really love to see this approach explored more. It would be really handy to have models that can learn and evolve over time through usage https://github.com/babycommando/neuralgraffiti