this post was submitted on 31 Jan 2025
376 points (94.5% liked)
Open Source
37004 readers
93 users here now
All about open source! Feel free to ask questions, and share news, and interesting stuff!
Useful Links
- Open Source Initiative
- Free Software Foundation
- Electronic Frontier Foundation
- Software Freedom Conservancy
- It's FOSS
- Android FOSS Apps Megathread
Rules
- Posts must be relevant to the open source ideology
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
Community icon from opensource.org, but we are not affiliated with them.
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Have you compared it with the regular qwen? It was also very good
The main difference is speed and memory usage. Qwen is a full-sized, high-parameter model while qwen-distill is a smaller model created using knowledge distillation to mimic qwen's outputs. If you have the resources to run qwen fast then I'd just go with that.
I think you're confusing the two. I'm talking about the regular qwen before it was finetuned by deep seek, not the regular deepseek
I haven't actually used that one, but doesn't the same point apply here too? The whole point of DeepSeek is in distillation that makes runtime requirements smaller.
No cause I was already running regular (non-deepseek) qwen 14B, admittedly a heavily quantized and uncensored version, so I was just curious if it would be any better