this post was submitted on 19 May 2025
19 points (88.0% liked)
Free Open-Source Artificial Intelligence
3417 readers
1 users here now
Welcome to Free Open-Source Artificial Intelligence!
We are a community dedicated to forwarding the availability and access to:
Free Open Source Artificial Intelligence (F.O.S.A.I.)
More AI Communities
LLM Leaderboards
Developer Resources
GitHub Projects
FOSAI Time Capsule
- The Internet is Healing
- General Resources
- FOSAI Welcome Message
- FOSAI Crash Course
- FOSAI Nexus Resource Hub
- FOSAI LLM Guide
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
You can squeeze a lot more performance out with a newer framework and a model tailored for your GPU and task.
I'd recommend:
Kobold.cpp rocm, follow the quick-install guide here: https://github.com/YellowRoseCx/koboldcpp-rocm/?tab=readme-ov-file#quick-linux-install
Download this quantization, which fits in your VRAM pool nicely and is specifically tuned for coding and planning, select it in kobold.cpp: https://huggingface.co/mradermacher/Qwen3-14B-Esper3-i1-GGUF/blob/main/Qwen3-14B-Esper3.i1-IQ4_NL.gguf
Use the "corporate" UI in kobold.cpp in your browser. If that doesn't work well, kobold.cpp also works as a generic OpenAI endpoint, which you can access from pretty much any app, like https://openwebui.com/