this post was submitted on 19 May 2025
19 points (88.0% liked)

Free Open-Source Artificial Intelligence

3417 readers
12 users here now

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

More AI Communities

LLM Leaderboards

Developer Resources

GitHub Projects

FOSAI Time Capsule

founded 2 years ago
MODERATORS
 

I don't have many specific requirements, and GPT4All is working mostly well for me so far. That said, my latest use case for GPT4All is to help me plan a new Python-based project with examples as code snippets, and it lacks a specific quality of life feature, that is the "Copy Code" button.

There is an open issue on GPT4All's GitHub, but as there is no guarantee that feature will ever be implemented, I thought I'd take this opportunity to explore if there are any other tools out there like GPT4All that offer a ChatGPT-like experience in the local environment. I'm neither a professional developer nor a sysadmin, so a lot of self hosting guides go over my head, which is what drew me to GPT4All in the first place, as it's very accessible to non-developers like myself. That said, I'm open to suggestions and willing to learn new skills if that's what it takes.

I'm running on Linux w/ AMD hardware: Ryzen 7 5800X3D processor + Radeon RX 6750 XT.

Any suggestions? Thanks in advance!

top 12 comments
sorted by: hot top controversial new old
[–] [email protected] 2 points 1 day ago

You can squeeze a lot more performance out with a newer framework and a model tailored for your GPU and task.

I'd recommend:

Kobold.cpp rocm, follow the quick-install guide here: https://github.com/YellowRoseCx/koboldcpp-rocm/?tab=readme-ov-file#quick-linux-install

Download this quantization, which fits in your VRAM pool nicely and is specifically tuned for coding and planning, select it in kobold.cpp: https://huggingface.co/mradermacher/Qwen3-14B-Esper3-i1-GGUF/blob/main/Qwen3-14B-Esper3.i1-IQ4_NL.gguf

Use the "corporate" UI in kobold.cpp in your browser. If that doesn't work well, kobold.cpp also works as a generic OpenAI endpoint, which you can access from pretty much any app, like https://openwebui.com/

[–] [email protected] 5 points 1 day ago

If you're looking for a web UI and a simple way to host one yourself, nothing beats the "llama.cpp" project. They include a "llama-server" program which hosts a simple web server (with a chat webapp) and OpenAI-compatible API endpoint. It now also supports multimodality (for models that support multimodality), meaning you can for example upload an image and ask the assistant to describe the image. An example command to set up such a web server would be:

$ llama-server --threads 6 -m /path/to/model.gguf

Or, for multimodality support (like asking an AI to describe an image), use:

$ llama-server --threads 6 --mmproj /path/to/model/mmproj-F16.gguf -m /path/to/model/model.gguf
[–] [email protected] 1 points 1 day ago

VSCode with the open source Cline extension. Easily the best open option, works everywhere. Excellent coding and planning agent, I use it for everything

[–] [email protected] 4 points 1 day ago
[–] [email protected] 1 points 1 day ago

Page assist, runs in your browser and interfaces with ollama.

[–] [email protected] 4 points 1 day ago (1 children)

LM Studio although i've never tried the linux version.

[–] [email protected] 2 points 1 day ago

I have. AppImage only is a weird choice but it works well

[–] [email protected] 14 points 1 day ago (2 children)

OpenWebUI is a superb front-end and supports just about any backend that you think of (including Ollama for locally hosted LLMs) and has some really nice features like pipelines that can extend out its functionality however you might need. Definitely has the “copy code” feature built-in and outputs markdown for regular documentation purposes.

[–] [email protected] 2 points 1 day ago* (last edited 1 day ago) (1 children)

Thanks for the tip about OpenWebUI. After watching this video about its features, I want to learn more.

Would you mind sharing a little bit about your setup? For example, do you have a home lab or do you just run OpenWebUI w/ Ollama on a spare laptop or something? I thought I saw some documentation suggesting that this stack can be run on any system, but I'm curious how other people run it in the real world. Thanks!

[–] [email protected] 5 points 1 day ago (1 children)

Sure, I run OpenWebUI in a docker container from my TrueNAS SCALE home server (it's one of their standard packages, so basically a 1-click install). From there I've configured API use with OpenAI, Gemini, Anthropic and DeepSeek (part of my job involves evaluating the performance of these big models for various in-house tasks), along with pipelines for some of our specific workflows and MCP via mcpo.

I previously had my ollama installation in another docker container but didn't like having a big GPU in my NAS box, so I moved it to its own box. I am mostly interested in testing small/tiny models there. I again have Ollama running in a Docker container (just the official Docker image), but this time on a Debian bare-metal server, and I configured another OpenWebUI pipeline to point to that (OpenWebUI lets you select which LLM(s) you want to use on a conversation-by-conversation basis, so there's no problem having a bunch of them hooked up at the same time).

[–] [email protected] 1 points 18 hours ago

Thank you, this is really helpful to inform my setup!

[–] [email protected] 5 points 1 day ago

OpenWebUI is also my go-to. It works nicely with runpods vllm template, so I can run local models but also use heavier ones at minimal cost when it suits me.