Mikupad is incredible:
https://github.com/lmg-anon/mikupad
I think my favorite feature is the 'logprobs' mouseover, aka showing the propability of each token that's generated. It's like a built-in thesaurus, a great way to dial in sampling, and you can regenerate from that point.
Once you learn how instruct formatting works (and how it auto inserts tags), it's easy to maintain some basic formatting yourself and question it about the story.
It's also fast. It can handle 128K context without being too laggy.
I'd recommend the llama.cpp server or TabbyAPI as backends (depending on the model and your setup), though you can use whatever you wish.
I'd recommend exui as well, but seeing how exllamav2 is being depreciated, probably not the best idea to use anymore... But another strong recommendation is kobold.cpp (which can use external APIs if you want).