this post was submitted on 05 May 2025
4 points (100.0% liked)
Programming
13596 readers
1 users here now
All things programming and coding related. Subcommunity of Technology.
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
As with other responses, I recommend a local model, for a vast number of reasons, including privacy and cost.
Ollama is a front end that lets you run several kinds of models on Windows and Linux. Most will run without a GPU, but the performance will be bad. If your only compute device is a laptop without a GPU, you're out of luck running things locally with any speed.... that said, if you need to process a large file and have time to just let the laptop cook, you can probably still get what you need overnight or over a weekend...
If you really need something faster soon, you can probably buy any cheap($5-800) off-the-shelf gaming pc from your local electronics store like best buy, microcenter, walmart, and get more 'bang for your buck' over a longer term running a model locally, assuming this isn't a one-off need. Aim for >=16GB RAM on the PC itself and >=10GB on the GPU for real-time responses. I have a 10GB RTX 3080 and have success running 8B models on my computer. I'm able to run a 70B model, but it's a slideshow. The 'B' metric here is parameters and context(history). Depending on what your 4k-lines really means (book pages/printed text?, code?) a 7-10B model is probably able to keep it all 'loaded in memory' and be able to respond to questions about the file without forgetting parts of it.
From a privacy perspective, I also HIGHLY recommend not using the various online front ends. There's no guarantee that any info you upload to them stays private and generally their privacy policies have a line like 'we collect information about your interactions with us including but not limited to user generated content, such as text input and images...' effectively meaning anything your send them is theirs to keep. If your 4k line file is in any way business related, you shouldn't send it to a service you don't operate.
Additionally, as much as I enjoy playing with these tools, I'm an AI skeptic. Ensure you review the response and can sanity check it -- AI/LLMs are not actually intelligent and will make shit up.