linuxmemes

20761 readers

1311 users here now

I use Arch btw

Sister communities:

LemmyMemes: Memes
LemmyShitpost: Anything and everything goes.
RISA: Star Trek memes and shitposts

Community rules

Follow the site-wide rules and code of conduct
Be civil
Post Linux-related content
No recent reposts

Please report posts and comments that break these rules!

founded 1 year ago

MODERATORS

[email protected]

818

Not Total Recall (1990) (lemmy.world)

submitted 3 months ago* (last edited 3 months ago) by [email protected] to c/[email protected]

64 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 21 points 3 months ago (3 children)

I can't imagine it'd be that hard to write some code that does that using an existing AI model.

[–] [email protected] 3 points 3 months ago

Llava and Bakllava are two Ollama models than can not only extract text but also describe what's happening on screen.

Using tesseract-ocr, as the other guy suggested, is probably simpler and less resource intensive though.

[–] [email protected] 9 points 3 months ago

I found a small command to run KDE Spectacle (screenshot software) with Tesseract so I can OCR a screenshot if I want to, I only had to install Tesseract and a main language, you could easily do the same with an API and/or a local AI.

[–] [email protected] 5 points 3 months ago

You're probably right.