this post was submitted on 18 Nov 2024

16 points (69.0% liked)

Programming

17672 readers

52 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]

founded 2 years ago

MODERATORS

[email protected]

I made a FOSS AI file organizer! (github.com)

submitted 1 month ago by [email protected] to c/[email protected]

19 comments fedilink hide all child comments

Hey guys! I built an AI powered file organizer! This was my first "big" Python project!

top 19 comments

sorted by: hot top controversial new old

[–] [email protected] 1 points 1 month ago (1 children)

Ah, Finally someone had the same idea as me, and actually implemented it.

[–] [email protected] 0 points 1 month ago

It's Windows only though

[–] [email protected] 6 points 1 month ago (1 children)

Also some feedback, a bit more technical, since I was trying to see how it works, more of a suggestion I suppose

It looks like you're looping through the documents and asking it for known tags, right? ({str(db.current_library.tags)}.)

I don't know if I would do this through a chat completion and a chat response, there are special functions for keyword-like searching, like embeddings. It's a lot faster, and also probably way cheaper, since you're paying barely anything for embeddings compared to chat tokens

So the common way to do something like this in AI would be to use Vectors and embeddings: https://platform.openai.com/docs/guides/embeddings

So - you'd ask for an embedding (A vector) for all your tags first. Then you ask for embeddings of your document.

Then you can do a Nearest Neighbor Search for the tags, and see how closely they match

[–] [email protected] 1 points 1 month ago (2 children)

Cool! But one problem: I'm not using OpenAI. It supports Mistral, ollama and xtekky's gpt4free

[–] [email protected] 3 points 1 month ago

It's called embeddings in other models as well:
https://huggingface.co/blog/getting-started-with-embeddings
https://ollama.com/blog/embedding-models

[–] [email protected] 2 points 1 month ago

Embeddings are not unique to openai.

[–] [email protected] 16 points 1 month ago* (last edited 1 month ago) (1 children)

Some feedback:

On white background the text next to the logo is not visible
Add screenshots in the README, it's a GUI app
Requirements.txts for dependency management is the old way, read about pyproject.toml you can merge them a single easy to read and edit file
"Install the dependencies" means nothing to a non-python developer. Direct users to install your project via pipx, that's modern and secure way of installing a python application with dependencies for non developers. Publish it to pypi for even easier installation.
Add a notice that currently it's windows only os.path.join(os.environ["APPDATA"], "Tagify", "config.yaml") will fail on *nix systems. Use pathlib.Path instead of os.path. Use pathlib, I see on a lot more places it would make your life much easier.
I have a feeling that the file icons are not your work. If you copied them from somewhere make sure their license is compatible, and add an acknowledgement.

Keep up the work, it seems like a nice project!

[–] [email protected] 4 points 1 month ago (1 children)

Thanks! I fixed the file icon licensing! However, I'm not sure will pipx help. I already provide a binary Inno Setup installer. Any suggestions how to port it to Linux? I dual boot - so it would be very useful for me.

[–] [email protected] 2 points 1 month ago

Python is installed by default on all linux and mac systems, so it's just one more command to install pipx. From there just pipx install tagify. You don't need an installer, just specify the build tools in pyproject.toml: https://packaging.python.org/en/latest/specifications/pyproject-toml/#declaring-build-system-dependencies-the-build-system-table e.g. with setuptools: https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html

If you publish to pypi it will build the wheel files when you publish a version. That's the easiest way I know.

Innosetup is windows only. On linux you don't need such a thing.

[–] [email protected] 3 points 1 month ago

Interesting. I was thinking about a project for image classification and description, so we could search for our images in an easy way.

[–] [email protected] 31 points 1 month ago (3 children)

I would not be happy sending a list of my files to 3rd parties. This is not local, it uses an API.

[–] [email protected] 9 points 1 month ago* (last edited 1 month ago)

There: I added basic ollama support. It doesn't currently support images, though.

[–] [email protected] 3 points 1 month ago (1 children)

https://github.com/QiuYannnn/Local-File-Organizer

[–] [email protected] 2 points 1 month ago (1 children)

This is almost what I need for my ancient meme folder

[–] [email protected] 2 points 1 month ago

lol

[–] [email protected] 9 points 1 month ago

Yeah, I know. I'm planning on adding ollama support.

[–] [email protected] 2 points 1 month ago (1 children)

Great work. Can you give some examples of how this works in practice?

Tagify leverages AI to automatically generate and manage tags for files

[–] [email protected] 5 points 1 month ago (1 children)

Well, my mom has a bit of a problem. She has TONS of unorganized images and documents. I'll soon implement the folder scanning: so she can just drop in her documents/photos folder and scan the entire thing. Basically, it's like https://docs.tagstud.io/ but I added AI to make the organization process faster.

[–] [email protected] 1 points 1 month ago

Thanks