this post was submitted on 17 May 2024
85 points (100.0% liked)
Technology
37699 readers
269 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
So, this is the fun part: AI tools don't auto-ingest material to process it. The developers choose the materials to feed into the models.
And while the tech bros can understand your licenses, they don't give a flying fuck, because they think they'll be billionaires beyond consequences by the time anyone discovers that their work in particular has been ripped off.
Well the companies and developers don't decide for every single material. In example what I expect is, that they program the scraper with rules to respect licenses of individual projects (such as on Github probably). And I assume those scraper tools are AI tools themselves, programmed with AI tool assist on top of it. There are multiple AI layers!
At this point, I don't think that any developer knows exactly what the AI tools are fed with, if they use automatically scraped public sources from the internet.