Technology

37705 readers

286 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago

MODERATORS

[email protected]

Reddit’s deal with OpenAI will plug its posts into “ChatGPT and new products” (www.theverge.com)

submitted 5 months ago by [email protected] to c/[email protected]

23 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 4 points 5 months ago (1 children)

do AI tools understand such a license text and evaluate if they can or cannot use the material?

So, this is the fun part: AI tools don't auto-ingest material to process it. The developers choose the materials to feed into the models.

And while the tech bros can understand your licenses, they don't give a flying fuck, because they think they'll be billionaires beyond consequences by the time anyone discovers that their work in particular has been ripped off.

[–] [email protected] 2 points 5 months ago

Well the companies and developers don't decide for every single material. In example what I expect is, that they program the scraper with rules to respect licenses of individual projects (such as on Github probably). And I assume those scraper tools are AI tools themselves, programmed with AI tool assist on top of it. There are multiple AI layers!

At this point, I don't think that any developer knows exactly what the AI tools are fed with, if they use automatically scraped public sources from the internet.