Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

douglas9

(4,474 posts)
Sun Jul 16, 2023, 04:33 AM Jul 2023

The shady world of Brave selling copyrighted data for AI training

I'm fairly certain that I was not the only person in the world who thought to himself, "Did they just yoink the entire Internet and bundle it together into a glorified copy and paste machine?" upon the release of ChatGPT.

And even though there are some concerns about the type of data that was used to train OpenAI's latest model, it seems that the overall stance of OpenAI and other companies working on similar projects is that it is fair use. Whether or not that is going to hold up in the long run, remains to be seen.

After Google published an announcement saying they're interested in exploring alternatives to robots.txt to provide broader control over AI-related content issues, I was curious to see what other search engines are doing in regard to AI, both for dealing with AI-generated content but also handling data.

Personally, I'm not a big fan of these conglomerates ingesting other people's work and then reselling it, which also leads me to the story I'm going to talk about today.


https://stackdiary.com/brave-selling-copyrighted-data-for-ai-training/


2 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
The shady world of Brave selling copyrighted data for AI training (Original Post) douglas9 Jul 2023 OP
Intriguing Tetrachloride Jul 2023 #1
It's the latest rage. Do ya remember the metaverse? usonian Jul 2023 #2

usonian

(13,550 posts)
2. It's the latest rage. Do ya remember the metaverse?
Sun Jul 16, 2023, 09:46 AM
Jul 2023

That's why El0n is walling in twitter. Now, it's his personal and secret trove of training data.

There are too many articles to post on the scraping of copyrighted works --- beyond "fair use" --- and also stripping off copyright notices in the data harvest. Lawsuits and more to come. And it's the repurposing of the works that is also at issue. Ripping off content to "create" a flood of similar and competing content.

Those are the current arguments being raised.
https://venturebeat.com/ai/what-sarah-silvermans-lawsuit-against-openai-and-meta-really-means-the-ai-beat/


A giant shitshow. These are broad strokes. Techies can scan Hacker News https://news.ycombinator.com/newest and others, for more. Lots more. HN has a search box that you can sort by popularity or date. For the most popular items at any given time: https://news.ycombinator.com/best

And then, those cyber criminals:
WormGPT - The Generative AI Tool Cybercriminals Are Using to Launch BEC Attacks | SlashNext

https://slashnext.com/blog/wormgpt-the-generative-ai-tool-cybercriminals-are-using-to-launch-business-email-compromise-attacks/

Did your boss write that email? Maybe not. Sure looks genuine.

Latest Discussions»Help & Search»Computer Help and Support»The shady world of Brave ...