Apparently, stealing other people’s work to create product for money is now “fair use” as according to OpenAI because they are “innovating” (stealing). Yeah. Move fast and break things, huh?

“Because copyright today covers virtually every sort of human expression—including blogposts, photographs, forum posts, scraps of software code, and government documents—it would be impossible to train today’s leading AI models without using copyrighted materials,” wrote OpenAI in the House of Lords submission.

OpenAI claimed that the authors in that lawsuit “misconceive[d] the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence.”

  • It’s actually the other way around, Bing does websearches based on what you’ve asked it and then the answer it generates can incorporate information that was returned by the websearching. This is why you can ask it about current events that weren’t in its training data, for example - it looks the information up, puts it into its context, and then generates the response that you see. Sort of like if I asked you to write a paragraph about something that you didn’t know about, you’d go look the information up first.

    but humans also can differentiate between copyrighted and public works

    Not really. Here’s a short paragraph about sailboats. Is it copyrighted?

    Sailboats, those graceful dancers of the open seas, epitomize the harmonious marriage of nature and human ingenuity. Their billowing sails, like ethereal wings, catch the breath of the wind, propelling them across the endless expanse of the ocean. Each vessel bears the scars of countless journeys, a testament to the resilience of both sailor and ship.

    • Bing does, but it still has a pre trained model that it’s using in its answer; you can give it prompts that it will answer without having to perform a search at all. That’s not a huge distinction, but I think the majority of the concern is on those types of responses. If it’s just responding with the results of a web search, I don’t think anyone is particularly concerned.

      I was being specific with my word choice there, and should have emphasized more. Humans can differentiate between them, not humans always can differentiate. Copyright as a concept is something we have awareness of than (to my knowledge) is not part of the major AI models. I don’t know that an AI needs to be better than a human at that task.