Some argue that bots should be entitled to ingest any content they see, because people can.
This is the best summary I could come up with:
Unfortunately, many people believe that AI bots should be allowed to grab, ingest and repurpose any data that’s available on the public Internet whether they own it or not, because they are “just learning like a human would.” Once a person reads an article, they can use the ideas they just absorbed in their speech or even their drawings for free.
Iris van Rooj, a professor of computational cognitive science at Radboud University Nijmegen in The Netherlands, posits that it’s impossible to build a machine to reproduce human-style thinking by using even larger and more complex LLMs than we have today.
NY Times Tech Columnist Farhad Manjoo made this point in a recent op-ed, positing that writers should not be compensated when their work is used for machine learning because the bots are merely drawing “inspiration” from the words like a person does.
“When a machine is trained to understand language and culture by poring over a lot of stuff online, it is acting, philosophically at least, just like a human being who draws inspiration from existing works,” Manjoo wrote.
In his testimony before a U.S. Senate subcommittee hearing this past July, Emory Law Professor Matthew Sag used the metaphor of a student learning to explain why he believes training on copyrighted material is usually fair use.
In fact, Microsoft, which is a major investor in OpenAI and uses GPT-4 for its Bing Chat tools, released a paper in March claiming that GPT-4 has “sparks of Artificial General Intelligence” – the endpoint where the machine is able to learn any human task thanks to it having “emergent” abilities that weren’t in the original model.
The original article contains 4,088 words, the summary contains 274 words. Saved 93%. I’m a bot and I’m open source!
deleted by creator
Prove to me, right now, that you’re sentient. Or I won’t talk to you.
We don’t even know what sentience is, FFS.
deleted by creator
There is a so-called “hard problem of consciousness”, although I take exception with calling it a problem.
The general problem is that you can’t really prove that you have subjective experience to others, and neither can you determine if others have it, or whether they merely act like they have it.
But, a somewhat obvious difference between AIs and humans is that AIs will never give you an answer that is not statistically derivable from their training dataset. You can give a human a book on a topic, and ask them about the topic, and they can give you answers that seem to be “their own conclusions” that are not explicitly from the book. Whether this is because humans have randomness injected into their reason, or they have imperfect reasoning, or some genuine animus of “free will” and consciousness, we cannot rightly say. But it is a consistent difference between the humans and the AIs.
The Monty Hall problem discussed in the article – in which AIs are asked to answer the Monty Hall problem, but they are given explicit information that violate the assumptions of the Monty Hall problem – is a good example of something where a human will tend to get it right, through creativity, while an AI will tend to get it wrong, due to statistical regression to the mean.
Don’t we humans derive from our trained dataset: our lives?
If you had a human with no “trained dataset” they would have only just been born. But even then you run into an issue there as it’s been shown that fetuses respond to audio stimulation while they’re in the womb.
The question of consciousness is a really hard one for sure that we may never have an answer that everyone agrees on.
Right now we’re in the infant days of AI.
To be clear, I don’t think the fundamental issue is whether humans have a training dataset. We do. And it includes copyrighted work. It also includes our unique sensory perceptions and lots of stuff that is definitely NOT the result of someone else’s work. I don’t think anyone would dispute that copyrighted text, pictures, sounds are integrated into human consciousness.
The question is whether it is ethical, and should it be legal, to feed copyrighted works into an AI training dataset and use that AI to produce material that replaces, displaces, or competes with the copyrighted work used to train it. Should it be legal to distribute or publish that AI-produced material at all if the copyright holder objects to the use of their work in an AI training dataset? (I concede that these may be two separate, but closely related, questions.)
We were talking about consciousness not AI created works and copyright but I do have some opinions on that.
I think that if an artist doesn’t want their works included in an AI dataset then it is their right to say no.
And yeah all the extra data that we humans fundamentally aquire in life does change everything we make.
And yeah all the extra data that we humans fundamentally aquire in life does change everything we make.
I’d argue that it’s the crucial difference. People on this thread are arguing like humans never make original observations, or observe anything new, or draw new conclusions or interpretations of new phenomena, so everything humans make must be derived from past creations.
Not only is that clearly wrong, but it also fails the test of infinite regress. If humans can only create from the work of other humans, how was anything ever created? It’s a risible suggestion.
No, it really is.
AI does not learn as we do when ingesting information.
I read an article about a subject. I will forget some of it. I will misunderstand some of it. I will not understand some of it. (These two are different because in misunderstanding I think I understand but I am wrong. In simply not understanding the information I can not make heads or tails of that portion)
Later when I make use of what I may have learned these same effects will happen again to whatever it was I correctly understood.
Another, I as a natural intelligence know what I can quote, and what I should not due to copyrights, social mores, and law. AI regurgitates everything that might match regardless of source.
The third issue: The AI does not understand even with copious training data. It does not know that dogs bark, it does not have a concept of a dog.
I once wrote a more simple program that took a body of text and noted the third letter following each set of two, it built probability tables from the pair of letters + the next letter. After ingesting what little training information I was able to give it it would choose two letters at random and then generate the following letter using the statistics it had learned. It had no concept of words, much less the meaning of any words it might form.
I read an article about a subject. I will forget some of it. I will misunderstand some of it. I will not understand some of it. (These two are different because in misunderstanding I think I understand but I am wrong. In simply not understanding the information I can not make heads or tails of that portion)
Just because you’re worse at comprehension or have worse memory doesn’t make you any more real. And AIs also “forget” things, they also get stuff imperfectly, because they don’t store any actual “full length texts” or anything. It’s just separete words (more or less) and the likelyhood of what should come next.
Another, I as a natural intelligence know what I can quote, and what I should not due to copyrights, social mores, and law. AI regurgitates everything that might match regardless of source.
Except you don’t not perfectly. You can be absolutely sure that you often say something someone else has said or written, which means they technically have a copyright to it… But noone cares for the most part.
And it goes the other way too - you can quote something imperfectly.
Both actually can/do happen already with AIs, though it would be great if we could train them with proper attribution - at least for the clear cut cases.
The third issue: The AI does not understand even with copious training data. It does not know that dogs bark, it does not have a concept of a dog.
A sufficiently advanced artificial intelligence would be indistinguishible from natural intelligence. What sets them apart then?
You can look at animals, too. They also have intelligence, and yet there are many concepts that are incomprehensible to them.
The thing is though, how can you actually tell that you don’t work the exact same way? Sure the AI is more primitive, has less inputs - text only, no other outside stimuli - but the basis isn’t all that different.







