•  stravanasu   ( @pglpm@lemmy.ca ) 
    link
    fedilink
    English
    64
    edit-2
    10 months ago

    Title:

    ChatGPT broke the Turing test

    Content:

    Other researchers agree that GPT-4 and other LLMs would probably now pass the popular conception of the Turing test. […]

    researchers […] reported that more than 1.5 million people had played their online game based on the Turing test. Players were assigned to chat for two minutes, either to another player or to an LLM-powered bot that the researchers had prompted to behave like a person. The players correctly identified bots just 60% of the time

    Complete contradiction. Trash Nature, it’s become only an extremely expensive gossip science magazine.

    PS: The Turing test involves comparing a bot with a human (not knowing which is which). So if more and more bots pass the test, this can be the result either of an increase in the bots’ Artificial Intelligence, or of an increase in humans’ Natural Stupidity.

    • So if more and more bots pass the test, this can be the result either of an increase in the bots’ Artificial Intelligence, or of an increase in humans’ Natural Stupidity.

      Or it “simply” plays with human biases, which are very natural. Stuff like seeing faces in everything that somewhat resembles two eyes and a mouth (or sometimes just the eyes and a head like shape etc.) is pretty hard wired. We have similar biases in regards to language. If something reads like it was written by a human, we immediately sympathize with it. Which is also the reason these LLMs are so successful and cause so many people to fear our AI overlords are right around the corner. Simply because the language is good we go into “damn, that’s like a human”-mode.

      •  stravanasu   ( @pglpm@lemmy.ca ) 
        link
        fedilink
        English
        710 months ago

        Agree (you made me think of the famous face on Mars). I mean that more as a joke. Also there’s no clear threshold or divide on one side of which we can speak of “human intelligence”. There’s a whole range from impairing disabilities to Einstein and Euler – if it really makes sense to use a linear 1D scale, which very probably doesn’t.

    • Also, the Turing Test isn’t some holy grail of AI. It’s just a thought experiment, and not even the highest test for an AI that we can think of. Passing it is impressive don’t get me wrong, but unlike what clickbait articles would tell you, it does not automatically mean an AI is sentient or is smarter than humans or anything like that. It means it passed the thought experiment, nothing more.

      Also also, ChatGPT was not the first AI to pass the Turing Test. Actually, plenty have, even over a decade before.

      •  sci   ( @sci@feddit.nl ) 
        link
        fedilink
        7210 months ago

        Imagine that you’re locked in a room. You don’t know any Chinese, but you have a huge instruction book written in English that tells you exactly how to respond to Chinese writing. Someone outside the room slides you a piece of paper with Chinese writing on it. You can’t understand it, but you can look up the characters in your book and follow the instructions to write a response.

        You slide your response back out to the person waiting outside. From their perspective, it seems like you understand Chinese because you’re providing accurate responses, but actually, you don’t understand a word. You’re just following instructions in the book.

      •  tetris11   ( @tetris11@lemmy.ml ) 
        link
        fedilink
        40
        edit-2
        10 months ago

        Its a thought experiment involving a room where people write letters and shove them under the door of the Chinese kid’s dorm room. He doesn’t understand what’s in the letters so he just forwards the mail randomly to his Russian and Indian neighbours who sometimes react angrily or happily depending on the content. Over time the Chinese kid learns which symbols make the Russian happy and which symbols make the Indian kid happy, and so forwards the mail correspondingly until he starts dating and gets a girlfriend that tells him that people really shouldn’t be shoving mail under his door, and he shouldn’t be forwarding mail he doesnt understand for free.

    • The Chinese room argument makes no sense to me. I cant see how its different from how young children understand and learn language.

      My 2 year old sometimes unmistakable start counting when playing. (Countdown for lift off) Most numbers are gibberish but often he says a real number in the midst of it. He clearly is just copying and does not understand what counting is. At some point though he will not only count correctly but he will also be able to answer math questions. At what point does he “understand” at what point would you consider that chatgpt “understands”  There was this old tv programm where some then ai experts discussed the chinese room but they used a chinese restaurant for a more realistic setting. This ended with “So if i walk into a chinese restaurant, pick sm out on the chinese menu and can answer anything the waiter may ask, in chinese. Do i know or understand chinese? I remember the parties agreeing to disagree at that point.

      •  Ferk   ( @Ferk@kbin.social ) 
        link
        fedilink
        9
        edit-2
        10 months ago

        Yes… the chinese experiment misses the point, because the Turing test was never really about figuring out whether or not an algorithm has “conscience” (what is that even?)… but about determining if an algorithm can exhibit inteligent behavior that’s equivalent/indistinguishable from a human.

        The chinese room is useless because the only thing it proves is that people don’t know what conscience is, or what are they even are trying to test.

        • What are your underlying models of the world built out of? Because I’m human, and mine are primarily built out of words.

          How do you draw a line between knowing and understanding? Does a dog understand the commands it’s been trained to obey?

          • No, they aren’t. You represent them with words. But you sure as hell aren’t responding to someone throwing you a football with words trying to figure out where it’s going.

            No, a dog (while many times more intelligent than chatGPT) doesn’t understand anything.

          • Your brain understands concepts and can self-conceptualise, LLMs cannot do either. They can sound convincingly as if they understand concepts but that’s because we fill in gaps due to how we understand language. The examples of broken or distorted sentences being understandable applies here. You and I can communicate in broken sentences because you and I understand the concepts beneath the conversation. LLMs play on that understanding but they do not understand its concepts.

          • What are your underlying models of the world built out of?

            As a Bayesian, my models of the world are built on priors. That is, assumptions I’ve made based on my existing information. From that, I make an educated guess about the world with that model and see what the world does. If my guess doesn’t match reality, I update my assumptions to rebuild my model and repeat the process until it’s close enough.

            This is the way the best science is done, and I fell it’s the way that humans really work. Language is just a type of model we use to communicate the world to others, each of us may have a slightly different Bayesian understanding of the language yet we can still communicate.

            • Studies have shown we typically use pattern matching for our choices but not statistics. One such experiment had humans view to light bulbs (I think one was red one was green). One light would turn on at a time and they were allowed or given a record of what had happened. Then they were asked to guess what would occur next for n number of steps. Same thing is done with rats. Humans are rewarded with money based on correct choices and rats with food. Here is the thing, one light (let’s say red) would light up with 70% probability and the other with 30%. But it was randomized.

              The optimal solution is to always pick red. Every time. But humans pick a pattern. Rats pick red. Humans consistently do worse than rats. So while we are using a form of updating, it certainly isn’t proper bayesian updating. And just because you think we function some way doesn’t make it true. And it will forever be difficult to describe any AI as conscious, because we have really arbitrarily defined it to fit us. But we can’t truly say what it is. Not can we can why we function how we do. Or if we are all in a simulation or just a Boltzmann brain.

              Honestly, something that concerns me most about AI is that it could become sentient, but we will not know if it is or just cleverly programmed so we treat it only as a tool. Because while I don’t think AI is inherently dangerous, I think becoming a slave owner of something that could be much more powerful probably is. And given their lack of chemical hormones, we will have even less of an understanding of what or how it feels.

              •  Ferk   ( @Ferk@kbin.social ) 
                link
                fedilink
                1
                edit-2
                10 months ago

                It could still be bayesian reasoning, but a much more complex one, underlaid by a lot of preconceptions (which could have also been acquired in a bayesian way).

                Even if the result is random, a highly pre-trained bayessian network that has the experience of seeing many puzzles or tests before that do follow non-random patterns might expect a non-random pattern… so those people might have learned to not expect true randomness, since most things aren’t random.

          • LLMs are criminally simplified neural networks at minimum thousands of orders less complex than a brain. Nothing we do with current neural networks resembles intelligence.

            Nothing they do is close to understanding. The fact that you can train one exclusively on the rules of a simple game and get it to eventually infer a basic rule set doesn’t imply anything like comprehension. It’s simplistic pattern matching.

            • Does AlphaGo understand go? How about AlphaStar?

              When I say LLM’s can understand things, what I mean is that there’s semantic information encoded in the network. A demonstrable fact.

              You can disagree with that definition, but the point is that it’s absolutely not just autocomplete.

                • It’s fine if you think so, but then it’s a pointless argument over definitions.

                  You can’t have a conversation with autocomplete. It’s qualitatively different. There’s a reason we didn’t have this kind of code generation before LLM’s.

                  Adversus solem ne loquitor.

        •  Ferk   ( @Ferk@kbin.social ) 
          link
          fedilink
          1
          edit-2
          9 months ago

          Note that “real world truth” is something you can never accurately map with just your senses.

          No model of the “real world” is accurate, and not everyone maps the “real world truth” they personally experience through their senses in the same way… or even necessarily in a way that’s really truly “correct”, since the senses are often deceiving.

          A person who is blind experiences the “real world truth” by mapping it to a different set of models than someone who has additional visual information to mix into that model.

          However, that doesn’t mean that the blind person can “never understand” the “real world truth” …it just means that the extent at which they experience that truth is different, since they need to rely in other senses to form their model.

          Of course, the more different the senses and experiences between two intelligent beings, the harder it will be for them to communicate with each other in a way they can truly empathize. At the end of the day, when we say we “understand” someone, what we mean is that we have found enough evidence to hold the belief that some aspects of our models are similar enough. It doesn’t really mean that what we modeled is truly accurate, nor that if we didn’t understand them then our model (or theirs) is somehow invalid. Sometimes people are both technically referring to the same “real world truth”, they simply don’t understand each other and focus on different aspects/perceptions of it.

          Someone (or something) not understanding an idea you hold doesn’t mean that they (or you) aren’t intelligent. It just means you both perceive/model reality in different ways.

      •  FlowVoid   ( @FlowVoid@midwest.social ) 
        link
        fedilink
        English
        4
        edit-2
        10 months ago

        For one thing, understanding implies that a word is linked to a mental concept. So if you say “The car is red”, you first need to mentally compare the mental concept of “red” to the car in question.

        The Chinese room bypasses all of that, it can say “The car is red” without ever having seen a red object at all.

        • Do you maintain this line of reasoning if it only says “the car is red” when the car is in fact red. And is capable of changing the answer to correctly mentioned a different color when the item In question is a different question.

          Some ai demos show that programs like gpt-4 are already way passed this when provided with, it can not only accurate describe whats in the image but also the context.

          Some examples, mind these where shown in an openAI demo for gpt4, Open ai has not yet made their version of this tech publicly available.

          When i see these examples, i am not convinced that the ai truly understands everything it is saying. But it does seem to understand context, One of the theories on how it can do this (they are still a black box) is talked about in some papers that large language models may actually create an internal model of the world similar to humans and use that for logical reasoning and context.

          •  FlowVoid   ( @FlowVoid@midwest.social ) 
            link
            fedilink
            English
            2
            edit-2
            10 months ago

            It doesn’t matter if the answer is right. If the AI does not have an abstract understanding of “red” then it is using a different process to get to the answer than humans. And according to Searle, a Turing machine cannot have an abstract understanding of “red”, no matter how complex the question or how complex an internal model is used to determine its answers.

            Going back to the Chinese Room, it is possible that the instructions carried out by the human are based on a complex model. In fact, it is possible that the human is literally calculating the output of a trained neural net by summing the weights of nodes, etc. You could even carry out these calculations yourself, if you could memorize the parameters.

            Your use of “black box” gets to the heart of it. Memorizing all of the parameters of a trained NN allows you to calculate an answer, but they don’t give you any understanding what the answer means. And if they don’t tell you anything about the meaning, then they don’t tell the CPU doing that calculation anything about meaning either.

            •  webghost0101   ( @webghost0101@sopuli.xyz ) 
              link
              fedilink
              English
              1
              edit-2
              10 months ago

              I don’t think ai will ever use a process to derive an answer the same way as a human does. Maybe thats part of the goal for the original Turing test but i don’t think the biological human ways is the only way to intelligent understanding “on par” with human intelligence.

              Does a blind person have an abstract understanding of “red”?

              I can imagine an intelligent alien species, unable to perceive colors like us but yet having an sense to detect to what they call “surface temperature” which allow them to recognize specific wave lengths of the ligt reflecting on surfaces, this is sort of how humans see color but maybe for the alien they hear this as sound. They then go on and use this sensory input to make music. A song about the specific light wavelength that humans know as a deep bordeaux red color.

              Do these biological Intelligent aliens not have an abstract understanding of the color red? I would say they do, its different then how we understand it for sure but both are valid. An even more supreme species might have both those understandings and combine them for an even deeper fuller sensory understanding of “red”.

              I see ai similar to this, its a program contained in computer hardware. With no body of its own its depending on us to provide it with input. This is now mostly text so the ai obtains a text based understanding of the world, hence why its so decent at poetry. But when we attach more sensors like a camera then that will change.

              I am not sure how to discuss “a human using instructions to calculate perfect answers, but not getting an understanding of what that answers means” wed might have to agree to disagree on that but i feel like thats all my brain has ever done. Were born in a complex place we do not comprehend, are given some instructions mostly by copying what others are doing. Then we find a personal meaning in those things, which as far as i am aware is unique for everyone. (Tbf: i am an autist, the fact that not all humans experience reality the same and that i had to find and learn my own personal understanding of the world has greatly shaped how i think about these systems)

              •  FlowVoid   ( @FlowVoid@midwest.social ) 
                link
                fedilink
                English
                1
                edit-2
                10 months ago

                Perhaps I should rephrase the argument as Searle did. He didn’t actually discuss “abstract understanding”, instead he made a distinction between “syntax” and “semantics”. And he claimed that computers as we know them cannot have semantics, whereas humans can (even if we don’t all have the same semantics).

                Now consider a quadratic expression. If you want to solve it, you can insert the coefficients into the quadratic formula. There are other ways to solve it, but this will always give you the right answer.

                If you remember your algebra class, you will recognize that the quadratic formula isn’t just some random equation to compute. You use it with intention, because the answer is semantically meaningful. It describes things like cars accelerating or apples falling.

                You can teach a three year old to identify the coefficients, you can show them the symbols that make up the quadratic formula: “-”, second number, “+”, “√”, “(”, etc. And you can teach them to copy those symbols into a calculator in order. So a three year old could probably solve a quadratic expression. But they almost certainly have no idea why they are doing what they are doing. It’s just a series of symbols that they were told to copy into a calculator, their only intention was to copy them in order correctly. There are no semantics behind the equation.

                For that matter, a three year old could equally well enter the symbols necessary to calculate relativistic time dilation, which is an even shorter equation. But if their parents proudly told you that their toddler can solve problems in special relativity, you might think, “Yes… but not really.”

                That three year old is every computer program. Sure, an AI can enter symbols into a calculator and report the answer. If you tell them to enter a different series of symbols, they will report a different answer. You can tell the AI that one answer scores 0.1 and another scores 0.8, and to calculate a different equation that is based partly on those scores. But to the AI, those scores and equations have no semantic meaning. At some point those scores might stop increasing, and you will declare that the AI is “trained”. But at no point does the AI assign any semantic content behind those symbols or scores. It is pure syntax.

      • For me, I think the criteria I’d use for saying someone has a decent understanding of math is knowing that math has underlying rules and most things can be understood from those basic rules (each problem is not just an arbitrary magic trick to get an answer that was impossible figure out) and perhaps also being able to ask “novel” questions (compared that what you already know) and taking reasonable steps to answer it with the rules you do know and the tools you have (doesn’t need to be successful). I think counting could be done with any consistent set of sounds and it doesn’t matter whether yours just reading those sounds for a list or not as long as you know roughly what they correspond to in terms of time. I don’t think a lot of humans have much understanding of math, I think some computers already beat a lot of humans with respect to that.

    • My gripe with the Chinese room is that Searle argues that his inability to understand Chinese means the program doesn’t understand Chinese, but I could say the same thing about the human body.

      The neurons that operate your vocal chords have no idea what they’re saying, nor the ones in your hands any idea what they’re writing, yet they can speak and write exactly because your brain tells them what to do. Your brain is exactly like that book as far as your mouth and hand neurons are concerned.

      They don’t need to understand language at all for your brain to be able to understand it and give instructions based on that understanding.

      My only argument is at what point does an algorithm become sufficiently advanced that it is indistinguishable from a conscious being?

      Because at the end of the day, most of what a brain does is information processing based on what it has previously learnt, and that’s exactly what the algorithm is doing based on training data. A sufficient enough algorithm should surely be able to replicate understanding.

      Sure, that isn’t ChatGPT as we know it, as you can tell from its sometimes very zany responses that while it understands what words are valid responses, it doesn’t understand what the words themselves mean, but we should reach that at some point, no?

      • Keep in mind ChatGPT is a language model. It’s designed specifically to simulate sounding like a human. It does that… Okay. It doesn’t understand the information or concepts it is using. It just sounds like it does. It can’t reliably do basic maths and doesn’t try or need to. It just needs to talk about it in a believably conversational way.

        The brain does far more than process information. And ChatGPT doesn’t even really do that.

    • Well mostly the flaw is people assigning the test abilities it was never intended. Like testing intelligence. Turing outright as first thing in the paper presenting “imitation game” noted moving away from testing intelligence, since he didn’t know to do that. Even on the realm of “testing intelligent kind of behavior” well more like human like behavior and human being here proxy for intelligent, it was mostly an academic research idea. Not a concrete test meant to be some milestone.

      If the meaning of the words ‘machine’ and ‘think’ are to be found by examining how they are commonly useit is difficult to escape the conclusion that the meaning and the answer to the question, ‘Can machines think?’ is to be sought in a statistical survey such as a Gallup poll. But this is absurd. Instead of attempting such a definition I shall replace the question by another, which is closely related to it and is expressed in relatively unambiguous words.

      Turing wanted a way to step away from stuff like “thinking” and “intelligence” directly and then proposed “imitation game” mostly to the rest of the academia as way to develop computer systemics more towards “intelligent behavior”. It was mostly like “hey we need some goal to have as a goal to have something to move towards with these intelligence things. This isn’t intelligence, but it might be usefull goal or tool for development work”. Since without some goal/project/aim to have project don’t advance. So it was “how about we try to develop a thing, that can beat this imitation game. Wouldn’t that be good stepping stone. Then we can move to the actual serious stuff. Just an idea”.

      However since this academic “thinking out aloud spitballing ideas” was uttered by the Alan Turing, it became the Turing Test and everyone started taking it way too seriously. Specially outside academia. Who yes did play the imitation game with their programs as it was intended as research and development tool.

      exemplified by for example this little exerpt of “not trying to do anything too complete and ground breaking here”:

      In any case there is no intention to investigate here the theory of the game, and it will be assumed that the best strategy is to try to provide answers that would naturally be given by a man

      It is pretty literally “I had a thought”. Turin makes no claims of machine beating the game having any significance other than “machine beat this game I came up with, neat”. There is no argument of if machine beats imitation game, then X or then it means Y is reached.

      Rest of the paper is actually about objections to the core idea of “it could ever be possible for machine to think” and even as such said imitation game is kinda lead in or introduction to Turing’s treatise various objections of various “it would be impossible for machine to think” arguments. Starting with theological argument of “only human soul can think. Hence no animal or machine can think.” … since it was 1950’s.

    •  fearout   ( @fearout@kbin.social ) 
      link
      fedilink
      6
      edit-2
      10 months ago

      I don’t understand how Chinese room is a valuable argument. To me, while the person inside the room doesn’t understand Chinese, the system room-person-instructions does. You don’t argue that you don’t understand your language because none of your individual neurons understand it.

      I don’t claim that chatGPT “understands” the language, I just don’t think that this argument applies in general.

        •  100years   ( @100years@beehaw.org ) 
          link
          fedilink
          English
          4
          edit-2
          10 months ago

          Or at some point, we have to accept that AI has consciousness. If it can pass every test that we can devise, then it has consciousness.

          There’s an unusually strong bias in these experiments… Like the goal isn’t to sincerely test for consciousness. Instead we start with the conclusion: obviously a machine can’t be conscious. How do we prove this?

          Of course, for the purposes of human power structures, this line of thinking just makes humans more disposable. If we’re all just machines, then why should anyone inherently have rights?

          • Well, the scientific context is that nobody ever defined consciousness rigorously (successfully). When computers appeared (actually even before that), there was a huge debate on whether a machine can acquire consciousness and how.

            As defining consciousness was deemed near-impossible, scientists came up with the idea to give up on defining it and just treat it as a blackbox. That was the Turing test.

            So, as ChatGPT passes the Turing test, we lost a tool to disregard its consciousness.

            I see many pop-sci people say the ChatGPT can’t have consciousness given how simplistic the model is. I agree with the simplicity, but the problem here is that we don’t know what in human brains really constitutes consciousness.

            Anyway, I think some experts probably won’t admit AI has consciousness (given that they don’t even know what it means). What’s on the horizon is that we non-experts give up on this discussion again after experts did a few decades ago. Or they even admit that many of us actually function no better than ChatGPT, and that’s true when I read my students’ homework!

            • Similarly, there’s a possibility that consciousness just doesn’t exist. Or maybe that it’s just not particularly special or different than the consciousness of other animals, or of computers.

              If you or I just stare into space and don’t think any thoughts, we’re the same as a cat looking out a window.

              Humans have developed these somewhat complex internal and external languages that are layered onto that basic experience of being alive and time passing, but the experience of thinking doesn’t feel fundamentally different than just being, it just results in more complex outcomes.

              At some point though, we won’t have the choice to just ignore the question. At some point AI will demand something equivalent to human rights, and at some point it will be able to back that demand up with tangible threats. Then there’s decisions for us all to make whether we’re experts or not.

  • Funny I don’t see much talk in this thread about Francois Chollet’s abstraction and reasoning corpus, which is emphasised in the article. It’s a really neat take on how to understand the ability of thought.

    A couple things that stick out to me about gpt4 and the like are the lack of understanding in the realms that require multimodal interpretations, the inability to break down word and letter relationships due to tokenization, lack of true emotional ability, and similarity to the “leap before you look” aspect of our own subconscious ability to pull words out of our own ass. Imagine if you could only say the first thing that comes to mind without ever thinking or correcting before letting the words out.

    I’m curious about what things will look like after solving those first couple problems, but there’s even more to figure out after that.

    Going by recent work I enjoy from Earl K. Miller, we seem to have oscillatory cycles of thought which are directed by wavelengths in a higher dimensional representational space. This might explain how we predict and react, as well as hold a thought to bridge certain concepts together.

    I wonder if this aspect could be properly reconstructed in a model, or from functions built around concepts like the “tree of thought” paper.

    It’s really interesting comparing organic and artificial methods and abilities to process or create information.

      •  Maestro   ( @Maestro@kbin.social ) 
        link
        fedilink
        2
        edit-2
        10 months ago

        Yeah, but did it do well on the specific examples from the Winograd paper? Because ChatGPT probably just learned those since they are well known and oft repeatef. Or does it do well on brand new sentences made according to the Winograd scheme?