[long] Some tests of how much AI "understands" what it says (spoiler: very little)

diz ( @diz@awful.systems ) · 5 months ago

[long] Some tests of how much AI "understands" what it says (spoiler: very little)

MudMan ( @MudMan@fedia.io ) · 5 months ago

Well, yeah, but that’s all bullshit.

So why would you buy into it when presenting a rebuttal?

I am interested in pointing out that the likely response machine getting the answers to test questions right is not a particularly interesting outcome. That’s interesting.

I’m interested in which of the likely responses the machine struggles with and when it stops struggling and what the amount of data and processing associated to each are. That’s interesting.

It’s interesting that language emerges from the math at, all, let alone how plausible the output is in most situations. That’s more than interesting.

But if your response to the obvious misrepresentation that a chatbot is a person of ANY level of intelligence is to point out that it’s dumb you’ve already accepted the premise. You’re now part of the bullshit. That’s counterproductive. And worse, uninteresting and outright boring.

I am excited about the ways different ML applications can help with automation or as part of a workflow. I think explaining to gullible executives how that would actually work (spoilers, it’s not by replacing workers with chatbots) is very relevant. But this and a lot of the online criticism is not doing that, it’s buying into the correct premise that the only reason that’s not how it works is because the AI is too dumb and it’ll be fine when it’s smarter, when that’s unlikely to be the case. Making a better screwdriver won’t turn it into a machete. This is entirely the wrong conversation to be having.

The Cuuuuube ( @Cube6392@beehaw.org ) · 5 months ago

People aren’t worried about buying into it. Were worried about our bosses buying into it. And they are. And our landlords buying into it. Because they want to

FRANK.MCCONNEL ( @hairyvisionary@fosstodon.org ) · edit-2 5 months ago

@Cube6392 @MudMan Not simply “because they want to”, but because they know it will be treated as an authority (we put so much stuff in) and will (or can be coerced to) give the answers they as paying customers want

MudMan ( @MudMan@fedia.io ) · edit-2 5 months ago

It’s actually as hard to keep these aligned to the company line as it is to keep them answering truthfully, which is to say very hard.

All these people are out there shilling products that don’t work because they’re designed for capabilities the tech doesn’t have. Google has started rolling Gemini out as a replacement for Google Assistant and it’s embarassing how much functionality it’s missing in direct comparisons.

There is really no need for paranoia and conspiracy when good old greed and incompetence is more than enough to explain the outcomes we’re seeing.

The Cuuuuube ( @Cube6392@beehaw.org ) · 5 months ago

There is really no need for paranoia and conspiracy when good old greed and incompetence is more than enough to explain the outcomes we’re seeing.

That’s literally what we’re saying

MudMan ( @MudMan@fedia.io ) · 5 months ago

But that’s my problem. You guys are here trying to convince somebody who isn’t listening that you’re better than AI at doing a thing AI doesn’t do in the first place.

You’re implicitly accepting that eventually AI will be better than you once it gets “good enough”. May as well jump in ahead of the curve, right?

Only no, that’s not how it’s likely to go. It’s not what it does or how it works. Everybody is arguing about the sci-fi version of this stuff and making wrong decisions as a result, both critics and advocates. It’s super frustrating. We need a lot more unbiased education and a lot less argumentative nonsense on all sides.

Amoeba_Girl ( @Amoeba_Girl@awful.systems ) · edit-2 5 months ago

You’re implicitly accepting that eventually AI will be better than you once it gets “good enough”. May as well jump in ahead of the curve, right?

What gave you the impression? Most if not all of us here believe that it fundamentally can’t ever get good enough and that the only use case is spam and spamlike activities.

It’s interesting that language emerges from the math at, all, let alone how plausible the output is in most situations. That’s more than interesting.

It’s interesting, but it’s not language. It’s sequences of characters that plausibly look like language. You can’t use language without intent.

MudMan ( @MudMan@fedia.io ) · edit-2 5 months ago

I would have agreed a few years back. I was firmly in the camp that the reason that natural language was so hard to generate was you’d need general intelligence to make it work.

I think that position is not tenable now. What you get from chatbots is not intelligence, but it’s language by any definition. You can have a consistent conversation with it, the syntax is reliably correct, the context is reliably coherent, you can extract meaning from it. It’s language by any definition of it I know, unless you retroactively and tautologically redefine the terms to claim that language IS intelligence.

And even then you’d have the problem that the tech can also generate other output that you’d think requires intelligence, like images or video.

But hey, whatever, that is a surprising and interesting outcome. It probably shouldn’t be, we know a human can lose the ability to use language or parse visual information and remain an intelligent, self-aware person. In retrospect it shouldn’t be so surprising, but I still think it is.

But that’s a fairly academic surprise. If anything, that confusion between language and intelligence is the bit that I see both sides of this argument embracing and that I find frustrating, I think you’ve nailed where that breaks down for me.

The Cuuuuube ( @Cube6392@beehaw.org ) · 5 months ago

We’re not accepting that. We’re actively arguing that people trying to do that are bad for society at large

ebu ( @ebu@awful.systems ) · edit-2 5 months ago

You’re implicitly accepting that eventually AI will be better than you once it gets “good enough”. […] Only no, that’s not how it’s likely to go.

wait hold on. hold on for just a moment, and this is important:

Only no, that’s not how it’s likely to go.

i regret to inform you that thinking there’s even a possibility of an LLM being better than people is actively buying into the sci-fi narrative

well, except maybe generating bullshit at breakneck speeds. so as long as we aren’t living in a society based on bullshit we should be goo–… oh fuck

diz ( @diz@awful.systems ) · 5 months ago

But if your response to the obvious misrepresentation that a chatbot is a person of ANY level of intelligence is to point out that it’s dumb you’ve already accepted the premise.

How am I accepting the premise, though? I do call it an Absolute Imbecile, but that’s more of a word play on the “AI” moniker.

What I do accept is an unfortunate fact that they did get their “AIs” to score very highly on various “reasoning” benchmarks (some of their own design), standardized tests, and so on and so forth. It works correctly across most simple variations, such as changing the numbers in a problem or the word order.

They really did a very good job at faking reasoning. I feel that even though LLMs are complete bullshit, the sheer strength of that bullshit is easy to underestimate.

self ( @self@awful.systems ) · 5 months ago

given how none of their rant applied to your OP, I’m fairly certain they didn’t read it and were just going off the title. see also how fast they went from a false critique of LLMs (“of course they’re not people”) to an appeal to an imaginary middle ground (“both proponents and critics of LLMs anthropomorphize them/think they’re sci-fi marvels”, a ridiculous claim to apply to your OP or to serious LLM skepticism in general) to smuggling in hype (“…but of course LLMs are revolutionary and we don’t know what they’re capable of”)

in short, don’t bother with this shithead, they’re just marketing OpenAI products to a particularly hostile crowd

V0ldek ( @V0ldek@awful.systems ) · 5 months ago

So why would you buy into it when presenting a rebuttal?

“Let me show you how ridiculous your point is when taken at face value” is a great way to rebutt, actually.

[long] Some tests of how much AI "understands" what it says (spoiler: very little)

[long] Some tests of how much AI "understands" what it says (spoiler: very little)

A couple simple probes:

GPT4 is uncannily good at recognizing the river crossing puzzle

An Idiot With a Petascale Cheat Sheet

Is this a “hallucination”?

But after an update, GPT-whatever is so much better at such prompts.

The need for an Absolute Imbecile Level Reasoning Benchmark

Randomness in bullshitting