economics reporter learns "surprising facts" from no data

froztbyte ( @froztbyte@awful.systems ) · 9 months ago

economics reporter learns "surprising facts" from no data

Amoeba_Girl ( @Amoeba_Girl@awful.systems ) · 9 months ago

Another way of putting it: Out of 196 questions, ChatGPT-4 got about 5 more correct answers than a random guesser would (39 vs 34.23.)

What are the odds of that?

I’m too lazy to look through the tests he’s administering, but IQ tests like the WAIS have vocabulary questions, which yes you would expect an LLM to be better at than random chance.

I’ve surely said it before but when you see the sort of thinking on display by Mr Max Truth here, is it any wonder why rationalists are impressed with ChatGPT’s reasoning faculties.

Amoeba_Girl ( @Amoeba_Girl@awful.systems ) · 9 months ago

I asked ChatGPT-4 if cars in roundabouts in Ireland go clockwise or counterclockwise. It got it wrong. When I told it that, it apologized and gave the right answer. But then I trickily called it out on its right answer, and it apologized again and reverted to the wrong answer. Fundamentally, it knows that the Irish drive on the left side of the road, but it doesn’t understand how to apply that to a roundabout to find the circular direction.

lol you fucking idiot

self ( @self@awful.systems ) · 9 months ago

this coin I’m flipping fundamentally knows everything about how the Irish drive, but it only seems to feel like giving me the right answer approximately half the time

this reminds me of very early in my programming career, when I discovered that an NPC I programmed to randomly either move forward or turn left every 10 seconds was surprisingly good at solving simple labyrinths. I used to instantiate like 100 of them and see which ones would win (or “fight” by colliding with each other, or escape the labyrinth by stacking on top of other instances). you’re telling me now I was a handful of incredibly stupid blog posts away from being a renowned AI researcher?

swlabr ( @swlabr@awful.systems ) · 9 months ago

I used to instantiate like 100 of them and see which ones would win (or “fight” by colliding with each other

The basilisk will not take kindly to your desecration of AGI for sport.

economics reporter learns "surprising facts" from no data

economics reporter learns "surprising facts" from no data

Top AIs still fail IQ tests