AI hallucination on real life game

countingtls ( @countingtls@lemmy.ml ) · edit-2 1 year ago

AI hallucination on real life game

OmnipotentEntity ( @OmnipotentEntity@beehaw.org ) · 1 year ago

Ah, I see what you mean now. I can confirm at lower visits, KataGo does indeed want to push and cut, and this is a problem. For what it’s worth, at higher visits KataGo sees the problem with the push and cut at wL11, and suggests other moves instead, such as wK13, or tenuki-ing and playing wC9.

I think that wN5 played in the game is premature though, the squeeze play you described would be suicidal for black to try without the push and cut due to the aji at R8.

I showed this position to lightvector (the author of KataGo) on the Computer Go Discord, and he had this to say:

Thanks! I looked at this position too and I concur, there is no blind spot, in the sense that there is no important move affecting the tactics that has too low of a policy prior causing it to be not explored or under-explored, even deep into the variations I didn’t find any sign of that.

There is some strong value head errors on positions, where some positions in branches that lead to a win by one side have evaluations part of the way along that are highly positive for the other side, which dissuades the search from thinking they work until potentially a fairly large number of playouts that make it “push through” that hill to discover the true value. That’s the reason why you sometimes see it miss the right move, and why values in the position fluctuate a ton even at fairly large numbers of playouts. More value head training to judge the short-term tactics might help a bit.

In this case I suspect the value head might be too smooth or linear in its extrapolation. Like if you take a position winning for player A, and add 4 different reasons that each at first glance might allow B to win, but each one actually barely doesn’t work, then all of them count for nothing in reality, but if the value head is a little bit inaccurate and also is too linear and assigns some value to each one, the combination of all of them pulling in the same direction and the value head attributing some weight to each one and linearly adding them together might be enough in total to make the value head think B is good. I get a sort of sense that there’s something like that maybe going on here.