This is a game played on June 19 between Ueno Asami (as black) and Fujisawa Rina (as white). At move 159, Asami cut with K14.

It involves a followup of the life and death of the white group M10, and most human players would be able to judge it, but AI (from Golaxy to Katago) all seem to deem it “alive” and continue to fight without securing the life of that group first. While a human player can read out the sequence like Rina did during the game and make the right choice.

But it takes like 10k+ playouts before AI realize the K14 cut is a very good move, and (with few playouts AI even judges K14 as a blunder). And it takes millions of playouts before AI realizes the M10 white group is in trouble.

What other AI blindspots and hallucinations have you seen in real games?

  • Ah, I see what you mean now. I can confirm at lower visits, KataGo does indeed want to push and cut, and this is a problem. For what it’s worth, at higher visits KataGo sees the problem with the push and cut at wL11, and suggests other moves instead, such as wK13, or tenuki-ing and playing wC9.

    I think that wN5 played in the game is premature though, the squeeze play you described would be suicidal for black to try without the push and cut due to the aji at R8.

    I showed this position to lightvector (the author of KataGo) on the Computer Go Discord, and he had this to say:

    Thanks! I looked at this position too and I concur, there is no blind spot, in the sense that there is no important move affecting the tactics that has too low of a policy prior causing it to be not explored or under-explored, even deep into the variations I didn’t find any sign of that.

    There is some strong value head errors on positions, where some positions in branches that lead to a win by one side have evaluations part of the way along that are highly positive for the other side, which dissuades the search from thinking they work until potentially a fairly large number of playouts that make it “push through” that hill to discover the true value. That’s the reason why you sometimes see it miss the right move, and why values in the position fluctuate a ton even at fairly large numbers of playouts. More value head training to judge the short-term tactics might help a bit.

    In this case I suspect the value head might be too smooth or linear in its extrapolation. Like if you take a position winning for player A, and add 4 different reasons that each at first glance might allow B to win, but each one actually barely doesn’t work, then all of them count for nothing in reality, but if the value head is a little bit inaccurate and also is too linear and assigns some value to each one, the combination of all of them pulling in the same direction and the value head attributing some weight to each one and linearly adding them together might be enough in total to make the value head think B is good. I get a sort of sense that there’s something like that maybe going on here.