Quote:
Originally Posted by arzlan
The problem is that it wants to do this with over twice as many bluff combos as value hands.
This kind of spot has been posted as an example of Snowie's weirdness and generally unbalanced river play several times in the thread. I think it's mostly just a sample size problem, due to the way Snowie trained, as Frogman mentioned. During its "reinforcement learning" phase, it just can't have experienced every possible action sequence with every possible combo vs every other possible combo in order that the river EV is correct.
It's not like with a solver, where you have the exact ranges - and frequencies with which each combo appears at that decision point - and let the software calculate a near-GTO solution. Snowie had to "play" billions of hands against "random" opponents. On that particular runout, it must have done quite well with its bluff shoves, so it kept doing them, but it did badly with its calls of shoves vs the agents it trained against. The game of poker is just too big for a neural network to have found solutions for every river card on every board vs every possible "random" strategy.
As has been said before, you can use Snowie to draw some general conclusions about optimal ranges, particularly on the earlier streets where Snowie's EV estimation is a bit more trustworthy too, but its river play is sometimes quite weird/wrong, and the numbers don't "add up".
Don't lose sleep over its percentages being unbalanced in one specific spot imo. You're unlikely to ever have that exact runout (and those exact combos) in real life. (If it came up again, you could 'exploit' Snowie by calling wide, but in other spots that look quite similar, you might find Snowie is considerably under-bluffing. Snowie's "mistakes" probably kind of even out in the long run.)