Of all the longtime forum regs, I think I'm the one who racks up posts in the slowest amount of time. I joined 2+2 over 7 years ago and I'm just now hitting 3,000.
I've been on the forum for so long that a COTM that I wrote once,
http://forumserver.twoplustwo.com/17...91/?highlight=
is almost 3 years old as I write this one. Back then, a lot of people were talking about "GTO" as if it was some new mathematical theory that was going to solve the game of poker--and more importantly, that you could play "GTO" or close to it and lock in a guarantee of long-term profit.
As a professional mathematician, I could see pretty clearly that most of this forum had a deep misunderstanding of what GTO is supposed to be able to do mathematically, and also of what its limitations were. So I tried to write something that would help.
Today there are still tons of references to GTO on this forum, most of which are still silly. In particular, the thing that always bugged me the most--and which I never got around to discussing in the COTM either--is that people want to believe that GTO behaves the same for 3+-player games as it does for 2-player games. This is absurd. Every time I see it I want to jump into the thread and correct it, but it happens too often. So now I'm using my 3,000th post on it.
The Situation In 2-Player Games: Just so we're on the same page, in 2-player games, most of what you think about how GTO works actually is true. If 2 people are playing a zero-sum, fair game, there's always going to be at least one Nash equilibrium, and it will always be break-even for both players. The number of Nash equilibria (there could be more than one) is sort of a moot point in 2-player games. Here's why. There's an important theorem that in a zero-sum 2-player game, all the different Nash equilibria have to give the same payouts to both players. So we can define the
value of any 2-player zero-sum game for either player as the amount that player would receive in a Nash equilibrium.
Now if one player is playing a NE strategy, and the other deviates, that new strategy has to pay out either the value of the game, or less. This is by definition of the Nash equilibrium. So, if the deviating player breaks even or loses, the player sticking to the NE must break even or win when the other player deviates. That's because the game is zero-sum, so with 2 players, if one loses EV, that EV has only one place to go: the other player. This is the source of the notion that playing GTO locks in at least the value of the game for anyone playing it. And if the value of the game is 0, then GTO must break even or win in a 2-player game.
Where people get it wrong for 2-player games is they tend to assume that
any deviation from GTO results in a loss. This is false--see the thread linked above for many, many examples. But overall most of what people believe about how GTO works in 2-player zero-sum games with a value of 0 is right.
What Goes Wrong When You Add More Players: The mathematical definition of a Nash equilibrium says that if
all but one of the players are playing their NE strategy, the final player cannot gain by deviating from their own NE strategy. In 2-player games, this implies a lot more than it does in multi-player games. So when you get your intuition about GTO from 2-player games, there are (at least) two extremely important places where that intuition blows up completely:
1.
When there are multiple Nash equilibria, they need not have the same EV for all players. So there is no longer any such thing as the "value" of the game. There is no well-defined value for GTO to "lock in" like there is the 2-player case. The proof that it must exist was specific to 2 players and totally breaks down when we add more.
Here's an example. Let's say 3 people are going to play a game where they each hold a coin in their hand and secretly set it to heads or tails. They reveal their play at the same time; if one is showing something different than what the other two are showing, that player wins +2 and the other two each lose 1. (If they all show the same it's a tie.)
This game is clearly very symmetrical. So what do you suppose a Nash equilibrium for this game might look like?
Based on my time on this forum, if you're reading this, I think there's a large chance that you would guess that there is exactly one Nash equilibrium, and that it involves all players playing 50% H/50% T.
If you guessed this, you are partially right. All players playing 50% H/50% T clearly is a NE. If any 2 players are playing that, then the third player is indifferent to playing H or T--it's always an EV of 0 no matter what. He'll win 25% of the time, tie 25% of the time, and lose the remaining 50%. But a win is worth twice as much as a loss, so the EV is 0. By a symmetrical argument, the EV is 0 for all 3 players in this NE.
If this were a 2-player game, you could be comfortable stopping your analysis there. But it's not. There's more. There's another Nash equilibrium that has
different payouts than the first one!
Suppose you're playing this game with my girlfriend and me. Now suppose she always throws heads, and I always throw tails.
No matter what strategy you play, this is a Nash equilibrium. Whatever my EV is throwing tails, it definitely won't go up if I switch to heads. If you're throwing heads, I'm better off throwing tails; if you're throwing tails, I'm losing whether I throw heads or tails; so I can't gain by deviating. By a similar argument, my girlfriend can't either. That means
we have found a Nash equilibrium where it is guaranteed that one player (you)
will get a negative payout. You'll get -1 every time. (Since you always get -1 you're obviously indifferent to changing your strategy as well.) And this is despite the fact that the game is zero-sum and totally symmetrical!
At this point you may be wondering whether this kind of situation--a Nash equilibrium where one player is guaranteed to lose even though the game is zero-sum and symmetrical--is possible in poker. The answer is, I don't know. Regardless, it very well might be the case that poker has multiple Nash equilibria, and they don't have the same payout structure. That means if you think of GTO as one strategy, or a NE as one set of strategies, you are thinking too small.
2.
The scenarios for what's allowed to happen when one player deviates from a Nash equilibrium include bad results for some of the non-deviating players. This is the big one. It's much more important and more relevant than the other one.
In a Nash equilibrium, when one player deviates, that player's EV must stay the same or go down. This is by definition of the NE. In a 2-player game, when one player deviates and breaks even, the other must also break even. When one player loses, the other must win.
Both of these things completely stop being true when there is no unique "other".
Here's a toy game to show this point. Imagine we're playing a poker-like game with 2 players--call them A and P. Let's represent A's range by the interval (0,1) and P's range by the interval (-.4,0) U (1,1.6). So P is polarized. There's $1 in the pot. A acts first. If A checks (which he always does), P can check back or bet $1, and if he bets, A can call or fold.
We all know the solution to this game--P bets all of his range greater than 1, plus the range (-.3,0). So he's betting 90% of the time, and when he does he has a 2:1 value:bluff ratio. A should call that bet with the top half of his range. A wins the pot 10% of the time when P checks back, and his EV is 0 when P bets. A's EV is .1 and P's EV is .9. As long as P bets his entire value range, it doesn't matter how much or how little he bluffs--his EV will always be .9 as long as A doesn't change his calling frequency.
Now let's add a third player to this game--call him B. B will sit between A and P. He'll have exactly the same range as A--the interval (0,1). He also checks to P every time, as does A. If P bets $1, B can call or fold when the action gets back to him. He, like A, cannot raise.
How does B's presence change the Nash equilibrium for this game? (There's not more than one NE for this toy game, by the way. I'm omitting the proof because this is gonna be long enough.)
-P's GTO strategy is totally unchanged--he bets his entire value range and he bets a 2:1 value:bluff ratio, so any caller has 0 EV. Since he bets 90% of the time, his EV in this game is .9.
-A's GTO strategy is to never call any bet. (It sounds weird, but it's true.) Since he and B have the same range, he should expect to win half the pot on average when P checks back a loser. Since this happens 10% of the time, his EV is .05.
-B's GTO strategy is to never overcall if A calls, but if A folds (which happens all the time in a NE), he will call P's bet with the top half of his range. He gets no EV from this, though, so his EV all comes from P's check backs. Like A, his EV is .05.
This may already be interesting to you if you never thought about it before, but where it gets really interesting is what happens when P's bluffing frequency changes. In the 2-player version of this game, the EVs don't change when P unilaterally deviates. But now they will.
For example, let's imagine that P overbluffs. He will now be betting 100% of the time, while the other 2 players stick to GTO. Since A's GTO strategy is to always fold, his EV now drops to 0. Since B's calling frequency means that P's bluffs are all 0 EV, P's EV doesn't change either--it's .9. (He takes down the $1 pot 50% of the time, wins $2 30% of the time, and loses $1 20% of the time.) B's EV has gone up to .1.
So the NE payouts were .05-.05-.9. The third player deviating has caused them to go to 0-.1-.9. In other words,
one player has deviated in a way that he breaks even but he shifts EV from one of the GTO players to the other one. That means that when one player deviates it is actually possible for someone to stick to GTO and have their EV go down compared to what it was in the Nash equilibrium!
This to me poses a really interesting question for poker. According to the definition of the Nash equilibrium, if you put one normal player in a ring game with 8 GTO players, the normal player could not win. But it is theoretically possible that based on his style of play he could drag some of the GTO players into losing territory! And this is with ONE player deviating!
Now imagine the theoretical possibilities if one GTO player is in a ring game with 8 normal players. I think most people's intuition is that the GTO player would clean up. But in fact mathematically we have absolutely no reason to believe that this is the case.
In multiway games where even one person is deviating from GTO, all bets are off about what kind of payoff the GTO strategy gets.
GTO might still win, maybe, but unless you are playing heads-up, there is no mathematical guarantee that it will or theoretical reason to believe it should. And this is even BEFORE you get into the fact that we aren't even close to knowing what GTO is anyway! It's theoretically possible that you could know all there is to know about GTO strategy and still lose by playing it. So think about that next time you're wondering if studying "GTO" will help your game.
PHEW, OK. I guess I'll end my rant there. If you read this far, thanks for humoring me.