Post #3k: Please Indulge Me While I Rant About Game Theory - Live Low-stakes No Limit Poker Forum

Two Plus Two Forums Poker Strategy Live No-Limit Hold’em Cash

Post #3k: Please Indulge Me While I Rant About Game Theory

Post Reply Subscribe

...

Page 1 of 5

1 2 3 4 5

Page 1 of 5

1 2 3 4 5

07-26-2017 , 10:38 PM

CallMeVernon

COTM Crusher

Join Date: Jun 2010 Posts: 4,088

Of all the longtime forum regs, I think I'm the one who racks up posts in the slowest amount of time. I joined 2+2 over 7 years ago and I'm just now hitting 3,000.

I've been on the forum for so long that a COTM that I wrote once,

http://forumserver.twoplustwo.com/17...91/?highlight=

is almost 3 years old as I write this one. Back then, a lot of people were talking about "GTO" as if it was some new mathematical theory that was going to solve the game of poker--and more importantly, that you could play "GTO" or close to it and lock in a guarantee of long-term profit.

As a professional mathematician, I could see pretty clearly that most of this forum had a deep misunderstanding of what GTO is supposed to be able to do mathematically, and also of what its limitations were. So I tried to write something that would help.

Today there are still tons of references to GTO on this forum, most of which are still silly. In particular, the thing that always bugged me the most--and which I never got around to discussing in the COTM either--is that people want to believe that GTO behaves the same for 3+-player games as it does for 2-player games. This is absurd. Every time I see it I want to jump into the thread and correct it, but it happens too often. So now I'm using my 3,000th post on it.

The Situation In 2-Player Games: Just so we're on the same page, in 2-player games, most of what you think about how GTO works actually is true. If 2 people are playing a zero-sum, fair game, there's always going to be at least one Nash equilibrium, and it will always be break-even for both players. The number of Nash equilibria (there could be more than one) is sort of a moot point in 2-player games. Here's why. There's an important theorem that in a zero-sum 2-player game, all the different Nash equilibria have to give the same payouts to both players. So we can define the value of any 2-player zero-sum game for either player as the amount that player would receive in a Nash equilibrium.

Now if one player is playing a NE strategy, and the other deviates, that new strategy has to pay out either the value of the game, or less. This is by definition of the Nash equilibrium. So, if the deviating player breaks even or loses, the player sticking to the NE must break even or win when the other player deviates. That's because the game is zero-sum, so with 2 players, if one loses EV, that EV has only one place to go: the other player. This is the source of the notion that playing GTO locks in at least the value of the game for anyone playing it. And if the value of the game is 0, then GTO must break even or win in a 2-player game.

Where people get it wrong for 2-player games is they tend to assume that any deviation from GTO results in a loss. This is false--see the thread linked above for many, many examples. But overall most of what people believe about how GTO works in 2-player zero-sum games with a value of 0 is right.

What Goes Wrong When You Add More Players: The mathematical definition of a Nash equilibrium says that if all but one of the players are playing their NE strategy, the final player cannot gain by deviating from their own NE strategy. In 2-player games, this implies a lot more than it does in multi-player games. So when you get your intuition about GTO from 2-player games, there are (at least) two extremely important places where that intuition blows up completely:

1. When there are multiple Nash equilibria, they need not have the same EV for all players. So there is no longer any such thing as the "value" of the game. There is no well-defined value for GTO to "lock in" like there is the 2-player case. The proof that it must exist was specific to 2 players and totally breaks down when we add more.

Here's an example. Let's say 3 people are going to play a game where they each hold a coin in their hand and secretly set it to heads or tails. They reveal their play at the same time; if one is showing something different than what the other two are showing, that player wins +2 and the other two each lose 1. (If they all show the same it's a tie.)

This game is clearly very symmetrical. So what do you suppose a Nash equilibrium for this game might look like?

Based on my time on this forum, if you're reading this, I think there's a large chance that you would guess that there is exactly one Nash equilibrium, and that it involves all players playing 50% H/50% T.

If you guessed this, you are partially right. All players playing 50% H/50% T clearly is a NE. If any 2 players are playing that, then the third player is indifferent to playing H or T--it's always an EV of 0 no matter what. He'll win 25% of the time, tie 25% of the time, and lose the remaining 50%. But a win is worth twice as much as a loss, so the EV is 0. By a symmetrical argument, the EV is 0 for all 3 players in this NE.

If this were a 2-player game, you could be comfortable stopping your analysis there. But it's not. There's more. There's another Nash equilibrium that has different payouts than the first one!

Suppose you're playing this game with my girlfriend and me. Now suppose she always throws heads, and I always throw tails. No matter what strategy you play, this is a Nash equilibrium. Whatever my EV is throwing tails, it definitely won't go up if I switch to heads. If you're throwing heads, I'm better off throwing tails; if you're throwing tails, I'm losing whether I throw heads or tails; so I can't gain by deviating. By a similar argument, my girlfriend can't either. That means we have found a Nash equilibrium where it is guaranteed that one player (you) will get a negative payout. You'll get -1 every time. (Since you always get -1 you're obviously indifferent to changing your strategy as well.) And this is despite the fact that the game is zero-sum and totally symmetrical!

At this point you may be wondering whether this kind of situation--a Nash equilibrium where one player is guaranteed to lose even though the game is zero-sum and symmetrical--is possible in poker. The answer is, I don't know. Regardless, it very well might be the case that poker has multiple Nash equilibria, and they don't have the same payout structure. That means if you think of GTO as one strategy, or a NE as one set of strategies, you are thinking too small.

2. The scenarios for what's allowed to happen when one player deviates from a Nash equilibrium include bad results for some of the non-deviating players. This is the big one. It's much more important and more relevant than the other one.

In a Nash equilibrium, when one player deviates, that player's EV must stay the same or go down. This is by definition of the NE. In a 2-player game, when one player deviates and breaks even, the other must also break even. When one player loses, the other must win.

Both of these things completely stop being true when there is no unique "other".

Here's a toy game to show this point. Imagine we're playing a poker-like game with 2 players--call them A and P. Let's represent A's range by the interval (0,1) and P's range by the interval (-.4,0) U (1,1.6). So P is polarized. There's $1 in the pot. A acts first. If A checks (which he always does), P can check back or bet $1, and if he bets, A can call or fold.

We all know the solution to this game--P bets all of his range greater than 1, plus the range (-.3,0). So he's betting 90% of the time, and when he does he has a 2:1 value:bluff ratio. A should call that bet with the top half of his range. A wins the pot 10% of the time when P checks back, and his EV is 0 when P bets. A's EV is .1 and P's EV is .9. As long as P bets his entire value range, it doesn't matter how much or how little he bluffs--his EV will always be .9 as long as A doesn't change his calling frequency.

Now let's add a third player to this game--call him B. B will sit between A and P. He'll have exactly the same range as A--the interval (0,1). He also checks to P every time, as does A. If P bets $1, B can call or fold when the action gets back to him. He, like A, cannot raise.

How does B's presence change the Nash equilibrium for this game? (There's not more than one NE for this toy game, by the way. I'm omitting the proof because this is gonna be long enough.)

-P's GTO strategy is totally unchanged--he bets his entire value range and he bets a 2:1 value:bluff ratio, so any caller has 0 EV. Since he bets 90% of the time, his EV in this game is .9.

-A's GTO strategy is to never call any bet. (It sounds weird, but it's true.) Since he and B have the same range, he should expect to win half the pot on average when P checks back a loser. Since this happens 10% of the time, his EV is .05.

-B's GTO strategy is to never overcall if A calls, but if A folds (which happens all the time in a NE), he will call P's bet with the top half of his range. He gets no EV from this, though, so his EV all comes from P's check backs. Like A, his EV is .05.

This may already be interesting to you if you never thought about it before, but where it gets really interesting is what happens when P's bluffing frequency changes. In the 2-player version of this game, the EVs don't change when P unilaterally deviates. But now they will.

For example, let's imagine that P overbluffs. He will now be betting 100% of the time, while the other 2 players stick to GTO. Since A's GTO strategy is to always fold, his EV now drops to 0. Since B's calling frequency means that P's bluffs are all 0 EV, P's EV doesn't change either--it's .9. (He takes down the $1 pot 50% of the time, wins $2 30% of the time, and loses $1 20% of the time.) B's EV has gone up to .1.

So the NE payouts were .05-.05-.9. The third player deviating has caused them to go to 0-.1-.9. In other words, one player has deviated in a way that he breaks even but he shifts EV from one of the GTO players to the other one. That means that when one player deviates it is actually possible for someone to stick to GTO and have their EV go down compared to what it was in the Nash equilibrium!

This to me poses a really interesting question for poker. According to the definition of the Nash equilibrium, if you put one normal player in a ring game with 8 GTO players, the normal player could not win. But it is theoretically possible that based on his style of play he could drag some of the GTO players into losing territory! And this is with ONE player deviating!

Now imagine the theoretical possibilities if one GTO player is in a ring game with 8 normal players. I think most people's intuition is that the GTO player would clean up. But in fact mathematically we have absolutely no reason to believe that this is the case. In multiway games where even one person is deviating from GTO, all bets are off about what kind of payoff the GTO strategy gets.

GTO might still win, maybe, but unless you are playing heads-up, there is no mathematical guarantee that it will or theoretical reason to believe it should. And this is even BEFORE you get into the fact that we aren't even close to knowing what GTO is anyway! It's theoretically possible that you could know all there is to know about GTO strategy and still lose by playing it. So think about that next time you're wondering if studying "GTO" will help your game.

PHEW, OK. I guess I'll end my rant there. If you read this far, thanks for humoring me.

Quote

07-26-2017 , 10:46 PM

Garick

Oberbiergenießer

Join Date: Dec 2007 Posts: 26,516

Thank you

Quote

07-26-2017 , 10:57 PM

Buster65

The Situation

Join Date: Sep 2012 Posts: 2,240

We have not always agreed strategy wise, Vernon, but I've always agreed with you about GTO. Very well written 3K post!

Quote

07-26-2017 , 11:05 PM

iraisetoomuch

banned

Join Date: Aug 2013 Posts: 37,486

Well played sir.

Very balanced post.

Quote

07-27-2017 , 12:39 AM

BigSkip

The Situation

Join Date: Jun 2009 Posts: 19,989

I still believe in the end it is fitty-fitty that we will win tree fiddy...but other than that an excellent post.

Quote

07-27-2017 , 12:54 AM

aoFrantic

Carpal \'Tunnel

Join Date: Apr 2007 Posts: 25,221

Vernon your post really highlights the theory that knowing just a little about a subject can get you in a lot more trouble than being completely ignorant.

You've explained this well but especially in this forum but there are a lot of well meaning players who confuse (advanced) concepts and it hurts their game and thought process due to having incomplete information.

Quote

07-27-2017 , 01:18 AM

QuantumSurfer

old hand

Join Date: May 2012 Posts: 1,987

Quote:

Originally Posted by iraisetoomuch

Well played sir.

Very balanced post.

Balancing is too much GTO for my taste.

Quote

07-27-2017 , 01:27 AM

ChrisV

Carpal \'Tunnel

Join Date: Jul 2004 Posts: 40,336

It's obvious that in multiway poker a player deviating from optimal strategy can reduce your payout, because examples are very easy to come up with. One trivial example is being bet out of a dry side pot by a player who can't beat the allin player.

In multiway situations "GTO" gets used as shorthand for trying to have a balanced strategy that it's hard to exploit, rather than what we mostly do in LLSNL, which is play in a way that heavily exploits other players. I guess "non-exploitative" would be better but it doesn't really roll off the tongue.

Quote

07-27-2017 , 04:43 AM

outdonked

banned

Join Date: Jun 2017 Posts: 547

Thank you a million times. Thank you!

Finally a smart man comes around to open our eyes to make sure we know what we're talking about things we not fully understand.

Thanks!

Quote

07-27-2017 , 05:45 AM

#10

HappyLuckBox

old hand

Join Date: Feb 2013 Posts: 1,491

For the majority of players grinding llsnl, you dont need to worry about balance. Because low stakes games are filled with unbalanced calling stations, so all you need to do is play abc tag and rarely bluff.

As you progress higher up in stakes and you start to encounter less multiway pots, with tougher opponents, balance becomes more important.

Being known as having the capability to bluff/give action gets your value bets paid off. Thats really the main takeaway.

Quote

07-27-2017 , 06:44 AM

#11

venice10

Referee

Join Date: Nov 2007 Posts: 25,855

Going to sticky this thread for a while.

Will eventually a link to it in the stickies, but something that clearly needed to be said.

Quote

07-27-2017 , 11:07 AM

#12

pokerodox

The Situation

Join Date: Nov 2012 Posts: 4,889

Nice post Vernon.

Quote:

Originally Posted by ChrisV

It's obvious that in multiway poker a player deviating from optimal strategy can reduce your payout, because examples are very easy to come up with. One trivial example is being bet out of a dry side pot by a player who can't beat the allin player.

In multiway situations "GTO" gets used as shorthand for trying to have a balanced strategy that it's hard to exploit, rather than what we mostly do in LLSNL, which is play in a way that heavily exploits other players. I guess "non-exploitative" would be better but it doesn't really roll off the tongue.

Hopefully this ^^^ gets discussed.

Quote

07-27-2017 , 11:21 AM

#13

CallMeVernon

COTM Crusher

Join Date: Jun 2010 Posts: 4,088

Quote:

Originally Posted by ChrisV

In multiway situations "GTO" gets used as shorthand for trying to have a balanced strategy that it's hard to exploit, rather than what we mostly do in LLSNL, which is play in a way that heavily exploits other players.

Quote:

Originally Posted by HappyLuckBox

Being known as having the capability to bluff/give action gets your value bets paid off. Thats really the main takeaway.

These two posts are sort of related in my mind because my response to them is sort of the same. It's not that I necessarily disagree with anything I've quoted; it's just that it's important to understand that the takeaway from these quotes should NOT be "GTO is important".

The reason for that is, these quotes aren't describing GTO.

What they're really describing is the same thing that every poker player should be doing: trying to find a style of play that maximally exploits the people you're playing against. Calling that style "GTO" doesn't actually make it GTO or imbue it with any special mathematical properties. If you happen to be in a game where a certain style wins, you're winning in part because you happen to have found a way to play that is hard for humans to exploit. How close it is to actual GTO is totally irrelevant to its ability to win.

HLB's quote gets to the heart of it. If you have to talk about "being known" for things, you aren't really talking about game theory. You're talking about manipulating your human opponents.

Quote

07-27-2017 , 12:09 PM

#14

ZuneIt

veteran

Join Date: Mar 2009 Posts: 2,160

Thanks Vernon! Both for the clarification in the above reply to ChrisV & HLB and....

'imbue": to impregnate or inspire, as with feelings, opinions, etc.:
The new political leader was imbued with the teachings of Mahatma Gandhi.

Had to look it up...new word for me.

Quote

07-27-2017 , 12:22 PM

#15

gobbledygeek

Poet Laureate of LLSNL

Join Date: Jul 2006 Posts: 36,142

Congrats on your 3K post, you non-posting whore!

GGTO,theydo"Takin'CareOfBusiness",right?G

Quote

07-28-2017 , 07:01 AM

#16

Petrucci

Carpal \'Tunnel

Join Date: Apr 2017 Posts: 6,199

Gratz on the 3K post Vernon, and thanks alot for everything ive learned from you through the years.

Quote

07-28-2017 , 03:20 PM

#17

BadlyBeaten

banned

Join Date: Aug 2011 Posts: 2,300

Quote:

Originally Posted by CallMeVernon

If you happen to be in a game where a certain style wins, you're winning in part because you happen to have found a way to play that is hard for humans to exploit.

Maybe, maybe not. There are probably many reasons that that style wins in that game, and "hard to exploit" may not be one of them.

Quote:

Originally Posted by CallMeVernon

How close it is to actual GTO is totally irrelevant to its ability to win.

I don't think this is true, and I think it's impossible to prove that it is.

As an aside, I think there's a need to have a uniform understanding of what winning is. For example, if I join the game with you and your girlfriend and play GTO, I'm still winning... if I cannot locate (or create) another game where I lose less. Presumably I came from a game where I was losing more?

Last edited by BadlyBeaten; 07-28-2017 at 03:30 PM.

Quote

07-28-2017 , 06:35 PM

#18

Garick

Oberbiergenießer

Join Date: Dec 2007 Posts: 26,516

What? Losing less is not winning. It has a lower EV than not playing, which is always an option.

Quote

07-28-2017 , 08:33 PM

#19

BadlyBeaten

banned

Join Date: Aug 2011 Posts: 2,300

Well, I think there's a sound argument that losing less IS winning. If you leave a game in which you are losing (whatever) then you have WON. In a sense.

But your hidden premise is that we CAN leave "the game." If we MUST INEVITABLY end up in some other game, then we DO actually win by switching to a game in which we lose less. And I would further argue that even if we stay in a game that we are losing, we can "win" by making strategic adjustments that get us closer to your description of winning.

E.g. a chess engine will calculate very small differences in the chance of success between the two players, and we commonly refer to this as "winning," but after the next move by the OTHER player, the same engine will show that now the other player is "winning," again by a very small amount.

So I'm suggesting it may be important to explore the definition. After all, don't we all agree that there are some "winning" poker players who have lost money at poker? (And some losers who are winning?) It can take a long time for the various definitions of winning and losing to become completely aligned. Maybe there are as many different definitions of winning as there are different games??

Last edited by BadlyBeaten; 07-28-2017 at 08:38 PM.

Quote

07-28-2017 , 08:47 PM

#20

ZuneIt

veteran

Join Date: Mar 2009 Posts: 2,160

IMO, using the chess engine is a flawed analogy. It could show white slightly ahead after analyzing 19 ply deep, however, once white moves, the engine starts over from scratch.
Furthermore, in really close games, once the engine has analyzed 19 ply deep with black to play, it may very well show black with a slight edge, as it is now "seeing" a 1/2 move deeper into the game, than it did when it analyzed white's position 19 ply deep.

Quote

07-28-2017 , 09:09 PM

#21

BadlyBeaten

banned

Join Date: Aug 2011 Posts: 2,300

And what I'm saying is maybe it ISN'T a flawed analogy (in the absolute sense).

Quote

07-28-2017 , 10:20 PM

#22

Garick

Oberbiergenießer

Join Date: Dec 2007 Posts: 26,516

And what I'm saying is stop tarding up the thread with your silly hypotheticals as you desperately try to save your "GTO is perfect until someone can prove with math I don't understand that it isn't" mentality.

Not to appeal to authority, but CMV has a terminal degree in math. If you don't understand/don't want to believe what he says about math, you can't disprove it with a hypothetical "but maybe it isn't!" You must do better math.

Quote

07-28-2017 , 11:31 PM

#23

venice10

Referee

Join Date: Nov 2007 Posts: 25,855

Quote:

Originally Posted by BadlyBeaten

And what I'm saying is maybe it ISN'T a flawed analogy (in the absolute sense).

Garick's premise is the real world. We never have to play poker. We always have the 0 EV option. To say that there is a situation where you have to play poker is ridiculous. The fact that all poker players win hands and lose hands has no bearing on whether losing less is "winning."

All that said, I recognize that in matters of faith there is no logic that can shake that faith. You've said what you come to say. My suggestion is to let it go at this point.