COTM: A Crash Course in Game Theory - Live Low-stakes No Limit Poker Forum

Two Plus Two Forums Poker Strategy Live No-Limit Hold’em Cash

COTM: A Crash Course in Game Theory

Post Reply Subscribe

...

Page 1 of 10

1 2 3 4 5 6

Page 1 of 10

1 2 3 4 5 6

10-02-2014 , 02:19 AM

CallMeVernon

COTM Crusher

Join Date: Jun 2010 Posts: 4,086

COTM: A Crash Course In Game Theory

Every now and then I’ll hear someone on 2+2 talk about “GTO strategy” as if it is some magic bullet that, if they only knew what it was, would guarantee that they could beat anyone who was doing anything different from “GTO”. It turns out that this idea is based on a fundamental misunderstanding of game theory optimal strategy as defined by game theory. This post is meant to explain what the definition of a GTO strategy actually is, and to clear up some of the misconceptions about what it is and what it isn’t.

Below I will explain the basic math behind game theory, using games that are much simpler than poker. Hopefully seeing the basics will show why the story is not as simple for poker as many would like to believe.

The basic idea that I’m going to try to get across by the end is this: if you are playing poker to maximize winnings, don’t bother to attempt to learn or play a GTO strategy. It’s a waste of time. If your opponents are losing players, there is virtually a 0% chance that you will maximize your winnings against them by playing GTO. And if your opponents are winning players, even if they are playing GTO, you probably still don’t need to play GTO to avoid losing to them. If this surprises you, keep reading.

DISCLAIMER: While I am a mathematician, my expertise is not in game theory. That being said, I do know some game theory, and I stand by everything I’m putting in the rest of this OP as mathematical fact (but of course I might have made an error which I will admit if someone points it out). I may, in follow-up posts, diverge into the realm of opinion, if there is any debate on the content, but I will try my best to distinguish between what I know (or think) to be factually true and what is my opinion.

Example 1: The Prisoner’s Dilemma

This is a classic game for game theorists to study because it is simple but already contains a lot of interesting concepts. I’ll use it to introduce some of the terminology of game theory.

Here’s the setup. You and a partner-in-crime have both been arrested and remanded to separate interrogation chambers. The cops want to put you both away for 7 years, but unfortunately (for them, not for you) they only have enough evidence to convict each of you of a lesser crime and give you 3 years. However, they offer you and your partner the same deal: give us the evidence to convict your partner of the 7-year charges, and we’ll take 2 years off your sentence.

So here the “game” is that you can take or leave the deal, and so can your partner. There are 2*2=4 possible outcomes. Game theorists usually represent the game as a matrix, like this:

___C________D

C (-3,-3) (-7,-1)

D (-1,-7) (-5,-5)

Here’s how to read the matrix. The rows represent all the different things you can do in the game—your strategies. You have 2 strategies: Cooperate (C) with your partner, or Defect (D) to the police and give him up. Your opponent (your partner who’s offered the same deal) has the same 2 strategies, represented in the columns.

The entries in the matrix represent what are called the utilities of each possible outcome*. They are ordered pairs—your utility is the first coordinate, and your partner’s utility is the second. So for example, if you Cooperate and your partner Defects, you will get 7 years in jail and he only gets 1 (he gets 2 years off a 3-year sentence, and you get slammed with the full 7 years). That is shown in the upper right ordered pair in the matrix.

(*You can think of utility as sort of like EV, but it’s not exactly the same. The EV is the utility of a situation times the probability that that situation happens. So for example, if you wanted to compute the EV of playing D, you’d multiply your utility when your opponent plays C times the percentage of the time he actually plays C, then do the same for when he plays D, then add them. This is what we do all the time on this forum when computing EVs; the word “utility” just refers to the individual numbers that you’re multiplying the percentages by when you do it.)

So given this payout matrix, if you are playing this game, what should you do? Now, there’s already a problem here because “should” is a subjective term. I might be asking any of the following questions (or even a different one I’ve left out):

1) If you played this game repeatedly, what strategy gives you the greatest chance of the highest possible score?

2) If you played this game repeatedly, but were able to negotiate with your partner before playing, what strategy would the two of you settle on to ensure the highest total aggregate score?

3) If you played this game repeatedly, what strategy maximizes the chance that you will end up with a higher score than your opponent?

Game theorists don’t answer all of these questions. Instead, what they usually do is pick one definition of “should” and stick with it. In game theory, the standard is called maximin—we try to create a plan of attack such that our worst possible score using the plan is better than (or at least not worse than) any other plan’s worst possible score. (If there’s a tie, you look at the next-worst possible outcome, etc.)

So, OK, enough stalling: how are you supposed to play the Prisoner’s Dilemma using a maximin strategy? Looked at this way, the Prisoner’s Dilemma is a very easy game, and here’s why:

1) If your opponent plays C, you score -3 by playing C and -1 by playing D.
2) If your opponent plays D, you score -7 by playing C and -5 by playing D.

So, no matter what your opponent does, you would rather play D than play C. That means that C is what game theorists call a dominated strategy: it’s a strategy such that there exists another strategy that outperforms it across the board. In this case, C is dominated by D. Once we know this, it’s clear that neither player “should” play C, which means that both players “should” play D all the time.

Now let me explain why “should” is in quotes—because again, it is subjective. What I mean here is that if both players are playing D, neither player has any incentive to switch to C. They’d be better off sticking with D. However, if both players were playing C, either player would have an incentive to switch to D, and then eventually they would reach an equilibrium of always playing D anyway. This is what in game theory is called a Nash equilibrium. The definition of the Nash equilibrium is exactly what I just laid out: a state where neither player has any incentive to change the way they are playing. When someone refers to a “GTO strategy”, what they really mean is one player’s side of a Nash equilibrium. (This is why a GTO strategy is often called an “unexploitable” strategy—because being exploitable means exactly that you give your opponent an incentive to play a certain way against you, and a Nash equilibrium strategy is exactly one that avoids that.)

So in this game, the GTO strategy for each player is what’s known as a static strategy—play D 100% of the time. But isn’t it unfortunate that the GTO strategy locks players into taking -5 every round of the game, when if both players played a dominated strategy they would both improve their score to a -3?

That question would probably take an entire other post to answer, so I will leave this example here. Instead I’ll use a different and much more familiar game to show some more nuances of Nash equilibria, like how to compute them when they are not static strategies, and exploitation when one player is not using a GTO strategy.

Example 2: Rock-Paper-Scissors

(Note: I’m going to assume I don’t have to explain the rules of this game.)

RPS differs from the Prisoner’s Dilemma in two important ways, one of which it has in common with poker and one of which it’s not clear which is more similar to poker:

1) RPS is a zero-sum game. That means that one player’s gain must be at the direct expense of one (or more) opponents. PD is not zero-sum. If both players were playing C, and suddenly they both switch to D, they both lose compared to the previous state. In a zero-sum game, this is by definition impossible. (Poker, at least when playing without a rake, is a zero-sum game. As a sidenote, both games are also fair, in that both players will get equal EVs from the Nash equilibrium.)

2) The Nash equilibrium is not a static strategy. In PD, it is—both players play D all the time. In RPS, playing any one strategy 100% of the time is clearly exploitable. That doesn’t mean that there is no Nash equilibrium—it just means that the Nash equilibrium will be a mixed strategy, and we have to do a little work to compute it. I bet you probably know what it is already, but I’ll show the calculations below, because they can be generalized to any 2-player zero-sum game. (As far as I know, it is not known whether the GTO strategy for any poker game is static or mixed!)

Computing the Nash Equilibrium

So without further ado, here is the payout matrix for RPS. This time only your (the row player’s) utilities are listed, because this is a zero-sum game and the other player’s utilities are the exact opposite.

__R_P_S

R 0 -1 +1

P +1 0 -1

S -1 +1 0

So as I said above, the Nash equilibrium must be a mixed strategy. Playing only one row all the time is clearly exploitable. Also, in this game, any one row you pick has an opposing column that your chosen strategy will maximally exploit, and vice versa. That means unlike PD, no strategy is dominated.

Here’s how to solve for the Nash equilibrium. Let’s say you are the row player. You know you will be playing a mixed strategy, so let r be the percentage of the time you play R, p the percentage of the time you play P, and s be the percentage of the time you play S. Since these are the only 3 strategies, one fact is obvious:

r + p + s = 1

If we want to have any hope of solving for the 3 variables, though, we need at least 2 more equations. Here is how we get them. The definition of the Nash equilibrium is that your opponent should never have an incentive to change his strategy. That means that your opponent’s EV of any play must be the same (this is called the indifference principle). In particular, we have these 3 equations:

EV(R) = EV(P)
EV(R) = EV(S)
EV(P) = EV(S)

(We know we can use these because if, say, R had a higher EV than P or S, our opponent would have an incentive to only play R, and that contradicts the definition of the Nash equilibrium.)

So let’s start with the first equation:

EV(R) = EV(P)
0r - 1p + 1s = 1r + 0p - 1s
-p + 1 - r - p = r - 1 + r + p
2 - 3r - 3p = 0

And now we can treat the second equation the same way:

EV(R) = EV(S)
-p + 1 - r - p = -r + p + 0s
1 = 3p
p = 1/3

Now that we know p = 1/3, we can go back to the first equation and solve for r:

2 - 3r - 1 = 0
1 = 3r
r = 1/3

And of course that implies s = 1/3 as well.

So there you have it: the GTO strategy for RPS is to randomize your strategy so that you have a 1/3 chance of throwing each. Let’s check that this actually satisfies the Nash equilibrium condition.

If your opponent plays R, his EV is (1/3)(0) + (1/3)(-1) + (1/3)(1) = 0.
If your opponent plays P, his EV is (1/3)(1) + (1/3)(0) + (1/3)(-1) = 0.
If your opponent plays S, his EV is (1/3)(-1) + (1/3)(1) + (1/3)(0) = 0.

This strategy guarantees that anything your opponent does is always exactly 0 EV. In other words, it guarantees that he never gains anything from switching his strategy. Since the game is zero-sum and fair, it also implies that nothing he does can beat you. Similarly, your opponent has the same Nash equilibrium strategy—randomize so his chances of each play is 1/3. If he does this, he guarantees that you cannot beat him. A true Nash equilibrium involves both players doing this and therefore giving neither player any incentive to deviate from it.

Deviating from the Nash Equilibrium

Now I want to talk about so-called “exploitive” strategies and how they fit into the picture. I’m going to try to use examples in RPS that can be crudely analogized to poker, since after all this is a poker forum.

It ought to be clear from the previous section that if both players play RPS using a GTO strategy, their EV will be 0. However, one extremely common misperception about GTO strategy is that if you play GTO, and your opponent doesn’t, you will now start to show a positive EV. The calculations in the previous section disprove that notion—if you stick to a GTO strategy, your EV will stay at 0 no matter what your opponent does**. Your opponent could switch to a strategy that is 100% R—about as different from GTO as you could get—and if you continued playing GTO, your EV against that strategy would still be 0, just the same as if he were playing GTO.

(**Note that this is not always true for every game and every strategy. However, it is always true for every zero-sum fair game if your opponent sticks to strategies that are not dominated. In RPS, no strategy is dominated. But imagine if we added a 4th option to the game; call it Pebble. Pebble beats scissors and loses to paper, but it also loses to rock. Pebble is dominated by rock. Now a GTO strategy would be 1/3 rock, 1/3 paper, 1/3 scissors, and no Pebble. Throwing rock, paper, or scissors would be 0EV against the GTO strategy, but if you played GTO and your opponent played any strategy that included Pebble, you would now be +EV.)

That is the flip side of GTO strategy: you are not exploitable, but you are also not exploiting. If you know that your opponent is deviating (but still playing only non-dominated strategies) and you want to punish him for that, you have to deviate yourself.

Here is an example of what I’m talking about. Let’s say you are sitting down to play RPS with someone and you have agreed in advance to play repeated “hands” of RPS. Your goal will be not just to outscore your opponent if you can, but also to maximize your total score over all the hands you play. Your opponent will be trying to do the same thing. Since you are both sophisticated players, you are allowing the use of dice (and not necessarily 6-sided dice) to randomize your play if you wish.

Now let’s say you both roll dice behind each other’s backs, and you have decided that you will use a 6-sided die and will throw R on a 1 or 2, P on a 3 or 4, and S on a 5 or 6. Clearly this is a GTO strategy; your plan is to use it until you spot a weakness in your opponent, then try to exploit that weakness.

So here are two different scenarios that you can try to exploit:

1) Let’s say that you notice, after 10-15 hands, that your opponent is only throwing S. The adjustment to that strategy, if you have correctly identified it, is obvious: start throwing 100% R. If your opponent only throws S and you only throw R, your EV is +1.

2) Let’s say you notice something more subtle than in example 1: after about 60 hands, you notice your opponent is throwing S about half the time and R and P about a quarter of the time each. (Maybe he is achieving this by, for example, rolling a 12-sided die and throwing R on a 1, 2, or 3, P on a 4, 5, or 6, and S on any 7—12.) So your opponent is unbalanced, but he is not as unbalanced as he is in the first example.

What should your strategy be to maximally exploit this strategy?

Your instinct might be that because your opponent is only slightly unbalanced towards S, your best exploitive strategy would be to slightly unbalance your strategy towards R. You may be surprised (as I was when I first thought about this) to learn that that’s not true!

Let’s do the calculations and see what the real best strategy is against 25% R / 25% P / 50% S:

If you play R, your EV is (1/4)(0) + (1/4)(-1) + (1/2)(1) = 1/4.
If you play P, your EV is (1/4)(1) + (1/4)(0) + (1/2)(-1) = -1/4.
If you play S, your EV is (1/4)(-1) + (1/4)(1) + (1/2)(0) = 0.

The numbers make it clear: throwing R maximizes your EV. So your maximally exploitive strategy is to throw 100% R, just as it was in the first case.

Hopefully this example illustrates the larger point that when you are trying to exploit a deviation from GTO, the exploitive strategy should not look anything like GTO, yet as long as your opponent doesn’t catch on and change his strategy, your “hugely exploitable” strategy will maximize your EV.

And again, let me reiterate—if you stuck to GTO, your EV against both of these unbalanced strategies would be the same as it always is: zero.

Quote

10-02-2014 , 02:56 AM

andorrak

centurion

Join Date: Sep 2014 Posts: 104

Thanks for the great read, it was nice to review some concepts you don't really explicitly think about.

Quote

10-02-2014 , 05:59 AM

de4df1sh

Carpal \'Tunnel

Join Date: Feb 2014 Posts: 6,893

in b4 GTO debate.

Quote

10-02-2014 , 07:16 AM

AbqDave

old hand

Join Date: Dec 2011 Posts: 1,649

Excellent! Fantastic post. Looking forward to a good discussion.

I've never been a huge fan of Plato but I think the concept of platonic ideals might apply here.

Think about cars. It's possible to imagine that there is some perfectly generalizable concept of "car" that is valid and true on a very deep level. However, it's hard to visualize what a "perfect car-form" would actually look like. Too abstract. And, at the end of the day, you gotta get down the road. A Jeep or a Prius will get you further down the road than an abstract concept.

GTO strikes me as sort of an idealized form of poker, except that I guess eventually we will have enough computer horsepower to solve the game, and see what the pure form actually looks like. Right now it's an abstract.

I think studying GTO helps us understand what features of poker speak to its essence. It gives us a template against which we can compare out own individual concepts of what the game of poker is.

Look, over the years I keep finding myself asking, what game is this? At first, I thought it was a card game. Then, I thought it was a behavior game. Studying GTO helps me understand is kind of a math game. That's not going to make me money the next time I'm trying to decide to check or bet my combo draw. But I think in the long run it will make me a better poker player.

Quote

10-02-2014 , 10:19 AM

DUCYdonk

adept

Join Date: Nov 2013 Posts: 764

Good summary. Thanks!

Quote:

Your instinct might be that because your opponent is only slightly unbalanced towards S, your best exploitive strategy would be to slightly unbalance your strategy towards R. You may be surprised (as I was when I first thought about this) to learn that that’s not true!

Really? My first instinct was that you needed to throw R all of the time because you're going to maximally exploit his tendency toward S. I think a poker parallel might be when you have 60% equity in a cash game pot and asking if you think you should be willing to put in 60% of your stack or 100%. We should be maximally exploiting our edge, which means we would want to put all of our stack in (poker is more complicated, however, because there is not only 1 decision point and our edge could theoretically be greater at a different point).

Does that make sense?

Also, I agree that GTO is largely unhelpful for almost all poker players. If you were playing against a BOT that was playing GTO, or in a nearly solved game where someone is playing close to perfect, then GTO might be useful, but more useful would be wisdom that told you not to play in that game. In other words, if you need to play GTO, you shouldn't be playing.

Quote

10-02-2014 , 10:42 AM

ProFeSSa D

centurion

Join Date: Jul 2009 Posts: 117

Great post. Thank you for boiling it down into terms that I can actually make sense of. I've always been very interested in GT (almost took a course on it in college) but at times the scenarios and terminology have been a bit overwhelming and difficult to apply practically to poker.

Quote

10-02-2014 , 11:40 AM

Ranma4703

veteran

Join Date: Apr 2006 Posts: 3,099

Poker is a little cooler because we have streets. Lets say on every street we are going to either check or bet pot, and our opponent is either going to call or fold. When we check our opponent wins 100% of the time; when we bet our bluffs and get called they win 100% of the time ,and when we bet our value and they call we win 100% of the time.

On the river, we are going to be betting 1 bluff hand for every 2 value hands, to make our opponent indifferent to calling. When we bet, our opponents EV becomes 0.

On the turn, should we apply the same ratio? Lets say we have 10 bluff combos and 4 value combos on the turn. If we bet 2 bluff combos and 4 value combos on the turn, and 2 and 4 on the river, villain will never win because we will never check, and have an EV of 0 100% of the time, which is the equivalent of them losing their turn pot sized bet. Therefore their turn EV will be 2TPS*(0/3)-TPS*(3/3) where TPS=turn pot size, so they will always fold the turn. Lets find a turn ratio that makes them indifferent between calling and folding the turn.
2TPS*(B/(6+B))=TPS*(6/(6+B) where B=bluff check on the river count, and 6 is the number of combos we are betting on the river to make villain indifferent between calling and folding.
we can drop TPS from both sides and just have: 2*(B/(6+B))=6/(6+B)
plugging that into wolfram alpha, we get B=3, so total bluffs on the turn = 5, total combos=9. Lets check this.
http://www.wolframalpha.com/input/?i...%2F%286%2BB%29

On the river, we are betting 6 combos, and checking 3. Villain wins 1/3 of the time on the river, and 2/3 of the time they lose their turn bet. They are getting 2:1 on their turn call, so now they are indifferent between calling and folding.

Now lets take it to the flop. We are betting 9 combos on the turn that make villain indifferent between calling and folding. When we check villain always wins, when we bet villain always loses their flop bet. Therefor, we want to be betting the turn 2 times for every 1 time we check, since we are giving them 2:1 on their call. That means we should bet 4.5 additional combos on the flop (notice this also applies to our turn bet - we knew we would be betting 6 combos on the river, so we wanted 3 additional combos on the turn). So if we bet 13.5 combos on the flop, we have made our opponent indifferent to calling and folding.

Now of course, all this assumes our villain has 0% equity when called, and we have 0% equity with our bluffs, both of which are not going to be true. The more equity we have vs villains range, the more bluffs we can have as our value range will expand. The more equity villains range has versus our value, the less we can bet as hands start dropping out of our river value range. But it is a good jumping off point for thinking about GTO in a poker like game.

Quote

10-02-2014 , 11:45 AM

CallMeVernon

COTM Crusher

Join Date: Jun 2010 Posts: 4,086

Quote:

Originally Posted by AbqDave

GTO strikes me as sort of an idealized form of poker

One thing I left out of my OP, but will say here, will probably blow this "ideal" out of the water:

Not only don't we know what a GTO strategy for poker looks like...we actually don't even know that poker has a unique Nash equilibrium. There could be more than one.

Games with more than one Nash equilibrium actually do exist. And what's more, we all know one: tic-tac-toe. I don't want to go into too much more detail unless there is serious clamoring for it, but in tic-tac-toe, with perfect play you can force a draw, and there is more than one way for the players to play perfectly. (There is even more than one way after you account for the symmetry of the board.)

GTO basically is not "ideal" anything. DUCYdonk summed it up very well: if you have to play GTO, you might as well not play (in fact, if you have to play GTO in a raked game, there's a significant chance you'll be a long-term loser in that game).

Quote

10-02-2014 , 11:56 AM

Mook

journeyman

Join Date: Feb 2007 Posts: 255

Quote:

Originally Posted by CallMeVernon

Hopefully this example illustrates the larger point that when you are trying to exploit a deviation from GTO, the exploitive strategy should not look anything like GTO, yet as long as your opponent doesn’t catch on and change his strategy, your “hugely exploitable” strategy will maximize your EV.

Great post, and IMO the bolded is the single most important point within it.

I'll use a poker example, just because it comes so readily to mind:

Let's say you're playing with medium-size stacks (100-150BB) against typical LLSNL droolers, tag along in a raised pot with a decent suited hand - something like 87s or A5s - and flop a good combo draw: not a 15-out monster but a flush draw plus a gutshot, middle pair, two overs, etc. There's a fair-sized bet to you ... should you call or raise? If the latter, how big?

For years - years! - by far the most +EV move with these hands at live 1/2 (especially from OOP) was the flop overbet-raise / monkey-shove, and it's one I made virtually always. It was a hugely, absurdly profitable play from 2006 through about 2011, for two reasons:

(1) You could count on villains to fold somewhat more often than GTO against these overbets. Not anywhere close to 100% of the time ... just somewhat more often. And, just as in the OP, even the minor deviation of "somewhat more often" calls for the huge exploitation of shoving "virtually always".
(2) Just as importantly, it wasn't a situation that came up all that often - usually once or twice a session at most - meaning it didn't give opponents the opportunity to notice I was being maximally exploitive and adjust.

I suppose I'm not alone in noticing that this strategy just doesn't work well any more ... opponents are calling these sorts of overbets far more readily than 5 years ago: "I put you on a draw all along." They haven't adjusted to me personally, of course: their adjustment has been a collective one, after years of watching winning LLSNL players "exploit" situations like these to the fullest.

As a counterexample, it makes me laugh to see hoodie-and-Beats regs sit down at live, loose 1/2 limp-fests and start raising literally 80% of their COs and buttons right out of the chute. Yes, each raise may well be maximally exploitive at that moment, in that hand. But given how transparent it is, and how even average opponents can and will adjust relatively quickly (perhaps not optimally, by 3-betting light, but at least by pre-emptively folding), is it really more +EV in the long run than seeing more flops, in position, against greater numbers of bad opponents holding mediocre hands?

In my mind, then, maximal exploitation of common situations isn't enough; it's just as important to find situations that can be maximally exploited without shifting opposing strategies. But that's just my opinion, and I realize that may be overthinking things at typical LLSNL tables.

In any event, really looking forward to this discussion.

Quote

10-02-2014 , 11:57 AM

#10

spikeraw22

The Situation

Join Date: Aug 2010 Posts: 8,741

Hopefully this will put to rest forever the ridiculous references to GTO in 1/2 threads.

Good read.

Quote

10-02-2014 , 12:38 PM

#11

AsianNit

banned

Join Date: Sep 2012 Posts: 9,881

Most important paragraph for newbs:

Quote:

Originally Posted by CallMeVernon

The basic idea that I’m going to try to get across by the end is this: if you are playing poker to maximize winnings, don’t bother to attempt to learn or play a GTO strategy. It’s a waste of time. If your opponents are losing players, there is virtually a 0% chance that you will maximize your winnings against them by playing GTO. And if your opponents are winning players, even if they are playing GTO, you probably still don’t need to play GTO to avoid losing to them. If this surprises you, keep reading.

Quote

10-02-2014 , 02:13 PM

#12

Lapidator

LLSNL Frequent Flyer

Join Date: Apr 2012 Posts: 13,762

Quote:

Originally Posted by spikeraw22

Hopefully this will put to rest forever the ridiculous references to GTO in 1/2 threads.

Good read.

But what about the lols?

We always need more lols...

Quote

10-02-2014 , 02:23 PM

#13

AsianNit

banned

Join Date: Sep 2012 Posts: 9,881

Quote:

Originally Posted by Lapidator

But what about the lols?

We always need more lols...

You don't get enough OPs where the correct response is "fold pre"?

Quote

10-02-2014 , 02:27 PM

#14

Lapidator

LLSNL Frequent Flyer

Join Date: Apr 2012 Posts: 13,762

Let's assume for a moment that a GTO strategy was possible for LLSNL. Let's further assume that it was known and describable.

Consider the level of complication it would involve at a full-ring table. Consider how much more complicated it would be then a 2-person RPS game.

There is very little chance that the average player you meet at the table (not the average human, btw) is capable of playing this strategy. They might be capable of understanding it. They might be capable of describing it fairly accurately to you in words. But they almost certainly would not be a capable of executing it.

(Sure there will always be those superhumans who can... and will... and do... They are few. And unless they realize what they're doing, they're going to be crushed by the other 8 players at the table who are playing an exploited strategy, and the "9th player" who is just collecting 50bb/hr in rake.)

My point it that it will always be more profitable to play an exploitable strategy then a GTO strategy, simply because our opponents (at LLSNL) are not capable of playing a GTO strategy. We don't even have to know whether such a strategy might exist -- because we can easily use common sense to show that our opponents could not execute it if it did.

Quote

10-02-2014 , 02:31 PM

#15

Lapidator

LLSNL Frequent Flyer

Join Date: Apr 2012 Posts: 13,762

(Having said that ^) I would actually be interested in a discussion of:

If a GTO strategy did exist for LLSNL, how would it affect our notional assumptions with respect to stack depth and bankroll management?

At face value, it seems to me like you couldn't really apply any "maximin" effort without being deeper and with a much bigger bankroll then we commonly have. Methinks stack depth is probably the more important factor here.

Quote

10-02-2014 , 03:35 PM

#16

thetruewheel

journeyman

Join Date: Jun 2014 Posts: 362

Quote:

Originally Posted by Lapidator

Let's assume for a moment that a GTO strategy was possible for LLSNL. Let's further assume that it was known and describable.

... snip...

I agree with the conclusions you draw, but I don't like the way you've stated this. GTO play is a strategy for the game "No Limit Texas Hold Em" which is well-defined and has the same rules no matter what stakes you play in. GTO is simply the strategy you end up at when all players are making every choice so as to maximize their EV (and never deviating from this)*. It doesn't really make sense to talk about a GTO strategy being "possible for LLSNL" - it's possible in every form of poker, but like you said, it isn't necessarily profitable or the best way to play.

* The corollary is that an optimal player (one who always makes the most +EV decision) will explicitly NOT play GTO against LLSNL players, even though they theoretically could be.

Quote

10-02-2014 , 09:20 PM

#17

venice10

Referee

Join Date: Nov 2007 Posts: 25,852

For those looking for applications of GTO play in poker, I suggest reading Matt Janda's book available from 2+2. His experience is in 6-max high level NL online poker. The necessary conditions for incorporating GTO appears to include a high level of confidence of the villain's range, a sophisticated villain, and lots of homework to know the correct odds. None of these occur in LLSNL, let alone any live play outside of the nose bleeds.

That doesn't mean there isn't value in his book for LLSNL. While it isn't spelled out, the methodology for figuring out what hands to float with in certain situations are there. I reference Janda in one section of my COTM next month which I've never really seen discussed in this forum. The problem with it is that you have to know a lot first before you can use what he discusses.

Quote

10-02-2014 , 09:54 PM

#18

Garick

Oberbiergenießer

Join Date: Dec 2007 Posts: 26,514

Quote:

GTO is simply the strategy you end up at when all players are making every choice so as to maximize their EV (and never deviating from this)*

No. Did you not read the OP? GTO is unexploitable, not EV maximization.

Quote

10-03-2014 , 02:29 AM

#19

CallMeVernon

COTM Crusher

Join Date: Jun 2010 Posts: 4,086

Quote:

Originally Posted by venice10

For those looking for applications of GTO play in poker, I suggest reading Matt Janda's book available from 2+2. His experience is in 6-max high level NL online poker. The necessary conditions for incorporating GTO appears to include a high level of confidence of the villain's range, a sophisticated villain, and lots of homework to know the correct odds. None of these occur in LLSNL, let alone any live play outside of the nose bleeds.

That doesn't mean there isn't value in his book for LLSNL. While it isn't spelled out, the methodology for figuring out what hands to float with in certain situations are there. I reference Janda in one section of my COTM next month which I've never really seen discussed in this forum. The problem with it is that you have to know a lot first before you can use what he discusses.

I will try to post something more detailed in a few days (it seems as expected that a lot of people are missing the point of my OP), but I need to jump in about the bolded. It's actually mathematically wrong.

Any strategy that takes the Villain's range or skill level into account CANNOT be GTO. If you are using the fact that you are confident about how your opponent is playing, you are by definition playing an exploitive strategy. In any Nash equilibrium for poker, your range should be constructed so that Villain's range and tendencies do not matter--he can't exploit you no matter what he does.

The reason why I made my OP about the much simpler example of RPS is that in that game you can verify that the above paragraph applies to it. While the Nash equilibrium for poker is more complex, it still adheres to the same basic mathematical principles.

Things like this are why I hold books that talk about GTO play in high suspicion.

Quote

10-03-2014 , 11:42 AM

#20

spikeraw22

The Situation

Join Date: Aug 2010 Posts: 8,741

Would I be correct in stating that GTO would be more useful in turning a -EV situation into a neutral EV situation, and destructive to a +EV situation? So, if I find myself heads up vs. an opponent with a massive edge vs me I can drive the action to a neutral expectation if I can somehow discover and implement GTO vs. him. If however, I am the one wih the edge I want to stay as far away from GTO as possible.

Quote

10-03-2014 , 12:12 PM

#21

CallMeVernon

COTM Crusher

Join Date: Jun 2010 Posts: 4,086

Quote:

Originally Posted by spikeraw22

In short, yes, this is basically correct, with, I'd say, one exception.

Poker is such a complicated game that your opponent could theoretically have an edge over you while also playing a dominated strategy (if, for example, your strategy is also dominated). So if your opponent is playing a dominated strategy and you switch to GTO, you will show a profit. If not, GTO play will be neutral EV only.

But the fundamental idea is right--GTO is a defensive concept. If you have an edge over your opponent, GTO play kills that edge. But if your opponent has an edge over you, GTO play also kills that edge.

Quote

10-03-2014 , 12:27 PM

#22

B&E

grinder

Join Date: Aug 2011 Posts: 466

Thanks for a thought provoking post. I've been spending some of my time on Matt Janda's work, so this COTM definitely caught my eye.

Quote:

And if your opponents are winning players, even if they are playing GTO, you probably still don’t need to play GTO to avoid losing to them. If this surprises you, keep reading.

Quote:

However, one extremely common misperception about GTO strategy is that if you play GTO, and your opponent doesn’t, you will now start to show a positive EV. The calculations in the previous section disprove that notion—if you stick to a GTO strategy, your EV will stay at 0 no matter what your opponent does**

These surprised me when I read them, but it took just a moment to realize that for RPS, that is in fact true. But it seemed hard to believe that would be true for poker.

That would mean in theory, you could take 8 GTO opponents and 1 random LLSNL player, playing together, and because all 8 GTO players have an EV of 0, the non-GTO player would have to have an EV of 0 also. (Let's assume a non-raked game) That would mean the non-GTO could employ ANY strategy and be EV=0.

Well, that doesn't seem like it can be true, might that be because many strategies in poker are dominated ones, the equivalent of your 'pebble' example?

Quote

10-03-2014 , 12:42 PM

#23

CallMeVernon

COTM Crusher

Join Date: Jun 2010 Posts: 4,086

Quote:

Originally Posted by B&E

in theory, you could take 8 GTO opponents and 1 random LLSNL player, playing together, and because all 8 GTO players have an EV of 0, the non-GTO player would have to have an EV of 0 also. (Let's assume a non-raked game) That would mean the non-GTO could employ ANY strategy and be EV=0.

Well, that doesn't seem like it can be true, might that be because many strategies in poker are dominated ones, the equivalent of your 'pebble' example?

That is exactly the explanation. If the one non-GTO player is playing a dominated strategy, then he is giving away EV to the 8 GTO players (but not evenly, because the seating arrangement will matter).

However, if the non-GTO player is playing a non-dominated strategy, everyone should have 0 EV.

Quote

10-03-2014 , 01:13 PM

#24

AsianNit

banned

Join Date: Sep 2012 Posts: 9,881

GTO seems like a reasonable strategy to use against a table of unknowns, which you abandon in favor of an exploitative strategy as soon as you discover traits which can be exploited. GTO is probably most useful against a completely generic opponent who doesn't give off any obvious tells when you are forced to play a significant pot against him almost as soon as you sit down at the table.

Quote

10-03-2014 , 01:14 PM

#25

luvinurmoney

veteran

Join Date: Jun 2010 Posts: 2,144

Quote:

Originally Posted by venice10

I'm going to be a nit picky and say that you are wrong about necessary conditions for incorporating GTO, because GTO doesn't care about the opponent's strategy. There are other posts in this thread that are a bit off but I'm not going to address them all because I'm not a GTO expert either.

Quote

Page 1 of 10

First

1 2 3 4 5 6

Last

Post Reply Subscribe

...

Page 1 of 10

First

1 2 3 4 5 6

Last