Open Side Menu Go to the Top
Register
How is a GTO strategy actually determined How is a GTO strategy actually determined

03-06-2019 , 11:14 PM
When we assign two arbitrary ranges, a strategy profile with betsizes and raise sizes, etc. and attempt to compute a GTO strategy--what's actually going on as frequencies are assigned to each strategy option and an equilibrium is reached? Is it purely an EV consideration, or is something else happening? Are my questions below essentially rephrasing the idea that the game has a mixed strategy Nash equilibrium (perhaps infinitely many)?

How close is "close enough" in terms of EV for two betsizings to have non-zero frequencies? To have identical frequencies? Is there a hard-and-fast rule relating frequencies and EVs, besides dominated strategies having frequencies of 0?

Suppose two strategy options have non-zero frequencies but one frequency is much higher than the other. Does one option having a higher frequency imply anything about the EVs of those options?

Is it possible for two otherwise identical options (again in terms of EV) to have wildly divergent frequencies? If so, why would that be the case at equilibrium?

If this has been covered ad infinitum or if this is all over the place, I apologize and would appreciate being linked to some threads so I can shut up and start reading and educating myself.
How is a GTO strategy actually determined Quote
03-06-2019 , 11:26 PM
A slightly more tractable example. I'm sure this has been covered before somewhere on the forum:

If we're playing a mixed strategy at equilibrium--say we have four strategy options with frequencies of 60%, 30% and 10%, and 0%. Is there really any impact in terms of EV by choosing the 10% node? In other words, is it a mistake to choose a relatively low frequency (yet non-zero) strategy? What are the practical implication of choosing any particular node that's not a dominated strategy?
How is a GTO strategy actually determined Quote
03-07-2019 , 12:11 AM
Welcome to the Forum! You are asking a bunch of really good questions. It would be impossible to try to cover everything you asked in one reply since you have asked about so many different (though related) things.

How "solvers" determine GTO strategies is a very interesting topic that has been covered in several threads in this forum. If you are interested in that, you can use 2+2's Search capability.

You also seem interested in the characteristics of mixed strategies and mixed-strategy equilibria. Again, there have been several good threads in this forum on that topic.

First, yes, GTO (Nash Equilibria) in poker are governed exclusively by EV. Any meta-concepts or anything else is not taken into account when we talk about GTO strategies.

Second, I think you know but just to be clear, every (pure) strategy of each player making up a mixed strategy equilibria (with positive frequency) will have the same EV at that equilibria (given the opponents' utilize their optimal mixed strategy).

It should be obvious, I guess, that if one (pure) strategy had a higher strategy than the others, the player would choose that strategy exclusively. Conversely, if some (pure) strategy has a lower EV than the others at equilibria, the player would never play that strategy.

Third, if I am understanding your question, there aren't any "consequences" of playing a low-frequency (pure) strategy in your optimal mixed strategy mix. Just as long as the choice was random and adheres to the optimal frequencies.

I suppose you may be wandering into real-world poker questions related to GTO. My answers above were given in the spirit of pure GTO.
How is a GTO strategy actually determined Quote
03-07-2019 , 08:05 AM
With regards to computing GTO solutions I think CFR+ is still a popular algroithm.
How is a GTO strategy actually determined Quote
03-07-2019 , 09:19 AM
Thanks! Yeah I realize there's a lot there, I was just frenetically typing out what was on my mind stream-of-consciousness style. A "shotgun strategy" to asking questions, if you will.

It's not clear to me why GTO frequencies would differ between two or more otherwise equal-EV options in the strategy profile for any non-dominated strategy. And I'm interested in the theory, but yeah also in the application at the table. But theory comes first.

Suppose we have a strategy profile with T total options, N of which have equal EV, and the EV of those N options is greater than the EV remaining T-N options.

So those N options would make up our mixed strategy equilibrium (have non-zero frequency), and the remaining T-N options would be dominated strategies with 0 frequency at equilibrium. That's correct?

But why/under what conditions does the frequency of the N high-EV options not converge to 1/N? I.e. why would two otherwise equal-EV and non-dominated strategies not have the same frequency at equilibrium. Does a non-dominated option having a higher or lower frequency imply anything at all?

My question arises from the use of a solver as I rebuild my game (I'm an online MTT player) to a GTO-oriented approach, which is all the rage and fashion these days. Do you know if solver solutions are sensitive to any initial conditions that otherwise wouldn't exist if we're doing a closed-form game theoretic analysis?

Last edited by Okra Winfrey; 03-07-2019 at 09:30 AM.
How is a GTO strategy actually determined Quote
03-07-2019 , 09:53 AM
I'll try to answer to the best of my knowledge but caveat I am no expert, just a lurker/hobbist.

Quote:
Originally Posted by Okra Winfrey
It's not clear to me why GTO frequencies would differ between two or more otherwise equal-EV options in the strategy profile for any non-dominated strategy. And I'm interested in the theory, but yeah also in the application at the table. But theory comes first.
I believe the frequencies do matter to the overall EV of the strategy and each player reaching equilibrium, but they are equivalent in EV at that point in the hand, otherwise you would choose to add or remove strategy choices at that point based on the EV available.

If your opponent doesn't adjust from the opposite strategy in the equillibrium strategy set you should be able to freely choose between any of the implemented strategies with any frequency for the strategies that the solution contains with non-zero frequency.

Quote:
Originally Posted by Okra Winfrey
My question arises from the use of a solver as I rebuild my game (I'm an online MTT player) to a GTO-oriented approach, which is all the rage and fashion these days. Do you know if solver solutions are sensitive to any initial conditions that otherwise wouldn't exist if we're doing a closed-form game theoretic analysis?
I would assume the solvers would be sensitive to information you provide them. They are just algorithms that produce an output based on what input you give them.

Also if you continue to play MTT don't forget about ICM and not just cash game theory solutions. It probably doesn't come up very often in MTT but it can influence decisions when payouts increase significantly.
How is a GTO strategy actually determined Quote
03-07-2019 , 10:34 AM
Quote:
Originally Posted by just_grindin
I'll try to answer to the best of my knowledge but caveat I am no expert, just a lurker/hobbist.



I believe the frequencies do matter to the overall EV of the strategy and each player reaching equilibrium, but they are equivalent in EV at that point in the hand, otherwise you would choose to add or remove strategy choices at that point based on the EV available.

If your opponent doesn't adjust from the opposite strategy in the equillibrium strategy set you should be able to freely choose between any of the implemented strategies with any frequency for the strategies that the solution contains with non-zero frequency.



I would assume the solvers would be sensitive to information you provide them. They are just algorithms that produce an output based on what input you give them.

Also if you continue to play MTT don't forget about ICM and not just cash game theory solutions. It probably doesn't come up very often in MTT but it can influence decisions when payouts increase significantly.
Thanks! Yes I haven't forgotten about ICM, etc.

Yeah the realization that we're talking about the EV of our overall strategy and not at that one particular point in the hand hit me a few minutes ago actually as I was thinking about my questions over breakfast. We always need to think in terms of the overall strategy, I suppose.
How is a GTO strategy actually determined Quote
03-07-2019 , 12:41 PM
Here's a simple (the simplest) example of a game without an equilibrium in pure strategies but has an equilibrium in mixed strategies. Typically the frequencies of a mixed strategy at equilibrium will not be 1/N. Typically payoffs are not "symmetrical" and there is no reason to think that frequencies should be (or converge to) 1/N. Think about raise/call/fold. Very rarely would these all be 1/3, right?

I posted this in another thread that touched on some of the same questions. You can look through that short thread since some of the same ground was covered.

Quote:
Originally Posted by whosnext
You are asking a whole bunch of good questions. Maybe we should start with your most basic question.

To make it easiest, consider a two-person zero-sum game. Suppose further that there is no pure strategy Nash Equilibrium. Then Nash proved that there must be a Nash Equilibrium in mixed strategies (actually von Neumann proved this many years before Nash). We'll come back to what this all means below.

Then it is straightforward to demonstrate that if each player is playing his/her optimal strategy, then all the EV's of all the pure strategies comprising their mixed strategy versus the opponent's optimal mixed strategy will be equal. If any strategy has a higher EV, then, of course, that player would play that pure strategy (and therefore that pair of mixed strategies cannot be jointly optimal). The point is that this "Equal EV" phenomenon is predicated on both players playing their optimal strategies.

As a simple example, suppose two players called Hero and Villain are playing the following game. Hero chooses between two pure strategies called Top and Bottom, while Villain chooses between two pure strategies called Left and Right.

The table below presents the payoffs to the two players called the payoff matrix (the first number is what Hero receives and the second number is what Villain receives).

StrategyLeftRight
Top
90,10
20,80
Bottom
30,70
60,40

It should be clear that there is no "pure strategy" Nash Equilibrium. If Hero plays Top, Villain would play Right. If Villain plays Right, Hero would play Bottom. If Hero plays Bottom, Villain would play Left. And if Villain plays Left, Hero would play Top. So there is no single best pure strategy for either player.

But there is a Nash Equilibrium in mixed strategies. It should be easy to show that the optimal (non-exploitable) strategy for Hero is to play Top 30% and Bottom 70%. Similarly, it should be easy to show that the optimal (non-exploitable) strategy for Villain is to play Left 40% and Right 60%. The "solution" to this game (the expected outcome if both players play their optimal strategies) is seen to be a payoff of 48 to Hero and 52 to Villain.

It is easy to show that if Villain is playing her optimal mixed strategy of 40% Left and 60% Right, then Hero is "indifferent" between playing Top or Bottom. Equivalently, if Villain is playing her optimal mixed strategy, then Hero's EV of playing Top will equal his EV of playing Bottom. (If this is not true, then Villain's strategy cannot be optimal.)

Similarly, if Hero is playing his optimal mixed strategy of 30% Top and 70% Bottom, then Villain's respective EV's of playing either Left or Right will be equal. (If this is not true, then Hero's strategy cannot be optimal.)

Since poker is fairly complicated, most people are convinced that there are no "pure strategy" Nash Equilibria. Of course, that means that there must be one or more mixed strategy Nash Equilibria. In each mixed strategy poker Nash Equilibrium, the "Equal EV" phenomenon must hold.

Anyway, this only scratches the surface of answering your questions. But I thought it might be a good idea if we start with some basics of game theory so we have a common starting point.
How is a GTO strategy actually determined Quote
03-07-2019 , 06:05 PM
Quote:
Originally Posted by Okra Winfrey
Is it possible for two otherwise identical options (again in terms of EV) to have wildly divergent frequencies? If so, why would that be the case at equilibrium?
Taking two different lines with the same hand means your range can be wide on both branches of a decision tree (this is related to the concept of "deception", where you don't want your range to be face up), and this reduces the possibility of exploitation.

e.g. Suppose you have the nut flush on the turn and villain is betting into you. If you always raise, then an observant villain would know you never have the nut flush if you just call, which gives him an incentive to bluff the river if you just call turn. If you always call (i.e. slowplay) on the turn, then he knows you never have the nuts if you raise, so he could 3-bet bluff the turn.

So the solution is to sometimes call and sometimes raise (i.e. you play a mixed strategy), such that you have the nuts in your range on both branches of the decision tree. If you use the optimal frequencies for calling and raising, then both lines will have the same EV, and a villain can't exploit you.

Mixing even happens with bluff-catchers on the river. In order to prevent villain making +EV bluffs with various combos, you have to call with various combos of your own. e.g. If villain knew that you only called with bluff-catchers containing a king, he'd be more likely to bluff when he has a king. To prevent him exploiting you in that way, you need to bluffcatch with a variety of hands, all at different frequencies.

P.S. This Upswing article is an easy-to-read primer on mixed strats: https://upswingpoker.com/mixed-strat...ker-decisions/
How is a GTO strategy actually determined Quote
03-16-2019 , 10:31 PM
Quote:
Originally Posted by ArtyMcFly
Taking two different lines with the same hand means your range can be wide on both branches of a decision tree (this is related to the concept of "deception", where you don't want your range to be face up), and this reduces the possibility of exploitation.

e.g. Suppose you have the nut flush on the turn and villain is betting into you. If you always raise, then an observant villain would know you never have the nut flush if you just call, which gives him an incentive to bluff the river if you just call turn. If you always call (i.e. slowplay) on the turn, then he knows you never have the nuts if you raise, so he could 3-bet bluff the turn.

So the solution is to sometimes call and sometimes raise (i.e. you play a mixed strategy), such that you have the nuts in your range on both branches of the decision tree. If you use the optimal frequencies for calling and raising, then both lines will have the same EV, and a villain can't exploit you.

Mixing even happens with bluff-catchers on the river. In order to prevent villain making +EV bluffs with various combos, you have to call with various combos of your own. e.g. If villain knew that you only called with bluff-catchers containing a king, he'd be more likely to bluff when he has a king. To prevent him exploiting you in that way, you need to bluffcatch with a variety of hands, all at different frequencies.

P.S. This Upswing article is an easy-to-read primer on mixed strats: https://upswingpoker.com/mixed-strat...ker-decisions/
I trully appreciate your takes, youre pretty good taking the theory to practical examples. I sometimes struggle reading the theory/strategies threads, because of the high level theory language and you do a great job in making it understandable for everyone.
Taking a look to the upswing article btw, and its a great read.
How is a GTO strategy actually determined Quote

      
m