Open Side Menu Go to the Top
Register
Please, help! (Quick GTO question) Please, help! (Quick GTO question)

07-10-2021 , 07:39 PM
If you are playing a GTO strategy, and it means making your opponent indifferent to action x or y, then how can you win money from his mistakes while keeping the GTO strategy?

In other words, your range is balanced in such a way that your EV will be the same regardless of what your opponent's decision is. So he cannot exploit you in anyway, but your EV will also remain the same no matter what he does (because you made him indifferent). The more popular example is the Clairvoyance game, where by betting a balanced polarized range on the river, villain can call 100% of the time, fold 100% of the time, or use any mix between the two and your EV will be the same.

I know I'm wrong but I don't quite see why. Could someone please tell me how an indifferent opponent can make mistakes in a scenario where you remain executing a GTO strategy? THANKS
Please, help! (Quick GTO question) Quote
07-10-2021 , 09:38 PM
Quote:
Originally Posted by Yeodan
Yup, I've read both, although I skipped some big replies that went on a different direction. Let me just ask you this: did you find the key insight you were looking for? If so, what was it? Because I saw you were feeling on the edge of understanding but not quite there, and that's exactly how I feel right now.
Please, help! (Quick GTO question) Quote
07-10-2021 , 10:54 PM
A very clear example is used in the book Modern Poker Theory.

It's a bluff catching scenario.
Where P2 can only win by bluff catching.

Pot: 100
Player 1: AA/QQ (50/50)
Player 2: KK
Board: 88822

Both players know each others strategy and ranges.
Player 1 has AA 50% of the time and QQ 50% of the time. Player 2 knows this, but doesn't know when P1 has which hand.
Player 1 knows P2 always has KK.

They both have 100 chips left to bet.

For simplicity, we know the best bet size is always going to be 100.
This is explained in the book.


Our first step to figuring this out is finding dominated strategies.
These are strategies we should never use.

They are the following in this situation:

1) Fold AA
AA is always the best hand, so P1 should never fold it.

2) Call QQ
P1 knows P2 always has KK, so if P1 calls with QQ they always lose.

3) Bet KK
P1 will never call QQ (P2 knows this) so if P2 bets KK, they are only getting called by better and will just lose more.

4) Check AA
Since P1 knows P2 is never betting KK, P1 should never check AA, since they will always have the best hand and want to win more.


Next we want to figure out what happens if we only bluff or never bluff.

If we always bet QQ (always bluff)
P2 can now just always call with KK and their EV will be 50% of the pot

If we never bet QQ (never bluff)
P2 knows P1's strategy, so if P1 bets, P2 knows P1 has AA and can now just always fold KK and their EV will still be 50% of the pot.


The solution (GTO play) will result in a 75% EV for P1
So we know there is a frequency of bluffing that will increase the EV for P1.

Again, this is explained mathematically in the book.
The correct bluffing frequency with QQ is 50%, this in turn makes P2 indifferent to calling or folding.

When P1 checks (25% of the time) P2 checks behind
P2 now knows P1 has QQ and will not call a bet, so there is no value in betting
So 25% of the time P2 picks up the 100 pot.

When P1 bets (75% of the time) the EV of P2 becomes 0 no matter if P2 calls or folds
P1 will have AA 2/3 of the time and QQ 1/3 of the time.
So we would lose 100 x2 vs AA and win 200 x1 vs QQ for a total of 0 EV.

As you can see, by betting QQ 50% of the time, P1 has now gained an extra 25% of the pot in EV.


Anyone not playing GTO in P1's spot is going to lose EV against a GTO opponent.
If you're betting QQ more than 50% of the time, P2 can adjust and call more often to increase their EV.
If you're betting QQ less often, P2 can adjust and fold more often to increase their EV.

If P2 is calling or folding too often, P1 can adjust to increase their EV.

But this spot clearly shows that when P1 is playing GTO, there is nothing that P2 can do to increase their EV.


The way GTO is making money here is by knowing the perfect bluffing frequency.
If the situation is reversed and our non-GTO opponent doesn't know the correct bluffing frequency, they give up EV.
Them having lower EV in the same reversed situation is how one wins or loses at poker over the long term.


This is a very simplified spot, that shows the value of GTO and how it wins against non GTO strategies.
It also demonstrates how you can win even more by exploiting your opponent when they're not playing GTO.


There are hundreds, thousands maybe millions of similar spots in poker.
It's very hard to see these, but they're there and if you do the math you can find em.

Last edited by Yeodan; 07-10-2021 at 11:02 PM.
Please, help! (Quick GTO question) Quote
07-11-2021 , 12:41 AM
Quote:
Originally Posted by Yeodan
The way GTO is making money here is by knowing the perfect bluffing frequency.
If the situation is reversed and our non-GTO opponent doesn't know the correct bluffing frequency, they give up EV.
Them having lower EV in the same reversed situation is how one wins or loses at poker over the long term.
THIS, OMG, THIS!

I'm reading the book Modern Poker Theory, and before it I read Play Optimal Poker, so I know pretty well the AA,QQ vs KK example which you used, and it was exactly this example that prompted me to ask my question.

I fully understood the math and everything, not a problem. The single and only flaw in my understanding was the overall perspective in a broad way. AND YOU JUT FIXED THAT. I was locked in one frozen spot wondering how the P2(kk) could possibly make a mistake if he is indifferent between calling and folding vs a Nash Eq. strategy. And he can't! P1 EV will always be the same, and P2 EV will always be EV=0 no matter what. The subtle trick is playing with the spot and changing the POV. I just did it beyond your example after reading your reply (this didn't pop in my head before):

Since KK is P2's only hand,
P2_Ev = KK_Ev = (KK_Ev vs check) + (KK_Ev vs bet)

if P1 never bluffs and P2 never deviates from GTO/NE to exploit P1, then vs a pot-sized bet P2 will keep calling 2/3 of the time he faces a bet, which is only when P1 has AA:

P2_Ev = (50% * 100) + (66.66% * 50% * -100)
P2_Ev = 50 - 33.33
P2_Ev = 16.67 (it's NOT 0! And 0 will always be as bad as it gets! But it would be higher than 16.67 if P2's value hands were a smaller %part of his range, keeping the premise he never bluffs. And if he bluffed too much, pure GTO MDF would return an EV higher than 0 - when it should be 0! vs GTO/NE)

So even if P2 knows that P1 always has AA when he bets because he never bluffs, and P2 still keeps calling anyway at his GTO frequency (2/3 vs Pot-sized bet), P2 wins (he doesn't break even!!!!) because by calling against exclusively the nuts (AA) and losing 100% of the time he calls, P2 won't give back all the EV he got from P1 checking behind with QQ (no bluffing)

Sorry if I sounded overly joyful, but I got it now! THANK YOU SO MUCH

Last edited by rororo; 07-11-2021 at 12:53 AM. Reason: Minor edits
Please, help! (Quick GTO question) Quote
07-11-2021 , 08:43 AM
Pretty sure your math is wrong.
P2's EV should be negative if they always call when P1 is never bluffing.

I'm not good enough at this to explain it though.
Please, help! (Quick GTO question) Quote
07-11-2021 , 09:58 AM
That is counter intuitive and I thought so as well before I worked it out.

P2_Ev = (50% * 100) + (66.66% * 50% * -100)

The reason it isn't negative is because if P1 never bluffs, P2 wins the Pot 100% of the time when P1 has QQ. P1 has QQ 50% of the time, so P2 wins 100% of the pot 50% of the time (50% * 100) is the same as (100* 50% * 100).

However, when P1 has AA, he bets 100% of the time. But P2 doesn't always call. P2 faces a bet 50% of the time, and out of those times, P2 calls 2/3 of the time (66.66% * 50% * -100). So it isn't negative because although he loses more by calling and never wins by catching a bluff, he has to call only a fraction of a fraction (66% of 50%) which is very low. And when checked he wins 100% of 50% bc P1 will never ever bluff when he has QQ.

The Ev of folding by definition is 0, so the times when P2 checks, faces a bet and folds, his EV is 0. Plugging it in the equations doesn't change anything.

His EV of calling is negative though, but his EV of checking makes up for that as long as he doesn't call to much (i.e. more than MDF).

In your/book's exmple we see P1 has +EV when he never bluffs, but by bluffing at the right frequency he gets more EV overall (25% more as you said). Well, this extra 25% is what P1 is giving up by not bluffing. P2's EV is that, but it shows $16.67 because he gives up part of it by sometimes paying P1's value bets sijce P2 still calls at GTO/NE.

That is the EV of the spot/actions. The EV of the game would still be negative when you account for the $50 he's put into the pot. But that is different since the pot doesn't belong to him. Even then though, his negative EV now would be better than his negative EV when P1 is playing GTO/NE (bluffing at a frequency).

Sounds confusing and counter intuitive but I believe the math is right. I would love to be corrected if that's not the case though.

Last edited by rororo; 07-11-2021 at 10:19 AM.
Please, help! (Quick GTO question) Quote
07-11-2021 , 10:43 AM
Oops, I made a mistake. I assumed P2's MDF was 2/3 or 66%. But this is for when he faces a half-pot bet. Vs a bigger, pot sized bet, P2's MDF is 1/2 or 50%. So fixing that:

P2_Ev = KK_Ev = (KK_Ev vs check) + (KK_Ev vs bet)
P2_Ev = (100% * 50% * 100) + (50% * 50% * -100)
P2_Ev = 50 - 25
P2_Ev = $25

In your/book's exmple (your first comment) we see P1 has +EV when he never bluffs, but by bluffing at the right frequency he gets more EV overall (25% more as you said). Well, this extra 25% is precisely what P1 is giving up by not bluffing.
Please, help! (Quick GTO question) Quote
07-11-2021 , 05:04 PM
The short answer is that mixing mistakes don't lose against a fixed GTO strategy, but they do make you exploitable.

Check out this thread for clarity: https://forumserver.twoplustwo.com/1...23/?highlight=
Please, help! (Quick GTO question) Quote
07-11-2021 , 05:45 PM
Quote:
Originally Posted by tombos21
The short answer is that mixing mistakes don't lose against a fixed GTO strategy, but they do make you exploitable.

Check out this thread for clarity: https://forumserver.twoplustwo.com/1...23/?highlight=
Interesting thread.

1) Versus GTO, mistakes with mixed strategies don't lose.
2) The reason is because if an action line is mixed, there is indifference about which branch to take. Both options have the same ev.
3) However, making mistakes vs GTO does lose indeed.
4) But the point is where it loses.
5) It doesn't lose vs fixed GTO if an action (strategy) is mixed/indifferent.
6) It loses vs fixed GTO if an action (strategy) is pure (it's not indifferent - there is a preference).

In the AKQ toygame, when K player checks and faces a bet vs a fixed GTO player with a balanced polarized range, the K player can't make a mistake because his strategy at GTO is mixed between calling and folding, so he is indifferent.

However, that same player can make a mistake when deciding whether to check or bet the K. He is NOT indifferent, his strat is not mixed: he must purely check every time, and if he bets he will make a mistake.

Overall:

1) Where the strategy is mixed, you don't lose vs GTO if you deviate (frequency-wise), but you can be exploited.
2) Where the strategy is pure, you lose vs GTO if you deviate, and you can lose even more vs a non-GTO exploitative strategy.

Bottom line: GTO does profit from mistakes, but those mistakes are not the ones made at indifference points. Instead they are the ones made at points where there's a preference.
Please, help! (Quick GTO question) Quote
07-11-2021 , 07:18 PM
Yeah spot on, so with regards to fixed GTO strategies you've got to draw a clear line in the sand between pure and mixing mistakes.

It's also important to recognize that mixing mistakes can be exploited by an opponent that adapts to punish mistakes. GTO solvers are actually exploitative algorithms - if you lock in some mixing mistakes by one player the other will automatically adjust to exploit it.
Please, help! (Quick GTO question) Quote

      
m