How does GTO win?
 06-04-2021, 06:24 PM #1 Yeodan veteran     Join Date: Mar 2008 Location: Belgium, Antwerp Posts: 2,923 How does GTO win? Someone posted rock paper scissors as an example earlier. If you play GTO in RPS it's impossible to win or lose, you will always break even against any other strategy. I understand GTO does win in poker when your opponent deviates from it, because poker is much more complicated than RPS. This somehow feels true, but I don't know why or how this works. What I don't understand is where GTO makes that profit. Pre-flop it sorta makes sense, if you play too many hands you'll have to fold, call or bluff too often somewhere in the hand to make up for that. If you play too few hands you're automatically not claiming some of the pots that you should have. So let's say we're at a table where everyone plays perfect GTO pre-flop. Only our imaginary hero players perfect GTO post-flop. The other players are just pressing some random buttons. Our entire game should be perfectly balanced as to not be exploitable right? So when we value bet, if villains react to the value bet randomly often enough, shouldn't it balance out in the end? Same for bluffs? Some of them will call/fold too often, others will do the opposite. Won't it just balance out like it does in RPS? I know the answer to this is no, it won't. But I don't understand why. How does playing GTO "force" opponents into making mistakes that are not recoverable by another opponent making the opposite mistake? As an example, say we get to the river and we make a GTO bet, we would do this exactly the same every time we get into this situation (even if it's a mixed strategy, we'd apply the mixed strategy exactly the same every single time in this exact situation) Let's say it's GTO for our opponent to call 50% of the time and fold 50% of the time. If one opponents calls 60% of the time but another calls only 40% of the time. We haven't won anything then right? The 10% mistake one opponent makes is countered by the opposite 10% mistake of the other opponent. (Or the same opponent in a different situation.) It seems like this should eventually just even out, why is this not the case? Is this example wrong? Or is it just too simple? Or am I just thinking about this in a completely wrong way? Or maybe GTO just "exploits" standard human poker by default?
 06-04-2021, 06:34 PM
ZKesic

Re: How does GTO win?

Every time someone makes a -EV play vs the GTO player, the GTO player wins that much vs him. If you fold the nuts on the river, you lost money vs GTO. If you call river shove with 5 high, you lost money vs GTO. If you check when the bet had higher EV, you lost money vs GTO. And so on...
 06-04-2021, 07:10 PM
aner0

Re: How does GTO win?

It's not about villains doing things "too often" or "not often enough", forget about frequencies. It's about them doing the wrong play with the WRONG HAND
Yeodan
 Originally Posted by aner0 It's not about villains doing things "too often" or "not often enough", forget about frequencies. It's about them doing the wrong play with the WRONG HAND
I was actually thinking this a bit after writing the whole thing

FazendeiroBH
 Originally Posted by Yeodan How does playing GTO "force" opponents into making mistakes that are not recoverable by another opponent making the opposite mistake?
Not an expert, so take my answer with a pinch of salt. You seem to be thinking in terms of your holding and itīs EV at that time (thinking about your example), when whatīs really gaining EV from the mistakes is your entire GTO range. If your opponent overfolds vs youri river bet, your bluffs are winning more than they should. If he overcalls, your value part wins more.

Just have to think about your entire range and GTO in poker will make sense to you.

aner0
 Originally Posted by FazendeiroBH Not an expert, so take my answer with a pinch of salt. You seem to be thinking in terms of your holding and itīs EV at that time (thinking about your example), when whatīs really gaining EV from the mistakes is your entire GTO range. If your opponent overfolds vs youri river bet, your bluffs are winning more than they should. If he overcalls, your value part wins more. Just have to think about your entire range and GTO in poker will make sense to you.
This isn't really true for GTO, and this range EV thing is a huge misconception in the poker community rn. If villain overfolds, GTO will make more moeny with bluffs while it makes exactly the same amount less with value.

Bad frequencies won't make you lose against GTO, forget about ranges and frequencies for a second and think about how your specific holding makes money

 06-05-2021, 07:56 PM
bwtaylor

Re: How does GTO win?

GTO is a synonym for a Nash equilibrium set of strategies. These names are defined to be a set of strategies such that no individual player can improve their EV by changing their strategy unilaterally from the equilibrium strategy. Economist John Nash, popularized in the movie A Beautiful Mind, proved that every finite game has at least one GTO set of strategies. GTO "wins" by almost by definition when opponents play non-GTO strategies with lower EV. I say almost, because technically it is possible to play a non-GTO strategy vs a GTO strategy with the same EV. In rock paper scissors, the nash equilibrium is for both players to select randomly between the three options with 1/3 frequency. If either player deviates from this, their EV remains the same against the GTO strategy, but it's no longer true that the other player can't change to a winning strategy.
Yeodan
 Originally Posted by bwtaylor but it's no longer true that the other player can't change to a winning strategy.
Yet that's exactly what we don't do by playing GTO. Well we could, but then we're no longer playing GTO.

Yeodan
 Originally Posted by aner0 This isn't really true for GTO, and this range EV thing is a huge misconception in the poker community rn. If villain overfolds, GTO will make more moeny with bluffs while it makes exactly the same amount less with value. Bad frequencies won't make you lose against GTO, forget about ranges and frequencies for a second and think about how your specific holding makes money
This is extremely confusing.
I can feel it sitting at the edge of my understanding, but I'm just not completely there yet.

Maybe an example could help?

aner0
 Originally Posted by Yeodan This is extremely confusing. I can feel it sitting at the edge of my understanding, but I'm just not completely there yet. Maybe an example could help?
The only way to make money in poker is by maximizing the EV with each individual holding.

Your holding's EV depends exclusively on its characteristics, villains range, and how villain plays that range. Your holding's EV isn't affected by your range or how you play other hands in the same spot, since each hand is an independent event.

Here's the tricky part: Your range can indirectly affect the EV of your holding through this chain of relationships:

You have a certain range and a certain strategy for that range on a given spot -> Villain notices this through frequencies and showdowns (Or maybe just correctly assumes how you play) -> He adjusts his strategy -> As we talked about, the EV of your specific holding is affected by villains strategy

In solvers, each player knows its opponents strategy perfectly, so this chain of events happens by definition. This is why, against a solver, your range and how you play it as a whole is the biggest factor affecting each of your specific holdings EVs.

GTO (Nash Equilibrium) works a little differently though, as it doesn't adjust to any opponent, it just plays the same "perfect" ranges and strategies over and over again no matter how you react to it.
As we already stablished, your individual hand's EV is decided exclusively by that hand's characteristics and how villain plays his range, regardless of our own range.
Therefor against GTO/Nash Equilibrium, our ranges and frequencies won't affect the EV of each individual holding, which is the only variable that ultimately affects winrate

 06-05-2021, 11:47 PM
Aesah

Re: How does GTO win?

I feel like there should be a sticky in this subforum for GTO again, in RPS, it is (basically) impossible to make a mistake. In poker, it is super easy for a human to make a mistake. Basically anytime you make a mistake, then your opponent profits. GTO wins by making no mistakes vs humans who are making mistakes.
Yeodan
 Originally Posted by Aesah Basically anytime you make a mistake, then your opponent profits.
Can you define "make a mistake"

Because I can think of actions that I would consider a "mistake" yet they don't lose vs a GTO playstyle

Let's say we always call an all-in on the river against a perfect GTO opponent.
We're not losing any money by doing this, though it seems like a clear mistake?

The GTO player will win more with his value hands, but lose the exact same amount with his bluffs.

This is exactly what I'm struggling with, I can't find the mistakes that makes GTO win.

One thing I've been thinking of is for example a GTO player has the nuts on the river and chooses a large bet sizing instead of a small one vs a range that is likely to call.

In the reverse situation a non-GTO player might bet smaller and lose value, which would be a mistake that loses vs GTO.

I find it hard to generalize this type of mistake into a clear definition though.
GTO wins when your opponent takes a line that is less EV than possible? Not in that one hand, but over the long run when GTO plays the same spots for higher EV?

I think what's really confusing me is exactly this, the definition of mistake.

 06-06-2021, 06:48 AM
Yeodan

Re: How does GTO win?

So we could say GTO wins by default against players who miss +EV value bets or players who bet too small when a larger bet would be higher EV. Would the same work for bluffs? GTO wins by default against players who miss +EV bluffs? Or even when they use a wrong sizing for bluffs? What about calls?
 06-06-2021, 09:59 AM
aner0

Re: How does GTO win?

A mistake is taking any line that isn't the highest EV line with a given hand. Calling with bad blockers, folding with good blockers, bluffing with bad blockers, not bluffing with good blockers, folding hands that beat some of villains value range, calling with hands that lose to some of villains bluffs, using suboptimal sizings, calling with too little equity on earlier streets, folding with too much equity... All of these are mistakes. The generalization is this: A mistake is taking a line that isn't the highest EV possible for your specific hand in a vaccuum
 06-06-2021, 10:04 AM
FazendeiroBH

Re: How does GTO win?

If you have a solver, you can check what happens with the EV of your range and each holding, when villains play suboptimal strategies. Just lock the GTO strategy to avoid the solver exploiting anything, adjust and lock villains strategy and run. Obv, if you know villain strategy to do this exercise, it doesnīt make sense to play GTO vs him.
Ceres

 Originally Posted by aner0 GTO (Nash Equilibrium) works a little differently though, as it doesn't adjust to any opponent, it just plays the same "perfect" ranges and strategies over and over again no matter how you react to it.
Umm, GTO is the ultimate adjustment to a perfect GTO opponent (minus locking)? I think saying GTO isn't adjusting to any opponent is a tad misleading.

Yeo, how does GTO win? Because the underlying principle of poker is math. The key word is optimise. Every decision point in a hand is based on probability, like all gambling, so the closer we can fine-tune our strategy around the underlying math (pot-odds, MDF etc) which maximalises our EV, the more we will win overall.

Quick story.

Say there's a carnival in town. One of the stalls has a game where you pay \$100 to try and guess which of the three doors has \$300 behind it. On average the stall operator can expect to break even on this bet, providing they randomise the location of the \$300 well enough. The punter will pick the right door around 1/3 and the price of entry will pay for the prizes.

Now, let's say the stall operator is a bit lazy and doesn't randomise the location of the money as perfectly. Let's say one of the punters recognises that, on average, the money is more often placed behind door one. Obviously they can start guessing door one more often and exploit the store operator; winning more on average than their previous strategy.

Now, in both these instances the punter is playing a GTO strategy. They are simply maximalising their gains within the parameters of the game. In the second example, the punter is now adjusting their previously perfect GTO strategy to exploit the store operator's laziness, but it is still a GTO strategy. It is still the most profitable strategy. Very often GTO is just the most exploitative strategy you can take.

Not my example but I read it recently and seemed like an easy way to think about GTO.

In conclusion: GTO makes money because ultimately your range/strategy is more balanced and optimised than your opponent's. And this includes making wild exploitative adjustments. When we node-lock our opponents to make more mistakes with their range in a solver --to replicate our weaker opponents, as Faz just mentioned above-- the solver will pick the most exploitative lines available. So it is both balanced and exploitative at the same time. Keyword being optimised.

Didace
 Originally Posted by Ceres Umm, GTO is the ultimate adjustment to a perfect GTO opponent (minus locking)? I think saying GTO isn't adjusting to any opponent is a tad misleading.
Theoretically, if you are playing a GTO strategy and come up against another GTO player, how would you adjust?

 06-06-2021, 03:08 PM
Ceres

Re: How does GTO win?

Leave?
 06-06-2021, 03:09 PM
Ceres

Re: How does GTO win?

Don't understand the question tbh. If we are playing GTO then we have already adjusted our strategy towards our opponent by definition.
Aesah
 Originally Posted by Yeodan Can you define "make a mistake"
Do you want a specific or general example?

aner0 answered general very well, so I'll try to go specific.

ZKesic already gave some great examples in the first reply in this thread, but they are kind of extreme so you might think "well I won't do that! that is an obvious mistake!", but how about something like folding QTbdfd on J63r vs a 1/3 cbet in SB vs BB. OK sure maybe YOU won't do that either, I don't know your poker skill level, but I'm sure you agree many poker players would, and regardless of your own personal cutoff it's obvious that there will be some spot that you won't know what to do with some hand. And anytime you make the decision that isn't what the GTO decision would have been, it is going to profit in that spot against you.

This includes calls, bets, and checks. For a checking vs. betting example, if you 3bet SB vs BTN and cbet on As2x3s it is likely a mistake, I've seen "decent" players cbet upwards of 100% on this board when they should be cbetting under 10% because the BTN has way more polarity advantage on this board with 54, 33, 22, A3, A2 and also plays the entire hand in position.

aner0
 Originally Posted by Ceres Don't understand the question tbh. If we are playing GTO then we have already adjusted our strategy towards our opponent by definition.
Your definition of GTO is academically correct, but in Poker when we say GTO we pretty much always mean exactly Nash Equilibrium. Given how people use the word, I just think it's easier for everyone to understand the terms of the discussion if we define GTO as Nash Equilibrium.

Nash Equilibrium doesn't adjust, you can play your side of NE against a bum fish and it would still be your side of NE AKA GTO.

We also have a different term for the type of strategy you're talking about which is MES (Maximally Exploitative Strategy), so I find keeping GTO as NE and MES as the actual best strategy to be the most functional use of words.

Aesah
 Originally Posted by Didace Theoretically, if you are playing a GTO strategy and come up against another GTO player, how would you adjust?
Do nothing to breakeven minus rake, unless you want to lose more, then you can start trying to "outsmart" the other GTO player XD

~~~

Also I completely disagree with Ceres's carnival example, I feel like this game is literally the exact same as rock paper scissors (choose 1 of 3). So it's like saying "we recognize our opponent throws paper more often, so we still play GTO by throwing scissors more often", which I don't think is true. I believe we are exiting GTO and entering exploitation at that point, since if we have a strategy of 50% scissors/25% rock/25% paper we can actually lose vs. some other opponents. Bringing it back to the carnival example game, we can be -EV if the carnival operator is switched without our knowledge with another carnival operator, but if we stick with the GTO strategy of choosing every door 33% of the time, there is no way they can try to rig it no matter what they do to make us lose money. That is the crux of GTO.

Anyway as I mentioned earlier I believe using RPS is a terrible, terrible way to try to understand GTO. Chess is better IMO. So replace poker with chess here: "I'm a perfect chess player vs. another perfect chess player, how should I adjust to beat him?" You don't, you just continue playing perfectly unless you want to start losing. But if the "hidden information/randomness" is the difference between chess and poker, then OK, use Blackjack (single deal, so no counting cards). The GTO strategy is well-known, but people still misplay all the time so they will lose more money compared to playing GTO.

Also if it helps make the blackjack example easier to understand, think of it as player vs. player, neither one is the dealer and has an inherent advantage (hits/stands in private, ties chop or whatever). So "I'm playing perfect GTO blackjack strategy, how should I adjust my strategy vs. someone else playing perfectly", again, you don't.

Aesah
 Originally Posted by aner0 Your definition of GTO is academically correct, but in Poker when we say GTO we pretty much always mean exactly Nash Equilibrium. Given how people use the word, I just think it's easier for everyone to understand the terms of the discussion if we define GTO as Nash Equilibrium. Nash Equilibrium doesn't adjust, you can play your side of NE against a bum fish and it would still be your side of NE AKA GTO. We also have a different term for the type of strategy you're talking about which is MES (Maximally Exploitative Strategy), so I find keeping GTO as NE and MES as the actual best strategy to be the most functional use of words.
Completely agree with this. When I say GTO in the context of this thread (and 99.9% of the time in context of poker), I mean NE

plexiq
Could you quote/link that definition please? Not trolling, I don't think I've ever seen the term GTO used in any academic work and I've certainly never seen a formal definition that was different from a NE.

(My understanding is that GTO is basically just an informal poker marketing term that is used somewhat interchangeably with Nash Equilibrium.)

aner0
 Originally Posted by plexiq Could you quote/link that definition please? Not trolling, I don't think I've ever seen the term GTO used in any academic work and I've certainly never seen a formal definition that was different from a NE. (My understanding is that GTO is basically just an informal poker marketing term that is used somewhat interchangeably with Nash Equilibrium.)
Ive heard It a few times used in the context of Game Theory with the meaning of "best possible strategy with the available information", probably not a super formal term still

