Hoss_TBF: "All top players use game theory, distributions, bluff ratios etc"
12-14-2012
, 02:57 PM
Quote:
i just thought a little bit about game theory and one situation came to my mind in which i dont see why gto is unexploitable, and this really bothers me so i thought i just post it here:
Lets assume player A and B are both playing gto strategies. Lets assume gto strategy for player A is betting the river full pot. Because player B is playing gto as well, he has to call 50% of his hands, which means player A has to valuebet any hand which is better than 75% of player Bs total range (because player B is calling top 50% of his range and player A has to be good >50% of the time against the calling range in order to make it a value bet).
Now lets say player A has some hands in his range which are better than 70% of player Bs hands. He cant value bet these for full pot because it would be an incorrect value bet, but he can bet it for ½ pot ( balancing it with the right frequency of bluffs obviously). Player B has to call with 66,6% of his range now, so the second range with which player A was betting ½ pot is a +ev play.
Given the fact player B is playing gto, player As strategy with just one betsizing is not the gto strategy, the strategy with the two betsizings may is, but if you continue my example ist pretty obvious that the more betsizings you have, the more ev the strategy gets, which leads me to believe that a gto strategy has a lot of different betsizings, depending on how much value your hand has, so the betsizing of a gto strategy should almost exactly tell the opponent which hands the gto player is betting for value in a certain spot, which is obviously very exploitable.
In reverse, if you are playing this gto strategy with the many different betsizings, your opponent, if he is playing gto, cant take advantage of that, because hes not reading hands or interpret something in your betsizing.
That means you can make money in these situations and its reverse if you know your opponent is playing gto, which obviously shouldnt be the case in theory.
i thought about this quite a bit but didnt see my mistake so i hope you guys can help me out, it really blows my mind
Lets assume player A and B are both playing gto strategies. Lets assume gto strategy for player A is betting the river full pot. Because player B is playing gto as well, he has to call 50% of his hands, which means player A has to valuebet any hand which is better than 75% of player Bs total range (because player B is calling top 50% of his range and player A has to be good >50% of the time against the calling range in order to make it a value bet).
Now lets say player A has some hands in his range which are better than 70% of player Bs hands. He cant value bet these for full pot because it would be an incorrect value bet, but he can bet it for ½ pot ( balancing it with the right frequency of bluffs obviously). Player B has to call with 66,6% of his range now, so the second range with which player A was betting ½ pot is a +ev play.
Given the fact player B is playing gto, player As strategy with just one betsizing is not the gto strategy, the strategy with the two betsizings may is, but if you continue my example ist pretty obvious that the more betsizings you have, the more ev the strategy gets, which leads me to believe that a gto strategy has a lot of different betsizings, depending on how much value your hand has, so the betsizing of a gto strategy should almost exactly tell the opponent which hands the gto player is betting for value in a certain spot, which is obviously very exploitable.
In reverse, if you are playing this gto strategy with the many different betsizings, your opponent, if he is playing gto, cant take advantage of that, because hes not reading hands or interpret something in your betsizing.
That means you can make money in these situations and its reverse if you know your opponent is playing gto, which obviously shouldnt be the case in theory.
i thought about this quite a bit but didnt see my mistake so i hope you guys can help me out, it really blows my mind
also, just because a particular move with a particular hand is +EV, does not mean it is part of a GTO strategy. you pass up +EV moves sometime for the sake of balance. in other words, precisely because you would be turning your hand face up, you pass up a move that seems to be +EV in a vacuum when it would enable your opponent to take that EV away by knowing what you have.
12-14-2012
, 04:26 PM
Join Date: Jun 2012
Posts: 2,474
Quote:
the fact that something is exploitable means it is not GTO. so if GTO requires different betsizing, it will require that you bet full pot only SOME of the time with your top 25%, and you would bet half pot SOME of the time with your top 25%, enough to make it breakeven at best for your opponent to push back against your weaker value range.
also, just because a particular move with a particular hand is +EV, does not mean it is part of a GTO strategy. you pass up +EV moves sometime for the sake of balance. in other words, precisely because you would be turning your hand face up, you pass up a move that seems to be +EV in a vacuum when it would enable your opponent to take that EV away by knowing what you have.
also, just because a particular move with a particular hand is +EV, does not mean it is part of a GTO strategy. you pass up +EV moves sometime for the sake of balance. in other words, precisely because you would be turning your hand face up, you pass up a move that seems to be +EV in a vacuum when it would enable your opponent to take that EV away by knowing what you have.
So many posts itt take the form of "what I don't get is if you just had a big enough rocket I'm pretty sure you could exceed the speed of light ldo".
12-14-2012
, 05:59 PM
Quote:
if you have a hand which is better than lets say 75% of your opponents total range, a much smaller sizing has the highest ev(probably something between betting 1/3 and 1/2 pot). you can do this for all different kinds of hands and i think the most +ev value betting sizing is different every time, depending on how strong your hand actually is.
given that, a gto strategy should contain a lot of different betsizings.
Now, if you know your opponent is playing gto and you know everything about how this strategy looks like, you can make fairly well assumptions on what your opponent has. e.g. if your opponent is betting 43,21% of the pot, you know for which hand in your opponents range it is optimal to bet this exact amount. now you can valueraise every hand which is better than this hand (and bluffing with the right frequency as well). so you can extract value with hands you usually wouldnt even consider raising for value with, just because your opponents range is capped to this assumed hand.
in reverse, if you play this strategy and your opponent still plays gto, he cant go for value this thin, because he doesnt make any assumptions on what your range looks like based on betsizing.
given that, a gto strategy should contain a lot of different betsizings.
Now, if you know your opponent is playing gto and you know everything about how this strategy looks like, you can make fairly well assumptions on what your opponent has. e.g. if your opponent is betting 43,21% of the pot, you know for which hand in your opponents range it is optimal to bet this exact amount. now you can valueraise every hand which is better than this hand (and bluffing with the right frequency as well). so you can extract value with hands you usually wouldnt even consider raising for value with, just because your opponents range is capped to this assumed hand.
in reverse, if you play this strategy and your opponent still plays gto, he cant go for value this thin, because he doesnt make any assumptions on what your range looks like based on betsizing.
Anyway, you said that the best strategy with the nuts is to bet big, but the best strategy with a hand of medium strength is to bet small. But then your opponent can exploit this by value-raising you very hands with hands that beat your medium strength hand. Suppose he did this. What would be your response exploitively? To bet small with the nuts in order to smash his thin value raising. But maybe you should do that anyway? I don't know where you got the idea that betting big with the nuts is the best play anyway; it's true in some contrived situations where your opponent can never raise you.
So now what do we have? Maybe three hands for each bet size. A strong hand, a medium-strength hand, and a bluff, mixed in the right ratio. The strong hand gets value when the other guy raises, the medium-strength hand gets value when the other guy calls, and the bluff gets value when the other guy folds. If we mix these appropriately, we can profit from the opponents' deviations from optimal.
12-14-2012
, 08:14 PM
Join Date: Jun 2012
Posts: 358
INCORRECT:
If you mix appropriately so that you can't be exploited, you profit from your opponents' deviations.
This is only rarely true. In your example, if your opponent calls too much, your expectation for your bluffs is less but it is more for your value bet. If your opponent folds too much, vice versa. If your opponent raises too much, your expection for your medium hands is less but it is more for your strong hands. Your net expectation is the same because it is this equilibrium that defines the correct mix. You do not profit.
CORRECT:
You can mix these appropriately to profit from your opponent's deviations.
If you know that your opponent is deviating you can suppress or augment that part of the mix to exploit the deviation. If he calls too much, you don't bluff. If he folds too much, you bluff more. If he raises too much, you suppress the thin value bets.
---------------------------------------------------
GTO strategies can't be exploited; however, they generally don't exploit imbalances in your opponent's strategy. You must unbalance yourself to take advantage of your opponent's imbalance; if he figures it out, he can exploit you! This is what 'feel players' do instinctively.
12-14-2012
, 09:18 PM
All of this has happened before, and all of it will happen again.
12-14-2012
, 11:02 PM
Quote:
This statement can be taken in two ways; one correct and one incorrect.
INCORRECT:
If you mix appropriately so that you can't be exploited, you profit from your opponents' deviations.
This is only rarely true. In your example, if your opponent calls too much, your expectation for your bluffs is less but it is more for your value bet. If your opponent folds too much, vice versa. If your opponent raises too much, your expection for your medium hands is less but it is more for your strong hands. Your net expectation is the same because it is this equilibrium that defines the correct mix. You do not profit.
CORRECT:
You can mix these appropriately to profit from your opponent's deviations.
If you know that your opponent is deviating you can suppress or augment that part of the mix to exploit the deviation. If he calls too much, you don't bluff. If he folds too much, you bluff more. If he raises too much, you suppress the thin value bets.
---------------------------------------------------
GTO strategies can't be exploited; however, they generally don't exploit imbalances in your opponent's strategy. You must unbalance yourself to take advantage of your opponent's imbalance; if he figures it out, he can exploit you! This is what 'feel players' do instinctively.
INCORRECT:
If you mix appropriately so that you can't be exploited, you profit from your opponents' deviations.
This is only rarely true. In your example, if your opponent calls too much, your expectation for your bluffs is less but it is more for your value bet. If your opponent folds too much, vice versa. If your opponent raises too much, your expection for your medium hands is less but it is more for your strong hands. Your net expectation is the same because it is this equilibrium that defines the correct mix. You do not profit.
CORRECT:
You can mix these appropriately to profit from your opponent's deviations.
If you know that your opponent is deviating you can suppress or augment that part of the mix to exploit the deviation. If he calls too much, you don't bluff. If he folds too much, you bluff more. If he raises too much, you suppress the thin value bets.
---------------------------------------------------
GTO strategies can't be exploited; however, they generally don't exploit imbalances in your opponent's strategy. You must unbalance yourself to take advantage of your opponent's imbalance; if he figures it out, he can exploit you! This is what 'feel players' do instinctively.
What I meant that to say was that by mixing properly any deviation from optimal will be more profitable for one of the hand types, thus the opponent can't exploit this type of strategy (in an obvious way). I got distracted and wrote something silly instead.
12-19-2012
, 08:23 AM
Do you think the solved GTO/equilibrium strategy uses call multipliers?((not my term/concept btw) - as in extra% hero needs to continue vs a bet when only calling to avoid villain cbetting his entire range and freerolling with his marginal equity hands)
I try and incorporate call multipliers(1.2x OTF and 1.1x OTT) when doing simple GTO math for my ranges(as in calling 60% or w/e range each street), but sometimes i struggle to reach the full amount of combos needed(simply don't have enough continuable holdings). I wonder if i might be trying to overdefend wrongly..
I try and incorporate call multipliers(1.2x OTF and 1.1x OTT) when doing simple GTO math for my ranges(as in calling 60% or w/e range each street), but sometimes i struggle to reach the full amount of combos needed(simply don't have enough continuable holdings). I wonder if i might be trying to overdefend wrongly..
Last edited by shark_fishin; 12-19-2012 at 08:32 AM.
12-19-2012
, 09:08 AM
Carpal \'Tunnel
Join Date: May 2004
Posts: 9,557
No, "call multipliers" are a human construct that may or may not help you play closer to equilibrium. Also, it seems likely to me that there will be situations where you just can't stop your opponent from "cbetting his entire range and freerolling with his marginal equity hands". IIRC MOP covers an analogous situation where you're folding the river > X/(P+X) where X=opponent's bet and P=Pot. Sometimes because of position or previous action you're just at an inherent disadvantage even if you play the equilibrium strategy.
12-19-2012
, 10:00 AM
u cylon bro?
12-19-2012
, 11:31 AM
Quote:
No, "call multipliers" are a human construct that may or may not help you play closer to equilibrium. Also, it seems likely to me that there will be situations where you just can't stop your opponent from "cbetting his entire range and freerolling with his marginal equity hands". IIRC MOP covers an analogous situation where you're folding the river > X/(P+X) where X=opponent's bet and P=Pot. Sometimes because of position or previous action you're just at an inherent disadvantage even if you play the equilibrium strategy.
What got me thinking about it was a situation where i call blind vs blind 35% range in 6max NLHE 100bb stacks, and my GTO strategy with call multiplier 1.2x requires me to call A8o or 77 on Th2sJs flop for SDV, but my friend thinks those hands are really weak and are both folds. I thought maybe i shouldn't be adding a call multiplier after all. Infact since the pfr's equilibrium strategy is probably to c/f 30% air on each street(to let my weakest flop and turn calls see showdown), the equity A8 has(stove says 30% but it's probably just the A pairing that counts, so maybe 10% or so) vs his betting range, is like a freeroll also. So both partys have equity.. i wonder if they both cancel each other out?
Last edited by shark_fishin; 12-19-2012 at 11:38 AM.
12-19-2012
, 12:17 PM
^ his answer was pretty unambiguously no, it's actually the first word of his post
12-19-2012
, 02:55 PM
As best I understand it, the concept probably has some relevance to dynamic hand value games, but really only applies directly to static hand value games.
12-19-2012
, 03:25 PM
Join Date: Jan 2004
Posts: 5,635
The concept is for dynamic hand value games. The idea is you take the principle from static hand value games that you have to defend with pot/(pot+bet) frequency and multiply that number by some arbitrary coefficient to get a frequency you should defend in a dynamic hand value game.
12-19-2012
, 04:13 PM
Quote:
The concept is for dynamic hand value games. The idea is you take the principle from static hand value games that you have to defend with pot/(pot+bet) frequency and multiply that number by some arbitrary coefficient to get a frequency you should defend in a dynamic hand value game.
12-19-2012
, 10:14 PM
One thing: Any time you play poker, whether live or online, you will have extra information about your opponent's play. This could include stats, timing, situational information, as well as physical tells and even "hunches."
It looks to me like some of the gto poker extrapolations or whatever you want to call them are based on totally information neutral scenarios, for instance two insentient computers playing against each other.
In a real game, there is always a vast and unquantifiable amount of extra information floating around.
Another thing: I might be wrong about this, or completely missing the point, but there seems to be a peculiar kind of recursion that could happen in a poker game because of the random element of the cards.
Let's say I realize that my opponent is playing a "game theory optimal" brand of poker. Since I know that he is, let's say, balancing all his actions, then I could get a pretty good idea of what he is doing, say if I reverse engineer like "what would a gto guy do?" and if I understand the correct gto strategy and know that he's playing it, I could predict. THEN if I play not-balanced, say I do such and such with my aces, such and such with 5-7 suited, such and such with AK, then, and I mean I obviously haven't worked this out, but it seems like I would have an advantage.
Like if an unbalanced player 4 bets me preflop, he got it and I easily fold. But if the "balanced" gto guy 4 bets me, then maybe my all-in fold equity or racing equity is that much greater, because I know he is balancing so he must have air some percentage of the time, and that air goes straight into my equity column.
Obviously I see the value of balancing, however it seems like if deception is lacking, than the opponent could snatch back the advantage.
It looks to me like some of the gto poker extrapolations or whatever you want to call them are based on totally information neutral scenarios, for instance two insentient computers playing against each other.
In a real game, there is always a vast and unquantifiable amount of extra information floating around.
Another thing: I might be wrong about this, or completely missing the point, but there seems to be a peculiar kind of recursion that could happen in a poker game because of the random element of the cards.
Let's say I realize that my opponent is playing a "game theory optimal" brand of poker. Since I know that he is, let's say, balancing all his actions, then I could get a pretty good idea of what he is doing, say if I reverse engineer like "what would a gto guy do?" and if I understand the correct gto strategy and know that he's playing it, I could predict. THEN if I play not-balanced, say I do such and such with my aces, such and such with 5-7 suited, such and such with AK, then, and I mean I obviously haven't worked this out, but it seems like I would have an advantage.
Like if an unbalanced player 4 bets me preflop, he got it and I easily fold. But if the "balanced" gto guy 4 bets me, then maybe my all-in fold equity or racing equity is that much greater, because I know he is balancing so he must have air some percentage of the time, and that air goes straight into my equity column.
Obviously I see the value of balancing, however it seems like if deception is lacking, than the opponent could snatch back the advantage.
12-19-2012
, 10:21 PM
Join Date: Feb 2010
Posts: 527
Quote:
One thing: Any time you play poker, whether live or online, you will have extra information about your opponent's play. This could include stats, timing, situational information, as well as physical tells and even "hunches."
It looks to me like some of the gto poker extrapolations or whatever you want to call them are based on totally information neutral scenarios, for instance two insentient computers playing against each other.
In a real game, there is always a vast and unquantifiable amount of extra information floating around.
Another thing: I might be wrong about this, or completely missing the point, but there seems to be a peculiar kind of recursion that could happen in a poker game because of the random element of the cards.
Let's say I realize that my opponent is playing a "game theory optimal" brand of poker. Since I know that he is, let's say, balancing all his actions, then I could get a pretty good idea of what he is doing, say if I reverse engineer like "what would a gto guy do?" and if I understand the correct gto strategy and know that he's playing it, I could predict. THEN if I play not-balanced, say I do such and such with my aces, such and such with 5-7 suited, such and such with AK, then, and I mean I obviously haven't worked this out, but it seems like I would have an advantage.
Like if an unbalanced player 4 bets me preflop, he got it and I easily fold. But if the "balanced" gto guy 4 bets me, then maybe my all-in fold equity or racing equity is that much greater, because I know he is balancing so he must have air some percentage of the time, and that air goes straight into my equity column.
Obviously I see the value of balancing, however it seems like if deception is lacking, than the opponent could snatch back the advantage.
It looks to me like some of the gto poker extrapolations or whatever you want to call them are based on totally information neutral scenarios, for instance two insentient computers playing against each other.
In a real game, there is always a vast and unquantifiable amount of extra information floating around.
Another thing: I might be wrong about this, or completely missing the point, but there seems to be a peculiar kind of recursion that could happen in a poker game because of the random element of the cards.
Let's say I realize that my opponent is playing a "game theory optimal" brand of poker. Since I know that he is, let's say, balancing all his actions, then I could get a pretty good idea of what he is doing, say if I reverse engineer like "what would a gto guy do?" and if I understand the correct gto strategy and know that he's playing it, I could predict. THEN if I play not-balanced, say I do such and such with my aces, such and such with 5-7 suited, such and such with AK, then, and I mean I obviously haven't worked this out, but it seems like I would have an advantage.
Like if an unbalanced player 4 bets me preflop, he got it and I easily fold. But if the "balanced" gto guy 4 bets me, then maybe my all-in fold equity or racing equity is that much greater, because I know he is balancing so he must have air some percentage of the time, and that air goes straight into my equity column.
Obviously I see the value of balancing, however it seems like if deception is lacking, than the opponent could snatch back the advantage.
Yes, 5bet bluffing vs a GTO bot might be more profitable than 5bet bluffing a certain unbalanced player, but it won't have a net positive winrate for you. If you adapt perfectly to the GTO bot, then you will have a winrate of 0bb/100.
What you say about information is true, insofar as that a GTO strategy could never really be used in live poker, due to the inherently exploitable human facets of the game.
12-19-2012
, 11:49 PM
This entire thread is pointless. Obviously GTO play in poker has already been solved.
https://www.youtube.com/watch?v=IZL7uiLGsO0
https://www.youtube.com/watch?v=IZL7uiLGsO0
12-19-2012
, 11:54 PM
Quote:
One thing: Any time you play poker, whether live or online, you will have extra information about your opponent's play. This could include stats, timing, situational information, as well as physical tells and even "hunches."
It looks to me like some of the gto poker extrapolations or whatever you want to call them are based on totally information neutral scenarios, for instance two insentient computers playing against each other.
In a real game, there is always a vast and unquantifiable amount of extra information floating around.
Another thing: I might be wrong about this, or completely missing the point, but there seems to be a peculiar kind of recursion that could happen in a poker game because of the random element of the cards.
Let's say I realize that my opponent is playing a "game theory optimal" brand of poker. Since I know that he is, let's say, balancing all his actions, then I could get a pretty good idea of what he is doing, say if I reverse engineer like "what would a gto guy do?" and if I understand the correct gto strategy and know that he's playing it, I could predict. THEN if I play not-balanced, say I do such and such with my aces, such and such with 5-7 suited, such and such with AK, then, and I mean I obviously haven't worked this out, but it seems like I would have an advantage.
Like if an unbalanced player 4 bets me preflop, he got it and I easily fold. But if the "balanced" gto guy 4 bets me, then maybe my all-in fold equity or racing equity is that much greater, because I know he is balancing so he must have air some percentage of the time, and that air goes straight into my equity column.
Obviously I see the value of balancing, however it seems like if deception is lacking, than the opponent could snatch back the advantage.
It looks to me like some of the gto poker extrapolations or whatever you want to call them are based on totally information neutral scenarios, for instance two insentient computers playing against each other.
In a real game, there is always a vast and unquantifiable amount of extra information floating around.
Another thing: I might be wrong about this, or completely missing the point, but there seems to be a peculiar kind of recursion that could happen in a poker game because of the random element of the cards.
Let's say I realize that my opponent is playing a "game theory optimal" brand of poker. Since I know that he is, let's say, balancing all his actions, then I could get a pretty good idea of what he is doing, say if I reverse engineer like "what would a gto guy do?" and if I understand the correct gto strategy and know that he's playing it, I could predict. THEN if I play not-balanced, say I do such and such with my aces, such and such with 5-7 suited, such and such with AK, then, and I mean I obviously haven't worked this out, but it seems like I would have an advantage.
Like if an unbalanced player 4 bets me preflop, he got it and I easily fold. But if the "balanced" gto guy 4 bets me, then maybe my all-in fold equity or racing equity is that much greater, because I know he is balancing so he must have air some percentage of the time, and that air goes straight into my equity column.
Obviously I see the value of balancing, however it seems like if deception is lacking, than the opponent could snatch back the advantage.
Also, since we really have no idea what an actual GTO strategy in NLHE would look like, it's very likely that you wouldn't recognize it even if you saw it.
12-20-2012
, 01:00 AM
Join Date: Sep 2002
Posts: 652
Two of you have already responded to anilyzer's "reverse engineering" comment, but I want to try another way of saying it. It's been said in this thread before, but bears repeating.
He's actually totally right that one could "reverse-engineer" the exact perfect strategy to play against the GTO guy. But this is the power of a GTO strategy pair. Player X says "I will play strategy x. And if you play *perfectly* against me (which is actually easy to do, once I tell you my strategy, see "calculating a nemesis" earlier in this thread) I will make an EV of exactly z.
So far, this is not very interesting. I mean, if I'm in the small blind in a 1/2 NL heads up game, I can say "I will fold every hand, and make at least -1 every hand, no matter what you do."
The amazing thing about a Nash Equilibrium is that it says the best strategy for Y will have a value that is *exactly equal* to the negative of z for X's best strategy. So to repeat:
X says "If I play x, and you do *the best you can* I will make z."
Y says "If I play y, and you do *the best you can* I will make -z."
I think anailyzer's mistake is in not realizing that when X says "I make at least z" that means *already taking into account the sort of manipulations he's describing." It's *easy* to play perfectly if the other guy tells you his strategy. Yet we find (making up numbers here) The small blind can say "I will play this way and *no matter what you do* I will average making .2/hand. If you play badly, I'll make more." And the big blind can say "If I play this way, I'll lose no more than .2/hand. No matter what you do, I won't lose more than that. If you play badly, I won't even lose that much."
So if you play a GTO strategy, you will get *at least* the value of the game. And if your opponent does so also, you will make exactly that value.
As has been pointed out many times, that is not necessarily a good thing. You won't exploit a bad opponent. And in fact, there are GTO strategies that make stupid plays (but only in response to a stupid opponent.) It's not all bad, though" in simple games, like rock-paper-scissors, if you play GTO your opponent can't make a mistake - however, poker isn't like this, there are plenty of opportunities to make plays that look ok but lose money to a GTO strategy.
Not trying to bag on anilyzer here, he put plenty of qualifiers in to say he wasn't sure. Just trying to help those still struggling with the math. [Hey, you're not in bad company, apparently durrrrr doesn't quite get it either
]
He's actually totally right that one could "reverse-engineer" the exact perfect strategy to play against the GTO guy. But this is the power of a GTO strategy pair. Player X says "I will play strategy x. And if you play *perfectly* against me (which is actually easy to do, once I tell you my strategy, see "calculating a nemesis" earlier in this thread) I will make an EV of exactly z.
So far, this is not very interesting. I mean, if I'm in the small blind in a 1/2 NL heads up game, I can say "I will fold every hand, and make at least -1 every hand, no matter what you do."
The amazing thing about a Nash Equilibrium is that it says the best strategy for Y will have a value that is *exactly equal* to the negative of z for X's best strategy. So to repeat:
X says "If I play x, and you do *the best you can* I will make z."
Y says "If I play y, and you do *the best you can* I will make -z."
I think anailyzer's mistake is in not realizing that when X says "I make at least z" that means *already taking into account the sort of manipulations he's describing." It's *easy* to play perfectly if the other guy tells you his strategy. Yet we find (making up numbers here) The small blind can say "I will play this way and *no matter what you do* I will average making .2/hand. If you play badly, I'll make more." And the big blind can say "If I play this way, I'll lose no more than .2/hand. No matter what you do, I won't lose more than that. If you play badly, I won't even lose that much."
So if you play a GTO strategy, you will get *at least* the value of the game. And if your opponent does so also, you will make exactly that value.
As has been pointed out many times, that is not necessarily a good thing. You won't exploit a bad opponent. And in fact, there are GTO strategies that make stupid plays (but only in response to a stupid opponent.) It's not all bad, though" in simple games, like rock-paper-scissors, if you play GTO your opponent can't make a mistake - however, poker isn't like this, there are plenty of opportunities to make plays that look ok but lose money to a GTO strategy.
Not trying to bag on anilyzer here, he put plenty of qualifiers in to say he wasn't sure. Just trying to help those still struggling with the math. [Hey, you're not in bad company, apparently durrrrr doesn't quite get it either
12-20-2012
, 05:14 AM
People can vary their calling/raising/all-in ranges, they can play tighter or looser regardless of stack size. This should automatically alter the expectation of the other player, and if he isn't flexible to adjust, we can't just gloss over this by saying "ok, I will win X, but he will win -X, so we're all even and he's unexploitable."
Here. This is from Wikipedia:
["the Nash equilibrium is a solution concept of a non-cooperative game involving two or more players, in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only his own strategy unilaterally. If each player has chosen a strategy and no player can benefit by changing his or her strategy while the other players keep theirs unchanged, then the current set of strategy choices and the corresponding payoffs constitute a Nash equilibrium."]
http://en.wikipedia.org/wiki/Nash_equilibrium
I mean, the "this works as long as the other player doesn't..." highlighted sections of this definition seem to perfectly describe exactly what poker players are constantly doing.
Hopefully I'm not being unbearably obtuse about this. To be honest my only exposure to this was in the "Raiser's Edge" (excellent book) section on it, but I just skimmed that part because I found the hand examples and strategy/meta-strategy discussions and hand examples far more interesting.
It just seems to me like there would be some theoretical questions about this to solve before we just jump in and start doing the math to come up with ranges. I mean with advanced stats, you could maybe plug that data into a program that would come up with some kind of adjusted equilibrium gto range for a specific situation...
but say there some specific game theory optimal range for pushing 20 BBs utg+1 with 6 players left at final table. Whatever that "optimal" range is, it seems like I could make it better by playing tighter than that in some situations, right? Or let's say I loosen it up in some situations.
Like if I am pushing 30% and everybody is folding unless they get AA/KK/QQ/AK-AQ, then all those times they fold, what difference does it make WHAT my hand was? If it's just a matter of frequency or percentages. When they call I have 25% or 15% equity usually, and when they fold I make a big blind. If they widen their range significantly than maybe my equity is one or two points higher because I have 22 and A2 in my range, but don't have J6 or T5 or whatever.
so like if I shove 20bbs and everybody folds, and then I'm like "yeah bitches I was gto" it's like, meh, ok, I guess. But it almost seems like I could shove the exact opposite low end of the range +AA and KK the same percentage of the time and get a very comparable result.
Ok, I'm now officially rambling. Later
12-20-2012
, 06:24 AM
Quote:
It just seems clear to me that gto/Nash equilibriums would have to be like functions, that would take something like the other players ranges as an argument. I glanced around the net and there are a ton of hardcoded nash equilibrium tables for poker that only take into account stack size, I don't see how these can possibly be correct.
Quote:
People can vary their calling/raising/all-in ranges, they can play tighter or looser regardless of stack size. This should automatically alter the expectation of the other player, and if he isn't flexible to adjust, we can't just gloss over this by saying "ok, I will win X, but he will win -X, so we're all even and he's unexploitable."
Quote:
Here. This is from Wikipedia:
["the Nash equilibrium is a solution concept of a non-cooperative game involving two or more players, in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only his own strategy unilaterally. If each player has chosen a strategy and no player can benefit by changing his or her strategy while the other players keep theirs unchanged, then the current set of strategy choices and the corresponding payoffs constitute a Nash equilibrium."]
http://en.wikipedia.org/wiki/Nash_equilibrium
I mean, the "this works as long as the other player doesn't..." highlighted sections of this definition seem to perfectly describe exactly what poker players are constantly doing.
["the Nash equilibrium is a solution concept of a non-cooperative game involving two or more players, in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only his own strategy unilaterally. If each player has chosen a strategy and no player can benefit by changing his or her strategy while the other players keep theirs unchanged, then the current set of strategy choices and the corresponding payoffs constitute a Nash equilibrium."]
http://en.wikipedia.org/wiki/Nash_equilibrium
I mean, the "this works as long as the other player doesn't..." highlighted sections of this definition seem to perfectly describe exactly what poker players are constantly doing.
Quote:
but say there some specific game theory optimal range for pushing 20 BBs utg+1 with 6 players left at final table. Whatever that "optimal" range is, it seems like I could make it better by playing tighter than that in some situations, right? Or let's say I loosen it up in some situations.
Like if I am pushing 30% and everybody is folding unless they get AA/KK/QQ/AK-AQ, then all those times they fold, what difference does it make WHAT my hand was? If it's just a matter of frequency or percentages. When they call I have 25% or 15% equity usually, and when they fold I make a big blind. If they widen their range significantly than maybe my equity is one or two points higher because I have 22 and A2 in my range, but don't have J6 or T5 or whatever.
so like if I shove 20bbs and everybody folds, and then I'm like "yeah bitches I was gto" it's like, meh, ok, I guess. But it almost seems like I could shove the exact opposite low end of the range +AA and KK the same percentage of the time and get a very comparable result.
Ok, I'm now officially rambling. Later
Like if I am pushing 30% and everybody is folding unless they get AA/KK/QQ/AK-AQ, then all those times they fold, what difference does it make WHAT my hand was? If it's just a matter of frequency or percentages. When they call I have 25% or 15% equity usually, and when they fold I make a big blind. If they widen their range significantly than maybe my equity is one or two points higher because I have 22 and A2 in my range, but don't have J6 or T5 or whatever.
so like if I shove 20bbs and everybody folds, and then I'm like "yeah bitches I was gto" it's like, meh, ok, I guess. But it almost seems like I could shove the exact opposite low end of the range +AA and KK the same percentage of the time and get a very comparable result.
Ok, I'm now officially rambling. Later
To give you an example of the value of GTO: suppose you are playing online poker against an unknown opponent. Behind your opponent's computer, however, is a room full of an unknown number of different players. Some are fish, some are maniacs, some are pros. You never know, in any given hand, which you are playing against. It could be a passive station one hand, a maniac the next; you could end up playing the entire match against durrr, you have no way of knowing from hand to hand. Since you have no way of adjusting your play based on opponent tendencies, what should you do? The answer is try to find a balanced strategy that does well against any opponent: reasonable preflop ranges, and solid postflop lines with a good balance of value betting and bluffs. If you were to play this way in the absolute most perfect way possible, then you would have found a GTO strategy.
12-20-2012
, 06:18 PM
Thanks for the very detailed response. I'm starting to get a clearer idea about this.
One thing:
It's hard for me to visualize how math could be used to find that perfect way of playing.
I mean, if we consider the possible number of two hand + flop combinations, [x][x] + [x][x] + [x][x][x] that right there is a large number. And that's just one hand heads up.
Now consider that we are playing hundreds, or even thousands of iterations or hands. And maybe that there are seven other players with various cards as well.
And then consider that I may be altering my strategy in different ways, based on the way I am responding to the opponents actions and strategy, and what I perceive that he thinks, or thinks that I am thinking, etc.
That's a lot of variables to consider, if we're going to use math to plan out a fixed strategy approach in advance and declare it to be at least breakeven no matter what happens, no matter what the opponent's actions are.
I mean, sure, on Wall St. the AIs can figure out that buying and selling the same stock 1,000 times a day in response to what their enemy AIs are doing will net a profit or prevent the enemy from netting a profit. It's really a place that humans can't visualize, but apparently its possible. Or maybe the AIs might've figured out that by colluding together to crash the market in 2008 that within 2 years they would be able to turn an even greater profit and consolidate far more of the dollars in play than they had before. It's possible that AIs can run so many deep permutations that they come up with conclusions that are completely alien to human thinking, but still work. I think it was in Wired magazine, there was an article about wall street AIs, how they will traverse this vast series of seemingly unnecessary actions and transactions just to arrive at a point of making one dollar, or a seemingly insignificant amount of money. But they will do these calculations and repetitive transactions a zillion times a day, that's just how they roll.
Like as I understand it, chess computers if they play each other will either totally smash their armies into each other and finish the game really quickly, based on a slight mathematical edge, or else will play extremely counterintuitive defensive games that could go for tens of thousands of moves and end in a draw everytime.
I like the idea of gto poker, but I have a doubt as to whether it is truly gto in
a reasonably limited sample size, for instance a 10 hour live session of ~appx 450 hands. Or in a single sng tournament.
I mean I can push 20bbs from middle position, get called and have 28% equity, lose, and then tell myself "I was gto right there and totally unexploitable. I may have lost but I was gto." But at some point what does that really mean?
One thing:
Quote:
Since you have no way of adjusting your play based on opponent tendencies, what should you do? The answer is try to find a balanced strategy that does well against any opponent: reasonable preflop ranges, and solid postflop lines with a good balance of value betting and bluffs. If you were to play this way in the absolute most perfect way possible, then you would have found a GTO strategy.
I mean, if we consider the possible number of two hand + flop combinations, [x][x] + [x][x] + [x][x][x] that right there is a large number. And that's just one hand heads up.
Now consider that we are playing hundreds, or even thousands of iterations or hands. And maybe that there are seven other players with various cards as well.
And then consider that I may be altering my strategy in different ways, based on the way I am responding to the opponents actions and strategy, and what I perceive that he thinks, or thinks that I am thinking, etc.
That's a lot of variables to consider, if we're going to use math to plan out a fixed strategy approach in advance and declare it to be at least breakeven no matter what happens, no matter what the opponent's actions are.
I mean, sure, on Wall St. the AIs can figure out that buying and selling the same stock 1,000 times a day in response to what their enemy AIs are doing will net a profit or prevent the enemy from netting a profit. It's really a place that humans can't visualize, but apparently its possible. Or maybe the AIs might've figured out that by colluding together to crash the market in 2008 that within 2 years they would be able to turn an even greater profit and consolidate far more of the dollars in play than they had before. It's possible that AIs can run so many deep permutations that they come up with conclusions that are completely alien to human thinking, but still work. I think it was in Wired magazine, there was an article about wall street AIs, how they will traverse this vast series of seemingly unnecessary actions and transactions just to arrive at a point of making one dollar, or a seemingly insignificant amount of money. But they will do these calculations and repetitive transactions a zillion times a day, that's just how they roll.
Like as I understand it, chess computers if they play each other will either totally smash their armies into each other and finish the game really quickly, based on a slight mathematical edge, or else will play extremely counterintuitive defensive games that could go for tens of thousands of moves and end in a draw everytime.
I like the idea of gto poker, but I have a doubt as to whether it is truly gto in
a reasonably limited sample size, for instance a 10 hour live session of ~appx 450 hands. Or in a single sng tournament.
I mean I can push 20bbs from middle position, get called and have 28% equity, lose, and then tell myself "I was gto right there and totally unexploitable. I may have lost but I was gto." But at some point what does that really mean?
12-20-2012
, 06:33 PM
I don't have time for a long response but I'll try to get a quick one off.
It means that, assuming your shove there was really GTO, there's no strategy your opponent can play which exploits your range there. Maybe they only call you with the nuts and sure, your equity is bad when you get called, but you're winning the pot uncontested most of the time; maybe they call you frequently, but then your equity is better when you get it in.
And as for your general point, yes, calculating GTO play is complicated. In fact, for NLHE, it's almost unfathomably complicated, and certainly beyond the capability of any of our computers for the foreseeable future. Computers have managed to calculate near-GTO strategy for HU limit poker specifically, because the number of possible situations is relatively small, and even then it requires a colossal amount of computations. If you add in more players, or switch to no-limit, the game tree quickly grows far beyond our abilities to compute. You won't be seeing truly GTO bots any time soon, or even probably within your life; but the point is that Nash proved that a GTO strategy does exist, even if it is far beyond our ability to calculate.
And as for your general point, yes, calculating GTO play is complicated. In fact, for NLHE, it's almost unfathomably complicated, and certainly beyond the capability of any of our computers for the foreseeable future. Computers have managed to calculate near-GTO strategy for HU limit poker specifically, because the number of possible situations is relatively small, and even then it requires a colossal amount of computations. If you add in more players, or switch to no-limit, the game tree quickly grows far beyond our abilities to compute. You won't be seeing truly GTO bots any time soon, or even probably within your life; but the point is that Nash proved that a GTO strategy does exist, even if it is far beyond our ability to calculate.
12-20-2012
, 08:14 PM
Quote:
Thanks for the very detailed response. I'm starting to get a clearer idea about this.
One thing:
It's hard for me to visualize how math could be used to find that perfect way of playing.
I mean, if we consider the possible number of two hand + flop combinations, [x][x] + [x][x] + [x][x][x] that right there is a large number. And that's just one hand heads up.
Now consider that we are playing hundreds, or even thousands of iterations or hands. And maybe that there are seven other players with various cards as well.
And then consider that I may be altering my strategy in different ways, based on the way I am responding to the opponents actions and strategy, and what I perceive that he thinks, or thinks that I am thinking, etc.
That's a lot of variables to consider, if we're going to use math to plan out a fixed strategy approach in advance and declare it to be at least breakeven no matter what happens, no matter what the opponent's actions are.
I mean, sure, on Wall St. the AIs can figure out that buying and selling the same stock 1,000 times a day in response to what their enemy AIs are doing will net a profit or prevent the enemy from netting a profit. It's really a place that humans can't visualize, but apparently its possible. Or maybe the AIs might've figured out that by colluding together to crash the market in 2008 that within 2 years they would be able to turn an even greater profit and consolidate far more of the dollars in play than they had before. It's possible that AIs can run so many deep permutations that they come up with conclusions that are completely alien to human thinking, but still work. I think it was in Wired magazine, there was an article about wall street AIs, how they will traverse this vast series of seemingly unnecessary actions and transactions just to arrive at a point of making one dollar, or a seemingly insignificant amount of money. But they will do these calculations and repetitive transactions a zillion times a day, that's just how they roll.
Like as I understand it, chess computers if they play each other will either totally smash their armies into each other and finish the game really quickly, based on a slight mathematical edge, or else will play extremely counterintuitive defensive games that could go for tens of thousands of moves and end in a draw everytime.
I like the idea of gto poker, but I have a doubt as to whether it is truly gto in
a reasonably limited sample size, for instance a 10 hour live session of ~appx 450 hands. Or in a single sng tournament.
I mean I can push 20bbs from middle position, get called and have 28% equity, lose, and then tell myself "I was gto right there and totally unexploitable. I may have lost but I was gto." But at some point what does that really mean?
One thing:
It's hard for me to visualize how math could be used to find that perfect way of playing.
I mean, if we consider the possible number of two hand + flop combinations, [x][x] + [x][x] + [x][x][x] that right there is a large number. And that's just one hand heads up.
Now consider that we are playing hundreds, or even thousands of iterations or hands. And maybe that there are seven other players with various cards as well.
And then consider that I may be altering my strategy in different ways, based on the way I am responding to the opponents actions and strategy, and what I perceive that he thinks, or thinks that I am thinking, etc.
That's a lot of variables to consider, if we're going to use math to plan out a fixed strategy approach in advance and declare it to be at least breakeven no matter what happens, no matter what the opponent's actions are.
I mean, sure, on Wall St. the AIs can figure out that buying and selling the same stock 1,000 times a day in response to what their enemy AIs are doing will net a profit or prevent the enemy from netting a profit. It's really a place that humans can't visualize, but apparently its possible. Or maybe the AIs might've figured out that by colluding together to crash the market in 2008 that within 2 years they would be able to turn an even greater profit and consolidate far more of the dollars in play than they had before. It's possible that AIs can run so many deep permutations that they come up with conclusions that are completely alien to human thinking, but still work. I think it was in Wired magazine, there was an article about wall street AIs, how they will traverse this vast series of seemingly unnecessary actions and transactions just to arrive at a point of making one dollar, or a seemingly insignificant amount of money. But they will do these calculations and repetitive transactions a zillion times a day, that's just how they roll.
Like as I understand it, chess computers if they play each other will either totally smash their armies into each other and finish the game really quickly, based on a slight mathematical edge, or else will play extremely counterintuitive defensive games that could go for tens of thousands of moves and end in a draw everytime.
I like the idea of gto poker, but I have a doubt as to whether it is truly gto in
a reasonably limited sample size, for instance a 10 hour live session of ~appx 450 hands. Or in a single sng tournament.
I mean I can push 20bbs from middle position, get called and have 28% equity, lose, and then tell myself "I was gto right there and totally unexploitable. I may have lost but I was gto." But at some point what does that really mean?
12-20-2012
, 08:46 PM
Join Date: Jan 2004
Posts: 5,635
Quote:
Oh. I think I'm about 1/6 in this thread, heh. Well if that concept did have any validity, it would only need to be true averaged across all boards, right? So thinking about it at the table while faced with a specific board would be pretty useless anyway. Maybe. Always glad to be told why I'm wrong in any case :/
Feedback is used for internal purposes. LEARN MORE
Powered by:
Hand2Note
Copyright ©2008-2022, Hand2Note Interactive LTD