Open Side Menu Go to the Top
Register
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation

09-13-2018 , 09:01 PM
Let's start a discussion about Simulation vs. GTO vs. AI, vs. developing instinct or exploitation. To keep it simple, let's base comparisons on practical win rate improvement.

GTO and AI can both result in a perfect or near perfect no limit Hold'em solution. Not easy to use in it's current state.

Developing 'instinct' by spending hunderds or thousand of hours studying is suggested by most coaches. Not many of have the time or commitment.

Exploitation has always been a good but incomplete strategy.

Simulation can provide useful information but is not precise. I assume here a simulator that can approximate a real players play against 5 other players. Not simplistic like Poker Stove.

I balieve that simulation can produce useful results by playing millions of simulated hands. It could approximate EV of every preflop starting hand,
every flop starting hand or draw, rate boards, and allow extensive experimentation.

If the goal is simply to improve win rate, not achieve perfection, then simulation may be the best answer.

Old joke punchline: I don't have to outrun the bear. I only have to outrun you.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-14-2018 , 09:06 AM
Quote:
Originally Posted by Fishing
Let's start a discussion about Simulation vs. GTO vs. AI, vs. developing instinct or exploitation. To keep it simple, let's base comparisons on practical win rate improvement.

GTO and AI can both result in a perfect or near perfect no limit Hold'em solution. Not easy to use in it's current state.

Developing 'instinct' by spending hunderds or thousand of hours studying is suggested by most coaches. Not many of have the time or commitment.

Exploitation has always been a good but incomplete strategy.

Simulation can provide useful information but is not precise. I assume here a simulator that can approximate a real players play against 5 other players. Not simplistic like Poker Stove.

I balieve that simulation can produce useful results by playing millions of simulated hands. It could approximate EV of every preflop starting hand,
every flop starting hand or draw, rate boards, and allow extensive experimentation.

If the goal is simply to improve win rate, not achieve perfection, then simulation may be the best answer.

Old joke punchline: I don't have to outrun the bear. I only have to outrun you.
Ok. How would you simulate profitable situations based on uniquely human failings? Tilt, lack of sleep, changes in decision making based on dopamine spike from wining a big pot?
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-14-2018 , 10:14 AM
There is another method. Geometry. There is an underlying geometric proof that predicts the results which the A.I. then proves.

I do not have such a geometric proof, but getting there would be easier than solving the entire tree, which is the only other option.

I believe the nature of poker, and that Nash Equilibrium is a given for poker, predicts the existence of the geometric solution.

The stated goal in the thread is to improve WR the most. One correctly solved graph contains every possible action, with every possible combo. To know the most optimal action is as simple (or complicated) as asking ‘where am I located on this graph?’.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-14-2018 , 02:59 PM
Quote:
Originally Posted by hAmThEkIlLeR
Ok. How would you simulate profitable situations based on uniquely human failings? Tilt, lack of sleep, changes in decision making based on dopamine spike from wining a big pot?
Simulation is based on averages and frequencies. It does not attempt to play against an individual player.

Assume that the simulator has analyzed hand histories from a tens of thousands of real hand history files and that it has used this information to develop a profile of an average player, Fish, Nit, Lag, and Tag. The profile is then used to develop opponent strategies ( ranges and rules ) in the simulator. Average human failings are thus part of the opponent profiles.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-14-2018 , 03:01 PM
Quote:
Originally Posted by robert_utk
There is another method. Geometry. There is an underlying geometric proof that predicts the results which the A.I. then proves.

I do not have such a geometric proof, but getting there would be easier than solving the entire tree, which is the only other option.

I believe the nature of poker, and that Nash Equilibrium is a given for poker, predicts the existence of the geometric solution.

The stated goal in the thread is to improve WR the most. One correctly solved graph contains every possible action, with every possible combo. To know the most optimal action is as simple (or complicated) as asking ‘where am I located on this graph?’.
Please elaborate on Geometry. I am not familiar with that as an option. How would it be implemented?
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-14-2018 , 05:05 PM


for example

Point A is a min bet.

Point ? is undefined without stack information; Point ? may very well be all in somewhere far above the 0ev line with a strong hand.

The curve is the bettor's ev, which is a function of potsize.

It could very well be that the curve starts above the 0ev line on the left and ends below the 0ev line at point ?, as would happen with a thin value hand.

I just drew it the way I did for simplicity.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-14-2018 , 05:13 PM
Yes, and underneath the bet size curve, you solve for easy to remember threshold combos. It will vary in runouts which should be led sometimes, vs other runouts that oop will never lead. Once you have that graph done *correctly* you will always know how to respond. Card removal plus SPR both add more complexity. But once you get it all put together, that is it. Done.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-14-2018 , 05:27 PM
Back to the original post:

I think exploitive play wins easily but there are a number of catches, one for each and every chance you get to exploit.

To put it simply, I'm much more likely to exploit someone on the river compared with preflop. For example, even vs the tightest players, I'll never raise 72o in standard conditions, even in a heads up match(exceptions include but are not limited to players sitting out, icm fun in tourneys). However, under the correct circumstances on the river, I may call with every hand that can beat (bet/final pot) of my opponent's range, or I may fold all bluffcatchers up to the point of beating a portion of my opponent's value hands that is smaller than (bet/final pot) but > 0%.

The point is that when there's more poker to be played, there's more room for counter exploitation. I think this quality is independent of the fact that the river is a static game, as we could make a game that ends with (random element) after the last round of betting and we will still see the most, or best, opportunities to exploit on the last round of betting.

However, I think that there's a gray area where it's necessary to have some sense of where the margin is (gto ev on an individual hand basis) in order to exploit, and also to know when you're in the deep end, or when you're very far from the margin.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-14-2018 , 05:35 PM
Quote:
Originally Posted by Bob148


for example

Point A is a min bet.

Point ? is undefined without stack information; Point ? may very well be all in somewhere far above the 0ev line with a strong hand.

The curve is the bettor's ev, which is a function of potsize.

It could very well be that the curve starts above the 0ev line on the left and ends below the 0ev line at point ?, as would happen with a thin value hand.

I just drew it the way I did for simplicity.
A simulator could vary the bet size then comparing win rates. Starting stacks can also be vary. I have not tried that yet, but will. Thanks. Good post.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-14-2018 , 05:45 PM
Quote:
Originally Posted by Bob148
Back to the original post:

I think exploitive play wins easily but there are a number of catches, one for each and every chance you get to exploit.

To put it simply, I'm much more likely to exploit someone on the river compared with preflop. For example, even vs the tightest players, I'll never raise 72o in standard conditions, even in a heads up match(exceptions include but are not limited to players sitting out, icm fun in tourneys). However, under the correct circumstances on the river, I may call with every hand that can beat (bet/final pot) of my opponent's range, or I may fold all bluffcatchers up to the point of beating a portion of my opponent's value hands that is smaller than (bet/final pot) but > 0%.

The point is that when there's more poker to be played, there's more room for counter exploitation. I think this quality is independent of the fact that the river is a static game, as we could make a game that ends with (random element) after the last round of betting and we will still see the most, or best, opportunities to exploit on the last round of betting.

However, I think that there's a gray area where it's necessary to have some sense of where the margin is (gto ev on an individual hand basis) in order to exploit, and also to know when you're in the deep end, or when you're very far from the margin.

The nature of simulation is that it must include enough hands for convergence and that it must deal with averages. It may have in the simulation specific conditions but that is not the purpose.

Simulation is never precise. It's just a tool that can allow for very fast verification of theories and generate results that are very close.

For example, the EV of a starting hand does not have to be precise if it is used instead of Equity for selecting hands in a range. We only need to be sure that the order of hands from best to worst is correct.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-14-2018 , 09:42 PM
I don't think poker works like that. (referencing the very simplistic graph)

The of EV of a hand is going to be significantly more of a function of your range, villain's range, and at it's immediate core--how the opponent responds to your bet.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-14-2018 , 10:17 PM
Quote:
Originally Posted by Brokenstars
I don't think poker works like that. (referencing the very simplistic graph)

The of EV of a hand is going to be significantly more of a function of your range, villain's range, and at it's immediate core--how the opponent responds to your bet.
I don't know the answer but do think we can find it through simulation.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-15-2018 , 03:59 AM
Quote:
Originally Posted by Fishing
I don't know the answer but do think we can find it through simulation.
Yeah, pio does this...
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-15-2018 , 05:27 AM
I think the best way to "go ahead and solve it" is to start with ending hands and work back to starting hands. Then at each step backwards in the hand ask, what percentage of my stack is Max(+EV), rather than bet sizes relative to BB or chip values.

This means the max bet size is determined and the smallest bet size varies. (1-%river-%turn-%flop-%pre-%blinds). We already know the effects of a small % varying since sub (20/50/100) BB is solved.

The min bet is defined by the game and has the least effect on the EV calculation for stack sized over 50, 100 and 200 BB. If it fluctuates instead of the river bet size, the EV curve will settle down and may approach a better solution that we have. Another reason to use blinds as a percent of stack size rather than BB or chip value is that the parameter determining optimal bet size is unitless. I think is a step in solving the bifurcation relationships between the parameters and EV.

Maybe we can just write a solution to short deck and move to china.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-15-2018 , 07:15 AM
counterfactual regret minimization vs 'neural networks'
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote
09-15-2018 , 03:15 PM
Quote:
Originally Posted by Pcallinallin
I think the best way to "go ahead and solve it" is to start with ending hands and work back to starting hands. Then at each step backwards in the hand ask, what percentage of my stack is Max(+EV), rather than bet sizes relative to BB or chip values.

This means the max bet size is determined and the smallest bet size varies. (1-%river-%turn-%flop-%pre-%blinds). We already know the effects of a small % varying since sub (20/50/100) BB is solved.

The min bet is defined by the game and has the least effect on the EV calculation for stack sized over 50, 100 and 200 BB. If it fluctuates instead of the river bet size, the EV curve will settle down and may approach a better solution that we have. Another reason to use blinds as a percent of stack size rather than BB or chip value is that the parameter determining optimal bet size is unitless. I think is a step in solving the bifurcation relationships between the parameters and EV.

Maybe we can just write a solution to short deck and move to china.

Good idea! Thanks. I will give it a try. Well, maybe not the China part.
Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation Quote

      
m