Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation - Poker Theory

Two Plus Two Forums Poker Strategy Poker Theory & GTO

Simulation vs. GTO vs. AI, vs. developing instinct vs. exploitation

Post Reply Subscribe

...

09-13-2018 , 09:01 PM

Fishing

centurion

Join Date: Jul 2004 Posts: 126

Let's start a discussion about Simulation vs. GTO vs. AI, vs. developing instinct or exploitation. To keep it simple, let's base comparisons on practical win rate improvement.

GTO and AI can both result in a perfect or near perfect no limit Hold'em solution. Not easy to use in it's current state.

Developing 'instinct' by spending hunderds or thousand of hours studying is suggested by most coaches. Not many of have the time or commitment.

Exploitation has always been a good but incomplete strategy.

Simulation can provide useful information but is not precise. I assume here a simulator that can approximate a real players play against 5 other players. Not simplistic like Poker Stove.

I balieve that simulation can produce useful results by playing millions of simulated hands. It could approximate EV of every preflop starting hand,
every flop starting hand or draw, rate boards, and allow extensive experimentation.

If the goal is simply to improve win rate, not achieve perfection, then simulation may be the best answer.

Old joke punchline: I don't have to outrun the bear. I only have to outrun you.

Quote

09-14-2018 , 09:06 AM

hAmThEkIlLeR

grinder

Join Date: May 2012 Posts: 569

Quote:

Originally Posted by Fishing

Ok. How would you simulate profitable situations based on uniquely human failings? Tilt, lack of sleep, changes in decision making based on dopamine spike from wining a big pot?

Quote

09-14-2018 , 10:14 AM

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

There is another method. Geometry. There is an underlying geometric proof that predicts the results which the A.I. then proves.

I do not have such a geometric proof, but getting there would be easier than solving the entire tree, which is the only other option.

I believe the nature of poker, and that Nash Equilibrium is a given for poker, predicts the existence of the geometric solution.

The stated goal in the thread is to improve WR the most. One correctly solved graph contains every possible action, with every possible combo. To know the most optimal action is as simple (or complicated) as asking ‘where am I located on this graph?’.

Quote

09-14-2018 , 02:59 PM

Fishing

centurion

Join Date: Jul 2004 Posts: 126

Quote:

Originally Posted by hAmThEkIlLeR

Ok. How would you simulate profitable situations based on uniquely human failings? Tilt, lack of sleep, changes in decision making based on dopamine spike from wining a big pot?

Simulation is based on averages and frequencies. It does not attempt to play against an individual player.

Assume that the simulator has analyzed hand histories from a tens of thousands of real hand history files and that it has used this information to develop a profile of an average player, Fish, Nit, Lag, and Tag. The profile is then used to develop opponent strategies ( ranges and rules ) in the simulator. Average human failings are thus part of the opponent profiles.

Quote

09-14-2018 , 03:01 PM

Fishing

centurion

Join Date: Jul 2004 Posts: 126

Quote:

Originally Posted by robert_utk

Please elaborate on Geometry. I am not familiar with that as an option. How would it be implemented?

Quote

09-14-2018 , 05:05 PM

Bob148

Carpal \'Tunnel

Join Date: May 2012 Posts: 11,972

for example

Point A is a min bet.

Point ? is undefined without stack information; Point ? may very well be all in somewhere far above the 0ev line with a strong hand.

The curve is the bettor's ev, which is a function of potsize.

It could very well be that the curve starts above the 0ev line on the left and ends below the 0ev line at point ?, as would happen with a thin value hand.

I just drew it the way I did for simplicity.

Quote

09-14-2018 , 05:13 PM

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

Yes, and underneath the bet size curve, you solve for easy to remember threshold combos. It will vary in runouts which should be led sometimes, vs other runouts that oop will never lead. Once you have that graph done *correctly* you will always know how to respond. Card removal plus SPR both add more complexity. But once you get it all put together, that is it. Done.

Quote

09-14-2018 , 05:27 PM

Bob148

Carpal \'Tunnel

Join Date: May 2012 Posts: 11,972

Back to the original post:

I think exploitive play wins easily but there are a number of catches, one for each and every chance you get to exploit.

To put it simply, I'm much more likely to exploit someone on the river compared with preflop. For example, even vs the tightest players, I'll never raise 72o in standard conditions, even in a heads up match(exceptions include but are not limited to players sitting out, icm fun in tourneys). However, under the correct circumstances on the river, I may call with every hand that can beat (bet/final pot) of my opponent's range, or I may fold all bluffcatchers up to the point of beating a portion of my opponent's value hands that is smaller than (bet/final pot) but > 0%.

The point is that when there's more poker to be played, there's more room for counter exploitation. I think this quality is independent of the fact that the river is a static game, as we could make a game that ends with (random element) after the last round of betting and we will still see the most, or best, opportunities to exploit on the last round of betting.

However, I think that there's a gray area where it's necessary to have some sense of where the margin is (gto ev on an individual hand basis) in order to exploit, and also to know when you're in the deep end, or when you're very far from the margin.

Quote

09-14-2018 , 05:35 PM

Fishing

centurion

Join Date: Jul 2004 Posts: 126

Quote:

Originally Posted by Bob148

A simulator could vary the bet size then comparing win rates. Starting stacks can also be vary. I have not tried that yet, but will. Thanks. Good post.

Quote

09-14-2018 , 05:45 PM

#10

Fishing

centurion

Join Date: Jul 2004 Posts: 126

Quote:

Originally Posted by Bob148

The nature of simulation is that it must include enough hands for convergence and that it must deal with averages. It may have in the simulation specific conditions but that is not the purpose.

Simulation is never precise. It's just a tool that can allow for very fast verification of theories and generate results that are very close.

For example, the EV of a starting hand does not have to be precise if it is used instead of Equity for selecting hands in a range. We only need to be sure that the order of hands from best to worst is correct.

Quote

09-14-2018 , 09:42 PM

#11

Brokenstars

Verified Coach NLHE

Join Date: Jul 2012 Posts: 16,093

I don't think poker works like that. (referencing the very simplistic graph)

The of EV of a hand is going to be significantly more of a function of your range, villain's range, and at it's immediate core--how the opponent responds to your bet.

Quote

09-14-2018 , 10:17 PM

#12

Fishing

centurion

Join Date: Jul 2004 Posts: 126

Quote:

Originally Posted by Brokenstars

I don't know the answer but do think we can find it through simulation.

Quote

09-15-2018 , 03:59 AM

#13

Brokenstars

Verified Coach NLHE

Join Date: Jul 2012 Posts: 16,093

Quote:

Originally Posted by Fishing

I don't know the answer but do think we can find it through simulation.

Yeah, pio does this...

Quote

09-15-2018 , 05:27 AM

#14

Pcallinallin

formerly the original donkey penis

Join Date: Jan 2007 Posts: 5,035

I think the best way to "go ahead and solve it" is to start with ending hands and work back to starting hands. Then at each step backwards in the hand ask, what percentage of my stack is Max(+EV), rather than bet sizes relative to BB or chip values.

This means the max bet size is determined and the smallest bet size varies. (1-%river-%turn-%flop-%pre-%blinds). We already know the effects of a small % varying since sub (20/50/100) BB is solved.

The min bet is defined by the game and has the least effect on the EV calculation for stack sized over 50, 100 and 200 BB. If it fluctuates instead of the river bet size, the EV curve will settle down and may approach a better solution that we have. Another reason to use blinds as a percent of stack size rather than BB or chip value is that the parameter determining optimal bet size is unitless. I think is a step in solving the bifurcation relationships between the parameters and EV.

Maybe we can just write a solution to short deck and move to china.

Quote

09-15-2018 , 07:15 AM

#15

outfit

old hand

Join Date: Feb 2015 Posts: 1,396

counterfactual regret minimization vs 'neural networks'

Quote

09-15-2018 , 03:15 PM

#16

Fishing

centurion

Join Date: Jul 2004 Posts: 126

Quote:

Originally Posted by Pcallinallin

Good idea! Thanks. I will give it a try. Well, maybe not the China part.

Quote

Post Reply Subscribe

...