Quote:
Originally Posted by TakenItEasy
In theory, it seems to me that any non-adaptive strategy would eventually completely reveal itself after observing enough hands. Therefore it should be fair to optimize any exploitative strategy vs a known GTO strategy when proving GTO is our goal.
In effect the GTO player will broadcast all ranges on all streets and all post flop action for the entire post flop decision tree since all of this information would be available if the complete algorithm were known. You can do better the creating perfect post-flop play... you can play perfectly pre-flop as well.
Given this information, It seems to me that the best approach would go something like this:
With perfect range information we could run equity calculations for the entire post flop decision tree without any loss of certainty.
Another words we could optimize exploitative GTO post-flop play. We only need to run an equity equation for 47 turns and 46 rivers and allowing for maybe 6 action combos per street and using 4 starting ranges it requires about a million equations per hand that should provide optimal exploitative post flop results for the given pre-flop ranges. So it seems well worth the trouble
47x46x6x6x6x4~1 or 2 Million equations to solve per hand.
fold equity can be calculated for precisely when folding ranges are known for all aggression on all streets.
Exact implied odds can be calculated.
the thinnest possible value bets can be calculated for known calling ranges.
Even calculate for possible floating situations given known ranges on turn/river runout combos.
Include folding and then choose the best EV.
Yup. It's easy to compute a best-response to any strategy. That's how the algorithm used to create cepheus knows when it's Found a close approximation to a Nash Equilibrium (A, B).
When
- Best-Response against B does almost no better than A does vs B
and
- Best-Response against A does almost no better than B does vs A.
Quote:
For Pre-flop action:
I would start with 4 initial conditions for defining 4 pre-flop strategies.
Range A would be to passively call ATC for the cheapest possible flop.
The idea would Be to see just how far the perfect post flop play vs defined GTO can take us.
Range B would be opening with a loose range while capping aggression at 2 bets and making wide 2-3 bet calls with uncapped ranges.
Range C would cp aggression at 3bets and uncapped calls for 3-4 bets.
Range D would be a tight range capping at 4 bets and uncapped 4-5 bet calls.
Once we deal the flop for the predefined pre-flop action we can Run the million equations to get the optimal results for all post flop action.
Then we could then repeat this for a decent sample of flops. In order to get the accurate win-rates for the four ranges.
Calculating best response is simpler than that. You don't need to deal with ranges. You can calculate the best action one hand at a time. You don't need to worry about being balanced because you are dealing with a non-adaptive opponent. Your range doesn't matter. Only your hand, board, betting, and stacks matter. And you can do better than perfect post-flop play, you can also play perfectly pre-flop against it.
Quote:
Once any strategy is found to be long term winning, we would know it's not GTO...
Right, but regret minimization algorithms find a strategy against which no strategy exists which is winning more than some tiny amount and it verifies that by checking the best-response.
Last edited by bobf; 04-11-2015 at 04:45 PM.