Quote:
Originally Posted by Pandaclocker
"They want their AI to be non-exploitative but that is very different from it actually being GTO"
Can you elabrorate on this, please?
I always thought that playing perfectly unexploitable poker was the same as playing perfect GTO.
Playing unexploitable means that no matter what strategy you opponent might use against you, he will not be able to have a positive EV against you (but always <=0).
Isn't that the same as playing GTO, or am I missing something here?
Hi,
So there are two key differences, although they are largely semantics as I think we are mostly just using different terms. That said, terminology can be important so...
1. In my statement above that you quoted I said "non-exploitative" not "unexploitable". "non-exploitative" just means that it doesn't adapt its play based on what its opponent is doing. Strategies that don't change their behavior based on past play are sometimes called static. In repeated games (like poker where you play multiple hands) in general it is always a GTO solution in the repeated game (although often there are non-static GTO solutions) to play the GTO solution to a single round game every single time. Obviously such a GTO strategy is static, but just training a bot to not adapt its play based on its opponents is obviously not sufficient to get you to GTO. A bot that always folds in not-exploitative but not GTO.
2. I think usually when people define "unexploitable" they mean that your opponent would not be able to "exploit you" which is usually defined as, "even if my opponent knew my strategy exactly he could not adjust his play in a way that would increase his EV, compared to how he is currently playing". That is to be exploitable there must be some strategy adjustment your opponent could make to exploit you and increase his EV. Again GTO strategies must be unexploitable, but there are unexploitable strategies that are not GTO. A trivial example would be always folding. This strategy is unexploitable in the sense that no matter what strategy your opponent uses he can not leverage his knowledge of your strategy to increase his EV. However, as you said, GTO requires that you are maximizing you EV at every point, not just avoid exploitation.
A more useful example to consider is a game where you hold the nuts or air and your opponent holds a medium strength bluff catcher (like the clairvoyance game from MoP) and there are 100 chips left to bet. You can bet 50 chips with a "balanced" (according to MoP alpha value) range that is "unexploitable" in the sense that you make our opponent indifferent between calling and folding. If you do so he has no way to exploit you, his EV is the same no matter what he does. So your strategy is unexploitable right?. However, doing that is NOT GTO because betting 50 chips with a balanced range is just a fundamentally weak play. You should be betting the full 100 chips (with a properly balanced alpha range) because that is more effective with your range and will increase your EV. The difference in quality between the strat that bets 50 chips with a balanced range vs 100 chips with a balanced range is not a matter of your opponent exploiting you, him playing perfectly (or always folding, or always calling or whatever he wants to do) has no impact on your EV or on which strategy is superior. The superiority is a matter of EV maximization which is a key element of GTO that is often overlooked because people focus too much on balance. I have a blog post on this here:
http://blog.gtorangebuilder.com/2014...ploitable.html
Furthermore, it is worth noting that there are plenty of games (including poker with rake) or a single hand of poker where you are out of position where a strategy like you defined unexploitable (guarantees your opponents EV is < 0) might not exist or where it might exist but not be GTO.
3. If your questions are in regards to snowie rather than GTORB, you can of course use CREV to just measure how close to GTO any strat is on some specific rivers (although I don't think snowie claims to be near GTO and I don't think that it needs to be GTO to be useful). That won't tell you how close to GTO its strategy as a whole is but it will give you an idea of if its really close, not very close, or way off. GTORB automatically calculates and reports its "
epsilon equilibrium distance" or "nash distance" with every solution which is the standard game theory measure of quality of an approximate GTO solution. I encourage all GTORB users to verify the reported solution accuracy and so I made videos of how to do so using CREV for river scenarios. You could of course check Snowie against CREV in the same way that you can check gtorb and directly measure its epsilon equilibrium distance, at least on a specific river, if you were so inclined. GTO is a science, is provable and is something you can at least to an extent check yourself so you don't need to take anyone else's word for if something is GTO or not.
Anyways, I made two videos on how to measure epsilon equilibrum distance using CREV that are free on youtube here:
https://www.youtube.com/watch?v=j7xSHC90_Og
https://www.youtube.com/watch?v=-UenhnsjiOY
Then I also have a (not free) CardRunners video on the topic here where I go through measuring the quality of one of the approximate GTO solutions from Janda's book using CREV here, and then show how to shrink the epsilon equilibrium distance by improving the strategy hand by hand based on GTORBs results:
http://www.cardrunners.com/poker-vid...-part-1-asuth/
Sorry this response is so long, but I like to be thorough.
-swc
Last edited by swc123; 01-21-2015 at 01:16 AM.