Open Side Menu Go to the Top
Register
Alberta university Poker 'bot "solves" heads up limit hold 'em Alberta university Poker 'bot "solves" heads up limit hold 'em

01-24-2015 , 03:15 PM
In these discussions it is important for everyone to realize that the perfect GTO strategy for hold em is, unlike stud, two different strategies, neither one which breaks even against great players. It wins the optimum as the button and loses the optimum as the big blind. It would have no chance against pros if it spotted them the button.
Alberta university Poker 'bot "solves" heads up limit hold 'em Quote
01-24-2015 , 05:06 PM
Quote:
Originally Posted by David Sklansky
In these discussions it is important for everyone to realize that the perfect GTO strategy for hold em is, unlike stud, two different strategies, neither one which breaks even against great players. It wins the optimum as the button and loses the optimum as the big blind. It would have no chance against pros if it spotted them the button.
So basically u r agreeing that Cepheus isnt GTO then. If as nburch said, it "adapts" to every player specific profile then it's a nemesis bot. It doesnt just have a strategy for bb and other sb, it has possibly hundreds of strategies incorporated in it.
Alberta university Poker 'bot "solves" heads up limit hold 'em Quote
01-24-2015 , 06:28 PM
Quote:
Originally Posted by Donkem
So basically what u have there is the mathematical nemesis. It's a perfect exploitative bot that reads the opponent's range and adapts. What u r calling GTO is actually a sum of all the strategies that beat all possible strategies in the game.

If not, then how would a nemesis be any different from this GTO?
Quote:
If as nburch said, it "adapts" to every player specific profile then it's a nemesis bot. It doesnt just have a strategy for bb and other sb, it has possibly hundreds of strategies incorporated in it.
You are mixing things up all over the place. The Cepheus "bot" is a fixed strategy, it does not change its play depending on the opponent's play. nburch was describing the process they used to arrive at this fixed strategy, as well as the verification (exploitability calculation). To reiterate, the Cepheus strategy is fixed now and does not adapt.

When they calculated the exploitability they had the Cepheus' fixed strategy play against its Nemesis and it loses less than 0.05bb/100 (<1mbb/h). There exists no strategy that has an expectation of >0.05bb/100 when playing Cepheus.

As for the existence of Nash Equilibria: John Nash provided a mathematical proof that NEs exist for a certain class of games, this work earned him a Nobel prize. Poker fits into that class of games, so the existence of NEs in HUFL is hardly a matter of opinion. The Cepheus strategy is a very close approximation of a NE for this game.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-27-2015 , 10:53 AM
I noticed that you can't choose stack sizes when querying Cepheus. Does this mean that Cepheus has only solved situations for which stacks are essentially unlimited? If this is the case, then I guess there are still stack sizes for which HUFLHE has not been solved (e.g. 6 big blinds – it's large enough that the game is not essentially shove-or-fold but small enough that there may be bets that would be made postflop if stacks were deeper).
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-27-2015 , 12:18 PM
Quote:
Originally Posted by tobakudan
I noticed that you can't choose stack sizes when querying Cepheus. Does this mean that Cepheus has only solved situations for which stacks are essentially unlimited? If this is the case, then I guess there are still stack sizes for which HUFLHE has not been solved (e.g. 6 big blinds – it's large enough that the game is not essentially shove-or-fold but small enough that there may be bets that would be made postflop if stacks were deeper).
For a single hand in a vacuum, as long as you're above the 24 bb or whatever limit it is that play is capped on all streets, your strategic play should be equivalent.

For smaller stack sizes, it seems like an entirely different solution set would be required. For 23.5 bb it would likely be just as difficult to get a solution even though it would be almost exactly the same. For something like 6 bb it should be significantly easier to get a solution.

We could even go a step further and require that the game is solved for all possible effective stack sizes, but here we would have to take into account the actual stakes we're playing and the rules for what you can bring to the table. On most online sites its legal to sit at say a 100/200 table with a $6,435.89 stack size. Even if we're above 24 bbs to start we might lose some hands and dip below to where we have a stack size with fractional bbs. However at brick and mortar casinos there are sometimes limits on what the minimum value of a chip that is considered in play can be. So a lot like with rake, since there are so many variables I don't really see anything wrong with "solving" the game with a fixed stack size as a starting point. Solving smaller stack sizes in increments of a half bb might be a huge plus since its very common to get to those sizes with standard buyins, but after that we're going to have to be drawing a line somewhere.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-27-2015 , 09:10 PM
Quote:
Originally Posted by NMcNasty
So a lot like with rake, since there are so many variables I don't really see anything wrong with "solving" the game with a fixed stack size as a starting point.
Yeah. I didn't mean that as a criticism. Just thought I'd point out that, for example, HUFLHE SNGs are not essentially weakly solved (although even if it played its fixed strategy at all stack sizes I doubt it would be significantly exploitable).
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-28-2015 , 06:15 PM
At the risk of beating an essentially weakly dead horse:

Quote:
Originally Posted by nburch
So essentially solved says it's not exact, but also that there was at least some thought put into the quality of approximation.
...
As for the choice of name, I guess I'm (ever so slightly) sorry not everyone likes it, but remain puzzled as to why someone thinks we should have used their particular choice of name rather than some other previously unused name.
Lots of people have brought this up with me over the last few days, and I don't think any of them thought that "essentially weakly solved" was an appropriate description.

No one cares what name you use, as long as it's not implying that you have a solution that is something more than just a good approximation, and doesn't encourage the spread of misinformation about game theory and probability and statistics.

What you are essentially saying is that you have "essentially solved the game for a human lifetime of play." Why didn't you just write that in the abstract and summaries? Writing just ”essentially weakly solved” and then much later adding the time period as part of the definition makes it seem like you are trying to trick people into thinking you have basically solved the game for any amount of play (or maybe that there is something that makes your approximate solution qualitatively different from prior approximate solutions).

Quote:
A human lifetime seems like a pretty appropriate upper-bound measure for a game that humans play.
Lots of humans play it. Why is one lifetime relevant? What if you trained a team of players to play with your strategy? What if you had video poker machines which thousands of people played for money?

Why are humans even appropriate for a computer? Chess is also a game that humans play, but people don’t say chess is essentially solved just because computers can show their statistical superiority over any human (within a small fraction of the number of games that a human can play in one lifetime). What if you’re trying to win a computer poker competition? (Now that you've "essentially solved" HULH, will you still hold HULH competitions?)

Quote:
Using duplicate with a partner (never by yourself, otherwise you see both sides of the cards) or using Cepheus itself as a duplicate partner (no need for sampling/simulation, as a strategy for both seats always breaks even against itself, averaged between the seats) reduces the standard error on an estimate of expected winnings by around 2.5 times (looking at Josh Davidson's work at http://poker.cs.ualberta.ca/publicat...vidson.msc.pdf.)
You didn't understand what I wrote: "you can compare how your opponent fares compared to how your own strategy would fare, and reduce the SD even further because you can calculate or sim the average EV for your own strategy." This is similar to, but not actually duplicate poker. The nemesis doesn't need to play both sides. You do need to simulate or calculate how your strategy would do against yourself, because you are not interested in the average return of an infinite number of hands (which of course would be 0), you are interested only in the hands actually played.

(When you had your man-v-machine competitions, you could have used this concept to estimate Polaris's edge versus each individual human player, instead of just against each pair of duplicate partners.)

Quote:
Ignoring the fact that techniques for variance reduction are already outside the scope of the essentially solved definition...
Only because that's how you defined "essentially solved". In the real world (or even a hypothetical world), you'd be foolish to ignore such techniques. If you are a casino offering a HULH video poker machine (or any video poker machine, or other casino game with a skill element) for real money, would you only look at the net winnings of a player? Or would you look at individual plays to determine the player's skill in a much shorter period of time? If you're only going to look at winnings, I'd love to play blackjack at your casino. Would you wait to bar me until my winnings reach a statistical level of certainty, or will you bar me the first time you see me with a max bet on a high count splitting tens?

Quote:
Even with a combination of multiple variance reduction techniques, some not applicable to human players, HU LHE value estimates still have a per hand standard deviation of around 1 bb/hand, and that just requires millions of hands to get estimates that are accurate to around 1mBB/hand.
The SD would be much less if you look at the EV of each individual action (assuming perfect play on later streets) and compare it to the EV you'd have playing against yourself. (This will not be very accurate for an opponent that was not nearly a nemesis, but you don't really care about being perfectly accurate for those situations, though there are ways you can improve the accuracy if you need to.)

Note that an observer can distinguish your strategy from a GTO strategy the first time that your strategy makes a play that is pure 0 in the GTO strategy. The only way you can hope to prevent your strategy from being distinguished from a perfect strategy within a few thousand hands is to purify your strategy. Note also that the GTO strategy doesn’t have to be completely known, only some of the pure zeros need to be known.

(Why didn’t you guys do a purification step at some point, at least for preflop decisions? Seems like it would have given you a relatively quick reduction in exploitability, and could have sped up later iterations and saved storage.)

Quote:
No... We're not trying to say that there's a 95% chance of not detecting the imperfection, we're saying there's not a 95% chance of detecting the imperfection:
That may be how you’re defining “essentially solved” (though your paper has different definitions that are not all equivalent). But, you are trying to show that one strategy is statistically nearly even against another, so you should want close to a 50% chance of winning, not close to 0%.

If you are trying to show the opposite, that the strategies are not evenly matched, then a 5% chance of winning is reasonable, as the goal for 100% certainty is 0%.

Putting this another way: the goal of the test is to get the hypothesis, that your opponent’s strategy is even with yours, incorrectly accepted. Wouldn’t you want the hypothesis accepted 95% of the time, instead of 50% of the time?

Quote:
you can't distinguish a 1mbb/hand approximation from perfect with statistical confidence using 61M hands.
Actually, you can. You can't guarantee that you can, but half of the time you will distinguish a 1mbb/hand approximation from perfect with 95% statistical confidence using 61M hands. Half of the time you will fool someone using such a test into not rejecting the null hypothesis (though you won’t fool someone using a 90% test as often).

And, as I’ve mentioned above, if you compared the result to how your own strategy would have fared, or better yet looked at individual actions instead of just the result of the hand, you’d only need a fraction of those hands.

Quote:
Or another way, using a 95% confidence test there is a 5% chance of a false positive (exact shows up as approximation) and a ~50% chance of a false negative. You can vary the test to have a 5% chance of both false positives and false negatives with a 1mbb/hand approximation, and as you say, this would require four times as many hands. That's not too worrying: if you would only be satisfied with a 5%/5% error rate test, then it'll take you four "lifetimes" of poker to distinguish a 1mbb/hand strategy.
Actually, what I’m saying is that you want much greater than a 50% chance of a false negative, not a 5% chance of a false negative. The more false negatives, the closer you are to optimal.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-28-2015 , 07:01 PM
Did I say a bit of the above earlier? Andy uses too many big words for me to be sure.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-31-2015 , 08:28 PM
Quote:
Originally Posted by droller
Always disliked this group... I can't see how this could ever be good for the poker world, especially in regards to legalization, regulation and getting governments on board. All the decision makers see is that a computer program has been created to outplay humans and take all their money. Perfect....

Way to go U of A... You "solved" a game practically no one plays anymore staged in a magical fantasy world with no rake!
I strongly disagree, I think this was a very good thing for poker. It's terrible for those who don't want to take the time to study GTO, but it's revolutionary for those who want to understand the game at the most elite level possible. It can help show us that some of the things we thought were correct were really incorrect all along. Now people can actually see that they're folding too much or not check-raising enough. It's not just an intuitive guessing game where we're gambling as to what's correct. It makes us all better poker players if we understand what's GTO. Obviously there's too many permutations of possibilities for the human mind to grasp in his head, but we can at least gather a framework as to what's correct. The more one is willing to breakdown what's GTO the more of an understanding of what's correct and what their opponents are doing wrong. Hopefully I can expand some of the concepts I learn from this game into a more complicated form of poker such as a NLHE or PLO ring game.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-01-2015 , 11:22 PM
I read this new some time ago. Like a poker player I worry that the limit holdem will dying after that. I love to play this type of holdem. It makes me nostalgic because I started playing poker at FL holdem tables.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-02-2015 , 12:19 PM
You can now play against an earlier Cepheus-like bot on my site. It's almost the same except there's slight hole card abstraction on the river (everything else is unabstracted) and the payoffs are intentionally tilted so that it's forced to call and 4-bet preflop some of the time. The server can't handle too much traffic so I won't post a link here, see paper. I'll put some other games there too later.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-02-2015 , 07:17 PM
Quote:
Originally Posted by droller
Always disliked this group... I can't see how this could ever be good for the poker world, especially in regards to legalization, regulation and getting governments on board. All the decision makers see is that a computer program has been created to outplay humans and take all their money. Perfect....

Way to go U of A... You "solved" a game practically no one plays anymore staged in a magical fantasy world with no rake!
Quote:
Originally Posted by greatwhite
I strongly disagree, I think this was a very good thing for poker. It's terrible for those who don't want to take the time to study GTO, but it's revolutionary for those who want to understand the game at the most elite level possible. It can help show us that some of the things we thought were correct were really incorrect all along. Now people can actually see that they're folding too much or not check-raising enough. It's not just an intuitive guessing game where we're gambling as to what's correct. It makes us all better poker players if we understand what's GTO. Obviously there's too many permutations of possibilities for the human mind to grasp in his head, but we can at least gather a framework as to what's correct. The more one is willing to breakdown what's GTO the more of an understanding of what's correct and what their opponents are doing wrong. Hopefully I can expand some of the concepts I learn from this game into a more complicated form of poker such as a NLHE or PLO ring game.
I think some people are missing the point about the type of play that the Cephious bot is playing. it basically plays to minimize losses while not being able to maximize wins. Against most players who make many mistakes, this kind of bot is far less dangerous than playing against a top pro player who will understand your weaknesses and maximize wins when exploiting against your weaknesses.

Also, the bot never loosing long term is based on unraked play. When adding a rake, against very good players, it won't be a winning player. Just one that is losing to the rake by less than or equal to their opponent. Therefore, in HU limit poker, I wouldn't be surprised if such a bot wasn't profitable without including some kind of table selection routine.

Besides, How much play does a site get for HU limit play? I honestly have no idea but Id guess, not a lot.

One other issue I saw that was brought up were the stack sizes. Once effective stacks cover the maximum number of bets you would expect to reasonably see in a hand, it doesn't make any difference, unlike for NL play where it would make a huge difference.

The only time it could make a difference is when both players keep raising but that seems like this could only happen with a nuts vs nuts type situation, to me though I've never played HU Limit Holdem myself.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-02-2015 , 07:34 PM
Quote:
Originally Posted by TakenItEasy
I think some people are missing the point about the type of play that the Cephious bot is playing. it basically plays to minimize losses while not being able to maximize wins. Against most players who make many mistakes, this kind of bot is far less dangerous than playing against a top pro player who will understand your weaknesses and maximize wins when exploiting against your weaknesses.

Also, the bot never loosing long term is based on unraked play. When adding a rake, against very good players, it won't be a winning player. Just one that is losing to the rake by less than or equal to their opponent. Therefore, in HU limit poker, I wouldn't be surprised if such a bot wasn't profitable without including some kind of table selection routine.

Besides, How much play does a site get for HU limit play? I honestly have no idea but Id guess, not a lot.

One other issue I saw that was brought up were the stack sizes. Once effective stacks cover the maximum number of bets you would expect to reasonably see in a hand, it doesn't make any difference, unlike for NL play where it would make a huge difference.

The only time it could make a difference is when both players keep raising but that seems like this could only happen with a nuts vs nuts type situation, to me though I've never played HU Limit Holdem myself.
It plays GTO which means no matter what you do you can't have a mathematical edge on it. If a player understands what is GTO then they can understand where their opponents are making mistakes and exploit them. For example if they bluff too much you can call them down more and if they fold too much you can bluff more. You don't truly know at what point they're doing something "too much" without understanding what's GTO. Right now players only have an intuitive understanding and try to use their statistics and compare it to winning players in order to figure out how they should play hands.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-02-2015 , 09:27 PM
Quote:
Originally Posted by greatwhite
It plays GTO which means no matter what you do you can't have a mathematical edge on it. If a player understands what is GTO then they can understand where their opponents are making mistakes and exploit them. For example if they bluff too much you can call them down more and if they fold too much you can bluff more. You don't truly know at what point they're doing something "too much" without understanding what's GTO. Right now players only have an intuitive understanding and try to use their statistics and compare it to winning players in order to figure out how they should play hands.
My comments were more directed towards not needing to fear GTO, bots because they are less dangerous than an exploitative player who could outplay you, if you were a losing player.

Secondly, I didn't think someone mass tabling GTO bots could be that big a threat, since the profit motive would be pretty limited. A GTO bot wouldn't be able to fully exploit soft games for enough profit to make it as worthwhile compared to a pro who could exploit such games far more efficiently.

Since I've never played HU Limit Holdem, as I've already stated, I could be wrong, but it just seems unlikely to me logically.

My point vs your comment was just how significant defining precise optimal balance points actually was for a winning player. Let's assume that you already had a good idea of where optimal play was without knowing the balance points precisely for all situations. The mistakes made at near optimum would be much smaller than the larger mistakes you were already aware of so the additional profits you would realize would be small when compared to the win rate you already realized.

If you were up against another good player who was playing close to your level already, than the extra edge you'd gain probably couldn't offset the rake anyway.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-03-2015 , 01:19 AM
Quote:
Originally Posted by TakenItEasy
Also, the bot never loosing long term is based on unraked play. When adding a rake, against very good players, it won't be a winning player. Just one that is losing to the rake by less than or equal to their opponent.
.
Nope. The bot will do worse than a perfect or near perfect player that is altering its play due to the rake.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-03-2015 , 07:39 AM
Quote:
Originally Posted by David Sklansky
Nope. The bot will do worse than a perfect or near perfect player that is altering its play due to the rake.
Good point. Winners pay the rake so the implied odds will be tweaked a bit while the bot is calibrated to play with no rake so I'd agree with that.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-04-2015 , 12:58 AM
Quote:
Originally Posted by Andy Bloch
At the risk of beating an essentially weakly dead horse:

...
I suspect that it _is_ beating a dead horse, but....

You don't like the term "essentially solved" and think it's somehow misleading. We'll just have to disagree. I don't know how you use "essentially" but if I ask someone "are you done?" and their answer is "I'm essentially finished" I don't think they're 100% finished.

You don't like the idea of using a human lifetime of play. We'll just have to disagree. Yes, there could be all sorts of other arbitrary cutoffs. Regardless of the cutoff, everyone is also completely free to choose not to use "essentially solved" to describe their work.

You talk about chess, and statistical superiority over humans. This misses the point. First, essentially solved is still a worst case bound that considers an opponent playing a best response, not an average opponent. It's pretty much a certainty that the best response to any current chess program would win. Conversely, if someone was to just consider actual human play for poker, HULH was done years ago. Second, back to chess, for a non-randomised strategy, the variance of the outcome is zero. Any mistake could be exploited every single match. If the strategy IS randomised, with only some probability of making a mistake, you could ask the question of how likely you are to play enough matches to encounter the mistake.

Regarding variance reduction techniques and essentially solved, we'll just have to disagree. Yes, you would be silly to ignore them in real world applications. If someone wants to estimate the value of a best response playing against Cepheus, I'd say they were silly to do so by playing games against it, and then use the fact that Cepheus' strategy is available to do reduce the variance. Given the strategy, the best reduction in variance would be to do the entire expectation, and we've already done that computation...

We are aware there are a wide variety of variance reduction techniques. If I've misread what you wrote, you also misread what I wrote. Yes, it's quite true that you can subtract off the expected value of the bot against itself, and this is an unbiased estimator because the expected value of self play when alternating seats is zero (otherwise you need an extra term.) Regarding playing duplicate poker with Cepheus as a partner... If I get x_i on hand i playing as the small blind, and Cepheus gets y_i playing as the small blind with same cards, you can use sum_i x_i - y_i to estimate your expected winnings. Or, if you were playing a duplicate match with Cepheus as partner, Cepheus-the-big-blind gets -y_i on hand i because the game is zero-sum, so your team estimate is also sum_i x_i - y_i. But you know the expected value of Cehpeus' self-play is 0, so that's also an estimate of your own value.

Hopefully, everyone is well aware that there is not (generally) just a single Nash equilibrium for a game. Finding an arbitrary equilibrium that never makes a particular action x does NOT mean that action x is a mistake. It just means not playing x is not a mistake. Given how CFR+ with no sampling works, and the choice to use the current strategy, there are no obvious places to prune low probability actions, preflop or otherwise: keep in mind that actions taken with positive probability had better expected value than the previous strategy, and the next-to-last strategy was already very good. If an actions is such an obvious mistake, it wouldn't be playing it on the next iteration... It might also be worth noting that even with sampling CFR, using the average strategy (which retains the initial all-random-play strategy from the first iteration), throwing away low probability actions still INCREASES the exploitability, not decreases it. There will surely be some threshold value where this switches, but it's not at all obvious what that threshold is. Purification might seem like a good idea, but it isn't.

Just going ahead and assuming that the places where Cepheus makes actions with small non-zero probability are places where it is making a mistake is a theoretically un-justified leap. Outside of some very particularly defined situations (like David Sklansky's specific scenario of drawing to the nuts) there may be arguments for why you shouldn't make some particular action that are good enough to improve human play, but fall short of a proof that any strategy that makes that decision is iteratively dominated.

So... if you catch it folding the nuts you know there's a mistake, but you had better still include the probability of reaching that situation, and the probability that it actually does so -- there's a good chance it still doesn't show up in 65 million hands.

Finally, as far as the statistics go, I will admit to being confused as to exactly what you're suggesting. Yes, for a 95% confidence test there is a bit better than 50% chance that Cepheus would appear to be an equilibrium with 65M hands playing against a best response. Yes, there's ~5% chance of Cepheus being ahead in terms of money. And, yes, the rate of false negatives and the probabilty of being ahead in terms of money both go up with a better approximation. But you have now twice talked about having a 95% chance of the identifying the proposed strategy as an equilibrium. Using a 95% confidence test, this is exactly and only the case for an exact strategy. Fine: you don't like >5% chance of winning (50%/5% type I/II error), or 5%/50% error, and you clearly don't like 5%/5% error. Do you have a particular type-I/type-II error level in mind that’s achievable before outright solving the game?
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-04-2015 , 08:49 AM
You all seem convinced.. But... Isn't it still impossible to work out how to balance in respect of value without assigning the villain a range?

Therefore, the bot will have to calculate a range for me. If I am aware of my own range, but the bot isn't aware of his own range.( If the computer isn't self aware.) Then surely it's always possible to exploit it for blue line by playing level 3 exploitative poker. (Avoiding your perceived range with your actual range.)

It's obvious that we can't take red line profit from this bot but is blue line no longer up for debate??
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-04-2015 , 09:34 AM
Quote:
Originally Posted by Yadoula8
You all seem convinced.. But... Isn't it still impossible to work out how to balance in respect of value without assigning the villain a range?

Therefore, the bot will have to calculate a range for me. If I am aware of my own range, but the bot isn't aware of his own range.( If the computer isn't self aware.) Then surely it's always possible to exploit it for blue line by playing level 3 exploitative poker. (Avoiding your perceived range with your actual range.)

It's obvious that we can't take red line profit from this bot but is blue line no longer up for debate??
The bot plays in such a way that, regardless of its opponent's range in any spot, an average minimum expected value is guaranteed across all those spots. (This is what it means for the bot's strategy to be balanced.) The bot doesn't perform any calculations; it just looks up in its huge database (i.e. strategy) what it should do with its current hand in its current situation with what frequency. The almost perfectly balanced strategy (against any counter-strategy) has already been calculated; that's what the bot was doing for more than 2 months.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-04-2015 , 10:11 AM
So the bot works out its value by calculating its average win rate against all the possible hands in our range? That's level 2 poker to me...

Then you say it preplans how its going to defend against exploitation, so, it's pre adjusting which is level 3. As well as balancing all the way of course.

Anyone know if the bot knows level 4??.. Perhaps it just thinks it's circular after level 3 lol
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-04-2015 , 10:26 AM
Quote:
Originally Posted by Yadoula8
So the bot works out its value by calculating its average win rate against all the possible hands in our range? That's level 2 poker to me...

Then you say it preplans how its going to defend against exploitation, so, it's pre adjusting which is level 3. As well as balancing all the way of course.

Anyone know if the bot knows level 4??.. Perhaps it just thinks it's circular after level 3 lol
A Nash equilibrium (GTO) strategy, which is what the bot approximates, doesn't require leveled thinking.

Say your on the river with a range containing only bluff catchers, and your opponent, who has a range that is polar relative to yours, bets allin for the size of the pot. You're considering whether to call. The last 3 times this situation occurred, you called and your opponent showed value hands. You're thinking that your opponent is value betting again (level 1). But your opponent may know this, and may have began adjusting by beginning to bluff as well (level 2). Knowing this, you might want to call again in case your opponent has adjusted (level 3). Knowing this, your opponent...

The bot doesn't "think" like this. Rather, it has worked out (through many zillions of trials) that if it calls 50% of the time, a minimum EV is guaranteed, and there is no other calling strategy that can guarantee a greater minimum EV (i.e. all other calling strategies can be exploited, i.e. there exists a counter-strategy that results in an EV that is less than the minimum EV guaranteed by calling 50%).

The bot has done this for all possible situations, and thus has no need for leveled thinking.

Judging from the content of your posts, I'm guessing that you're not familiar with the concept of a Nash equilibrium?
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-04-2015 , 12:29 PM
Quote:
Originally Posted by Yadoula8
So the bot works out its value by calculating its average win rate against all the possible hands in our range?
The bot doesn't care about your range.

It has a fixed strategy, the opponent doesn't matter. It doesn't outplay you, it waits for you to outplay yourself.

Think of it like someone that plays perfect strategy against a continuously shuffled blackjack bot. Lets say the rules are such that the bot has a 0.5% edge vs the player but the house agreed to repay 0.5% after the session so you're essentially 50-50 if you play perfectly. Any time you deviate from perfect strategy then you lose a little bit of edge.

Now change that scenario to you playing hu lhe vs Cepheus and its playing GTO against you. The only way for you to be 50-50 is to play GTO as well and any time you deviate you lose an edge. Over time this will add up to a big amount because poker is so much more complicated than Blackjack. In fact I doubt a human will ever be able to play perfect GTO unassisted.

Last edited by Wolfram; 02-04-2015 at 12:45 PM.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-04-2015 , 01:09 PM
Quote:
Originally Posted by Yadoula8
So the bot works out its value by calculating its average win rate against all the possible hands in our range? That's level 2 poker to me...

Then you say it preplans how its going to defend against exploitation, so, it's pre adjusting which is level 3. As well as balancing all the way of course.

Anyone know if the bot knows level 4??.. Perhaps it just thinks it's circular after level 3 lol
You really have no idea what you're talking about. Read a little bit about game theory and nash equilibriums and get back to us.

The bot plays a completely static strategy. It does not adjust to the individual players at all. It uses a pre-computed strategy that is guaranteed to, at worst, break even (note: within the limits defined above - they haven't found a true GTO solution, just something very near it, so it's guaranteed minimum EV is not 0, but just a very small negative number)
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-04-2015 , 01:23 PM
Its a bit confusing because the current strategy the bot uses does not adjust to opponent tendencies. The strategy used at hand 1 is exactly the same as the strategy used at hand 1000.

However, in order to get to the current strategy the bot played trillions of hands against it self and did in fact adjust. It does not learn from its opponent currently, but did learn from itself in the past.

Ideally, when solving to find the perfect GTO solution, you wouldn't need to play any hands at all or make any adjustments in any way whatsoever. Experience is completely unnecessary. You simply calculate the solution. The main hurdles are finding the best algorithm and finding the memory and processing power to get this done.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
02-04-2015 , 01:58 PM
I understand that perfectly well, honestly.

I said that it both pre-adjusts and balances. I didn't say that it adjusts, which it clearly wouldn't do.

The only level 3 action which it would need to do is preadjusting. Which you guys all say it's taught itself how to do. You also explained how it also already considered all the level 2 factors, as I described. So I was also wondering whether its algorithm had also included the relevant factors from levels 4 and 5 etc etc.

I guess this really comes down to whether these levels really are relevant while balancing. I'm discussing this in my own thread, 'query regarding game theory', I would really appreciate any input.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote

      
m