Open Side Menu Go to the Top
Register
Alberta university Poker 'bot "solves" heads up limit hold 'em Alberta university Poker 'bot "solves" heads up limit hold 'em

01-09-2015 , 01:17 PM
See this article, and the related link to the university web site that includes an interesting display of the programmed strategy, including charts of call/raise/fold strategy for a heads up LIMIT hold 'em game. Is this interesting or useful, do we think? The website interface includes functions for calculating odds of particular hands, but it's a little clunky compared to other tools out there in the poker sphere.

According to the article, the researchers think that they have devised a computer program that will play "perfectly" and never lose in a 2-handed limit hold 'em game. Of course, it's only for heads up play, and only for Limit.

http://www.theverge.com/2015/1/8/751...rogram-cepheus
Alberta university Poker 'bot "solves" heads up limit hold 'em Quote
01-09-2015 , 01:26 PM
I want to know what lower bound they have on the maximally exploitative strategy.

edit: .986 mBB per game. So, around .05bb/100

Last edited by RussianRoulette; 01-09-2015 at 01:53 PM.
Alberta university Poker 'bot "solves" heads up limit hold 'em Quote
01-09-2015 , 01:46 PM
Maximum exploitability of their solution is given at just below 1mbb/g, so <0.1BB/100.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-09-2015 , 02:38 PM
Quote:
Originally Posted by plexiq
Maximum exploitability of their solution is given at just below 1mbb/g, so <0.1BB/100.
a game is two hands, since the dealer and BB are asymmetrical.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-09-2015 , 02:44 PM
Ah, you are correct of course.

Last edited by plexiq; 01-09-2015 at 02:49 PM. Reason: Good thing i hedged with "<" ;)
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-09-2015 , 03:43 PM
My initial question is how the program will play in reality against a random opponent, since it has "perfected" its game only by playing billions of hands against itself. Aside from the limitations that it's only applicable to a 2-handed game, and only limit hold 'em, would it be able to "learn" fast enough in only a few hundred hands against different types of players to come out ahead?

For example, what if an opponent played very hand as if he had pocked Aces? How would the computer react to the betting sequences? At what point would the program figure out that it can't take the opponent's bets seriously and adjust?

The other interesting question would be whether the computer is somewhat exploitable based on the idea that the computer never bets with air -- the computer always makes pre-flop decisions based on the standard chart and makes post-flop decisions based on probabilities, but never fires hard with a true bluff. If we take all "air" bluffs out of play, does that give the human an advantage?

Very interesting
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-09-2015 , 03:46 PM
The bot plays a fixed strategy that does not change depending on the opponent's play. It is guaranteed to lose no more than 0.05bb/100 on average, regardless of the opponent's strategy. (See terminology sticky regarding GTO/Nash Equilibrium play.)
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-09-2015 , 03:49 PM
[x] interesting
[x] useful
[ ] solved

IMO
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-09-2015 , 05:51 PM
Quote:
Originally Posted by NMcNasty
[x] interesting
[x] useful
[ ] solved

IMO
I'd call .05BB/100 solved. You have to draw the line somewhere.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-09-2015 , 06:18 PM
NVG thread with authors from the paper contributing posts:

http://forumserver.twoplustwo.com/sh....php?t=1502189

Definitely agree we have to consider this solved.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-09-2015 , 07:10 PM
Quote:
Originally Posted by RussianRoulette
I'd call .05BB/100 solved. You have to draw the line somewhere.
Yeah, preferably at the solution!


By the way how big is the space for limit poker all the way to the river? It cant be that huge to make it impossible to in principle be solvable by a supercomputer? How would one try to actually solve it? It might be an issue of not enough hard drives space though.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-09-2015 , 07:43 PM
They didn't use any abstractions, except for re-mapping suits etc. So the full solution would have exactly the same storage requirements as the current one.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-09-2015 , 09:25 PM
It is solved in that the algorithm provable converges to a solution and they have the hardware to run it and save it. They ran the program for a couple of months on their supercomputer to get the solution they did. If they ran it longer it would even be closer.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-09-2015 , 09:32 PM
Quote:
Originally Posted by KevinGChapman
My initial question is how the program will play in reality against a random opponent, since it has "perfected" its game only by playing billions of hands against itself. Aside from the limitations that it's only applicable to a 2-handed game, and only limit hold 'em, would it be able to "learn" fast enough in only a few hundred hands against different types of players to come out ahead?

For example, what if an opponent played very hand as if he had pocked Aces? How would the computer react to the betting sequences? At what point would the program figure out that it can't take the opponent's bets seriously and adjust?

The other interesting question would be whether the computer is somewhat exploitable based on the idea that the computer never bets with air -- the computer always makes pre-flop decisions based on the standard chart and makes post-flop decisions based on probabilities, but never fires hard with a true bluff. If we take all "air" bluffs out of play, does that give the human an advantage?

Very interesting
During it's billions of hands of training it learns a response that can not lose to every possible strategy. It does not adjust at all in an actual game, whether it's opponent caps every hand or folds every river. It's strategy can already defeat both of those and all in between. Why would you think it doesn't bluff? It certainly does.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-10-2015 , 09:28 AM
Quote:
Originally Posted by KevinGChapman
My initial question is how the program will play in reality against a random opponent, since it has "perfected" its game only by playing billions of hands against itself. Aside from the limitations that it's only applicable to a 2-handed game, and only limit hold 'em, would it be able to "learn" fast enough in only a few hundred hands against different types of players to come out ahead?

For example, what if an opponent played very hand as if he had pocked Aces? How would the computer react to the betting sequences? At what point would the program figure out that it can't take the opponent's bets seriously and adjust?

The other interesting question would be whether the computer is somewhat exploitable based on the idea that the computer never bets with air -- the computer always makes pre-flop decisions based on the standard chart and makes post-flop decisions based on probabilities, but never fires hard with a true bluff. If we take all "air" bluffs out of play, does that give the human an advantage?

Very interesting

It plays GTO, close to nash equilibrium, because it keeps playing itself. Thats why it beat any player.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-10-2015 , 11:59 AM
Quote:
Originally Posted by Allen C
It is solved in that the algorithm provable converges to a solution and they have the hardware to run it and save it. They ran the program for a couple of months on their supercomputer to get the solution they did. If they ran it longer it would even be closer.
Meaning this version playing against the version that was ran longer would lose. So this version is not unexploitable and is not solved.

I mean I'm OK with "virtually solved", "computationally solved", "basically solved" or whatever but not "solved".

The game is "solved" when a strategy cannot be exploited even theoretically (for a symmetrical heads up game). Otherwise its not.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-10-2015 , 12:15 PM
Quote:
Originally Posted by RussianRoulette
I'd call .05BB/100 solved. You have to draw the line somewhere.
0.0 BB/100

I mean we actually had a thread and a debate about this a couple years ago when it was 2 BB/100. So I guess I'll we'll be meeting up in another 2 years or so when it gets to 0.1 BB / 100, and then maybe again when its 0.01 BB / 100.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-10-2015 , 02:58 PM
Quote:
Originally Posted by NMcNasty
0.0 BB/100

I mean we actually had a thread and a debate about this a couple years ago when it was 2 BB/100. So I guess I'll we'll be meeting up in another 2 years or so when it gets to 0.1 BB / 100, and then maybe again when its 0.01 BB / 100.
Seriously this is the biggest bs semantics argument I've ever seen. So you're saying if they lost .00000001bb/100 'technically' it's not solved.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-10-2015 , 03:13 PM
So how can 2 human players join forces to defeat the computer in a 3way without ever communicating cards but only playing in a way that the group has the advantage and they split the profits? I bet its doable and its where poker eventually goes when you cannot defeat alone in a table the best in the world but by implicit collusion of the losers they become winners as a group. There is actually nothing illegal about that or unethical. Very advanced AI of different skill levels but all very rational would converge to such cooperative behavior if my intuition that its possible to work proves true. So imagine the best in the planet vs 8 very good that are cooperating always until they collectively defeat his edge.

Now that would be spectacular game theory research to perform and could in principle kill n>=3 players poker for good or create a new format of it to eliminate that issue.

The reason i think this may be possible is because one could allow errors vs players that you do not mind exploiting you because statistically over time this averages out as an effect and all that remains is the ability you gave that other player that was locally exploiting you to gain an advantage vs the target player.


Ps: Notice that in principle if this works they do not have to split profits in the end at all. Just keep playing thousands of hands in the same table playing the same cooperative strategy cyclically and the averaging will take care of it. This is why its a problem if possible because its in principle not illegal or unethical at all, its part of a broader game realized. They all play a few thousands hands and they figure out someone is better and then they target that person to stop losing to him lol. But one needs to prove this is doable and that the target player has no defense against it. That may not require to actually fully solve the game though if the cooperative edge is significant.

Last edited by masque de Z; 01-10-2015 at 03:43 PM.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-10-2015 , 04:34 PM
I'm not even really sure how a HU bot could play in a 3-way match - most of the possible game states don't exist in the bot's strategy, so then what is it gonna do?
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-10-2015 , 04:47 PM
Quote:
Originally Posted by NMcNasty
Meaning this version playing against the version that was ran longer would lose. So this version is not unexploitable and is not solved.

I mean I'm OK with "virtually solved", "computationally solved", "basically solved" or whatever but not "solved".

The game is "solved" when a strategy cannot be exploited even theoretically (for a symmetrical heads up game). Otherwise its not.
Cepheus is different than previous efforts that were beatable for only a small amount per hand in that it solves the full game rather than an abstract game. If you ran the training simulations for previous bots even longer there is no guarantee they would get any better. Their potential was capped at a certain exploitability. Cepheus is not and will converge to perfect. So I think that's something.

Last edited by Allen C; 01-10-2015 at 04:47 PM. Reason: typo
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-10-2015 , 06:20 PM
Quote:
Originally Posted by NMcNasty
[x] interesting
[x] useful
[ ] solved

IMO
+1

good job to the team in Alberta but this is just an accolade. where is the end game with this avenue of gameplay? this is saturated to me due to bots, computers and technological advances today
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-10-2015 , 06:30 PM
Quote:
Originally Posted by RustyBrooks
I'm not even really sure how a HU bot could play in a 3-way match - most of the possible game states don't exist in the bot's strategy, so then what is it gonna do?
Well i am assuming the bot plays real well HU (using the near solution) and can defeat each player individually but when they both play the computer needs a 3 player solution. Still the majority of the plays will be heads up typically (that will be true more likely in NL though). If the computer is typically ok in 3way spots it may recover the edge in HU if all 3 play for themselves and nobody else.

Now finding a full 3 person solution may be a lot tougher. The interesting question is if a Nash solution exists for more than 2. If it does exist then it wont matter how they collude (in the absence of communicating cards). But i think interesting issues emerge when there are more than 2 players. Do we have a theorem for more than 2 players for an existence? (i think the existence is for non cooperative games, but here you have 2 that cooperate vs one that doesnt).

http://en.wikipedia.org/wiki/Nash_equilibrium

Last edited by masque de Z; 01-10-2015 at 06:48 PM.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-10-2015 , 07:20 PM
Quote:
Originally Posted by masque de Z
So how can 2 human players join forces to defeat the computer in a 3way without ever communicating cards but only playing in a way that the group has the advantage and they split the profits? I bet its doable .
Of course its doable. The obvious case is where the computer bets and the second player normally raises with a middling hand to cut down the odds of the third. They just have to agree not to make these raises and the "GTO" computer is toast.
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote
01-10-2015 , 07:25 PM
Quote:
Originally Posted by just_grindin
Seriously this is the biggest bs semantics argument I've ever seen. So you're saying if they lost .00000001bb/100 'technically' it's not solved.
Suppose I tell you I've solved Chess. My solution will get at least a draw against all gazillion variations except a few with each color. Would you say I've truly solved chess?

Last edited by heehaww; 01-10-2015 at 07:43 PM. Reason: Karganeth said it better
Alberta university Poker 'bot &quot;solves&quot; heads up limit hold 'em Quote

      
m