Open Side Menu Go to the Top
Register
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot

05-10-2015 , 01:57 PM
<3 Arty
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:03 PM
"a stunning 7-1 draw"

lollllll
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:03 PM
Throw in every 4-0 sweep in Word Series, Stanley Cup, NBA, etc... because none of those meet 95% either
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:04 PM
Quote:
Originally Posted by ArtyMcFly


More science...
Spoiler:

Thank you man. I am on tilt every time I open this thread and see news from "state of the art", especially in statistics department. Made my day
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:04 PM
Quote:
Originally Posted by ArtyMcFly


More science...
Spoiler:
Awesome photoshops of that dinosaur Sandholm. Anyone got links to the CMU forums or site or anywhere the prof has a presence where we can post these and troll him
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:13 PM
Quote:
Originally Posted by lostinthesaus
A bot only losing 9 bb/100 to this team is very significant and should be taken very seriously. I think it will only get exponentially better from here on out both in speed of improvement and actual game improvement, especially with that Microsoft money backing it.

Would not be surprised to learn of many new poker think-tanks that have formed and existing ones newly inspired to start building/improving their own bots after this. Personally starting to feel the beginning of the true end of online poker. Win-rates sure to plummet as these bots improve. Legislation will likely be slowed for having to factor in bot use, penalties, TOC violations etc.

This was definitely a victory for the bot side of things. If not a victory for the bot, definitely a loss for the humans.

Live poker on the up-and-up. Who knew these live tourney donks had it right all along?

Very good article about AI I found extremely informative and appropriate for this thread:

http://waitbutwhy.com/2015/01/artifi...olution-1.html
http://waitbutwhy.com/2015/01/artifi...olution-2.html


Read the fine print in the scientific papers. Claudico is pretty much exactly what we expected (worse actually I was worried it might be balanced enough on later streets to actually have an edge). It still displays some problems due to abstraction and the Algorithms it uses that are intractable at the moment. The game truly is far larger than LHE and they are having trouble with that. Maybe if they want to spend another million running the supercomputers for a couple months to converge some of its bigger leaks etc. it will be marginally better but Afaik this is about as good as CMU can do with current techniques. I think we have more than a year. Sandholm thought he was already close to NE and would trash the brains. They tested Claudico on some mid stake, east coast noobs before this and crushed them. They figured they were ready to go ala Watson but got crushed.

I guess they still want grant money to try again given the spin the prof has put on his failure
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:19 PM
Now that I have a soul again I will field some questions in the thread.

I will go ahead and answer the most recurring one - declaring this a statistical tie. I have talked with CMU team a bunch, and I like the team. In science my understanding is that they have to be very specific, and that the 95% confidence interval is what is used in their other matches. So they cannot say they lost with statistical significance, as that isnt true.

However, there are a few things that I have an issue with. The first is that while it might have been on the fine print as to what constitues statistical significance, my attidue when arriving here was win or lose we play our game. If we we win that's great, if we lose hats off to the CMU team. I didn't realize statistical significance was a thing until about 1 or 2 days left, at which point I was going pretty crazy from the volume.

While they cant say it was a statistical win, I do feel that had the shoe been on the other foot, that things would have been aired slightly differently. For example, the CMU computer science twitter, was tweeting things like "Doug Polk in press conference maintains that Claudicos winrate was impressive". lol winrate itt. I am more then happy to say good things about the program as I think there were lots of things it did well. But there were also problems with the way its strategy worked in spots. That tweet was deleted after I complained to the CMU team, but I think it is a bit naive in thinking that should the humans have lost, that the headline would be "match ends in statistical tie".

I also think that for the university, its a bit different. They have an incentive to look good and also are not as familiar with the constraints of human players. Fatigue played a major role down the stretch, I was doing almost no studying and there was a couple days in there where I was tilting my ****ing face off from how boring and tedious it had gotten. It is also hard for them though, I think they put on the blinders slightly when they are trying to achieve their goal.

IMO headline should have read "Humans win over Computer with 92% confidence". If you look at the way that the majority of news outlets have reported this, they kind of cut through the tie stuff (although some such as nbc did not), because reporting it a tie is also not a very attractive headline for them.

At the end of the day it is what it is, I said my piece to the media, and I can walk away feeling like I did my job and the human team won. That is all that really matters and im not gonna battle it out about getting the exact quotes I would want from CMU, no value there anyway.

Also to answer another question, Ive been asked pretty constantly if I am going to do this again. I am most likely not going to. It was way too hard for me and I took too much of a hit in what my expected value is elsewhere, not that 210/hr is bad, it just isnt really on the map with where I think I could most effeciently spend my time for how hard I had to work.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:24 PM
It has probably been answered already, but why was it 200bb instead of 100bb deep?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:26 PM
Quote:
Originally Posted by kingofcool
It has probably been answered already, but why was it 200bb instead of 100bb deep?
This is the standard format for bot challenges. The deeper you get the more difficult the challenge of playing well comes, because the game tree is bigger. So I think this was set as the standard. These guys dont know anything about what is played online, they arent trying to solve the most popular form. They just wanted an arena to work within that would be more difficult to solve and allow more room to create strategies. (My understanding anyway, would be better to ask Noam or Sam)
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:42 PM
Hi Doug,

Thank you for answering questions.

What win rate (bb/100) would be required in order to satisfy the 95% confidence?

Is this win rate attainable in all likelihood?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:49 PM
Doug:

Maybe this has been answered earlier, but were you allowed to know stats after the match daily? What about when it was over? How much more effective would a real-time HUD have been?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:52 PM
Quote:
Originally Posted by Jaznin
Hi Doug,

Thank you for answering questions.

What win rate (bb/100) would be required in order to satisfy the 95% confidence?

Is this win rate attainable in all likelihood?
My understanding is aprox 10 bb/100, but again these are questions better suited for the programmers and less for me.

The winrate we won at? Or 10 bb? Both are certainly doable. Under more ideal circumstances we could do better, but that isnt gonna be as important as next year im sure the bot will be stronger.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:53 PM
Quote:
Originally Posted by lostinthesaus
Doug:

Maybe this has been answered earlier, but were you allowed to know stats after the match daily? What about when it was over? How much more effective would a real-time HUD have been?
We got the hands after each day, for that day, and could import them and review. However a real-time hud would have been way better, this is the first time in easily 5-6 years I had to play "blind".
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:53 PM
Quote:
Originally Posted by kingofcool
It has probably been answered already, but why was it 200bb instead of 100bb deep?

The bot competition where this thing descends from uses 200bb. In order for them to run at any other depth, they would have to rerun all the supercomputing time... And any corrections / improvements they have made to the base play would need to be redone, etc. It is not necessarily starting from scratch, but it would require a lot of work to change the depth. This is also why every single hand starts at 200 bb effective. The bot is built for that depth and that depth alone. "Mirroring" is used as the reasoning behind fixed bb hands, but mirroring against human opponents makes no sense. You can't erase the hard drive and have the same opponent play the opposite side lol.. mirroring against different opponents (as done here) has really confusing implications. Consider they set this up to run just like the bot competition for their advantage - I am a believer mirroring was kept just to make sure 200bb fixed could be carried through on "mirroring"'s back.

Great job Doug and team - nice victory and class acts the whole way.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:54 PM
Congrats for the win and thanks to everyone who participated in the challenge, it was fun watching the stream.
Humans!
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:54 PM
Could the bot beat TCfromUB?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:56 PM
Quote:
Originally Posted by Mecastyles
You can atleast say it is ****ing ridiculous statement.

used to look up to the beserious



I never expressed my personal view on it. I was just clarifying that I had nothing to do with whether other people from CMU or the media called it a "statistical tie." I think it was a clear win for the humans, which was significant at the 90% level and not at the 95% level.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 02:58 PM
Here is how the annual computer poker competition reports a result not meeting their previously established significance threshold:

"Two-player no-limit Instant Runoff:

1. Tartanian7
2. Prelude
3. Hyperborean

Prelude, Hyperborean and Slumbot are not separated by a statistically significant margin."
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 03:00 PM
And FYI Doug - they knew ahead of time that it took 100ks of hands to find 95% confidence against tough opponents (their match against "prelude" was nearly 1 mil hands).... so to me it seems like headline freerolling or a severe overestimation of their *edge against humans.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 03:02 PM
Quote:
Originally Posted by punter11235
It's truly fascinating how you can come up with a game representation with 8 orders of magnitude more nodes to visit than needed. No wonder everything is untrackable in "state of the art" world
Can you elaborate? You have a method for computing best response that is much faster than theirs? I haven't had a chance to look into that problem very closely. It's too bad there has been such a big disconnect between the academic researchers and the non-academic poker software developers -- you should have submitted a bot to the competition and dropped by the AAAI workshop so we could have chatted
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 03:08 PM
To Doug/Sam/TimTam/punter/other super smart math/programmer/HU elites posting ITT:

I apologize if what I'm about to ask has been at least strongly implied by various posters smarter than me. I have read a majority of this thread, and I have not seen the following question asked, although I think Sam/punter/TimTam might have just assumed everyone is aware of what I'm about to ask and that the answer is yes.

Cliffs: Is it correct to say that due to bucketing/abstraction, the bot must treat a min bet as something greater than a min bet; therefore, the bot will always "overreact" to min bets? If correct, does the same logic apply to most all-ins for the same reason (i.e., the bot will always "underreact" to all-ins)?

Assumptions:

Due to the abstractions/estimations in the bot's algorithm which are necessary because the bot does not have an infinitely (or at least sufficiently) powerful computer to calculate the Nash Equilibrium play with all specific hands (e.g. K2o vs. K3o treated identically), and most importantly for the conclusions I am about to make, bet sizes, the bot "buckets" (a/k/a groups and treats identically) certain hands and bet sizes at any particular point in its game tree.

For instance, in a game with 200 big blinds, after a flop comes down where there were no pre-flop raises, maybe donk bets of 1, 2, and 3 big blinds are treated identically by the bot, because the bot does not have sufficient computational power to calculate distinct strategies to deal with 1 bb flop bets, 2 bb flop bets, and 3 bb flop bets, so to conserve resources it calculates one strategy that it will implement regardless of whether the flop bet of the opponent is 1, 2, or 3 bb.


Conclusions:

Due to this "bucketing" of the bet sizes of the bot's opponent, the bot will necessarily always over-value/overreact to min bets, because there is nothing below the min bet amount that the bot can "bucket." In other words, if the minimum bet is 1, the bot is at a disadvantage (so long as its opponent is aware of the bucketing and knows how to exploit it) because the bot's strategy to deal with a bet of 1 must always be sufficient to deal with a bet of 1 + X, and probably [1 + (X + 1)], [1 + (X + 2)], etc.

Due to the fact that the bot's opponent knows that there is no number less than the min bet for the bot to bucket, the min bet is necessarily at the very bottom of the bot's "bucket range" (I know that's not correct terminology, but I hope that makes sense). So where the human min donk bets a flop, the human knows that the bot has bucketed its min bet of 1 and is treating it as a bet of 1 + *some number*. Regardless of what *some number* is, the human knows the bot is going to treat the min donk bet as something greater than a min donk bet; therefore, the bot is necessarily not going to treat the min donk bet optimally (at least according to the bot's algorithm's version of GTO).

Similarly, the bot will under-react to/underestimate most all-in bets, because the bot will probably have bucketed the all-in bet amount with certain lesser bet amounts.

Unless I'm missing something, which is definitely possible, unless the bot's programmers specifically accounted for the possibility that the bot's opponent could have the above knowledge about the abstraction/bucketing it employs (or maybe they accounted for it for some other reason), the bot's opponent could always use the above information to exploit: 1) the bot's overreaction to min bets, and 2) the bot's underreaction to all-ins in most spots.

To Doug/cheet/dong/bjorn specifically - is this how you and the other players discovered the "min donk bet exploit," or was it just by chance?

Last edited by Rambler1; 05-10-2015 at 03:19 PM.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 03:37 PM
question 1: given the results of this challenge, was the human team's EV winrate over the course of the challenge (and thus under the conditions of the challenge) positive with at least 95% certainty?

question 2: given the results of this challenge and (my) understanding of grinding poker, would the human team have a positive EV winrate under normal grinding conditions (meaning the conditions under which the members of the team choose to play when they play professionally) with at least 95% certainty?

these are very different questions. question 2 cannot by answered solely by applying statistical calculations to the results/experiment and may contain counter-factual components. however, I am very confident that I can answer it by taking into account my experience as a poker player of how fatigue etc. affect winrates. Given the results and my understanding of winrates, mental game, grinding etc., it's well over 95% that the humans have an edge under their normal conditions. Of course, the CMU team can't calculate this, and shouldn't be expected to try to imo.

Last edited by Keruli; 05-10-2015 at 03:44 PM.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 03:47 PM
Doug I know it's strat question but I hope you can answer since it's pretty simple (and doesn't apply for human play).

Why were you raising limps with almost 100% of your range? Claudico should be fairly balanced with its limps I thought (obviously you have the hhs and can look through the database). Raising limps 100% is exploitative- so has Claudico adjusted over time to the 100% iso in your opinion?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 03:51 PM
Another question- which you may or may not answer.

Do you think that it's always GTO (or mixed GTO) to shove the nuts on the river in position no matter the stackdepth/range? As in is it ever a mistake to do this with a value hand? I know Claudico's choice of bluffs were very strange with respect to blocker effects.

I remember Claudico doing this a LOT (with bluffs too ofc), but it's not something humans do.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-10-2015 , 03:59 PM
Quote:
Originally Posted by Rambler1
To Doug/Sam/TimTam/punter/other super smart math/programmer/HU elites posting ITT:

I apologize if what I'm about to ask has been at least strongly implied by various posters smarter than me. I have read a majority of this thread, and I have not seen the following question asked, although I think Sam/punter/TimTam might have just assumed everyone is aware of what I'm about to ask and that the answer is yes.

Cliffs: Is it correct to say that due to bucketing/abstraction, the bot must treat a min bet as something greater than a min bet; therefore, the bot will always "overreact" to min bets? If correct, does the same logic apply to most all-ins for the same reason (i.e., the bot will always "underreact" to all-ins)?

Assumptions:

Due to the abstractions/estimations in the bot's algorithm which are necessary because the bot does not have an infinitely (or at least sufficiently) powerful computer to calculate the Nash Equilibrium play with all specific hands (e.g. K2o vs. K3o treated identically), and most importantly for the conclusions I am about to make, bet sizes, the bot "buckets" (a/k/a groups and treats identically) certain hands and bet sizes at any particular point in its game tree.

For instance, in a game with 200 big blinds, after a flop comes down where there were no pre-flop raises, maybe donk bets of 1, 2, and 3 big blinds are treated identically by the bot, because the bot does not have sufficient computational power to calculate distinct strategies to deal with 1 bb flop bets, 2 bb flop bets, and 3 bb flop bets, so to conserve resources it calculates one strategy that it will implement regardless of whether the flop bet of the opponent is 1, 2, or 3 bb.


Conclusions:

Due to this "bucketing" of the bet sizes of the bot's opponent, the bot will necessarily always over-value/overreact to min bets, because there is nothing below the min bet amount that the bot can "bucket." In other words, if the minimum bet is 1, the bot is at a disadvantage (so long as its opponent is aware of the bucketing and knows how to exploit it) because the bot's strategy to deal with a bet of 1 must always be sufficient to deal with a bet of 1 + X, and probably [1 + (X + 1)], [1 + (X + 2)], etc.

Due to the fact that the bot's opponent knows that there is no number less than the min bet for the bot to bucket, the min bet is necessarily at the very bottom of the bot's "bucket range" (I know that's not correct terminology, but I hope that makes sense). So where the human min donk bets a flop, the human knows that the bot has bucketed its min bet of 1 and is treating it as a bet of 1 + *some number*. Regardless of what *some number* is, the human knows the bot is going to treat the min donk bet as something greater than a min donk bet; therefore, the bot is necessarily not going to treat the min donk bet optimally (at least according to the bot's algorithm's version of GTO).

Similarly, the bot will under-react to/underestimate most all-in bets, because the bot will probably have bucketed the all-in bet amount with certain lesser bet amounts.

Unless I'm missing something, which is definitely possible, unless the bot's programmers specifically accounted for the possibility that the bot's opponent could have the above knowledge about the abstraction/bucketing it employs (or maybe they accounted for it for some other reason), the bot's opponent could always use the above information to exploit: 1) the bot's overreaction to min bets, and 2) the bot's underreaction to all-ins in most spots.

To Doug/cheet/dong/bjorn specifically - is this how you and the other players discovered the "min donk bet exploit," or was it just by chance?

There are two different kinds of "abstraction," card abstraction and betting abstraction. The "card abstraction" is like the bucketing of the K2o and K3o. For the "betting abstraction" we selected several bet sizes for each situation. E.g., 0.25 pot, 0.5 pot, 0.75 pot, pot, 2 pot, all-in. When the opponent chose a size in between two sizes that we had in the abstraction, we needed to interpret their size as one of the two. Here is a paper describing our main approach for doing this and a comparison to other approaches, http://www.cs.cmu.edu/~sganzfri/Translation_IJCAI13.pdf.

Basically, it assumes that opponent will call a bet of size x with probability 1/(1+x), which comes from pot odds. The main mapping we use is

f(x) = [(B-x)(1+A)]/[(B-A)(1+x)],

where A, B are sizes in the abstraction, and x is the size taken by the opponent.

For the flop minbet example, he bets 100 into pot of 500. So x = 0.2. The closest actions we have are A = 0 (for check) and B = 0.25. Plugging these in to the formula give f(x) = 1/6 = 0.167. This is the probability we map his bet down to 0 and interpret it as a check. So we pick a random number in [0,1], and if it's above 1/6 we interpret the bet as 0.25 pot, and otherwise as a check.

In some of the sessions we used some different approaches for various reasons maybe I will discuss later.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote

      
m