Open Side Menu Go to the Top
Register
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot

05-09-2015 , 05:53 AM
Quote:
Originally Posted by Nit Bag
Won't the equity chops and mirrored hands be reflected in actual St Dev observed in 80K hands without need for adjustment?
The former yes, the latter no. To account for the mirroring, you would have to add the result of the mirrored hand-pair up first, so you would have 40000 double hands.

ignatius
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 07:07 AM
Quote:
Sean Connery, Roger Moore, Daniel Craig, Pierce Brosnan
errrr which is the odd one out here?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 07:35 AM
I cannot understand how people are forming conclusions about the skill level of Claudico. Claudico losing "only" 9bb/100 vs the best in the world does not imply much about the strength of the program.

ITT people treating poker like chess too much IMO.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 07:53 AM
It's pretty obvious to me that not only the Claudico team, but random news networks would prefer to run the story- "man beaten by machine in poker". So you can't blame the Claudico team for trying to speak up for their bot.

Hell even after the Mayweather-Pacquiao fight the Pacquiao side came up with all sorts of excuses, and the coach made fun of the fact that Mayweather had boxed well defensively. It's obvious that the professor who created the bot is not going to say- yes we were outclassed about HIS OWN bot when the results were reasonably close. His analogy with the amount of money wagered etc was indeed a poor choice however.

Anyway the media spin was against humans from the start, and the Claudico team only did as they were expected. People are so results oriented- during the first day everyone called the bot trash and didn't even bother talking about winrate (lol doug winning 20bb/100 lololol). If we did seek out a number for statistical significance beforehand maybe there wouldn't be this argument now.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 08:32 AM
Claudico team,

Can you post for us the standard deviation calculated naively (e.g., N=80,000), and then the standard deviation calculated intelligently (e.g., N=40,000)?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 09:43 AM
Quote:
Originally Posted by punter11235
Can we have data from the match used to calculate the variance?



Please... The only not misleding thing to report is raw data, then you write:

MY INTEPRETATION OF THE RESULTS:

"Our arbitrary chosen p value at 95% as commonly used by our colleges indicated this and that".

It is arbitrary threshold. The term "statistically significant" is very misleading to people who don't understand statistics because they think about as it matters/it doesn't matter thing. As you can see even some professors think that way claiming "statistical tie" or what not.



Exactly. You had some assumptions before the match. Let's even make it that you didn't know anything about playing entities. Now you have 80k hands played and you can say things like:

"Such experiment, if repeated would result in humans winning this % of the time". Simple and understandable for anyone without math background.
+1

The fact that people have such a hard time understanding this shows how badly stats is taught.

Imagine you have some prior beliefs about the humans true winrate, w. After 80k hands, there is now a new distribution (posterior distribution). Crudely speaking, about 90%+ of the probability mass of this new distribution will be positive.

The professor is calling it a "tie" because (again roughly speaking) less than 95% of the probability mass is positive. However, even if it was statistically significant at 95%, the shape of the posterior distribution of the winrate would not change very much.

This is just to say that 80k hands at a 9bb/100 winrate DOES strongly suggest the humans have an edge and it is dishonest to proclaim "tie"
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 10:24 AM
Quote:
Originally Posted by polarizeddeck
+1

The fact that people have such a hard time understanding this shows how badly stats is taught.

Imagine you have some prior beliefs about the humans true winrate, w. After 80k hands, there is now a new distribution (posterior distribution). Crudely speaking, about 90%+ of the probability mass of this new distribution will be positive.

The professor is calling it a "tie" because (again roughly speaking) less than 95% of the probability mass is positive. However, even if it was statistically significant at 95%, the shape of the posterior distribution of the winrate would not change very much.

This is just to say that 80k hands at a 9bb/100 winrate DOES strongly suggest the humans have an edge and it is dishonest to proclaim "tie"
if you think normal stats are taught poorly, you are in for a world of hurt if you try to talk bayesian statistics w/ americans. i'm guilty of this too (i got a "pity pass" in my phd level bayesian stats course taught by siddhartha chib while getting my mba). there is 0% chance i legit passed that course. but at least i can understand/read bayesian papers now without puking lol. and i'm a former economist and statistician. so imagine the average american learning this stuff. complete fail.

to your point though, you're generally right that if we started with a naive interpretation of the prior, we'd end up w/ the vast majority of the posterior distribution as human>computer. to actually check this, i'd need the data and i'd have to whip out the 'ol gauss program (yes, we learned in gauss b/c our teacher was an adjunct assh*le who wouldn't listen to logic/reason that nobody but him and his team at the st.louis fed uses that program).
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 10:29 AM
Quote:
Originally Posted by Allen C
Quote:
The "pot" won by the human poker pros—$732,713—sounds very impressive. But during the two week "Brains vs. A.I." challenge, $170 million was bet. The winnings amount to less than one-half of 1 percent of the total pot—a statistically tiny amount.
Also, I'm guessing that $170 million bet included uncalled bets. Wouldn't Claudico's many overbets that rarely get called massively increased the "pot" compared to the amount won even if the humans calling strategy soundly beat them? Perhaps those bets increased the standard deviation over normal human matches?

But yeah, complete bs that he didn't back it up with that actual statistics rather than, "look, here's a big number and here's an irrelevant small number so its a tie."
It's far from complete bs, he just forgot to mention that $170 million is >5% of the 80k*40k maximum that could've been wagered and thus statistically significant (.. Kappa).

I would also like to have the data fwiw.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 10:52 AM
Quote:
Originally Posted by RussianRoulette
I cannot understand how people are forming conclusions about the skill level of Claudico. Claudico losing "only" 9bb/100 vs the best in the world does not imply much about the strength of the program.

ITT people treating poker like chess too much IMO.
do you even poker bro?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 11:13 AM
are they finished? final winnings?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 11:24 AM
Quote:
Originally Posted by +rep_lol
do you even poker bro?
I think his point was that in chess, basically playing ability is transitive. A>B, B>C pretty much guarantees A>C.

But in poker, it's possible that Claudico only loses 9bb/100 vs the top 4 hunl players, but also loses at 9bb/100 vs midstakes regs.

He could mean something else, but this is a good point.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 12:26 PM
Quote:
Originally Posted by polarizeddeck
Then as long as rho <= -0.33 (which I think is reasonable), the humans are winning at 95%.
Shouldn't the correlation coefficient be positive?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 12:36 PM
Quote:
Originally Posted by fees
Looking to bet on the empire, PM me.
Did anyone take you up on this offer, and are they refusing to pay up because "it was a tie"?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 12:41 PM
Quote:
Originally Posted by watergun7
I think his point was that in chess, basically playing ability is transitive. A>B, B>C pretty much guarantees A>C.

But in poker, it's possible that Claudico only loses 9bb/100 vs the top 4 hunl players, but also loses at 9bb/100 vs midstakes regs.

He could mean something else, but this is a good point.
lol based on your posting itt, i really don't need you to explain anything to me. but thanks.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 01:22 PM
doug polk's interview on the pokercast last week was interesting in regards to how the bot was playing
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 02:01 PM
Quote:
Originally Posted by cheet
Both sides went into this not knowing quite what to expect. The CMU team has an idea of that now and we know whatever they're willing to submit to a challenge in the future will be quite an improvement. I don't really know what to expect in a year or two, but I know whatever they have in the future will follow the same trajectory so far and that it will be much better and tougher than any previous iteration. Not sure what to say other than that, it's a hard thing to predict.

Seems to me like Carnegie Mellon and Sandholm built their bot as best they could, tested it, and now the experiment is over. Kind of like when Watson played jeopardy. They tested their algorithms or whatever and now they can start to apply it to more real life relevant tasks and actually start to do somehing useful. Not just engage in the mental exercise of solving poker which in fact helps nobody and only really serves to bring about the competitive end to NLHE ala lhe. This whole thing seemed like a blown up version of some grad student's project. Now apply this bot's tech to something useful.

Sorry you guys lost and weren't the ones to beat poker. You did respectably nevertheless. It kind of sounds like with your methodology you most likely won't be the ones to get there even if you try because Punter already has that on lock and you guys can't even calculate max exploit. I think for Sandholm to go back to the drawing board for a few years and to try to come back with something better would be super lame. This was a worthwhile endeavor on its own but to take it further would be a waste of of a great university's resources and of great minds as well. World has a lot of problems, do something that matters yo.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 02:59 PM
Quote:
Originally Posted by UpHillBothWays
if you think normal stats are taught poorly, you are in for a world of hurt if you try to talk bayesian statistics w/ americans. i'm guilty of this too (i got a "pity pass" in my phd level bayesian stats course taught by siddhartha chib while getting my mba). there is 0% chance i legit passed that course. but at least i can understand/read bayesian papers now without puking lol. and i'm a former economist and statistician. so imagine the average american learning this stuff. complete fail.

to your point though, you're generally right that if we started with a naive interpretation of the prior, we'd end up w/ the vast majority of the posterior distribution as human>computer. to actually check this, i'd need the data and i'd have to whip out the 'ol gauss program (yes, we learned in gauss b/c our teacher was an adjunct assh*le who wouldn't listen to logic/reason that nobody but him and his team at the st.louis fed uses that program).
Did you switch to a field that doesn't require the ability to write in English?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 03:09 PM
Quote:
Originally Posted by RickySteve
Did you switch to a field that doesn't require the ability to write in English?
MBA from university of phoenix online where they pass you regardless of intelligence as he admitted
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 03:16 PM
Some people in this thread stated chess is way more complicated than nlhu.

When it comes to computing a solution to a game, is it not only the amount of decisions that decide how complicated the game is? As in which is more difficult to computationally solve? What did i miss?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 03:25 PM
Media reporting it as a tie (within article)....

http://www.nbcnews.com/tech/innovati...am-ace-n356111
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 03:37 PM
Quote:
Originally Posted by bip!
I gotta believe they thought the thing was unbeatable.

Tartarian stomped the bot competition - this stomped Tartarian... so he figured it was near perfect.

And I keep harping on it - but hand mirroring was likely set up to justify 200bb fixed strategy as much as it was for statistical reasons (props to them for going all the way to 200bb though).
200bb probably benefited them too; the bot made all its money on the river.

Really tilting seeing this professor flat out lie. Can someone correct this to whatever media sources he is reporting to?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 03:44 PM
Quote:
Originally Posted by NoamBrown
Hey all,

I'm one of the Claudico developers and I'm sitting here chatting with the pros and Sam now. I thought I'd clear up some confusion about the statistics. We calculated the 95% confidence interval based on the 80,000 mirrored hands that were played and it was +/- 10.35bb/100. The pros won by 9.16bb/100. That's a pretty strong lead, but the result is not statistically significant at 95%.

We discussed this with the pros before we made the announcement and we all were pretty satisfied with how things were phrased. The title says that the pros finished ahead in chips, but the subheader says it was not a statistically significant result.
I'm sure you say that in reverse. At the beginning of this you guys said 95% would be considered significant.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 05:53 PM
Quote:
Originally Posted by elliot10181
Media reporting it as a tie (within article)....

http://www.nbcnews.com/tech/innovati...am-ace-n356111
at least every article so far starts off saying the humans won
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 05:58 PM
I will recall another example where the variance in mirrored hands was quite extreme.

VS WCG
WCG: Bot raises 2.5x J8ss, WCG calls T5o
Flop: 885r
wcg chk, bot bets, wcg call
Turn: 5
wcg chk, bot bets, wcg call
River: 5
wcg chk, bot overbets, wcg shoves allin, bot calls.

Vs Jason Les
Preflop: Bot limps btn with T5o, Jason Les makes it 3.5x with Js8s, bot folds.
(Future board runout: 885-5-5)


19.9k differential in favor of humans.

Last edited by redfin; 05-09-2015 at 06:05 PM.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-09-2015 , 06:01 PM
Quote:
Originally Posted by elliot10181
Media reporting it as a tie (within article)....

http://www.nbcnews.com/tech/innovati...am-ace-n356111
Quote:
Originally Posted by elliot10181
Media reporting it as a tie (within article)....

http://www.nbcnews.com/tech/innovati...am-ace-n356111
Absolutely ridiculous how they have spun it. "... a slim victory for the humans — so slim, in fact, that the scientists running the show said it's effectively a tie"

Imagine watching the giants winning the world series 4-0 in 2012. And instead of reporting that the giants were victorious and were the better team in the world series, saying: "the probability that this outcome or better could happen if the giants' true winrate was 50% is 6.25%. This is not significant at the 5% level so it is effectively a tie"

Everyone would see how moronic a statement this is. Yes, you wouldn't bet your life that the giants are better than the tigers after watching that, but the evidence is certainly suggestive. In the case of the poker match, the humans winrate (based on noam's #s) was within a 92% confidence interval so it's a similar scenario.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote

      
m