Open Side Menu Go to the Top
Register
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot

05-08-2015 , 01:53 PM
Quote:
Originally Posted by Joshua
I didn't see the streamed press conference but on the competition homepage the professor says it was a statistical tie:

But statistically, says Tuomas Sandholm, the CMU professor of computer science who led the Claudico's development, it was a tie, and the results of the experiment would seem to be inconclusive.
The "pot" won by the human poker pros—$732,713—sounds very impressive. But during the two week "Brains vs. A.I." challenge, $170 million was bet. The winnings amount to less than one-half of 1 percent of the total pot—a statistically tiny amount.
It's impossible to say that the humans clearly beat Claudico, according to Sandholm.


Link:
https://www.cs.cmu.edu/brains-vs-ai#May-8-2015
completely disgraceful
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 01:55 PM
Quote:
Originally Posted by Joshua
I didn't see the streamed press conference but on the competition homepage the professor says it was a statistical tie:

But statistically, says Tuomas Sandholm, the CMU professor of computer science who led the Claudico's development, it was a tie, and the results of the experiment would seem to be inconclusive.
The "pot" won by the human poker pros—$732,713—sounds very impressive. But during the two week "Brains vs. A.I." challenge, $170 million was bet. The winnings amount to less than one-half of 1 percent of the total pot—a statistically tiny amount.
It's impossible to say that the humans clearly beat Claudico, according to Sandholm.


Link:
https://www.cs.cmu.edu/brains-vs-ai#May-8-2015
I'd love to hear Sam's thoughts on this lol. Isn't he a two-time supernova elite? How could this statement pass through his bs filter???
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 01:59 PM
Had Claudico been +9bb/100 rather than -9bb/100... I wonder what tune the professor would be singing
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 01:59 PM
Quote:
The "pot" won by the human poker pros—$732,713—sounds very impressive. But during the two week "Brains vs. A.I." challenge, $170 million was bet. The winnings amount to less than one-half of 1 percent of the total pot—a statistically tiny amount.
Also, I'm guessing that $170 million bet included uncalled bets. Wouldn't Claudico's many overbets that rarely get called massively increased the "pot" compared to the amount won even if the humans calling strategy soundly beat them? Perhaps those bets increased the standard deviation over normal human matches?

But yeah, complete bs that he didn't back it up with that actual statistics rather than, "look, here's a big number and here's an irrelevant small number so its a tie."
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:02 PM
It's not too surprising after seeing him talk on stream for 2 weeks. He talks about Claudico like a beloved son. I'm not sure what's worse though, calling the end result a tie or when he told somebody that the bot was in 4th place out of 5.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:03 PM
Quote:
Originally Posted by tartarus
I'd love to hear Sam's thoughts on this lol. Isn't he a two-time supernova elite? How could this statement pass through his bs filter???
The Prof probably doesn't like being told he is wrong if his interpretation of this event is any indication.

The best aspect of this event for me was the insight it gave me into the minds of academics. I will now no longer trust and most certainly will verify what "experts" say.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:10 PM
Quote:
Originally Posted by 200zoomgrinder
WOW at tie. I thought Sam was saying that they would accept 95% confidence. At std dev = 160bb/100 (quite high for HUNL I think) and winrate 9.56 humans are still 95% confidence to be winning.

Using equity chops usually reduces std dev by about 20% @ 100BB, maybe a bit less @ 200BB since you get AI pre river less. So maybe 10%. Mirror I would guess reduces (significantly) more than equity chops, but again maybe lets say 10%. So conservatively the adj. std dev is 129.6 This gives humans 98% confidence of being winners. Nice "tie".
Won't the equity chops and mirrored hands be reflected in actual St Dev observed in 80K hands without need for adjustment?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:16 PM
a lot of low and midstakes players with near 170 millions wagered would love to have a statistical tie in their bank account
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:19 PM
and once again people turn away from their ego problem to a professor's ego problem.

Doug said that he beats the top regs he plays for more than 9bb and that claudico is extremely good.
This bot is still better than every single person that posted in this thread besides bjorn/doug/dong
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:21 PM
Quote:
Originally Posted by mack's
and once again people turn away from their ego problem to a professor's ego problem.

Doug said that he beats the top regs he plays for more than 9bb and that claudico is extremely good.
This bot is still better than every single person that posted in this thread besides bjorn/doug/dong
whats your point? everything said on this page is still true, this wasnt a tie it was a very clear victory for humans.

Nobody said the bot was bad.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:21 PM
Quote:
Originally Posted by Kirbynator
a lot of low and midstakes players with near 170 millions wagered would love to have a statistical tie in their bank account
epic, quoted for posterity

lol@tie
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:28 PM
LOL@ tie

I really enjoyed these matches over the last weeks,it was very entertaining and they did a great job representing the poker community (especially Doug).
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:32 PM
Quote:
Originally Posted by mack's
and once again people turn away from their ego problem to a professor's ego problem.

Doug said that he beats the top regs he plays for more than 9bb and that claudico is extremely good.
This bot is still better than every single person that posted in this thread besides bjorn/doug/dong
Look, we all heard clearly from Doug's statement that Claudico was a strong opponent and that he beats the top regs for more than the winrate against Claudico. No one is doubting Claudico's strength.

What is crazy and totally illogical is the Professor's means of evaluating the match. 9bb/100 tells us (according to Doug) that Claudico is a strong opponent. However, it is nowhere statistically insignificant- just non-conclusive. Look at the way they evaluated the results: looking at the total amount of money wagered, saying that the money won by the humans was a small fraction of that. Anyone who's played a decent amount of poker should understand how stupid that is.

As someone posted, consider the headline the Professor would write if Claudico was up 9bb/100. I'm sure it would not be statistical tie. A much better headline would be that the humans beat Claudico in a close match that demonstrated an AI can play at a world-class level in heads up no limit.

Last edited by tartarus; 05-08-2015 at 02:43 PM.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:33 PM
if it's a statistical tie, hopefully we will see some real money matches between top players and claudico with the professor backing claudico
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:33 PM
Hey all,

I'm one of the Claudico developers and I'm sitting here chatting with the pros and Sam now. I thought I'd clear up some confusion about the statistics. We calculated the 95% confidence interval based on the 80,000 mirrored hands that were played and it was +/- 10.35bb/100. The pros won by 9.16bb/100. That's a pretty strong lead, but the result is not statistically significant at 95%.

We discussed this with the pros before we made the announcement and we all were pretty satisfied with how things were phrased. The title says that the pros finished ahead in chips, but the subheader says it was not a statistically significant result.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:38 PM
Quote:
Originally Posted by cheet
http://pittsburgh.cbslocal.com/2015/...rivers-casino/

A longer segment will be on the CBS morning show tomorrow at (I believe) 9:30 am est
good segment. look forward to seeing the longer one.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:38 PM
Quote:
Originally Posted by NoamBrown
Hey all,

I'm one of the Claudico developers and I'm sitting here chatting with the pros and Sam now. I thought I'd clear up some confusion about the statistics. We calculated the 95% confidence interval based on the 80,000 mirrored hands that were played and it was +/- 10.35bb/100. The pros won by 9.16bb/100. That's a pretty strong lead, but the result is not statistically significant at 95%.

We discussed this with the pros before we made the announcement and we all were pretty satisfied with how things were phrased. The title says that the pros finished ahead in chips, but the subheader says it was not a statistically significant result.
That's great and a fine analysis, but that's not how it was remotely phrased in the article. You could have said it was statistically insignificant with that analysis attached, but what's actually written is really horrible:

"The "pot" won by the human poker pros—$732,713—sounds very impressive. But during the two week "Brains vs. A.I." challenge, $170 million was bet.

The winnings amount to less than one-half of 1 percent of the total pot—a statistically tiny amount."
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:39 PM
Quote:
Originally Posted by Nit Bag
Won't the equity chops and mirrored hands be reflected in actual St Dev observed in 80K hands without need for adjustment?
With 80k hands, the observed variance probably should be used, rather than just assuming it's the same as the variance of human play. Same argument for directly including the effects of duplicate and equity chops, rather than just assuming it's around 25% for duplicate and ...?% for all-in equity. Has anyone been crazy enough to transcribe all of the hands from stream?

I'd have guessed a standard deviation of around 16-18 big blinds/hand (sorry about units!) with all-in equity: good bot vs bot play in the last computer poker competition dropped down under 20 IIRC, and some of our own tests are sitting in that range. Duplicate seems to give around a 25% reduction in the standard deviation for no-limit. That would give an estimated standard deviation of 0.083 big blinds/hand for average winnings over 80k duplicate hands.

I'd have liked a stronger argument than "It's a tie!"

Edit: And there's some of the actual data. Good!
Suprisingly high values. Either duplicate variance not being computed correctly (I hope not!), or duplicate wasn't giving 25% reduction, or the raw money variance was all the way up around 20 big blinds / hand. Or some combination.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:40 PM
Quote:
Originally Posted by NoamBrown
Hey all,

I'm one of the Claudico developers and I'm sitting here chatting with the pros and Sam now. I thought I'd clear up some confusion about the statistics. We calculated the 95% confidence interval based on the 80,000 mirrored hands that were played and it was +/- 10.35bb/100. The pros won by 9.16bb/100. That's a pretty strong lead, but the result is not statistically significant at 95%.

We discussed this with the pros before we made the announcement and we all were pretty satisfied with how things were phrased. The title says that the pros finished ahead in chips, but the subheader says it was not a statistically significant result.
What St. Dev was arrived at? Calling it a statistical tie is not correct though is it? More correct to say, as you did, that the win is not conclusive at the 95% confidence interval.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:44 PM
If the devs could - would you please put the histogram up of results at 100 hand intervals?
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:48 PM
Quote:
Originally Posted by NoamBrown
Hey all,

I'm one of the Claudico developers and I'm sitting here chatting with the pros and Sam now. I thought I'd clear up some confusion about the statistics. We calculated the 95% confidence interval based on the 80,000 mirrored hands that were played and it was +/- 10.35bb/100. The pros won by 9.16bb/100. That's a pretty strong lead, but the result is not statistically significant at 95%.

We discussed this with the pros before we made the announcement and we all were pretty satisfied with how things were phrased. The title says that the pros finished ahead in chips, but the subheader says it was not a statistically significant result.
around 10bb/100 is a wide lead. telling it is a "tie" is forging the results. (I mean: giving false information or giving a valid information that misleading the audience)

we are all playing poker and a lot of us is practiced deep stack games (100bb+) and we know how much winrate can be expected from 80k hands (I am not mentioning that hands was mirrored and eq chopped). Against a good player it is a wide win (against a weak player 10bb/100 is not a great result ofc). You can tell to the audience you AI played better than a bad player, but if you tell Claudico made a tie against human team we will know it is against the truth and the interest of all poker players (so this forging probably will be well known publicly).
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:52 PM
Quote:
Originally Posted by bip!
If the devs could - would you please put the histogram up of results at 100 hand intervals?

Or even better, the bb result of each hand.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:53 PM
Quote:
I suspect your algorithm for finding max exploitability would be useful, using a few fixed bet sizes, but not rigorous enough for academia. All your own arguments about claims without rigor would equally apply to your own metric.
No. That would be great lower bound on exploitability. Anybody who did some poker programming knows that and they also know it's quite a useful bound which hard to get to low values.

Quote:
But statistically, says Tuomas Sandholm, the CMU professor of computer science who led the Claudico's development, it was a tie, and the results of the experiment would seem to be inconclusive.
The "pot" won by the human poker pros—$732,713—sounds very impressive. But during the two week "Brains vs. A.I." challenge, $170 million was bet. The winnings amount to less than one-half of 1 percent of the total pot—a statistically tiny amount.
It's impossible to say that the humans clearly beat Claudico, according to Sandholm.
Well it was clear for some time he doesn't know what he is talking about anyway. It is just ridiculous but unlike claims about programming it's easily verifiable as ridiculous even by non-programmers.

Quote:
Prof needs to keep the grant money flowing. I don't think they set all of this up expecting the bot to lose lol.
This is tilting because Holdem is simple enough game that if you throw resources at it you will get a decent AI even if you are absolutely clueless about poker and poker programming. It's tilting the hell out of me even more so seeing how Sam is apparenlty very reasonable guy who is trying to answer questions and gives honest info/assessments while the boss in a project is a major bs'er.

Quote:
Edit: And there's some of the actual data. Good!
Suprisingly high values. Either duplicate variance not being computed correctly (I hope not!), or duplicate wasn't giving 25% reduction, or the raw money variance was all the way up around 20 big blinds / hand. Or some combination.
At this point I wouldn't trust the prof to calculate variance correctly. I think the players have their HHs from the match so maybe they could ask someone to write a simple script and verify it (I understand they don't want to disclose them).
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:54 PM
Quote:
Originally Posted by NoamBrown
Hey all,

I'm one of the Claudico developers and I'm sitting here chatting with the pros and Sam now. I thought I'd clear up some confusion about the statistics. We calculated the 95% confidence interval based on the 80,000 mirrored hands that were played and it was +/- 10.35bb/100. The pros won by 9.16bb/100. That's a pretty strong lead, but the result is not statistically significant at 95%.

We discussed this with the pros before we made the announcement and we all were pretty satisfied with how things were phrased. The title says that the pros finished ahead in chips, but the subheader says it was not a statistically significant result.
I'm surprised that the players agreed to this wording? Maybe one of them can chime in on their reasoning? "We didn't gaf" will suffice
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote
05-08-2015 , 02:56 PM
When I saw it was Polk and Donger, I was confident in the humans.

When I saw Bjorn had been added, I grew even more confident.

When I saw Jason Les had been added, I knew humans were in trouble.
WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot Quote

      
m