WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot - Page 51 - Poker News

If we're now analyzing selective samples of this, I'm winning at 21 bb/100 for the past 8k hands after really figuring things out. But that's not the spirit of the challenge here and it's not the story here. The story is that the humans with the mirrored hole-card format put together a cumulative 9.56 bb/100 over 79,200 hands so far. That's something we're all really happy and proud about. We all worked together and put together a big win and showed that humans still have the edge in this game for now.

Does the human team expect to lose if the claudico team offered a rematch in the near future?

Quote

05-07-2015 , 11:21 PM

#1252

cheet

veteran

Join Date: Jan 2006 Posts: 2,498

Quote:

Originally Posted by frinxor

Does the human team expect to lose if the claudico team offered a rematch in the near future?

Both sides went into this not knowing quite what to expect. The CMU team has an idea of that now and we know whatever they're willing to submit to a challenge in the future will be quite an improvement. I don't really know what to expect in a year or two, but I know whatever they have in the future will follow the same trajectory so far and that it will be much better and tougher than any previous iteration. Not sure what to say other than that, it's a hard thing to predict.

Quote

05-07-2015 , 11:53 PM

#1253

Land Of The Free?

old hand

Join Date: Apr 2011 Posts: 1,695

Quote:

Originally Posted by mythrilfox

samooth, you're missing the point. obviously there's value in computing exploitability and sam never said otherwise. the question is what value is there in computing exploitability using an abstraction. if we're exploitable for 5 bb/100 using an abstracted tree, what does that translate to in terms of exploitability using the full game tree? i'm guessing from sam's comments that meaningful translation of this sort is nontrivial if not impossible.

punter never really addressed this and instead asked what's the value in creating a strategy without being able to assess how far from equilibrium it is, which imo is a silly question. of course there's value. if, for instance, your algorithm generates successive strategies such that strategy n beats all strategies 1 to n-1, then you're still getting closer to equilibrium with each successive iteration without ever knowing how far you are from it, and inevitably you will eventually arrive at equilibrium with enough iterations.

this relies on testing the value of the strategies empirically (i.e. by actually playing bajillions of hands), so as you got far enough along and started generating successive strategies that were extremely close to equilibrium, the edges would be so small that you would never be able to peg down precisely where equilibrium was. that doesn't mean there still isn't immense value in getting close to equilibrium, or at the very least demonstrably improving. it would be like saying we can't measure how far we are from a complete understanding of the laws of physics, so what value is there in even trying to understand them in the first place? literally all of science is measuring how a method/model compares to our current understanding & how well it performs empirically, not some sort of ground-up value assessment which would require complete omniscience of everything in the universe. generally speaking we only have the luxury of ground-up assessment in small and controlled systems that we humans created.

Great post, agree 100%.

Quote

05-08-2015 , 12:44 AM

#1254

Sam Ganzfried

journeyman

Join Date: Oct 2014 Posts: 224

Quote:

Originally Posted by psloth

Nice interviews with Dong and Jason, Sam. Much more enjoyable than muted Doug/Bjorn streams.

@cheet
hand #207 7s6s vs AcQc where claudico calls your chk/jam on Jc8s4s in a 4bet pot you would've rivered the straight flush lol, turn Qo.

Quote:

Originally Posted by kingofcool

Anyone know on which stream and approximate times this occurs at?

Thanks!

Here's a couple of my convos with the players. There were a few others I wasn't able to track down. If anyone finds the other ones pls PM me those links/times too, thanks.

Jason starting around 2:50:
http://www.twitch.tv/claudico_vs_jasonles/b/655464606

Bjorn starting around 2:50 I think:
http://www.twitch.tv/claudico_vs_bjornli/b/655466136

Jason starting at 1:16:40
http://www.twitch.tv/claudico_vs_jasonles/b/656660092

Dong starting at beginning:
http://www.twitch.tv/claudico_vs_dongkim/b/656665132

Quote

05-08-2015 , 01:35 AM

#1255

LifeWithFaceCards

centurion

Join Date: Dec 2012 Posts: 194

Quote:

Originally Posted by cheet

id also like to commend the humans for succeeding in what seem like very challenging conditions. this has been a serious grind, for not much reward, for a long time now.

im curious how much the humans think they are sacrificing in bb/100 by playing with no HUD, in an unfamiliar environment, without their customary comforts, being questioned and interviewed in the middle of big hands, without a 4 color deck, with a ****ty interface, etc.

all of that seems to be pretty significant to me.

Quote

05-08-2015 , 03:10 AM

#1256

tultfill

veteran

Join Date: Jan 2009 Posts: 2,926

Quote:

Originally Posted by LifeWithFaceCards

100%

Quote

05-08-2015 , 03:52 AM

#1257

IMDABES

veteran

Join Date: May 2012 Posts: 2,335

Quote:

Originally Posted by LifeWithFaceCards

Seriously, this isn't be mentioned enough, really wonder how bad they could slaughter this thing in the comfort of their own homes with a HUD.

Quote

05-08-2015 , 04:33 AM

#1258

g-p

Carpal \'Tunnel

Join Date: Apr 2005 Posts: 6,705

next robot team should put up some money (50k?) and have HU players bid on how much they would pay to play it over x amount of hands

Quote

05-08-2015 , 04:44 AM

#1259

Ignatius

journeyman

Join Date: Sep 2002 Posts: 340

Quote:

Originally Posted by mcb08

The link has only been posted a dozen times itt.

http://www.twitch.tv/claudico_extra

This means that I'm probably not the only one for whom this TV-plugin does not work. It's also a really stupid idea and a colossal waste of bandwidth to use video to display a few numbers. As CMU stopped updating their webpage with current results once it was clear that their bot would lose, I'm using the preview thumbnails for the past broadcast page, but those are always half a day or so behind.

ignatius

Quote

05-08-2015 , 04:59 AM

#1260

Ignatius

journeyman

Join Date: Sep 2002 Posts: 340

Quote:

Originally Posted by mythrilfox

can someone clarify the perfect/imperfect recall distinction? is imperfect recall the academic way of saying claudico doesn't have a full strategy stored and calculates its strategy on the fly?

No. It means that it does not consider all details from the past action in the hand in determining how to proceed. It certainly has to abstract bet sizes and might also simplify the action previous streets and / or cluster boards of similar texture, like e.g. only remembering who put in the last raise in preflop when playing the turn.

ignatius

Quote

05-08-2015 , 06:25 AM

#1261

tultfill

veteran

Join Date: Jan 2009 Posts: 2,926

Quote:

Originally Posted by Ignatius

If you cant get twitch "to work" you're seriously far behind. Doesn't surprise me seeing you sign your posts. It's not the 40s anymore. We're not penpals here.

Quote

05-08-2015 , 06:41 AM

#1262

restorativejustice

veteran

Join Date: Aug 2014 Posts: 2,753

Quote:

Originally Posted by tultfill

If you cant get twitch "to work" you're seriously far behind. Doesn't surprise me seeing you sign your posts. It's not the 40s anymore. We're not penpals here.

A bit gratuitous. No need to be an ass.

Quote

05-08-2015 , 08:02 AM

#1263

Sam Ganzfried

journeyman

Join Date: Oct 2014 Posts: 224

Quote:

Originally Posted by mythrilfox

can someone clarify the perfect/imperfect recall distinction? is imperfect recall the academic way of saying claudico doesn't have a full strategy stored and calculates its strategy on the fly?

Quote:

Originally Posted by Ignatius

In a perfect-recall game, players never forget information they previously knew during the game. E.g., in standard HUNL, players know their exact hole cards and all bets throughout the hand and don't forget them. When we make computer agents such as Claudico, we have to "bucket" similar hands and remove some bet sizes, which is called "abstraction," to make the game small enough for us to be able to solve. The best algorithms for doing this create abstract games that have imperfect recall. For example, we use no abstraction for preflop and can differentiate between K2o and K3o. But on a KQT flop, we might "bucket" K2o and K3o together, so that we can't differentiate between them and are forced to play identically with them. This seems pretty reasonable, as they're pretty similar hands on this board. But technically this now has imperfect recall, as we can't distinguish between hands we previously could.

Nash equilibria are theoretically not guaranteed to exist in games with imperfect recall (even though they exist for full NLHE which has perfect recall), and there are no theoretical guarantees on our equilibrium-finding algorithm for them. But it does well empirically.

Last edited by Sam Ganzfried; 05-08-2015 at 08:07 AM.

Quote

05-08-2015 , 08:27 AM

#1264

punter11235

Carpal \'Tunnel

Join Date: Mar 2005 Posts: 8,210

Quote:

These are some pretty wild claims. How are you able to define what optimal play in chess is in order for you to say that "programs are nowhere near optimal play"?? Are you a chess grand master?

You don't need to be one. There are thousands of computer matches every day and top programs regularly score assymetrically from various starting positions (say 1-0 with white and 1/2 with black). That alone is enough to make the claim but there are different ways as well.
You may want to visit talkchess forum which is great resource with many very smart chess programmers talking about this stuff on regular basis.

Quote:

is imperfect recall the academic way of saying claudico doesn't have a full strategy stored and calculates its strategy on the fly?

More or less but it doesn't matter for the topic at hand. Poker is a perfect recall game and w/e techniques you use internally won't change it. Your result can still be assessed by general techniques.

Quote:

punter never really addressed this and instead asked what's the value in creating a strategy without being able to assess how far from equilibrium it is,

There is value but you can't claim anything about being close to equilibrium or using equilibrium approximating algorithms. If you make a strategy and say: "guys this is awesome I think it wins vs regs online!" then claim that and not equilibrium approximation. You either measure it or you don't make claims about it.

It's tilting to me because from my perspective (and from fellow programmers better than me at this) it's just very basic stuff. I thought it's some kind of misunderstanding when I've seen the professor talking about it at first but now I see it isn't. Good for poker I guess but really scary that with that kind of computing power you can get as strong entity despite all this.

Quote:

for instance, your algorithm generates successive strategies such that strategy n beats all strategies 1 to n-1, then you're still getting closer to equilibrium with each successive iteration without ever knowing how far you are from it, and inevitably you will eventually arrive at equilibrium with enough iterations.

Yes but the very fact that you are doing it means you can measure it and measuring is crucial for tweaks/experiments/compmaring algorithms.
If you don't measure you are betting that your wild guess about what's efficient will work. It's just very wrong way to go about it even if you have thousands of cores. There are so many variations of those algorithms and so many ways to improve them. By not having a measure you are walking blind.

Quote:

that doesn't mean there still isn't immense value in getting close to equilibrium, or at the very least demonstrably improving.

"Demonstrabtly" = you measure up or you shut up.

Quote:

we are from a complete understanding of the laws of physics, so what value is there in even trying to understand them in the first place?

It's bad analogy. It's an optimization problem. In optimization problems you measure your results. Say you are developing algorithms to approximate solution in Travelling Salesman Problem or w/e other difficult problem. The way you compare those algorithms is by measuring the quality of the solution.
It's poker. It's simple, well defined, relatively small card game. Let's not get carried away with law of physics or what not.

Quote:

The best algorithms for doing this create abstract games that have imperfect recall. For example, we use no abstraction for preflop and can differentiate between K2o and K3o. But on a KQT flop, we might "bucket" K2o and K3o together, so that we can't differentiate between them and are forced to play identically with them. This seems pretty reasonable, as they're pretty similar hands on this board. But technically this now has imperfect recall, as we can't distinguish between hands we previously could.

How do you know those algorithms are the best if you don't measure them up at what they are supposed to be doing?

Quote:

Nash equilibria are theoretically not guaranteed to exist in games with imperfect recall (even though they exist for full NLHE which has perfect recall), and there are no theoretical guarantees on our equilibrium-finding algorithm for them. But it does well empirically.

How do you know how it does empirically if you don't measure how well they do at what they are supposed to be doing?

Quote

05-08-2015 , 08:41 AM

#1265

samooth

veteran

Join Date: May 2009 Posts: 3,350

Quote:

Originally Posted by Sam Ganzfried

so you have an equilibrium-finding algorithm for an abstraction that uses imperfect-recall and thus the algorithm has no guarantee of converging to Nash, but you are confident it does because it works well "empirically"? and with empirically you refer to what exactly?

how can you not see the massive disconnect in that claim when you haven't measured how well the approximation process works?

Quote

05-08-2015 , 08:47 AM

#1266

mack's

enthusiast

Join Date: Apr 2015 Posts: 65

Quote:

Originally Posted by tultfill

If you cant get twitch "to work" you're seriously far behind. Doesn't surprise me seeing you sign your posts. It's not the 40s anymore. We're not penpals here.

can i remind everyone that this genius here is the nl10/nl25 micro player with all the answers to beating this bot.

Quote

05-08-2015 , 09:08 AM

#1267

restorativejustice

veteran

Join Date: Aug 2014 Posts: 2,753

I apologize if this is well known, but when will the last 800 hands be played?

Also, I cringe at the thought of Dong and Jason both finishing barely in the red after today and the Professor coming out and saying "It's a tie! Two humans won money and two humans lost" while ignoring the fact the hands were mirrored and that the humans with all the distractions, fatigue and being forced to play even when they did not feel their best overall won by about 10BB/100.

Come on Jason, slow play a few hands to the river and push them nuts for the "go big or go home" comeback.

Quote

05-08-2015 , 09:13 AM

#1268

bip!

Slow Pony

Join Date: Oct 2012 Posts: 14,204

Hats off to Sam for being open and fair and answering questions here.

Quote

05-08-2015 , 09:49 AM

#1269

krazykarter

adept

Join Date: Jan 2011 Posts: 743

Quote:

Originally Posted by Sam Ganzfried

Focusing on the part about bucketing K2 and K3 together on a KQT flop: if these hands are bucketed together on the flop, do they remain together on the turn if say a 2 or a 3 were to come up? To generalize, if any two (or more) hands were grouped in a bucket on the flop, will they remain together for the rest of the hand even if one variant were to greatly improve in strength over the others?

Quote

05-08-2015 , 09:58 AM

#1270

Artichokes

newbie

Join Date: Apr 2015 Posts: 35

Quote:

Originally Posted by restorativejustice

Obviously the bot is getting crushed but that's to be expected. I think for the professor to go in expecting to win isn't very realistic. The bot is not bad at all and it could probably crush small stakes / mid stakes HU players, but it's pretty hard from NLHU bots to go from being absolutely terrible to beating the top .0001% of players in the world.

Quote

05-08-2015 , 10:06 AM

#1271

samooth

veteran

Join Date: May 2009 Posts: 3,350

Quote:

Originally Posted by restorativejustice

Also, I cringe at the thought of Dong and Jason both finishing barely in the red after today and the Professor coming out and saying "It's a tie! Two humans won money and two humans lost" while ignoring the fact the hands were mirrored and that the humans with all the distractions, fatigue and being forced to play even when they did not feel their best overall won by about 10BB/100.

i don't think that will happen, from http://www.riverscasino.com/pittsburgh/BrainsVsAI/:

Quote:

HOW WILL A WINNER BE DETERMINED?
The winner will be determined by who – Claudico or the pros – has the most chips. Though the pros will each play one-on-one with Claudico, their chip totals will be combined as if they competed as a team in deciding the winner. Even if a pro beats Claudico one-on-one, that doesn’t mean much in determining whether a computer can outperform a human in a game with as much randomness as poker.

If the final chip totals are very close, it’s possible that Claudico and the pros will reach a statistical tie, in which case the competition will be declared a draw.

it's clear that the "brains" should be treated as one entity

Quote

05-08-2015 , 10:14 AM

#1272

restorativejustice

veteran

Join Date: Aug 2014 Posts: 2,753

Quote:

Originally Posted by samooth

it's clear that the "brains" should be treated as one entity

I agree, but the last interview I saw with the Professor that I posted about he seemed to redefine things, ignore the mirroring, and say the contest was "fairly even" and that the bot was in "fourth" ahead of Jason.

Quote

05-08-2015 , 10:17 AM

#1273

JPR

stranger

Join Date: Dec 2013 Posts: 13

Quote:

Originally Posted by cheet

If we're now analyzing selective samples of this, I'm winning at 21 bb/100 for the past 8k hands after really figuring things out. But that's not the spirit of the challenge here and it's not the story here. The story is that the humans with the mirrored hole-card format put together a cumulative 9.56 bb/100 over 79,200 hands so far. That's something we're all really happy and proud about. We all worked together and put together a big win and showed that humans still have the edge in this game for now.

Good enough for a statistical win?

Quote

05-08-2015 , 10:19 AM

#1274

mcb08

grinder

Join Date: Nov 2008 Posts: 434

Quote:

Originally Posted by Ignatius

Looks like they may have stopped 'streaming' the totals - likely because the challenge is nearing the end. As for a colossal waste of bandwidth, have you ever been on YouTube?

Quote

05-08-2015 , 10:30 AM

#1275

||||

grinder

Join Date: Mar 2015 Posts: 578

Lol at the moment on dougs stream, Doug explaining his 99 hand when computer incorectly folded A4s with the right odds vs doughs range. The guy interviewing him says something along the lines of "well maybe your all in with the nines was the mistake and your team lost expected value" Doug handling it well trying to explain his hand doesnt matter.

Quote

Page 51 of 65

First

31 41 46 47 48 49 50 51 52 53 54 55 56 61

Last

Post Reply Subscribe

...

Page 51 of 65

First

31 41 46 47 48 49 50 51 52 53 54 55 56 61

Last