WCGRider, Dong Kim, Jason Les and Bjorn Li to play against a new HU bot - Page 64 - Poker News

The bot is programmed to call a certain range pre flop so I could see that being exploitable. Unless the bot was programmed to sometimes call a random hand pre flop. I'm gonna assume the bot is programmed to call 10 10 plus pre flop. I tried just shoving and it called with 10 10, AQ and QQ

Quote

05-19-2015 , 12:21 AM

#1579

Eeenno

centurion

Join Date: Dec 2010 Posts: 188

it is a bit late to ask but I am curious that, if both human and Claudico can only use fixed bet sizing (25%/50%/75%/100% pot), will human still crush or is it close? And how about playing 75bb cap instead of 200bb?

I think with restriction mentioned above(fixed bet slider and 75bb cap), Claudico can crush any low stake nl players, especially when the human doesn't know he is playing a bot. Claudico seems to apply so much pressure that unless the human hit and run the stress will just eat human players alive in long sessions.

Quote

05-19-2015 , 01:20 AM

#1580

Rei Ayanami

Carpal \'Tunnel

Join Date: Aug 2012 Posts: 11,685

Quote:

Originally Posted by skario

jeskola.net / poker (200bb) is another bot that has only pot size bets but is almost flawless otherwise. It is (or should be) virtually unbeatable. Claudico was probably (educated guess) >1000mb exploitable so there is a big difference.

Thanks for the link. This was actually kind of fun -- improvising ranges for spots I've only prepped with third- or half-pot bets presented a cool challenge. Ran decently and booked a small win over a hundred hands.

Quote

05-19-2015 , 01:55 AM

#1581

ribbbed

stranger

Join Date: Aug 2013 Posts: 14

Sorry if it has already been asked, but is there any website where people can play against Claudico/a similar heads-up bot? (not like the Jeskola one posted earlier with only pot-sized bets though).

Quote

05-19-2015 , 03:33 AM

#1582

+VLFBERH+T

grinder

Join Date: Mar 2013 Posts: 657

Quote:

Originally Posted by ribbbed

Sorry if it has already been asked, but is there any website where people can play against Claudico/a similar heads-up bot? (not like the Jeskola one posted earlier with only pot-sized bets though).

The aforementioned Neo Bot: http://www.neopokerbot.com/

Sartre: https://www.cs.auckland.ac.nz/poker/

Quote

05-19-2015 , 05:44 AM

#1583

ribbbed

stranger

Join Date: Aug 2013 Posts: 14

Quote:

Originally Posted by +VLFBERH+T

The aforementioned Neo Bot: http://www.neopokerbot.com/

Sartre: https://www.cs.auckland.ac.nz/poker/

Ty~

Played vs. the Sartre bot for 500 hands and would say that it's 'broken' tbh. Tested out things like open shoving 400bb into it from button, called me with some real hands like JJ-AA but also with 73o, T5o, T8o...

Also, spamming minbet/minraise gets very easy folds too, eg.

"
===FLOP===
Kh Ts Ac
Mot Bet $1.0
Sartre Raise to $17.0
Mot Raise to $18.0
Sartre Fold
===FOLD===
Sartre Folds.
Mot Wins $21"

Noted heaps of other absurd things it does within that time, but it's probably completely irrelevant since this bot was prob just made by some I.T. department for fun.

Quote

05-19-2015 , 06:18 AM

#1584

+VLFBERH+T

grinder

Join Date: Mar 2013 Posts: 657

Quote:

Originally Posted by ribbbed

Raise from 17 to 18 ...??? Haha, genius

Quote

05-21-2015 , 02:20 PM

#1585

Nit Bag

centurion

Join Date: May 2012 Posts: 170

Good article in Wired online on this:

http://www.wired.com/2015/05/humans-...s-hold-em-now/

Also, LA Times & PBS segment:

http://www.latimes.com/nation/great-...ry.html#page=1

http://www.pbs.org/newshour/bb/will-...sourced-robot/

Quote

05-23-2015 , 01:18 AM

#1586

Sam Ganzfried

journeyman

Join Date: Oct 2014 Posts: 224

Several people pointed out a few particularly problematic hands for Claudico from the competition. For one of them, we apparently called a small river bet with 5 high. I didn't see this hand, but the reason for why we could do this was described in my prior post, http://forumserver.twoplustwo.com/sh...postcount=1500. We use a randomized "action translation" algorithm that maps a bet probabilistically to one of the sizes in our abstraction. If the opponent bets a size smaller than the smallest size in our abstraction, we will map it down to a check/call with some probability, and our protocol will force us to check/call for the next action when this occurs.

Here is the example I gave in my prior post for how this approach works. You can read more information and see a comparison to other approaches in the paper, http://www.cs.cmu.edu/~sganzfri/Translation_IJCAI13.pdf.

Basically, it assumes that opponent will call a bet of size x with probability 1/(1+x), which comes from pot odds. The main mapping we use is

f(x) = [(B-x)(1+A)]/[(B-A)(1+x)],

where A, B are sizes in the abstraction, and x is the size taken by the opponent.

For the flop minbet example, he bets 100 into pot of 500. So x = 0.2. The closest actions we have are A = 0 (for check) and B = 0.25. Plugging these in to the formula give f(x) = 1/6 = 0.167. This is the probability we map his bet down to 0 and interpret it as a check. So we pick a random number in [0,1], and if it's above 1/6 we interpret the bet as 0.25 pot, and otherwise as a check.

People have also brought up several other problematic hands -- most notably a hand where I think we had A4o and folded preflop after putting in over half our stack to a human's 99. On a different hand we had KT vs A2 and folded to a shove on the turn after putting in about 3/4 of our stack despite having a pair and FD (maybe top pair even, I forget the specifics). I looked at the log in detail for both hands during the competition, and the problem for both was due to the translation issue described above. For the A4 hand, we had mapped the opponent's 3bet or 4bet down to a smaller size, which caused us to look up a strategy for ourselves that had been computed thinking that the pot size was much smaller than we thought it was (I think we thought it was 7k and it was actually 10.5k). These translation issues can get magnified further as the hand develops if we think we have bet a percentage (say 2/3) of the (correct) size of the pot, while the strategies we have precomputed assumed a different size of the pot.

For the KT vs. A2 hand, the issue was similar. Here, the human made a 3bet on the flop which was slightly below the smallest size we had in our abstraction in that situation, and we ended up mapping it down to just a call (I believe it was just mapped down with 3% probability in that situation, and so we ended up getting pretty "unlucky" that we mapped it in the "wrong" direction). This ended up causing us to think we had committed far fewer chips to the pot at that point than we actually had.

I went over the log files for these two specific hands with Doug in person after the competition had ended (I think some of the other humans were present too), and he agreed actually that our lines with both the A4 and KT hand were reasonable had the pot size been what our computed strategies perceived it to be at that point. Of course, we both agree that the hands were both major mistakes if you include the misperception of the pot size. Even though these were only low probability mistakes due to randomization outcome used for the translation mapping, these types of mistakes can really add up over time, particularly if playing against humans who are aware of them and actively trying to exploit them. Doug alluded to this point as well in his interview, http://www.highstakesdb.com/5793-exc...ider-polk.aspx.

We actually became aware of this problem (partially based on those particular hands) mid-way during the competition, and decided to switch the translation mapping to a deterministic one for some of the sessions (the one called "deterministic pseudoHarmonic" described in the paper linked above). This one is likely far more exploitable than the randomized one, but we did not think the humans were focusing much energy on exploiting the leaks in how we responded to bet sizes. It turns out that I was correct in thinking this, and in subsequent discussions with the humans the said they weren't putting much effort into doing this, other than the frequent "minbets" that they thought might be problematic for us by pushing us off of our tree somehow.

Obviously this is a big leak of our agent that would need to be improved in the future. Based on Doug's interview it seems that he views this as the biggest leak currently as well, and it will be very interesting to see what improvements can be found, and whether those can be exploited in turn by good countermeasures.

One benefit of the "endgame solver," which computed the river strategy in real time, is that it solved that "off-tree problem" I just described. I give an example for how it does this in Section 3.4 of the paper, http://www.cs.cmu.edu/~sganzfri/Endgame_AAMAS15.pdf. When the river is dealt, we correct the pot size from our perception of it to the correct amount before computing our strategies, so any disparity that developed earlier in the hand will be corrected. This is one of the key benefits of the endgame solving approach, and I expect real-time computation to play a pivotal role in future research in this area. Other benefits and full details of the algorithm are described in the paper.

Some people commented on the running time of the algorithm (it averaged 15-20 seconds per river hand for most of the competition). The algorithm did not necessarily require that much time. We could have modified it to take less time, at the expense of the degree of card bucketing and/or number of bet sizes used. I realize it was somewhat frustrating to the players and the spectators to have to wait so long for each river. If a future competition occurs, perhaps a time limit would be mutually agreed upon in advance.

One limitation of the endgame solving algorithm (that I already knew was problematic, but became especially apparent early on in the competition) is that it does not fully account for card removal when deciding which hands to group together. It looks at the equity of each hand vs. the opponent's perceived range (assuming he had followed our strategies to that point as well), and buckets hands together with similar values. This may result in, for example, grouping together the nut low with a flush blocker together with the nut low without a flush blocker (since both would have equity 0) into the same bucket and not being able to distinguish between the two. This is why the agent sometimes made huge overbets without blockers, which some people pointed out was a mistake. That said, I'll clarify that the algorithm does actually take blockers/card removal into account to an extent -- if two hand have the same ranking, but one has higher equity vs. opponent's range due to card removal than the other, then the algorithm would account for this and potentially place them into different buckets. The humans seemed divided early on as to whether they thought the algorithm was taking into account card removal or not, and the answer is that it was, but just not as fully as possible.

Primarily for this reason, we decided to take out the large bet sizes for ourselves for the endgame solver partway through. Interestingly, Dong told me that they had looked into it and we were actually making money on those big sizes during the time we used them. I think everyone agrees that huge overbets are likely part of full optimal strategies, and likely underutilized by even the best human players. But card removal is also particularly important for these sizes, and I think for an agent to use them successfully an improved algorithm for dealing with blockers/CR would need to be developed, though I'm still quite curious how well we would've held up if we continued with the agent as it was with those sizes in.

A few posters, such as punter11235, claimed that there is better software available on the market for solving endgames given ranges for both players (perhaps software that fully accounts for card removal). I once looked into this and my understanding was that the best tool assumed just one bet size for all situations. While this may work very well for post-mortem analysis of human poker play, it's pretty clear that an agent that assumed just one bet size was available for the opponent would get creamed playing against humans of this caliber. The humans were certainly willing to make very small bets or huge overbets, particularly if they thought our algorithm had a weakness in responding to those. So we opted to use many different bet sizes to protect ourselves from bet-size exploitation (the version we used at the beginning had 8 different sizes for the first river bet, plus fold/check), at the expense of having to use some card abstraction and not fully account for card removal. Some of the humans informed me that there's software available now that uses 2-3 sizes and possibly doesn't use any card abstraction. I still think that using 8 sizes with card abstraction is much better against top humans than just 2-3 without it, though it would be interesting to run comprehensive experiments to test this. So I'm not convinced that there actually exists other software out there that is better for this than the approach we used, though I have not done a thorough investigation, and would be happy to hear from people familiar with the state-of-the-art tools.

I believe punter11235 or someone else also claimed that there exist stronger bots than Claudico. I'm pretty skeptical of this. If the developers of one of these software tools really has a stronger bot, then I don't understand why they wouldn't have submitted it to the computer poker competition. Winning that would certainly boost visibility for their product. I would understand why someone who actually has a bot playing illegally for profit might want to keep a lower profile, and I admit this is a possibility, though I still think it is very unlikely that there exist any bots that are better than Claudico.

I think that with the right team working full-time on it, an agent could be created that beats the top humans in one year. But accounting for the limitations imposed within academia, I would guess that a timeline of 3-4 years is more realistic.

Some people have claimed that the fact that the developers modified the agent during the competition constitutes cheating. I completely disagree. There was no specification in the rules that were agreed upon that it was not allowed. The humans were very aware that we were making modifications, and they frequently commented on this throughout the competition, and you can also see Doug discuss this in his interview. They caught on to most of the changes surprisingly quickly. I have described some of the major modifications above in this post (these have already been brought up in this thread/Doug's interviews, and the approaches are described in publicly-available papers linked above). The humans made significant modifications to their strategies throughout the competition as well. From what I hear, they went over databases and discussed strategy for hours every night. The players in the same room often consulted with each other for key decisions. I personally think it was exciting to include these elements in the competition. But in any case, I think a future event should provide clear specifications as to what modifications are allowed on both sides.

Regarding statistical significance, Noam Brown summed up everything extremely well in his post, and I agree with much of what he said, http://forumserver.twoplustwo.com/sh...postcount=1501. If I were the one writing the press release, I would have made the title: "Humans beat computers by an amount that is statistically significant at the 90% level but not at the 95% level!" I think it is misleading to flat-out call it a "tie" or even a "statistical tie," but many of the articles gave a pretty fair characterization of the results in my opinion.

I have recently completed my PhD and am not currently employed by Carnegie Mellon.

Last edited by Sam Ganzfried; 05-23-2015 at 01:45 AM.

Quote

05-23-2015 , 10:46 AM

#1587

+VLFBERH+T

grinder

Join Date: Mar 2013 Posts: 657

Quote:

Originally Posted by tenderloinig

What sets Claudico apart it seems is the bots ability to adjust its play on the fly via the endgame solver and also you say it has finer grained abstractions than other bots already because rather than assume uniform random distributions it is breaking down hand ranges as it plays reading betsizes etc. My question then is would Claudico be less effective on a slower computer, say a normal laptop.Does it need a bunch of cores to use the endgame solver?

Quote:

Originally Posted by Sam Ganzfried

Not gonna comment on the approach until after the competition concludes.

Hi Sam, great to have you in here again, and thank you for your elaborate post.

Now back to the question asked above, which you said you wouldn't answer while the competition was still running: I understand Claudico uses 64 cores with 512 GB RAM for real-time calculations. Still - is there any way to have a program like Claudico / Tartanian7 run on a standard PC without losing a ridiculous amount of playing strength ?

P.S. Congratulations for obtaining your PhD !

Quote

05-23-2015 , 01:55 PM

#1588

Sam Ganzfried

journeyman

Join Date: Oct 2014 Posts: 224

Quote:

Originally Posted by +VLFBERH+T

You can see the technical details for the computer competition here, http://www.computerpokercompetition....l-details-2014. I think we used c3.large for Tartanian7, which is 2 cores and 3.75 GB memory. We had to take out the real-time endgame solving from that version right before the competition due to a last-minute issue, but we used it successfully in the 2013 competition with similar resources. So I think the approach would still be very successful even with significantly fewer cores and RAM, we just used more for this recent competition because it happened to be available (I believe we didn't use much more than 3.75GB anyway even though much more was available). Using fewer cores would just mean that we have to use fewer card buckets and/or cut out some bet sizes (for us and/or the opponent) if restricted to achieve the same running time. I haven't done very thorough experiments to see exactly how the performance degrades when reducing the number of cores. This would be interesting to look at, and I actually do not think this degradation would be very severe if a good card abstraction algorithm were used despite reducing the number of card buckets.

Last edited by Sam Ganzfried; 05-23-2015 at 02:08 PM.

Quote

05-23-2015 , 02:16 PM

#1589

+VLFBERH+T

grinder

Join Date: Mar 2013 Posts: 657

Quote:

Originally Posted by Sam Ganzfried

You can see the technical details for the computer competition here, http://www.computerpokercompetition....l-details-2014. I think we used c3.large for Tartanian7, which is 2 cores and 3.75 GB memory. We had to take out the real-time endgame solving from that version right before the competition due to a last-minute issue, but we used it successfully in the 2013 competition with similar resources. So I think the approach would still be very successful even with significantly fewer cores and RAM, we just used more for this recent competition because it happened to be available (I believe we didn't use much more than 3.75GB anyway even though much more was available). Using fewer cores would just mean that we have to use fewer card buckets and/or cut out some bet sizes (for us and/or the opponent) if restricted to stay within the same running time. I haven't done very thorough experiments to see exactly how the performance degrades when reducing the number of cores though, this would be interesting to look at.

Oh wow, thank you ... if I read this right, then it does seem like Claudico (or at least Tartanian7) could easily be operated on a standard PC ... which hopefully improves the chances of you guys releasing some version of it, maybe similar to Cepheus as a web application, or possibly (much better, actually) even as a stand-alone program ... ?!

Tuomas Sandholm himself praised the bot (back then he was still talking abt Tartanian7, of course) as a great potential learning tool in his CardPlayer interview last September ...

I know I am asking you to speculate here a bit, but hope you answer this: how high would you estimate the chances of a release of such a training program in the not-too-distant future ? (I do understand that this will involve a certain amount of work for you guys, but as the program itself exists already, hopefully not too much work either ... and since U of A made Cepheus publicly available, I hope there is a slight competetive feeling of not wanting to stand back in this regard

)

Quote

05-23-2015 , 02:36 PM

#1590

Sam Ganzfried

journeyman

Join Date: Oct 2014 Posts: 224

Quote:

Originally Posted by +VLFBERH+T

)

I am not currently employed by Carnegie Mellon and cannot speculate on what lies ahead.

Quote

05-23-2015 , 02:38 PM

#1591

redfin

centurion

Join Date: Dec 2013 Posts: 170

Sam thanks for all the feedback. Apologies if this has been answered.

How much of an impact did the humans exert by 'tricking' the computer. By tricking I mean stuff like min donk leading into the bot hoping that it threw off the game tree. From what I recall, the first few days the min-donk leading caused some issues that required immediate adjustment.

There are clear exploitation techniques that human vs human would never be used, but for human vs imperfect bot are suitable, how much of a role did these play in the claudico challenge results?

Quote

05-23-2015 , 03:23 PM

#1592

Sam Ganzfried

journeyman

Join Date: Oct 2014 Posts: 224

Quote:

Originally Posted by redfin

Honestly, I think the humans just flat-out outplayed us, and most of their victory was not due to any special "tricks." For example, they raised our limps close to 100% of the time because they perceived us to be folding too often to raises there, which wasn't due to any trick specific to playing against a program, but just due to our frequencies possibly being unbalanced. (Though interestingly, I believe the humans said that we were actually making money on our limps overall, so I'm not totally sure what to conclude about the limping specifically.)

Bjorn was particularly good at exploiting Claudico, but I don't think it was really because of any special "tricks." I think he just perceived that Claudico was somehow too conservative overall and adopted an extremely aggressive counterstrategy that was very successful in exploited it.

Interestingly, I'm not sure if the min donk-bet strategy they were all using was necessarily accomplishing what they thought. We had a 25% pot bet in for the flop and turn, and a 10% pot bet in for the river. So it's not as if we had no sizes in our abstraction in the vicinity of those bets. I think they expected this would completely "throw us off our tree" somehow. We did have those occasional spots where we'd map it down to a check and call with an extremely weak hand, and perhaps they profited a decent amount from those. This could also have messed up our river strategy if they got to the river with ranges that were totally different from what we perceived their range to be there, though I don't think this ended up being a big problem.

Overall, I would say that they did far less "tricking" than I expected. I was really expecting them to do more betting weird sizes that were between sizes in our abstraction to cause problems like in the A4 and KT hands, and they never really did much of that. Obviously I would expect this type of thing to play a much bigger role in the future now that it has been exposed, and it will be very interesting to see how these leaks are addressed.

Last edited by Sam Ganzfried; 05-23-2015 at 03:47 PM.

Quote

05-23-2015 , 08:25 PM

#1593

TimTamBiscuit

veteran

Join Date: Oct 2007 Posts: 2,134

Quote:

Originally Posted by Sam Ganzfried

Some people have claimed that the fact that the developers modified the agent during the competition constitutes cheating. I completely disagree. There was no specification in the rules that were agreed upon that it was not allowed.

The scientific point is that manually modifying the AI during the study invalidates the study because it is no longer the same AI for the whole study. That is, you have introduced uncontrolled variance. Statistically, you would need to statistically test only the hand sample for which the AI was invariant.

Cheating or not is a debateable moral issue.

In contrast, deliberately introducing uncontrolled variance is not debatable. It is simply sloppy science.

Quote

05-23-2015 , 08:32 PM

#1594

TimTamBiscuit

veteran

Join Date: Oct 2007 Posts: 2,134

Quote:

Originally Posted by Sam Ganzfried

I was really expecting them to do more betting weird sizes that were between sizes in our abstraction to cause problems like in the A4 and KT hands, and they never really did much of that. Obviously I would expect this type of thing to play a much bigger role in the future now that it has been exposed, and it will be very interesting to see how these leaks are addressed.

I was expecting the Humans to do much, much more exploitation of the betsize abstraction weakness. I was disappointed that though the Human team were world class players they clearly did next to no preparation for playing against a bot. The betsize abstraction issue is common knowledge in the AI community, by no means recently "exposed".

In conclusion, the same Human players, properly prepared with simple Bot-counter strategies, could clearly have done much better against Claudico with or without allowing manual changes to Claudico. Without allowing manual changes to Claudico, the Humans could have increasingly crushed Claudico as the betsize exploitation points would have been fixed once discovered.

Quote

05-23-2015 , 11:02 PM

#1595

ArtyMcFly

Carpal \'Tunnel

Join Date: Dec 2014 Posts: 13,256

Quote:

Originally Posted by Sam Ganzfried

I am not currently employed by Carnegie Mellon and cannot speculate on what lies ahead.

Inb4 Andrew Ng or Demis Hassabis offers you a job.

Thanks for the detailed responses in the thread, and congrats on getting the PhD!

Quote

05-25-2015 , 04:41 PM

#1596

ike

Pooh-Bah

Join Date: Jan 2004 Posts: 5,635

Quote:

Originally Posted by TimTamBiscuit

It was a competition and a show, not an experiment. They also didn't play enough hands to draw a conclusion about who was winning unless someone won at an outrageously high bb/100, even ignoring the way in which you mention that they complicated the statistical relevance of the results.

Isn't it statistically problematic that the humans are modifying their play over time as well? It's really hard to be clear and rigorous about what you're measuring in a contest like this.

Quote

05-25-2015 , 05:59 PM

#1597

DougL

Too helpful for this post

Join Date: Sep 2002 Posts: 21,810

You'd especially have to consider the difference between fresh/excited humans looking forward to a new challenge and those same folks after a long grind of one-tabling. Listening to Doug on the pokercast, it was pretty clear that he thought there was a big difference in human play quality towards the end.

Quote

05-26-2015 , 10:48 AM

#1598

NoamBrown

stranger

Join Date: May 2015 Posts: 7

It goes both ways. The humans do get more tired as the competition progresses, but they also learn more about the bot and are better able to exploit it. Bjorn and Jason, for example, both played better against the bot as the competition progressed.

I suspect Claudico would have done better playing 200 copies of Doug for 100 hands each, rather than a single Doug for 20,000 hands.

Quote

05-26-2015 , 08:46 PM

#1599

ArtyMcFly

Carpal \'Tunnel

Join Date: Dec 2014 Posts: 13,256

Quote:

Originally Posted by NoamBrown

200 copies of Doug.

3D printers are getting more powerful every day, I suppose, but I think the Empire is strong enough already.

Inb4 'Humans vs AI, episode II: Attack of the Clones'

Quote

05-26-2015 , 09:19 PM

#1600

ike

Pooh-Bah

Join Date: Jan 2004 Posts: 5,635

Quote:

Originally Posted by NoamBrown

I'm inclined to agree that the humans should get better over time even if they're tired and their motivation is flagging.

My point is just that it's a pretty loose assumption to treat the 80k hands played by 4 different people over a period of a couple weeks as a single, essentially uniform, sample for purposes of saying things like "the humans beat the bot for 10 bb/100 with a standard deviation of whatever so we can say with 95% confidence that the true human vs bot winrate is within the bounds of x and y." "The true human vs bot winrate," is not really a well defined idea.

Quote

Page 64 of 65

First

14 44 54 59 60 61 62 63 64 65

Last

Post Reply Subscribe

...

Page 64 of 65

First

14 44 54 59 60 61 62 63 64 65

Last