Heads Up Hold'em Solved? - Page 8 - Poker News

Imagine for a second that you're a complete and perfect poker machine. If a person makes a single mistake you know how to and do instantly exploit it. Now for some crazy reason I decide to tell you exactly how I'm going to play every single hand and situation I could ever end up in online poker. You give me an exhaustive list of places you'd exploit my play at. I correct those leaks, potentially introducing more in doing so, and we repeat the process. This continues on until we finally reach a point where I tell you exactly how I'm going to play and ask you how you'd exploit me - try as you might, you can no longer find a single exploitable spot in my strategy. This strategy would be considered GTO.

snip

Thanks for that. People who really understand something can explain it simply and you did.

It seems to beg an obvious question: Where is the perfect poker machine against which people are developing the strategy?
-

thanks to others who posted the links.. I'm reading the Hawrilenko interview that I passed on when that other thread was running. And I see the math guys are back so I'm back to observing... have fun.

Last edited by joeschmoe; 10-14-2013 at 04:56 PM. Reason: and incorrectly using "beg the question" here is the only troll I've done in months

Quote

10-14-2013 , 05:12 PM

#178

franxic

veteran

Join Date: Apr 2010 Posts: 2,059

Quote:

Originally Posted by MouldyOnions

I hate to be "that guy", but the chart you are referring to would do pretty badly in the real world. Even on small stack sizes playing shove or fold is not the optimal line to take. So it isn't the best of examples.

Also this is a little nit picky, but I will mention it anyway. even if the game you played specifically only allowed you to play shove or fold, that chart still isn't perfect GTO. This is because depending on your stack sizes and whether you are the SB or BB, your expected value on the hand changes. This means a perfect GTO strategy would need to take future hands into account, and that chart does not.

First, no problem with you being "that guy", you are obviously shmart and edumacated and I respect your opinion.

I kind of agree with your first paragraph (depending on how you define "pretty badly") but don't really get the second (or get it and disagree). Can you elaborate?
I am not sure if/how your point about future hands makes sense, probably have to think that through, but wouldn't that contradict the definition of gto?

Also, if you look at heads up as two seperate games (one with yourself in the sb and one with yourself in the bb), what do you think about your ev in the sb where you actually can make it a push/fold game and the bb can only react? Wouldn't using the chart be preferable to any other strategy against a significantly stronger player?

Last edited by franxic; 10-14-2013 at 05:22 PM.

Quote

10-14-2013 , 05:53 PM

#179

punter11235

Carpal \'Tunnel

Join Date: Mar 2005 Posts: 8,210

Quote:

It seems to beg an obvious question: Where is the perfect poker machine against which people are developing the strategy?

It's relatively easy to compute the best max exploitive strategy vs given strategy. Idea of algorithms like CFR is to train vs such perfect opponent.
It should be intuitive: if you tell me how you play exactly with every hand in every spot you would expect me to come up with smart way to play vs you pretty soon. Computers can do that faster and more accurately.

Quote

10-14-2013 , 06:21 PM

#180

joeschmoe

veteran

Join Date: May 2012 Posts: 3,130

so... they probably have computers hammering away at each other 24/7.
On a scale of 1 to 10, ten being HU game solved, where does it now stand?

Quote

10-14-2013 , 06:37 PM

#181

gregorio

Carpal \'Tunnel

Join Date: Jan 2007 Posts: 39,313

That scale doesn't make sense. If you mean HULHE, then they are pretty close

Quote:

Originally Posted by FullyCompletely

Heads-up limit isn't solved yet. A couple of years ago, I posted in the "Math is apparently not important" thread (link) thread to discuss a technique we had for measuring how close a strategy was to a Nash equilibrium by computing how much it can lose against a perfect adversary. A Nash equilibrium would lose $0 / game, and the lower your exploitability is, the closer you are to Nash.

In that 2011 thread, I mentioned that the strategies we had for the 2008 Man-vs-Machine match were beatable for 235 milli-big-blinds / game, or 11.75 BB/100 (just divide by 20 to convert). The best strategies we had in 2011 were quite a bit better, being exploitable for 104.41 mbb/g, or about 5 BB/100.

We've made a lot of progress since then. Our most recent game solving algorithm, called CFR-BR (link) lets us use abstraction techniques but get as close as possible to a Nash equilibrium within an abstraction. In that paper we had strategies as low as 41.199 mbb/g (2 BB/100), and we've made more progress since then.

If you mean HUNLHE, then they are not very close

Quote:

Originally Posted by FullyCompletely

One other thing before I start on that, because you also mentioned 100 blind no-limit. Heads-up limit Texas hold'em has 3.17*10^17 game states and 3.19*10^14 decision points, but after you remove card isomorphisms there are only 1.38*10^13 "canonical" decision points that you need for a perfect abstraction. In fact you could probably squeeze that a little bit smaller since some of those situations have the nuts on the board and should be strategically identical, but it's probably still pretty close to as small as you can get without losing any information. Heads-up limit is really interesting for science right now because it's just small enough that we can measure the exploitability of a strategy by computing the value of a best response, but still way too big to solve exactly with algorithms like CFR. That's a sweet spot for science, because you can actually measure progress without the problem being trivial. The last couple of years since we got our fast best response technique in 2011 have seen some really exciting progress on abstraction and game solving techniques, not only by us but also from our colleagues at CMU and competitors in the ACPC like Eric Jackson.

Heads-up no-limit with $1/$2 blinds and 100-big-blind ($200) stacks, where you can only make whole dollar bets, has 8.72*10^38 game states and 8.81*10^35 decision points, and a lossless abstraction only gets you down to 3.81*10^34. That's enormous. Even measuring how close you are to a Nash equilibrium in this no-limit game would take, at best, 10^20 times longer than in heads-up limit using current best response techniques. That's actually very optimistic; the way that we measure exploitability in heads-up limit is efficient because of a fast terminal node evaluation trick for comparing two players' ranges at the same time, and the cards are a much smaller part of no-limit than the betting is. I'm much more interested in no-limit than in limit these days, but you just can't run many experiments in no-limit that are both meaningful and interesting, at least compared to limit.

This no-limit game is so big that if you used just one byte to represent the probability of each action in the preflop alone (assuming you can recompute subgame strategies as needed, like you claimed), it'd take 5.5 petabytes of disk just to write down the first round of a strategy. It'd take nearly 6,000 yottabytes if you wanted to store the flop as well. One yottabyte is if you took a 64GB MicroSDXC card and made a pile as large as the Great Pyramid of Giza (credit to wikipedia for the example). Game sizes are computed from the tool in one of my tech reports (link).
...
So, to sum up: I think you might be a bit optimistic on the no-limit thing.

Quote

10-14-2013 , 06:52 PM

#182

joeschmoe

veteran

Join Date: May 2012 Posts: 3,130

glad i'm not the only person who admits to using wiki as an occasional reference...

I get it.. and admit to grunching a few TLDR posts in this thread.

another dumb question: When a GTO strategy is someday formulated (for limit), will a human be able to play it or will only machines be capable of following it?

Quote

10-14-2013 , 07:08 PM

#183

samooth

veteran

Join Date: May 2009 Posts: 3,350

please stop

Quote

10-14-2013 , 07:37 PM

#184

joeschmoe

veteran

Join Date: May 2012 Posts: 3,130

Quote:

Originally Posted by samooth

please stop

The question upset you... I'll take that as a "No, humans cannot do it."

Quote

10-14-2013 , 08:01 PM

#185

samooth

veteran

Join Date: May 2009 Posts: 3,350

it's not that i would be upset if you asked one dumb question, dumb questions are fine. but you are derailing this thread when it should be more than obvious to you by now that you are a beginner in game theory and should study some instead of asking your entire way through it. it also seems like you haven't read the thread, or haven't tried hard enough to understand its content.

read all the posts regarding how a gto start in hu limit looks like, you will read stuff about the size of the game, how many decision points it has, what a mixed strat is etc. then ask yourself whether a human can apply this strat correctly and fully, and i have full faith in you that you'll end up with the right answer by independently thinking about it.

Quote

10-14-2013 , 08:21 PM

#186

joeschmoe

veteran

Join Date: May 2012 Posts: 3,130

I don't know who in this thread knows what they are talking about and who doesn't.
It's not like I haven't done any research.. been reading about Nash.. struggling though a PDF about counterfactual regret minimization..

but like I said earlier, humans tend to ignore extraneous pathways when solving a problem. And to get right to the heart of the matter, I want to know if this thing holds some promise for me in a real game, or if it must be boiled down to a few imprecise rules of thumb to have any value at all.

Quote

10-14-2013 , 09:21 PM

#187

Adreno

centurion

Join Date: Nov 2007 Posts: 147

Quote:

Originally Posted by Rei Ayanami

Many of the recurring problems in your analysis stem from the fact that you're unaware of (or at least not acknowledging) the possibility of mixed strategies. Here, for example, the bots can partition their range into multiple sizings and apportion some mix of {nuts, good hands, draws, weak hands, air, etc.} among all, thus avoiding the problems you bring up.

Read the "optimal play" section in The Mathematics of Poker; it should clear up many of the things you don't understand, or at least give you a fundamental grounding from which to make better guesses.

I appreciate the book tip, gonna look it up.

You're suggesting that the bots have a balanced range for each betsize. 2 things:

1. They don't. At least based on the 100 hand sample provided here, they appear to minbet mainly weak hands and potbet mainly nuts.
2. Let's assume for the sake of argument that their 1/4pot bet on a drawy flop has a balanced range. How is that the optimal betsizing? I already pointed out all the things wrong with this sizing so I'm not gonna repeat myself. Your reply covered another problem I presented, but not this one.

Whether GTO bots will eventually crush strong players for >20BB/100 is a matter of opinion, the state of the current NL bots is not. They're horrible. They are so bad I would be happy to play them for real money and I'm just a lowly bumhunter. If you think the bots are better then you can put your money on the bots and I'll play them. Should be easy money for you, don't even need to put the hours in.

Last edited by Adreno; 10-14-2013 at 09:25 PM. Reason: 20BB/100 added

Quote

10-14-2013 , 10:31 PM

#188

Rei Ayanami

Carpal \'Tunnel

Join Date: Aug 2012 Posts: 11,685

You're missing the point. The bots are most likely far from optimal, and I wouldn't find it hard to believe that they're horrible by real-world standards.

The point is your analysis demonstrates a fundamental lack of understanding of the means by which a ridiculous-looking strategy can be good. You also misapplied several GT concepts.

Your overall argument (that the bots aren't good) is reasonable, though.

Quote:

Originally Posted by Adreno

You're suggesting that the bots have a balanced range for each betsize.

That isn't what I suggested. I said it's possible for a player (whether in human, bot, or dog form) to avoid the problems you mentioned by having balanced ranges for each sizing (i.e., they aren't constrained to the flawed either-or proposition you provided). As evidenced by your repeated use of the word 'if', we were dealing with hypothetical strategies at that point--not these two specific bots' actual strategies.

Quote:

Originally Posted by Adreno

... 100 hand sample ...

Hmm.

Quote:

Originally Posted by Adreno

2. Let's assume for the sake of argument that their 1/4pot bet on a drawy flop has a balanced range. How is that the optimal betsizing? I already pointed out all the things wrong with this sizing so I'm not gonna repeat myself. Your reply covered another problem I presented, but not this one.

It's probably not. But the relevant question is "How can it (or something similar) be the optimal sizing?" Or more precisely, "How can it (or something similar) be one of the sizings you'd use in this spot as part of an optimal strategy?"

You'll probably be able to answer that question, or at least make educated guesses, after reading MoP.

Quote

10-14-2013 , 11:53 PM

#189

Adreno

centurion

Join Date: Nov 2007 Posts: 147

Quote:

I said it's possible for a player (whether in human, bot, or dog form) to avoid the problems you mentioned by having balanced ranges for each sizing (i.e., they aren't constrained to the flawed either-or proposition you provided).

Yes, you are correct and I was wrong on that point.

On another point, I find it really hard to believe small bets (like minbets) from position on the flop and turn could ever be part of an optimal solution. We give our opponent the odds to call with almost anything on almost any type of board. We almost never see a fold. If we are called we get 1BB more into the pot. If villain decides to CR, we have to fold like 40-60% of the time, when we could have just checked behind. The only time we gain something is when we have a really strong hand and we induce action that we wouldn't be able to get with a normal sized bet or a check. There's just no way that's enough to offset the huge # of times we
- put 1BB more into the pot as an underdog
- put 1BB more into the pot and then fold
- put 1BB more into the pot when we have a strong hand and our opponent would have called/raised a larger bet

Quote

10-15-2013 , 02:17 AM

#190

Signif, I

centurion

Join Date: Apr 2012 Posts: 191

@FullyCompletely

Tyvm for your elucidating replies in this thread. I've learned a lot from what you've had to say.

I was wondering if you might comment on the software "Snowie" which seems to be getting a lot of attention lately. It's my impression that most of this attention stems more from the way the software has been marketed than from its true merit. You're certainly more qualified to judge it however. What concrete achievements has the software made? What would your response be to their following claim? -

"While for the specific variant of fixed limit heads-up the computer is known to be just as good as the best professional Poker players, nobody could create strong artificial intelligence-based software for the most popular (and most complex) variant: no-limit full ring game. This goal has now been achieved with the release of PokerSnowie."

Quote

10-15-2013 , 03:34 AM

#191

franxic

veteran

Join Date: Apr 2010 Posts: 2,059

Quote:

Originally Posted by franxic

(..)

what do you think about your ev in the sb where you actually can make it a push/fold game and the bb can only react? Wouldn't using the chart be preferable to any other strategy against a significantly stronger player?

hmm that's a brainfart, forget about it...

Quote

10-15-2013 , 03:59 AM

#192

PoP T1me

centurion

Join Date: Dec 2006 Posts: 107

Quote:

Originally Posted by gregorio

If you mean HUNLHE, then they are not very close

see:
http://forumserver.twoplustwo.com/sh...&postcount=120

by all practical standards, I would say that what you have inferred is incorrect. Maybe not in terms of 'solved' but being able to beat any human over a meaningful sample is close enough imo.

Quote

10-15-2013 , 04:39 AM

#193

ike

Pooh-Bah

Join Date: Jan 2004 Posts: 5,635

Quote:

Originally Posted by MouldyOnions

It's a matter of how you define the game.

You're definitely right that the push/fold chart gets crushed by a strong player that is not restricted to push/fold (human or computer), down to fairly short stack sizes. That chart is a very close approximation of the solution to the toy game in which you are only allowed to shove or fold. When franxic says you can't lose if you play according to the chart, he means you can't lose to an opponent who is also restricted to push/fold.

If you were analyzing something like a HUSNG, then yeah, a really thorough solution would need to account for the value of future hands as a function of stack size and position. In that case, "the game," is the entire tournament.

However, academics who have worked on computer poker treat a single hand as "the game." This doesn't mean the solutions they produce are imperfect, it means they're solving a slightly easier problem.

Practically speaking, I can't imagine this matters in something like a deep stacked cash game with fluctuating stack sizes. In a fast structure tournament format, I could see it being a significant factor.

Quote

10-15-2013 , 04:49 AM

#194

franxic

veteran

Join Date: Apr 2010 Posts: 2,059

Quote:

Originally Posted by ike

If you were analyzing something like a HUSNG, then yeah, a really thorough solution would need to account for the value of future hands as a function of stack size and position. In that case, "the game," is the entire tournament.

not sure if i get that right, the concept is avoiding small +ev spots in order not to miss future bigger +ev spots?

Quote

10-15-2013 , 05:07 AM

#195

ike

Pooh-Bah

Join Date: Jan 2004 Posts: 5,635

Quote:

Originally Posted by franxic

not sure if i get that right, the concept is avoiding small +ev spots in order not to miss future bigger +ev spots?

Or taking small -EV spots now to avoid larger -EV spots in the future.

Suppose you're playing a push/fold HUSNG with a total of 20bb in play. Further suppose that playing the button with 9bb effective stacks is a sufficiently losing proposition that when you start the hand with 9bb and the button you're actually only 44% to win the tournament (rather than the 45% of the chips in play that you have).

If you're in the big blind with equal stacks, you should call a jam with 44% equity or better, rather than the 45% you would expect from looking at things in terms of maximizing chip-EV on a hand by hand basis.

Quote

10-15-2013 , 06:46 AM

#196

OptedOUT

stranger

Join Date: Oct 2013 Posts: 10

the advance of AI is unstoppable & insuperable.

Quote

10-15-2013 , 09:14 AM

#197

HeadsUpLoser

adept

Join Date: Mar 2009 Posts: 1,096

thx god poker is played between humans

. Sad fact: bots, if done right, won't be detectable so poker pros might need to move back to the casinos in the future. Good fact: There is a high change micro/low stakes players won't be affected as much

.

@FullyCompletely: How much are u guys banking at online limit games

Last edited by HeadsUpLoser; 10-15-2013 at 09:19 AM.

Quote

10-15-2013 , 09:42 AM

#198

skario

enthusiast

Join Date: Jan 2007 Posts: 87

Read http://www.computerpokercompetition....owall=&start=2 before analyzing the hand histories. The bots are not trying to play "GTO", they are trying to win the competition.

Quote

10-15-2013 , 09:48 AM

#199

jcsmitty22

grinder

Join Date: Nov 2005 Posts: 567

Two great things happened in this thread

1) i learned a crapload of stuff from a very smart phd student who could probably be a good poker player if he tried, thanks alot FullyCompletely!

2) I blocked joeschmoe

Quote

10-15-2013 , 10:12 AM

#200

OptedOUT

stranger

Join Date: Oct 2013 Posts: 10

Quote:

Originally Posted by skario

Read http://www.computerpokercompetition....owall=&start=2 before analyzing the hand histories. The bots are not trying to play "GTO", they are trying to win the competition.

What's clear is this: players are not equal and if AI bots are not ready yet to take on such mature and accomplished pros like kagome, ragen70 or sallywoo, but even given knights-odds to such overated players like that swedish kid with banana mashed puree in the lieu of the cerebral matter who would be beaten to a pulp without further AI development in this case.

Last edited by OptedOUT; 10-15-2013 at 10:25 AM.

Quote

Page 8 of 12

First

3 4 5 6 7 8 9 10 11 12

Last

Post Reply Subscribe

...

Page 8 of 12

First

3 4 5 6 7 8 9 10 11 12

Last