Open Side Menu Go to the Top
Register
hand vs. cepheus. let's try to understand this one hand vs. cepheus. let's try to understand this one

12-24-2017 , 12:25 PM
Quote:
Originally Posted by brick
Interesting, where can I find the paper?
Scroll down to read the original
http://science.sciencemag.org/conten...ref&siteid=sci

And here’s the review of its performance
http://poker.cs.ualberta.ca/publicat...ai-cfrplus.pdf

Last edited by UpHillBothWays; 12-24-2017 at 12:34 PM.
hand vs. cepheus. let's try to understand this one Quote
12-24-2017 , 12:26 PM
Quote:
Originally Posted by UpHillBothWays
Playing on my cell so no hh

I’m bu/sb w 22 no diamond

I raise, C 3b, I call (I’m capping about 1/4 the time w any pair <7s, 3/4 for 7s and up)

Flop was Qd9d4x. C bets I call.

Turn Ad. C checks. I’ve found this is polarizing. Either he has a killer draw, a super strong hand, or he’s in k/c mode (much of which I beat). Since I have no d and an easy fold to a raise, and some value to get, I bet. C calls

River was an off suit 5. C checks. I??
Link on where to play? Trust the website?
hand vs. cepheus. let's try to understand this one Quote
12-24-2017 , 06:17 PM
Quote:
Originally Posted by Clumsy Surgeon
Link on where to play? Trust the website?
http://poker-play.srv.ualberta.ca
hand vs. cepheus. let's try to understand this one Quote
12-24-2017 , 07:15 PM
Thank you.
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 01:10 PM
Up2ng: in the other thread I agreed with you but now I think you are mistaken. What you are describing is a Nash equilibrium given a range - this is what the expensive online solver tools use. Or what the MoP would call the “nemesis” strategy. This is not how Cepheus works. I mean it kinda is, but only in the sense that it is the nemesis for the entire game tree. It is opponent range independent.

To see why this has to be so imagine you are playing heads up vs one guy who has never played poker before. On the end he calls 100% of the time just to see what you have. Another guy walks up and he gets to the river, checks to see if he made a royal flush, and if not he folds. And now a third guy but he plays pretty well, some 2+2 schooled pro. And now you take turns playing all 3 of them but each hand you don’t know which guy you are playing. How are you going to analyze their river calling range?

Cepheus doesn’t care which of those guys he plays. He will beat them all, albeit he might have a lower win rate against the two extreme guys than you or I would because we would quickly adjust our ranges. If we had piosolver with us and put in a 100% calling range for one guy on the river it would also spit out a “nash Equilibrium” with zero bluffs and a value range that is a 50.001% favorite vs his entire range. This is just a bastardization of a Nash equilibrium because the guys strategy is so silly.

Uphill: counterfactual regret minimization is an algorithm to traverse the game tree quickly and efficiently. It is an approximation of gto when used on the entire game tree. And an approximation of a Nash equilibrium when used on a smaller tree. For that tree.
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 03:38 PM
I have this loose theory that there is a difference between gto and nash equilibrium:

nash equilibrium is only satisfied when two or more players are playing the nash equilibrium strategies, which goes all the way back to preflop.

gto strategies are only satisfied when two or more players are playing the comaximally exploitive strategies.

This may seem redundant, but here's the difference:

gto may be satisfied by a player individually by such a player that takes the comaximally exploitive line given any set of previous circumstances in a hand of poker; this includes preflop errors.

to show how this is true, think about what solvers do:

they figure out an approximation of a minimax strategy given information that may or may not satisfy the conditions of a nash equilibrium. This includes preflop errors.

For example:

If I knew the nash equilibrium preflop range, but included some hands that were not in that range in my preflop raising range and plugged them into a solver, the solver would produce a minimax strategy that is not strategically symmetrical with that of the nash equilibrium.

Thus while nash equilibrium strategy is the strategy that maximally exploits the opposing nash equilibrium strategy, it is not necessarily the gto strategy, which maximally exploits the maximally exploitive counter strategy, given any set of previous information, including preflop errors.
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 03:43 PM
Optimal play is to stop feeding the bots and AI scientists information that will inevitably lead towards the destruction of our game.

People who play in the AI matches, like the four huNL players who got crushed by the AI, are poker Uncle Toms.

butnah, it's fun in the short term to talk about math so who cares if we're willingly walking the path towards the destruction of our livelihoods ...
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 03:50 PM
Whoever shuts down the University of Alberta AI department deserves to be Time's Person of the Year.
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 03:53 PM
Let's go talk to some checkers players about how their game is thriving after it got solved ...

****ing lemmings
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 04:47 PM
Lol you can’t stop progress. You can only learn to use it to your advantage.
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 04:57 PM
Have you ever considered why there's essentially zero online gambling on chess or backgammon?

Engines exist for these games that play better than any human.

What, pray tell, do you think is going to happen to online poker when all the solvers and engines are available to everybody and everybody can play GTOish or whatever?

But no, Uncle Tom, keep on keepin' on. In five years the highest poker games online will be microstakes.

Of course, these AI *******s COULD easily invent a new game with which to test their tech, with the two added bonuses of:

- you won't be aiding the destruction of 1000s of people's livelihoods
- you can make the game exactly as simple or complex as you need

but, you know, that would get less intere$t from random nerds, which indirectly would lead to a lo$$ of funding for the University of Alberta Terminators and other traitors of mankind, so that's obviously impossible.
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 05:03 PM
I think you’re missing my point. I agree online poker is basically done for in a few years. But there isn’t anything you or I can do about that except try to crush until it happens. If those guys didn’t participate in the AI heads up match the algorithm still would have worked the same. Poker (particularly limit Holdem) is just a fairly simple game and it’s no surprise we are getting to the point where it can be solved by an iPhone.
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 05:09 PM
Quote:
Originally Posted by DeathDonkey
I think you’re missing my point. I agree online poker is basically done for in a few years. But there isn’t anything you or I can do about that except try to crush until it happens. If those guys didn’t participate in the AI heads up match the algorithm still would have worked the same. Poker (particularly limit Holdem) is just a fairly simple game and it’s no surprise we are getting to the point where it can be solved by an iPhone.
I agree that my tilting in this forum isn't going to STOP these *******s from destroying something that I love. I get that. I'm just trying to stem the tide as much as one guy can. I know I'm tilting at windmills.

It's just, I've seen this story play out in other games with my own two goddamn eyes. These people, flat out, do not know or do not care that they're indirectly ripping countless hundreds of thousands of dollars from the hands of poker players and other game players. They are the enemy.
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 05:16 PM
Quote:
Originally Posted by DeathDonkey
But there isn’t anything you or I can do about that except try to crush until it happens.
You know, though, I do think there is actually something we, as a poker-playing populace, COULD do about this if we were to team up. I do think that a mass-scale public outcry could at least make these guys HESITATE to continue their mission to destroy strategy games.

Sadly, most poker regs are QUANTS themselves and love this ****. The poker ecosystem badly needs more left-brain nerds such as yours truly. We're able to see the forest.
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 05:27 PM
I hear you and I don’t think you are wrong. On some level everyone who posts on this site is ok with sharing info and making the games tougher. You are somewhat of an exception tbh.

I guess I look at it like new technology is sometimes invented and sometimes discovered. Figuring out Poker is more of a discovery of the right way to task a computer with the problem. And it was going to get solved regardless. At least academics publish their work so people like me can tinker with CFR algorithms on my laptop. Otherwise a few unscrupulous programmers would / will make millions running bots online and nobody would be the wiser.

Also it’s not all peaches and cream being an academic these days. Their world is crumbling too
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 09:14 PM
Quote:
Originally Posted by Bob148
I have this loose theory that there is a difference between gto and nash equilibrium:

nash equilibrium is only satisfied when two or more players are playing the nash equilibrium strategies, which goes all the way back to preflop.

gto strategies are only satisfied when two or more players are playing the comaximally exploitive strategies.

This may seem redundant, but here's the difference:

gto may be satisfied by a player individually by such a player that takes the comaximally exploitive line given any set of previous circumstances in a hand of poker; this includes preflop errors.

to show how this is true, think about what solvers do:

they figure out an approximation of a minimax strategy given information that may or may not satisfy the conditions of a nash equilibrium. This includes preflop errors.

For example:

If I knew the nash equilibrium preflop range, but included some hands that were not in that range in my preflop raising range and plugged them into a solver, the solver would produce a minimax strategy that is not strategically symmetrical with that of the nash equilibrium.

Thus while nash equilibrium strategy is the strategy that maximally exploits the opposing nash equilibrium strategy, it is not necessarily the gto strategy, which maximally exploits the maximally exploitive counter strategy, given any set of previous information, including preflop errors.
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 11:19 PM
Quote:
Originally Posted by SitandSpin
Have you ever considered :

Of course, these AI *******s COULD easily invent a new game with which to test their tech, with the two added bonuses of:

- you won't be aiding the destruction of 1000s of people's livelihoods
- you can make the game exactly as simple or complex as you need.
You do realize with AI , in the next 15-20 years , will be responsible for the disparition of like 20/30% of the jobs that exist today ?

I think they really don’t care on how it affects the online poker economy.
They got bigger fish to fry !
hand vs. cepheus. let's try to understand this one Quote
12-26-2017 , 11:34 PM
https://forumserver.twoplustwo.com/1...brium-1699669/

I turned my other post in this thread into a theory forum post for anyone interested.
hand vs. cepheus. let's try to understand this one Quote
12-27-2017 , 02:06 AM
Hey DeathDonkey, great post above (#30).

I'm not sure what I've said in this thread that contradicts any of that, I think that we are on the same page. I never meant to imply that Cepheus behaves at all like the nemesis, or that it adjusts its play in any way based on opponent tendencies -- I don't think that it works that way either.

The point that I was trying to make is that, particularly when checked to on the river, Cepheus will calculate which hands it can bet "for value" and it will include those in its river betting range. THEN, it will balance this range by adding an appropriate number of bluffs. NOT the other way around. Its algorithm for determining "for value", by definition, will be based on what I've said earlier in this thread -- it will compare its current hand against its estimation of its opponent's range, and if it beats enough of that range it will bet for value. In this particular decision (checked to on the river) it will not factor in at all where it is within its own range, the board texture, how many bluffs it "wants" to have, etc.

Perhaps the confusion was that I then tried to give an example (in this thread or perhaps in the other thread) of what WE might want to do while playing exploitatively which is to actually try to estimate our opponent's range based on their tendencies -- like, if we know with certainty that our opponent only checks the river with 2 pair or better and with 6-hi or worse (and always folds to a bet while holding 6-hi or worse), then it's illogical for us to bet one pair on the river no matter what GTO says.

However, I agree that Cepheus does NOT play this way. It does not try to estimate our range based on OUR tendencies -- instead, (and this is now speculation as I'm not an expert on how this bot actually works) it will estimate our range based on how it thinks we should have played the hand up to this point (using its own GTO solutions) . . . in other words, if Cepheus was sitting in our seat (playing against itself) and using the same line as we've used, any hand that it would still have in its own range at that point becomes included in its estimation of our range for that decision point -- and then it bets for value or as a bluff or checks behind accordingly. But this is still a hand vs opponent's range algorithm. When you think about it, it has to be this way, otherwise it would miss value or it would spew, neither of which are GTO.
hand vs. cepheus. let's try to understand this one Quote
12-27-2017 , 03:23 AM
Quote:
You do realize with AI , in the next 15-20 years , will be responsible for the disparition of like 20/30% of the jobs that exist today ?
gtfo solution?
hand vs. cepheus. let's try to understand this one Quote
12-27-2017 , 09:26 AM
@up2ng

I am not intimately familiar with Cepheus' programming, but from my understanding of the poker AI that uses CFR algorithms there is no range estimation during play.

Cepheus just found the best action against itself traversing most of the nodes in the game tree and reducing the amount of regret it had along the way (i.e. it would change it's action slightly if another action yielded better results). Based on this method it determines what hands to bet, check, call, raise, and fold and with what frequency to take that action with a particular hand.

I imagine while it's playing humans or in competition it just has a lookup table to determine which action to take. At that point my guess would be it's just taking as input the hand, board, and actions and spitting out bet, check, call, raise, fold etc.
hand vs. cepheus. let's try to understand this one Quote
12-27-2017 , 12:22 PM
Quote:
Originally Posted by DeathDonkey

Uphill: counterfactual regret minimization is an algorithm to traverse the game tree quickly and efficiently. It is an approximation of gto when used on the entire game tree. And an approximation of a Nash equilibrium when used on a smaller tree. For that tree.
ty dd. i got the 20k foot view from the paper and your explanation aligns with what i took away and wrote above as well (and your explanation is more detailed).

sitandspin, the fact that you bring up "gambling on chess and backgammon" in a discussion of gambling on poker wrt computer advancements is a red flag regarding your understanding of the subject matter at hand.
hand vs. cepheus. let's try to understand this one Quote
12-27-2017 , 12:42 PM
Quote:
Originally Posted by just_grindin
@up2ng

I am not intimately familiar with Cepheus' programming, but from my understanding of the poker AI that uses CFR algorithms there is no range estimation during play.

Cepheus just found the best action against itself traversing most of the nodes in the game tree and reducing the amount of regret it had along the way (i.e. it would change it's action slightly if another action yielded better results). Based on this method it determines what hands to bet, check, call, raise, and fold and with what frequency to take that action with a particular hand.

I imagine while it's playing humans or in competition it just has a lookup table to determine which action to take. At that point my guess would be it's just taking as input the hand, board, and actions and spitting out bet, check, call, raise, fold etc.


Right, this is my understanding too, it isn’t doing anything we would call “calculating” during a hand. It’s actually doing something more simple it’s just never been done before since it was too large a problem. It’s sorta just tossing a strategy out there (guess and check), then it calculates its regret at each point and adjusts, as you said. What it ends up with could look to the untrained eye like a sort of haphazard strategy that would suffer from all sorts of imbalances, but it simply isn’t. It’s so close to the gto strategy as to be indistinguishable, and for each board and action it happens to have a well balanced range.
hand vs. cepheus. let's try to understand this one Quote
12-27-2017 , 12:44 PM
up-> i think the issue is here:

Quote:
Originally Posted by up2ng
The point that I was trying to make is that, particularly when checked to on the river, Cepheus will calculate which hands it can bet "for value" and it will include those in its river betting range. THEN, it will balance this range by adding an appropriate number of bluffs. NOT the other way around. Its algorithm for determining "for value", by definition, will be based on what I've said earlier in this thread -- it will compare its current hand against its estimation of its opponent's range, and if it beats enough of that range it will bet for value. In this particular decision (checked to on the river) it will not factor in at all where it is within its own range, the board texture, how many bluffs it "wants" to have, etc.
you kinda touch on it later, but to be clear, cepheus determines this ranges EXACTLY as if it were playing itself. not "kinda" and it doesn't estimate its opponents' ranges.
hand vs. cepheus. let's try to understand this one Quote
12-27-2017 , 12:45 PM
Quote:
Originally Posted by DeathDonkey
Right, this is my understanding too, it isn’t doing anything we would call “calculating” during a hand. It’s actually doing something more simple it’s just never been done before since it was too large a problem.
yes, the ACTUAL calculation (training of the ML algorithm) of all the nodes of the tree took 68.5 days.

Quote:
It’s sorta just tossing a strategy out there (guess and check), then it calculates its regret at each point and adjusts, as you said. What it ends up with could look to the untrained eye like a sort of haphazard strategy that would suffer from all sorts of imbalances, but it simply isn’t. It’s so close to the gto strategy as to be indistinguishable, and for each board and action it happens to have a well balanced range.
it's statistically indistinguishable from a nash equilibrium over 10^14 games.
hand vs. cepheus. let's try to understand this one Quote

      
m