Open Side Menu Go to the Top
Register
Weird Hypothetical HU NLH Question Weird Hypothetical HU NLH Question

08-28-2018 , 04:53 PM
Quote:
Originally Posted by browni3141
I believe the original scenario will be a bloodbath. P1 gets destroyed because P2 will effectively have complete information on the river in NL, since P1 will bluff with a sizing that causes P2 to fold 100% of the time assuming P2 is unaware of cheating. Since P2 is aware, he can bluffcatch in situations he’s supposed to fold and fold against sizings he’s supposed to call against. It is also generally in P1’s best interest to make it to the river where P2 has a very clear and large advantage, since in the cheating/unaware version P1 has 100% pot share against most hands on the river due to being able to bluff so effectively.

I’m not sure at which stack sizes P1 has the advantage. Surely just very short stacks, but I’m not sure how to estimate more precisely.



Change the toy game to NL and P2 will clean up.

P1 leaks too much information in a NL format. He is only winning the toy game because it’s a limit format.


Switching the toy games to NL is what is required to solve for EV in terms of bet size. There is only one bet in the toy game. So far there has not been a correct EV calculation in the thread.

Spoiler:
the range boundaries move as bet size changes
Weird Hypothetical HU NLH Question Quote
08-28-2018 , 05:08 PM
Quote:
Originally Posted by robert_utk
Switching the toy games to NL is what is required to solve for EV in terms of bet size. There is only one bet in the toy game. So far there has not been a correct EV calculation in the thread.

Spoiler:
the range boundaries move as bet size changes
Actually, there is still a big difference between a limit game with a single arbitrary bet sizing and no limit. The EV and optimal strategy when betting pot in the toy limit game is not the same as in the NL version.

Where did I make a mistake in my EV calculations for the toy game?

I’m not interested personally in solving for EV in terms of bet size. I’d rather skip to NL. I am pretty sure this toy game has already been brought up as NL in the past in this forum. It might even be in MoP. IIRC the strategy was to bet a different size for every value combo in P1’s range in the default version, while betting random sizings with bluffs in a distribution which balances every sizing.
Weird Hypothetical HU NLH Question Quote
08-28-2018 , 05:50 PM
Quote:
Originally Posted by browni3141
Actually, there is still a big difference between a limit game with a single arbitrary bet sizing and no limit. The EV and optimal strategy when betting pot in the toy limit game is not the same as in the NL version.



Where did I make a mistake in my EV calculations for the toy game?



I’m not interested personally in solving for EV in terms of bet size. I’d rather skip to NL. I am pretty sure this toy game has already been brought up as NL in the past in this forum. It might even be in MoP. IIRC the strategy was to bet a different size for every value combo in P1’s range in the default version, while betting random sizings with bluffs in a distribution which balances every sizing.


A model for NL was published in 1959 by Donald Newman. It was reproduced in MOP in a chapter entitled ‘Bet size, you don’t have to guess’ iirc. Ironically, the chapter leaves the reader guessing. In any case, the model does not work for dependent ranges featuring card removal.

Solving toy games as simple as these should be fun, for anyone interested in such models.
Weird Hypothetical HU NLH Question Quote
08-28-2018 , 08:11 PM
Quote:
Originally Posted by robert_utk
Solving all three versions of the toy game for EV in terms of Bet size will determine if there is any value of Bet which is +EV for P2.
A moments thought gives you an obvious yes. If the bet was gigantic, P2 is supposed to call only with his very best. P1 will bet bet almost every hand and P2 will win that gigantic bet more often than he loses it if he calls with somewhat above the average hand among those he was expected fold.
Weird Hypothetical HU NLH Question Quote
08-31-2018 , 01:19 AM
For practice, here is the solution of the first version of the Toy Poker model by using the principle of indifference, and a graphical solution for the EV (which serves also as a proof of optimality).



The Game:

Two players, Player1 and Player2, are dealt uniform independent (0,1) real number hands. Then each player antes 1 unit into the pot. Player1 may bet an amount B, or check. If Player1 checks, the hands are immediately compared. If Player1 bets, Player2 may call by matching the bet, or fold and forfeit the pot.



The proposed optimal solution for Player 1 is to bet with the best hands for value, bluff with the worst hands, and check the middle hands. He bets the same size for bluff or for value.



This will create a range that looks like this:

0-----------a-------------b-----------1

Player 1 bets for all values of x > b and x < a, and checks otherwise.



The response from Player 2 is to call with good hands and fold with bad hands.

This range looks like:

0-----------------c-------------------1

Player 2 calls when her hand y > c and folds otherwise.

To solve for these variables conveniently using indifference, we assume the pot belongs to neither player (we can subtract the amount of ante from the final EV). We use B for the size of the bet and P for the size of the pot. It is given that 0 < a < c < b < 1 .


Start with Player2, since she acts last and we are going backwards...

Player1 keeps Player2 indifferent between calling and folding at c:

EV fold = EV call

0 = (B+P)a - B(1-b)

0 = Ba + Pa – B + Bb


Player2 calls to keep Player1 indifferent between betting and checking at a:

EV bet = EV check

P(c) - B(1-c) = P(a)

(Pc – B + Bc)/(P) = a



Player2 keeps Player1 indifferent between checking and betting at b:

EV check = EV Bet

Pb = Pc + (B+P)(b-c) - B(1-b)

Pb = Pc + Bb + Pb – Bc - Pc – B + Bb

0 = 2Bb – Bc - B

1 = 2b – c



For the case where B=P=2 these three equations simplify to:

1 = 2a + b

2c – 1 = a

1 = 2b – c

Simple substitution of three equations with three unknowns yields the values for a, b, and c:

a = 1/9
b = 7/9
c = 5/9

The solution thus far is pretty straightforward. However, the calculation of the EV and the value of the game is usually the more complex part.


For piecewise linear EV functions, the total EV is the total area under the graphs (figs. 1&2 below) of the function, minus the original ante.


Here is the payoff graph of Player1:




Here is the same graph in terms of a, b, c, P, and B:





As a proof that the chosen strategy for Player1 is optimal, every action by Player1 is the highest vector for every value of hand (x). The strategy is indeed optimal.


The total EV for Player1 is the area under the upper envelope of this graph minus 1 for the ante.


Using the values in fig.1 we can calculate the area as follows:

(1/9)(2/9)(½) + (14/9)(7/9)(½) + (2/9)(14/9) + (12/9)(2/9)(½) - 1 = 180/162 - 1 = 10/9 - 1 = 1/9

The EVP1 of the game is positive and equals 1/9.

We can see this area in fig.1 and fig.2 as the tiny triangle in the bluff region below 1/9.
This is illustrative that when two poker players bet perfectly for value and call perfectly, the only profit is by bluffing.

Here is a graph of the payoff to Player1 according to the familiar poker strategy of ValueBets, Checks, and Bluffs:




Now let’s finish with this model and go from Pot Limit where B=P=2 to full-fledged no limit poker for any value of B > 0. Then we can move forward with the versions of the model that involve exposed hole cards of Player2.

As Bet size increases, the values of a, b, and c slide to the right (increase), as bet size decreases the values a, b, and c slide to the left (decrease).

Player1 will choose a bet size that maximizes the total area of the regions in fig.3. Once we have the value of the game in terms of B, we can solve for this value with basic calculus.

First, we need the total area of each region in terms of a, b, c, and B:

Value Area: (1 – b)(2b) + (1-b)(2 + B – Bc)(½)

Check Area: (2b)(b)(½) - a^2

Bluff Area: 2a^2

Substitution of the variables a, b, and c using the indifference equations gives the total area in terms of B, and subtracting 1 for the ante gives the value of the game in terms of B:

EVP1 = B/[(1+B)(4+B)]

Finding the first derivative and setting equal to zero gives the optimal bet size:

d/dB = [(4 – B^2)]/[(1 + B)^2(4+B)^2] = 0

B = 2, -2
B*=2

The maximum EV is obtained by betting 2, which was the size of the pot all along.

Here is a graph of the EV function, showing a clear maximum at B = 2 and decreasing as B > 2:





Some important notes on this model:


Player1 has a single unique optimal solution, which is as shown here. Any deviation by Player1 is a mistake and leads to less EV.
Player2 has multiple co-optimal strategies, in addition to the one shown here.
Player2 may in fact call with any hand 1/9 < y < 7/9 provided she does so with a mixed probability of: (b − c)/(b− a)
She does not lose EV by doing so, and holds this strategy in reserve, as this strategy stays in equilibrium with the optimal strategy required by Player1.


Now, to proceed and try to solve the model for various scenarios involving exposed hole cards, we need some Sklansky rules for GTO cheating.

There are a couple of scenarios.

Version 2: Player1 secretly sees the hand of Player2, and acts accordingly to maximize his EV.

Version 3: Player2 secretly notices that Player1 has seen her hand and acts accordingly to minimize his EV.



What do you think the answers will look like?

Last edited by robert_utk; 08-31-2018 at 01:28 AM.
Weird Hypothetical HU NLH Question Quote
09-12-2018 , 02:09 AM
Picking up with version 2 of the toy poker game:

Player1 sees the hand of Player2, and acts accordingly to maximize his EV.

It is worth noting that Player1 can choose a strategy that will go undetected by Player2 (would never showdown a hand that violated the normally optimal strategy for Player1). However, since the memory of Player1 is eventually stricken with amnesia, we will assume Player1 is going for maximum EV with no attempt at deception.

Also of note is that the case of No Limit will be solved as push-fold, which seems to be in keeping with the original question in the thread, and this avoids some comical bet sizing (Player1 could bet exactly enough to make Player2 call with every loser, or fold with every winner). We assume Player2 would notice such shenanigans and immediately choose to defend with mixed strategy. Recall, from version 1 of the game previously solved, it is required that Player1 bet with B=2 to be optimal.



In this version of the game, Player1 is given additional information. Player1 knows if his hand is a winning hand. This will create dual strategies for Player1, depending on holding a winning or a losing hand.

We hold Player2 to the same strategy from the previous version. Player2 calls with hand (y) when c < y < 1 and folds otherwise.



The proposed optimal strategy for Player1 is as follows:

Bet amount B with every winning hand (x), and also Bet amount B when Player2 holds a hand (y) where 0 < y < c .



Let's compute the EV of this strategy intuitively (we can reserve graphical proof as an exercise for later):

Player1 receives a winner with probability ½.

When Player1 receives a winner, his expected return when the pot belongs to neither player is:

(2+B)(1-c) + 2c

So Player1 wins always, and wins he extra amount B with probability (1-c)

We subtract 1 for the ante:

(2+B)(1-c) + 2c –1

Now all of this happens with probability ½ :

[(2+B)(1-c) + 2c –1]/2



Next Player1 receives a losing hand with probability ½.

When Player1 receives a loser, his expected return when the pot belongs to neither player is:

2c

So Player1 still wins the pot with probability c, and wins zero otherwise.

Subtract 1 for the ante

2c – 1

This happens with probability ½

(2c – 1)/2



Now add the EV of both together:

(B – Bc + 2c)/2



When B = 2, the EVP1 is positive and equal to 1.



The EV of Player1 is nine times larger than before, and is the entire ante of Player2. Quite unfair indeed!



That seems like quite a simple answer, so let’s check it.



This time, since the strategy of P1 depends on the hand of P2, we will use a method called Payoff Squares. Payoff Squares are simple 2-demensional representations of the interaction of 2 ranges when the game is zero sum. Start with a square where the bottom left corner is zero (nut low) and the bottom side is Player1’s range from 0 to 1, and the left side is Player2’s range from 0 to 1. See Fig4 below.



Fig 4: Player1’s Payoff Square vs Player2’s strategy. (Pot belongs to neither player)



Notice how quick and easy calculating EV can be with a payoff square vs. Using algebra (with many more variables, the algebra gets pretty intense).

We see that the EVP1 is 2c^2 + [4(1-c^2)]/2 -1(ante)

EVP1 = 2c^2 +4/2 -2c^2 -1

EVP1 = 2 - 1 = 1

OK, so we were correct with the intuitive solution, check.

But we lost a variable in there, namely c. We are probably going to need that variable in the no-limit push-fold solution. Lets get it back. Now we redraw the payoff square and use variables, see Fig5 below. NOTE: We will also use this graph later to calculate exactly the bluffing frequency of Player1.



Fig 5: Player1’s Payoff in terms of P, B, and c:







This makes it pretty simple that EVP1 = Pc^2 + [(1-c^2)(P+B)]/2 - ante

Pot = 2, and ante=1, so:

EVP1 = Pc^2 + (P+B)/2 - [(c^2)(P+B)]/2 - 1

EVP1 = c^2 + B(1-c^2)/2

Now we have EV in terms of B and c. Let’s get rid of c, and just have EV in terms of B:

From our handy original set of indifference equations, we can solve for the value of c in terms of B and substitute into the above equation. I will leave that algebra for the reader to compute, and the answer is:

c = (B^2 + 3B)/(B^2 + 5B + 4)
where EVP1 = c^2 + B(1-c^2)/2

This EV function is non-decreasing as B increases, and approaches a value of 3 (the best possible result) as B approaches infinity.

Here is the EV graph as Bet size increases:



So, for the no-limit push-fold solution where B = 2, the EV for Player1 is 1 and this only increases as the stacks get deeper.

Seems like all hope is lost for Player2. But not so, because in the final version of the game, she will notice that Player1 has seen her hand. She will be able to fight back and recover EV.
Weird Hypothetical HU NLH Question Quote
04-11-2020 , 06:51 PM
It seems to me like generally, P2 should call down wider. Like much lower down but try to keep it equity-based?

Suppose normal game - a turn like Qs on As Qd Ts if player 2 has Js8h - and is OOP - in a real game he can lead that turn to compensate for the 45ish flush combos and 90ish trips by adding these kinds of bluffs since IP doesn't necessarily 2bet with smth like AK and thus we have a lot of possibilities OTR = worse would be any K since most of our bluffs got there we size small I guess? Ripping it in 200%PSB on blanks.

Now if we knew our cards were exposed but the other player wasn't privy to the fact that we know he can know our cards, then this spot becomes a x/c and an attempted call down even when we brick especially when we block nutted combos by the river.

This is assuming some part of P1 is reset and he does not see how our strategy is developing into a very bluff catchy - peculiar sort of lines etc.

As P2 - I'd try to jam hands like 9Ts and K2s and TTs ( for example both K2s and TTs have similar equity vs AAs) KX also can enjoy very nice equity vs pairs QQs and lower.

P2 should play a very passive - but equity-based game whenever possible. For instance if he does hit that K OOP - he knows NOTHING worse is calling him and he if still jams whether he hits or not - player 1 has perfect knowledge when to call or fold.

So equity-based (no random no showdown nothing came in bomb the river is in P2's arsenal now) game also seems rather hard to implement yet.

If P1 is flipping into amnesia and back - I'd actually tell him my strategy for various hands just as a ruse to appear very confident about my game - this way at least his too peculiar lines will be easy to implement/justify without letting on.

Also 3betting hands like K7o and then if just called firing a full-clip cbet on boards like 9T4 2tone seems like a good idea for P2.

Basically, P2 needs to get it all in pre way too often - the only way to steal the advantage player 1 has and we know poker is a game of chance. Hands far down the line like 9Ts are not even a bad beat vs AKo

Wonder how the strategies are affected if it is a timed game - 30-45 mins or very low - like 10 x Uber-Hyper turbo HUSNGs... like a blind rise every 2 hands.
Weird Hypothetical HU NLH Question Quote
08-17-2020 , 06:22 AM
The OP doesn’t really make sense. The criteria for the situation isn’t very well defined. If I understand what you’re trying to say correctly:

P2 has the ultimate EXPLOITATIVE advantage. P1 is the perfect Level Two player as he can read his opponents strength perfectly accurately. This is a big advantage, however, P2 is essentially the perfect Level Three player as he understands exactly how his opponent perceives his strength. As always, Level Three trumps Level Two.

The perfect Level Three player would turn the Level Two players advantage into a disadvantage of equal proportion. The fact that P1 is perfectly accurate in his understanding of P2’s holding only means that P2 can perfectly accurately gage P1’s strength in relation to his own: Let’s say P2 has KK. If P1 bets then P2 can be sure that he must be stronger and so he is safe to assume that P1 must have AA. By simply reverse engineering each of P1’s decisions in this way, P2 is able to fold anytime he’s behind or continue anytime he’s ahead.

*This assumes that P1 is completely unaware that P2 is aware of him reading P2’s hand strength. However, if that were actually the case, P1 wouldn’t ever have a reason to exploit with a deceptive play. He’d just play straight value. I don’t think this is what you meant to say would happen, but it would happen.

Anyway...

Unfortunately for P2, he only has an exploitative advantage. If P1 decides to use GTO he has no advantage at all. Instead, he has the ultimate disadvantage.

If P1 decided to balance his range in any way, then P2 is completely screwed. The GTO strategy that P2 can apply is the same as the normal GTO strategy in Poker, but, as P1 has more information than usual, the GTO strategy he can design will be far more superior.

With GTO in the bag it seems as though P1 would win this standoff, but that’s not necessarily true because we don’t know whether or not P1 will try to exploit or use GTO. If he thinks that his advantage has not been exposed it is only rational for him to use this advantage to exploit and so he will get obliterated. If he thinks the opponent MIGHT have exposed his advantage to some degree, which is the most realistic scenario, then he MIGHT opt for GTO, or he MIGHT do more advanced exploitative plays, both of which will reverse P2’s advantage.

I’m trying to think what I would do if I woke up from amnesia and found myself playing HU with the ability to see an unknown opponents cards... Hmmm... I’m a good and honest man, so I would kindly tell my opponent that I saw her cards. “Sorry love, I somehow saw that you hold KK”. Whilst humbly gazing into her adoring eyes I’d silently thank the lord for this strange and awesome turn of events: “Praise be to God of all that is good and holy. I do thy bidding, safe in the knowledge that every advantage is mine... Amen.”
Weird Hypothetical HU NLH Question Quote

      
m