For practice, here is the solution of the first version of the Toy Poker model by using the principle of indifference, and a graphical solution for the EV (which serves also as a proof of optimality).
The Game:
Two players, Player1 and Player2, are dealt uniform independent (0,1) real number hands. Then each player antes 1 unit into the pot. Player1 may bet an amount B, or check. If Player1 checks, the hands are immediately compared. If Player1 bets, Player2 may call by matching the bet, or fold and forfeit the pot.
The proposed optimal solution for Player 1 is to bet with the best hands for value, bluff with the worst hands, and check the middle hands. He bets the same size for bluff or for value.
This will create a range that looks like this:
0-----------a-------------b-----------1
Player 1 bets for all values of x > b and x < a, and checks otherwise.
The response from Player 2 is to call with good hands and fold with bad hands.
This range looks like:
0-----------------c-------------------1
Player 2 calls when her hand y > c and folds otherwise.
To solve for these variables conveniently using indifference, we assume the pot belongs to neither player (we can subtract the amount of ante from the final EV). We use B for the size of the bet and P for the size of the pot. It is given that 0 < a < c < b < 1 .
Start with Player2, since she acts last and we are going backwards...
Player1 keeps Player2 indifferent between calling and folding at c:
EV fold = EV call
0 = (B+P)a - B(1-b)
0 = Ba + Pa – B + Bb
Player2 calls to keep Player1 indifferent between betting and checking at a:
EV bet = EV check
P(c) - B(1-c) = P(a)
(Pc – B + Bc)/(P) = a
Player2 keeps Player1 indifferent between checking and betting at b:
EV check = EV Bet
Pb = Pc + (B+P)(b-c) - B(1-b)
Pb = Pc + Bb + Pb – Bc - Pc – B + Bb
0 = 2Bb – Bc - B
1 = 2b – c
For the case where B=P=2 these three equations simplify to:
1 = 2a + b
2c – 1 = a
1 = 2b – c
Simple substitution of three equations with three unknowns yields the values for a, b, and c:
a = 1/9
b = 7/9
c = 5/9
The solution thus far is pretty straightforward. However, the calculation of the EV and the value of the game is usually the more complex part.
For piecewise linear EV functions, the total EV is the total area under the graphs (figs. 1&2 below) of the function, minus the original ante.
Here is the payoff graph of Player1:
Here is the same graph in terms of a, b, c, P, and B:
As a proof that the chosen strategy for Player1 is optimal, every action by Player1 is the highest vector for every value of hand (x). The strategy is indeed optimal.
The total EV for Player1 is the area under the upper envelope of this graph minus 1 for the ante.
Using the values in fig.1 we can calculate the area as follows:
(1/9)(2/9)(½) + (14/9)(7/9)(½) + (2/9)(14/9) + (12/9)(2/9)(½) - 1 = 180/162 - 1 = 10/9 - 1 = 1/9
The EVP1 of the game is positive and equals 1/9.
We can see this area in fig.1 and fig.2 as the tiny triangle in the bluff region below 1/9.
This is illustrative that when two poker players bet perfectly for value and call perfectly, the only profit is by bluffing.
Here is a graph of the payoff to Player1 according to the familiar poker strategy of ValueBets, Checks, and Bluffs:
Now let’s finish with this model and go from Pot Limit where B=P=2 to full-fledged no limit poker for any value of B > 0. Then we can move forward with the versions of the model that involve exposed hole cards of Player2.
As Bet size increases, the values of a, b, and c slide to the right (increase), as bet size decreases the values a, b, and c slide to the left (decrease).
Player1 will choose a bet size that maximizes the total area of the regions in fig.3. Once we have the value of the game in terms of B, we can solve for this value with basic calculus.
First, we need the total area of each region in terms of a, b, c, and B:
Value Area: (1 – b)(2b) + (1-b)(2 + B – Bc)(½)
Check Area: (2b)(b)(½) - a^2
Bluff Area: 2a^2
Substitution of the variables a, b, and c using the indifference equations gives the total area in terms of B, and subtracting 1 for the ante gives the value of the game in terms of B:
EVP1 = B/[(1+B)(4+B)]
Finding the first derivative and setting equal to zero gives the optimal bet size:
d/dB = [(4 – B^2)]/[(1 + B)^2(4+B)^2] = 0
B = 2, -2
B*=2
The maximum EV is obtained by betting 2, which was the size of the pot all along.
Here is a graph of the EV function, showing a clear maximum at B = 2 and decreasing as B > 2:
Some important notes on this model:
Player1 has a single unique optimal solution, which is as shown here. Any deviation by Player1 is a mistake and leads to less EV.
Player2 has multiple co-optimal strategies, in addition to the one shown here.
Player2 may in fact call with any hand 1/9 < y < 7/9 provided she does so with a mixed probability of: (b − c)/(b− a)
She does not lose EV by doing so, and holds this strategy in reserve, as this strategy stays in equilibrium with the optimal strategy required by Player1.
Now, to proceed and try to solve the model for various scenarios involving exposed hole cards, we need some Sklansky rules for GTO cheating.
There are a couple of scenarios.
Version 2: Player1 secretly sees the hand of Player2, and acts accordingly to maximize his EV.
Version 3: Player2 secretly notices that Player1 has seen her hand and acts accordingly to minimize his EV.
What do you think the answers will look like?
Last edited by robert_utk; 08-31-2018 at 01:28 AM.