A three-player game with differently-valued equilibria - Poker Theory

Two Plus Two Forums Poker Strategy Poker Theory & GTO

A three-player game with differently-valued equilibria

Post Reply Subscribe

...

01-24-2013 , 04:26 PM

Cangurino

Carpal \'Tunnel

Join Date: Apr 2008 Posts: 13,476

This came up in another thread which shouldn't be derailed further...

Consider the following game: Three players, A,B,C. Everybody antes $2 and picks a number 0 or 1. Payouts are as follows: If everybody picks 0, then A gets 1, B gets 2, C gets 3. If everybody picks 1, then A gets 3, B gets 1, and C gets 2. If not all votes are the same, then the majority vote splits the pot. So if A picks 0 while B and C pick 1, then B and C get $3 each.

Quote

$25m Guaranteed WPM on CoinPoker

Join the action now

Daily Rewards • Splash Pots • CoinRaces

01-25-2013 , 06:27 AM

Cangurino

Carpal \'Tunnel

Join Date: Apr 2008 Posts: 13,476

So making it more explicit, we have the following results:
(action of A, action of B, action of C) -> (payout A, payout B, payout C)

(0,0,0) -> (-1,0,1)
(0,0,1) -> (1,1,-2)
(0,1,0) -> (1,-2,1)
(0,1,1) -> (-2,1,1)
(1,0,0) -> (-2,1,1)
(1,0,1) -> (1,-2,1)
(1,1,0) -> (1,1,-2)
(1,1,1) -> (1,-1,0)

We see that there are two strategy triples - (0,0,0) and (1,1,1) - where no player has an incentive to change their choice (assuming the other two don't change), so these are equilibrium solutions. We also see that the payouts in these two cases are different.

If we know what our two opponents do, we can exploit them - if they both pick the same number, we pick it as well; if they pick different numbers it doesn't matter what we do. But if we do not know how they play, what should we do? I don't see any answer to this question, so maybe it is the wrong question.

This may be an example that shows the limited value of GTO play in multiplayer games.

Quote

01-25-2013 , 11:54 AM

Paul Valente

banned

Join Date: Mar 2009 Posts: 4,925

Just thinking out loud.

So if all players were to just pick 1 or 0 randomly, player A would break-even, player B would lose 1/8th every trial, and player C would win 1/8th every trial. This may give incentive for player C to pick randomly?

If players B and C pick randomly, player A does best by choosing 1. I player B knows player A is going to choose 1, he would also pick 1. If player C knows this he obv will also pick 1.

If players A and C pick randomly, Player B does best choosing 0. If player A knows this, he does best by also taking 0 in which case player C would also take 0.

And, if A and B pick randomly, player C does best with 0. If player A knows this he would also do better by taking 0, in which case player B would have to also take 0.

This seems to suggest to me that 0,0,0 would be the equilibrium, since 2 of the 3 players are incentivised to choose 0 over 1 when the other 2 players are choosing randomly.

Quote

01-25-2013 , 02:37 PM

pasita

adept

Join Date: Mar 2007 Posts: 1,101

Quote:

Originally Posted by Cangurino

We see that there are two strategy triples - (0,0,0) and (1,1,1) - where no player has an incentive to change their choice (assuming the other two don't change), so these are equilibrium solutions.
...

This may be an example that shows the limited value of GTO play in multiplayer games.

I'd still like to see your comment on the following, I'll restate so they're in this thread too.

-I believe those are equilibrium states (must say I'm not sure on my terminology), but what is the equilibrium strategy players take to reach those states? "Pick the same as majority did last time" works, but not on the first game, and not if the game has no memory.
And as Paul Valente also said, two players like 0 better. So I'm starting to believe the equilibrium strategy is "pick 0" for all players. The (1,1,1), while an equilibrium state, is never reached.

-Can you find a poker-like example, such as where the absolute hand values aren't decided by voting?

-In the 3-handed 1CP, with no colluding, I'm under the impression that once BTN picks his strategy, which is an equilibrium strategy and can thus be announced, that immediately defines the equilibrium strategies (and e_k, where this all started) for SB and BB also. Do I get this right? (If yes, can colluding without communication, i.e. a shared strategy with shared bankroll, change things in a game where there's no more betting? Colluding with communication certainly can -if SB had K, he can tell BB not to call with Q, as they beat everything but A anyway.)

Quote

01-25-2013 , 05:42 PM

Cangurino

Carpal \'Tunnel

Join Date: Apr 2008 Posts: 13,476

Pasita,

in my opinion you use a lot of assumptions which hold in two-player games, but not in this game. You should say what you mean by an equilibrium strategy. I'm also not sure what you mean by equilibrium state.

For me, in a two-player game an equilibrium consists of a pair of strategies, one for each player. Since one player can't improve by changing his strategy, this means that the strategies which make up the equilibrium can't be exploited. Here, equilibria are triples of strategies. And while it is still true that one player alone can't improve by deviating from the equilibrium, two players can conspire to cause harm to the third one; this is what I mean by collusion. So, playing an equilibrium strategy does not prevent exploitation when there are more than two players.

Quote

01-25-2013 , 07:10 PM

pasita

adept

Join Date: Mar 2007 Posts: 1,101

I agree with your definition of equilibrium strategy and equilibrium. With equilibrium state I mean the previous game was played (0,0,0) or (1,1,1), and from this state onward, no player should change their vote for the next game. But how do we get to (1,1,1)?

What is the actual strategy (in your game, where 0 or 1 is not inherently better, only depending on the payout structure) for each player that leads to equilibrium? A strategy can't be (0,0,0) or (1,1,1), it has to be (0) or (1) for each player. "Pick the same as others" will not work. "Pick the majority from last time" neither at the first trial, and never in a memoryless game. "Pick random"?

(BTW I still agree with PV... since 2 of the players prefer 0, each player's equilibrium strategy is to pick 0, and while there is not one equilibrium state, there's only one equilibrium strategy for each player, leading to (0,0,0). If everyone for some reason picked 1, that would be another equilibrium state where no-one can deviate... but I don't see a strategy that leads to that state. Except maybe if P, C and p were playing the game, y walked by and tossed a $100 bill in the middle saying "anyone who picks 1 gets to split that"... kind of feeding extra energy in the system, throwing it off it's current equilibrium to another one.)

Quote:

So, playing an equilibrium strategy does not prevent exploitation when there are more than two players.

I fully understand. I'm also all the time talking about a fair game and only possibly exploitation by schooling or some other collaboration method that's not against the rules.

Quote

01-25-2013 , 10:07 PM

yaqh

Pooh-Bah

Join Date: Aug 2007 Posts: 5,745

Quote:

Originally Posted by Cangurino

As long as we're being clear about vocabulary, what does the bolded mean? When I say "exploiting", I mean "playing so that our expectation is higher than it would be if we played an equilibrium strategy". Of course, that sentence would be nonsense with this definition. (Nor is my definition good anyway in 3+ player games where the equilibria can have different values.)

Quote

01-26-2013 , 03:10 AM

Cangurino

Carpal \'Tunnel

Join Date: Apr 2008 Posts: 13,476

Quote:

Originally Posted by yaqh

What I meant is the following: In a two-player game, if we play a strategy that is part of an equilibrium (a GTO strategy or equilibrium strategy), then no matter what the opponent does we can't do worse than in the equilibrium. In a multiplayer game this is not the case. While no single opponent can deviate and gain, my result can go down if more than one of them deviates simultaneously.

(It is actually possible that one player's deviation hurts me without helping him, but that's not the case in this specific game.)

Quote

01-26-2013 , 06:33 AM

pasita

adept

Join Date: Mar 2007 Posts: 1,101

What is required for

Quote:

In a multiplayer game this is not the case.

to be true? The existence of multiple equilibria with different payouts? If so, is there a property of a game that is required for multiple equilibria with different payouts? (The "guess what others are doing" seems like a good candidate.)

Quote:

(It is actually possible that one player's deviation hurts me without helping him, but that's not the case in this specific game.)

Depends on whether you're A, B or C, imo.

Quote

01-26-2013 , 06:40 AM

#10

Cangurino

Carpal \'Tunnel

Join Date: Apr 2008 Posts: 13,476

No, even with a single equilibrium you have this problem in multiplayer games.

Quote

01-26-2013 , 07:38 AM

#11

pasita

adept

Join Date: Mar 2007 Posts: 1,101

Ok, maybe I'm missing something... but if 2 players deviate from equilibrium, to a strategy that is not an equilibrium strategy, wouldn't either of them have the possibility and incentive to single handedly switch back to equilibrium?

Or: "3 players ante up. Each one secretly and simultaneously vote a number between 2 and 13. The one who votes the biggest number wins. If several players vote the same number, the pot is split between those players".

I don't think just "multiplayer" is enough to cause the situation you describe.

Quote

01-26-2013 , 03:01 PM

#12

Cangurino

Carpal \'Tunnel

Join Date: Apr 2008 Posts: 13,476

Quote:

Originally Posted by pasita

Let us consider the equilibrium (0,0,0) in our game (so A, B, and C all play 0). If B and C deviate, we get the strategy triple (0,1,1). Here, B and C split the pot. In this new situation, B can go back to playing 0, but this does not restore the equilibrium (C still plays 1), and he has no incentive to do so (in both (0,1,1) and (0,0,1), B's result is the same.

Quote:

Or: "3 players ante up. Each one secretly and simultaneously vote a number between 2 and 13. The one who votes the biggest number wins. If several players vote the same number, the pot is split between those players".

Here, we have a single equilibrium where all players play 13. In this case playing 13 is really unexploitable.

Quote:

I don't think just "multiplayer" is enough to cause the situation you describe.

I should have said: In multiplayer games this does not necessarily hold. Of course in specific cases it might.

Quote

01-26-2013 , 04:58 PM

#13

pasita

adept

Join Date: Mar 2007 Posts: 1,101

Quote:

B going back to 0 doesn't restore equilibrium, but now makes C hurt... giving C a reason to return to equilibrium. Imo, B's and C's incentive to return to 0 comes from the fact that if they stick to 1, A can now 1 too and B/C hurt... A is happy to take the beat for one round if he can get to (1,1,1) for the remainder of the game. Maybe future value discounting should be used. I haven't done the math so you may very well still be right and there's something that eludes my logic.

BTW, I'd still like to know what you think is the best strategy for each player to play this game. I'm saying it's "0" for each on the first round, and if the game continues and has memory, "pick majority vote from last round".

Quote:

Here, we have a single equilibrium where all players play 13. In this case playing 13 is really unexploitable.
...
I should have said: In multiplayer games this does not necessarily hold. Of course in specific cases it might.

OK... what if instead of voting, everyone gets dealt a card from a 13 card deck. All make secret vote of "Play" or "Pass". Those who "play" put p in the pot, those who "pass" forfeit any interest in the pot and put no more money in. Among those who chose "play", the highest card picks up the pot.

What I'm trying to find outis if there's a property of a game that forces it to exhibit the behaviour you described. "Multiplayer" wasn't. I believe "betting" isn't. Sequential betting? Raises allowed? Draws? Multiple streets? Some combination of those?

Or is it the non-fixed nature of nuts that your 000/111 game has... ie. people are betting without knowing whether they hold the absolute nuts or absolute nut low?

Quote

01-26-2013 , 05:13 PM

#14

Cangurino

Carpal \'Tunnel

Join Date: Apr 2008 Posts: 13,476

you make things more complicated than necessary... Repeating the game should have no influence on "optimal" play, it only makes a difference if you play exploitively and hence take into account what villains have done before. I would say that there is no such thing as optimal play here.

Quote

01-26-2013 , 05:43 PM

#15

pasita

adept

Join Date: Mar 2007 Posts: 1,101

One-shot prisoners dilemma has a different solution from one that's played indefinitely (or with a probability p of ending the game after each round). Or if your comment wasn't a general one, but meant your game specifically: if A has played your game for 10 rounds, always picking 0, and the others stick to 1, A should be able to figure out from history that he should switch to the equilibrium (NOT exploiting anything, but observing which equilibrium is reachable) game state the others are so generously offering him.

I still think, in one-shot-Cangurino, 0 is the equilibrium strategy for each player (because 2 of them prefer 0... things might be different if the payouts were 1,2,3 and 3,2,1 for the 000 and 111 equilibia), and 000 is the equilibrium that follows.

Last edited by pasita; 01-26-2013 at 05:44 PM. Reason: typo

Quote

01-26-2013 , 05:50 PM

#16

Cangurino

Carpal \'Tunnel

Join Date: Apr 2008 Posts: 13,476

I won't repeat it again: In the one-shot, both (0,0,0) and (1,1,1) are equilibria.

Quote

01-26-2013 , 06:09 PM

#17

pasita

adept

Join Date: Mar 2007 Posts: 1,101

I've never disagreed with that. I'm saying there's no sane (for each player individually, without explicit colluding) equilibrium strategy that leads to the equilibrium (1,1,1).
(1,1,1) is not a strategy choice anyone can take, it's an equilibrium that was reached from the selected strategies. Are we agreeing on that?

Everyone picks 0 or 1 (or a mix) as their strategy. B and C strictly prefer 0 over 1, and they know the other one prefers it too. A knows all that, so he's forced to 0 also.

If you were A, what would your strategy be?

Quote

01-26-2013 , 06:19 PM

#18

Cangurino

Carpal \'Tunnel

Join Date: Apr 2008 Posts: 13,476

I do not agree with the whole concept of a strategy leading to an equilibrium. A strategy is a choice one player makes. A strategy set assigns a strategy to each player. An equilibrium is a strategy set satisfying certain constraints. There is no process leading to an equilibrium. When the players choose, either they are at an equilibrium, or they are not.

What you could do is the following: Start with an arbitrary strategy set. Then iteratively let each player try and exploit the others (assuming they repeat their last choice). This may or may not lead to an equilibrium. (You could get a periodic sequence instead of a fixed point.). In our case both equlibria can be reached in this manner.

Neither player has any strict preferences in this game. If B and C play 1, so should A. If they both play 0, so should A.

As player A I wouldn't know what to do.

Quote

01-26-2013 , 07:32 PM

#19

pasita

adept

Join Date: Mar 2007 Posts: 1,101

Quote:

A strategy is a choice one player makes.

Agreed.

Quote:

A strategy set assigns a strategy to each player

I'd rather say "once all players have picked a strategy, it defines a strategy set". But as long as you don't imply causality in the other direction, fine.

Quote:

There is no process leading to an equilibrium

No process, per se, but the requirement that all players picked the same equilibrium strategy, and at least usually, without communicating.

Quote:

If B and C play 1, so should A. If they both play 0, so should A.

Agreed, but....

Quote:

Neither player has any strict preferences in this game.

why would B or C want to pick anything else that 0? In what situation would they gain, over picking 0? That would be only when everyone else picked 1 too.

Payout matrix for A's strategy choices or 0 or 1
....0,0 0,1 1,0 1,1 (this is what the other players choose)
0...1....3....3....0
1...0....3....3....3

So A would prefer 1 if the others choose randomly.

For b

0...2 3 3 0
1...0 3 3 1

B prefers 0 when others choose randomly and when other players know how to maximise.

For C

0...3 3 3 0
1...0 3 3 2

C prefers 0 when others choose randomly and when other players know how to maximise.

Since B and C are likely (imo) to pick 0, that forces A to pick 0 also.
In any case, if picking the wrong "alone" vote, the payout is 0. So it doesn't matter which wrong vote you pick, but it matters which right vote (which equilibrium strategy, when the others happen to pick the same) you pick.

Quote

01-26-2013 , 11:51 PM

#20

ccpb

newbie

Join Date: Dec 2011 Posts: 42

Interesting game, Cangurino. It looks like I have some reading to do.

Quote

01-27-2013 , 02:51 AM

#21

Cangurino

Carpal \'Tunnel

Join Date: Apr 2008 Posts: 13,476

I read "strict preference" as "dominating", meaning that no matter what the others do, 1 is always better than (or at least as good as) 0. I see you mean something else. My point was that there are no dominating strategies for any of the players.

I will compute the outcomes for mixed strategies.

Quote

01-27-2013 , 03:51 AM

#22

ccpb

newbie

Join Date: Dec 2011 Posts: 42

Gambit finds 3 equilibria. The 2 already identified and the following mixed strategy:

A: 0 (25%), 1 (75%)
B: 0 (67%), 1 (33%)
C: 0 (60%), 1 (40%)

Quote

01-27-2013 , 07:31 AM

#23

pasita

adept

Join Date: Mar 2007 Posts: 1,101

Ahhh, had forgotten all about Gambit.

Cangurino, I meant that of the 2 (pure strategy) equilibria, B&C would rather pick 000. "Strictly prefer" meant that they both gain "more" at 000, not "as much or more". Sorry if my choice of words was confusing.

I wonder if we're arguing over nothing.
In this game (I'm ignoring the mixed equilibrium for a while), A has two equilibrium strategies with different payouts, but he can't choose the one he's prefer, as the others will choose a strategy pointing to another equilibrium. Is that what you were saying?

I've been saying that with backwards deduction, all players will choose 0, and without explicit collusion, no-one will change that choice if game is repeated. A would jump in joy to 1 if B or C were to do that. Even if B&C do it together, A will take a one time hit and profit on the next rounds onward, hurting both B and C. Even if B&C will randomly (but together) jump between 0 and 1, A would benefit by randomly picking either choise (compared to the 000 state of things).

Quote

04-23-2014 , 02:56 AM

#24

tobakudan

grinder

Join Date: Feb 2014 Posts: 476

Quote:

Originally Posted by pasita

So A would prefer 1 if the others choose randomly.

This is only a valid deduction when it is known that in fact the other players will choose randomly. Since this is not generally the case, it cannot be said that A generally prefers 1 (and similarly for B and C preferring 0). Thus, your solution does not work out mathematically.

That said, I agree that if human beings played this game (just once, with strangers), in the vast majority of cases all players would choose 0 (due to reasoning similar to what you describe).

Quote

$25m Guaranteed WPM on CoinPoker

Join the action now

Daily Rewards • Splash Pots • CoinRaces

Post Reply Subscribe

...