Quote:
Originally Posted by Clicken
Could you clarify this please?
I run a regret minimization algorithm. In a single run of the algorithm each player's strategy converges to something after many iterations. I am talking about what it converges to, not what each iteration is.
In general, it could converge to some mixed or pure strategy for each player. In this case, in each of 1000 runs, the algorithm converged to each player playing a pure strategy.
Quote:
Are you saying that for any sequence of trials, every 1000+N trial had the same player choosing "3" and getting a -2 payoff?
After some large number of iterations, one of the players is picking "3" and getting a -2 payoff and it won't ever change after that. I tried this 1000 times and each time the same result occurred, but with a different player (random) ending up being the one to pick "3".
For example, here might be the result
Run 1
Player 1 converges to picking 3 always
Player 2 converges to picking 1 always
Player 3 converges to picking 2 always
Run 2
Player 1 converges to picking 1 always
Player 2 converges to picking 3 always
Player 3 converges to picking 2 always
Run 3
Player 1 converges to picking 3 always
Player 2 converges to picking 2 always
Player 3 converges to picking 1 always
etc.
Quote:
If so, it seems odd to me that the best that player could do is get locked into a negative pay-off for all eternity.
Regret minimization approaches "the best you could do" in hindsight against the strategy your opponents played. But you might do better by choosing a strategy that coerces your opponent's into changing their strategy.
Last edited by bobf; 10-13-2014 at 10:46 AM.