Quote:
Originally Posted by TeelXp
In the solver you could choose variations of Monte Carlo counterfactual regret minimization algorithm:
thats like saying: both the pluribus and GTO solvers both run on PCs, therefor they are essentially the same and pluribus is nothing special, when in fact they use vastly different methods. while solvers try to find nash equlibria in 2 player situations, to my understanding it isn't even clear that there is such a thing as a nash equilibrium in multiway pots, unless obviously all players "agree" to find such a thing and cooperate in doing so. so no, pluribus is not just a "better" version of a solver.
Quote:
Originally Posted by fishfood69er
No, it has baseline strategy , it uses self play but its based off of strategy
"The core of Pluribus’s strategy was computed via self play, in which the AI plays against copies of itself, without any data of human or prior AI play used as input. The AI starts from scratch by playing randomly, and gradually improves as it determines which actions, and which probability distribution over those actions, lead to better outcomes against earlier versions of its strategy. (...)
Pluribus’s self play produces a strategy for the entire game offline, which we refer to as the blueprint strategy. Then during actual play against opponents, Pluribus improves upon the blueprint strategy by searching for a better strategy in real time for the situations it finds itself in during the game. In subsections below, we discuss both of those phases in detail, but first we discuss abstraction, forms of which are used in both phases to make them scalable."
https://science.sciencemag.org/conte...cience.aay2400