Quote:
Originally Posted by mrno1324
I’m skeptical of their variance reduction techniques
The variance-reduction techniques are sound.
I created the first variance-reduction system for poker, called DIVAT (aka Darse's Ignorant Value Assessment Tool). [see
https://webdocs.cs.ualberta.ca/~dars...ivat-icgaj.pdf, or Chapter 5 of my PhD thesis,
https://poker.cs.ualberta.ca/publica...lings.phd.pdf] Morgan Kan and i built implementations of DIVAT, and used the tool in developing programs that won the 2007 AAAI computer poker competition. [see Morgan's MSc thesis for details]
Martin Zinkevich proved that DIVAT is a statistically unbiased estimator of the actual EV. The baseline strategy used for comparison of each decision was hand-crafted by me (based on my understanding of a game-theoretic strategy), but the differentials in expected values are unbiased, so it is a completely fair assessment method. [Loose analogy: Measure distances using a crooked stick. As long as you use the same stick, you can tell that distance A is bigger than distance B (and roughly by how much).]
Michael Bowling wanted a fully automated method for variance-reduction, since no one else on our team had the domain knowledge to hand-craft a baseline strategy for other games. All of the subsequent methods (MIVAT, etc.) were weaker than DIVAT in terms of variance reduction, but they had the significant advantage of being purely algorithmic. AIVAT is the latest iteration in that line.
The Carnegie Mellon result for 6-max is laudable, if it is real, but i have learned to be skeptical of all their claims. For years they have been making over-inflated claims, without giving proper academic credit to the University of Alberta Computer Poker Research Group (CPRG), which has been years ahead of them in every area of poker research. [I even remember Tuomas making grandiose claims in a talk at AAAI, about an hour before their program was crushed and humiliated (again) in the computer poker competition. For the record: the first poker program to beat top humans in heads-up Limit Hold'em was from the UofA. Then Mike, Neil, and Michael epsilon-solved the game. The first poker program to beat top humans in heads-up No Limit Hold'em was from the UofA. None of the press coverage bought by C-M to advertize their advances has ever mentioned those prior results.]
If their program is better than all human players in 6-max NLH, then i congratulate them (but sorry, i'll need to see some actual proof first). To be honest, i expected that result to come much sooner (like maybe 5 years after i left the research in 2007). [Perhaps the advances would come sooner if the researchers took a few weeks to actually learn the basics of the game... it can really help in gaining key insights. ;-)]