Open Side Menu Go to the Top
Register
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Google's AlphaZero just smashed the strongest chess engine. Is poker next?

12-06-2017 , 08:47 PM
https://www.chess.com/news/view/goog...100-game-match

AlphaZero taught itself to crush the strongest chess engine in just 4 hours. I understand the fundamental differences between chess and poker. One being a game of complete information and one not, but AlphaZero's approach to learning and playing games is like nothing we've seen before.

It appears to me that AlphaZero would be able to teach itself not only to approximate the nash equilibrium in different forms of poker, but to assess the strength of the opponent and start to employ maximal exploitative strategies that the opponent wouldn't be capable of adjusting to.

I'd sure like to see an AlphaZero vs Libratus match.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-06-2017 , 08:54 PM
https://www.youtube.com/watch?v=92tn67YDXg0

Last edited by Videopro; 12-07-2017 at 12:27 PM. Reason: removed deleted content
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-06-2017 , 09:10 PM
Quote:
Originally Posted by HITTHEPANDA
That bot has been beaten by thousands of players. I am one of them. It performs some mechanical aspects of the game extremely well, but falls on its face strategically.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-06-2017 , 10:00 PM
From the article in the OP, it also seems that Google handicapped Stockfish, giving their own machine a much greater advantage.

Why do AI researchers always seem to press their thumb on the scale?
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-06-2017 , 11:02 PM
sss
Quote:
Originally Posted by L1lyR0semary
https://www.chess.com/news/view/goog...100-game-match

It appears to me that AlphaZero would be able to teach itself not only to approximate the nash equilibrium in different forms of poker, but to assess the strength of the opponent and start to employ maximal exploitative strategies that the opponent wouldn't be capable of adjusting to.
.
If the opponent knows that the computer is trying to do this he can double cross it and in the case of great players, not only lose less than they would against computers not trying to exploit, but perhaps even be favored. Of course the computer could perhaps be trained to recognize double crossers but then what of double double crossers?
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-06-2017 , 11:13 PM
Call me when an AI can beat 6 max games for a decent winrate. Especially when other players know it is an AI and collude against it. Oh wait, by then we will have AI rights and that will be a bannable offence.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 12:03 AM
Play live and it wont matter
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 12:28 AM
Now that googles AI built its own AI that outperforms any human built AI, it won't be long. But by then there will be bigger things to worry about, like the robocalypse.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 12:38 AM
Is poker next

lol
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 02:26 AM
Its 4 hours were more like decades on a personal computer. That was enough time for it to get a superior evaluation of positions (chess in this case), although I read that some GO-engines added some stuffs from it to improve their engines, and that figures to be the more selective search, that though is not a new idea in any way.

Heads up poker did like the same. It just played and played till its head was not spinning too much anymore, and had some limited combo evaluations. Ended to some sort of GTO perhaps, while the chess engine to be called "GTO," I don't know about that.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 03:26 AM
Quote:
Originally Posted by David Sklansky
sss

If the opponent knows that the computer is trying to do this he can double cross it and in the case of great players, not only lose less than they would against computers not trying to exploit, but perhaps even be favored. Of course the computer could perhaps be trained to recognize double crossers but then what of double double crossers?
You're probably going to lose more money leaving the door open than you win back by being able to pounce on the way the AI exploits you. And it probably adjusts back instantly. You will lose the leveling and theoretical game vs a superior AI every time as well.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 03:51 AM
I think that if Deepmind does decide to focus on poker, the challenge will be an exploitative one. Approximating equilibria is likely a trivial task to DeepMind.

An exploitative challenge is open ended and involves poorly defined elements like human psychology. It would be extremely interesting to see how a sophisticated neural network would preform at that compared to humans. I have a feeling that it would be somewhat easy to get it better than most humans, but quite difficult to reach a superhuman level.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 06:01 AM
Quote:
I think that if Deepmind does decide to focus on poker, the challenge will be an exploitative one. Approximating equilibria is likely a trivial task to DeepMind.
+1 to this, although I think exploitative is not a huge problem when HU.

What's more interesting is balancing between equilibrium and exploitation in a multiplayer setting. Say you have a 6 max table 100bb NLHE table with two weak players and 3 strong players, + the AI. If the AI can outperform the 3 strong players globally, that would be a very impressive achievement.

It would be difficult to benchmark appropriately, as you would need those 5 humans to play several hundreds of thousands of hands (maybe millions?) for the results to be statistically significant.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 11:08 AM
Quote:
Originally Posted by AltruisticRaven
I think that if Deepmind does decide to focus on poker, the challenge will be an exploitative one. Approximating equilibria is likely a trivial task to DeepMind.

An exploitative challenge is open ended and involves poorly defined elements like human psychology. It would be extremely interesting to see how a sophisticated neural network would preform at that compared to humans. I have a feeling that it would be somewhat easy to get it better than most humans, but quite difficult to reach a superhuman level.
Humans and robots learn in the same way: Trial and error + evolution.

You make a interaction with the environment, if the interaction is helpfull to reach your goal you save it in the strategy and pick that move more often, if it fails you throw it away.

Humans learn with 7Billion CPUs at the same. Every time one CPU(human brain) learns something he tells it to the other CPUs trough our system of information sharing (Language + internet). Each human will use an algorithm to update his mind with the new info. This way the knowledge spreads like a virus.

The way humans learnt to exploit other humans at poker works like this as well: Maybe 10M human brains played poker online, Each time some1 learns an exploitative move or population tendency while playing or studying the data of his sessions he updates his own mind+ shares it with some of his friends or students or posts it on a internet forum.

If a robot wants to get good at exploiting humans, he also will need to play vs humans. (Or at least use a database of human play). There is no other way. He will need to see the way humans play, then trial and error on the best way to exploit, then trial and error too detect the adjustments

The problem is: (This is the problem for all real-life robot learning).

This learning takes time. The robot can play 100K hands vs itself in one minute, because it can simulate the task on a pc. But it can not play 100K hands vs humans in one minute. It will take way more time. There is no easy way to speed-up this process This is why robots are crushing GO and CHESS, and will get superhuman all all pcgames+ doing stuff on the internet,long before they will get superhuman at tasks that require difficult to simulate slow interaction with humans.

However if u would feed a large DB of hands to the google AI, or pokerstars would be OK with letting loose say a couple of hundred Google bots in the games, I think it would get way superhuman at exploiting. humans arent that good at complex number processing

speaking mostly about HU, dont know enough about 6m

Last edited by icoon; 12-07-2017 at 11:14 AM.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 11:17 AM
Interesting in this light:

http://www.essentially.net/rsp/

Rock paper scissors AI that beats humans 61% of the time.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 12:18 PM
Quote:
Originally Posted by uradoodooface
How about Chess is nowhere near real warfare or even real time simulated warfare. Lets see if this thing can beat a real player playing an old school Warcraft game (not the lame ass MMO), Starcraft, LOrd of the Realm in combat mode, ect.


Turn based games do not mean **** when it comes to warfare or some bots ability to crush someone in any game. I'lll **** a turn based bot in the mouth cuz it's worthless. REal time or go **** your mother. Turn based games are for peasants from a country who lives like it's 1017 AD. So this bot is the equivalent of a peasant from the 1000s lol. :O
Didn't an AI crush the top DOTA 2 players?
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 01:19 PM
Quote:
Originally Posted by Neil S
Didn't an AI crush the top DOTA 2 players?
Not yet but getting close. However it's a little redundant as bots can make a LOT more actions per minute than a human can. So even if the bot's strategy is worse they can still win with more units.
edit: Depends on how you define as top players.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 01:32 PM
I wonder what Alpha Zero would gain from playing a billion hands vs. the classic HU LHE unexploitable.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-07-2017 , 04:24 PM
Poker is a joke compared to Go lol. no chance

Maybe BroodWar could give it a run for its money
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-08-2017 , 04:54 AM
Quote:
Originally Posted by the orange crush
Is poker next

lol
Quote:
Originally Posted by omnishakira
Poker is a joke compared to Go lol. no chance
yup

Why'd all the internet wizards go to live games a long time ago?

Why are people getting banned online all the time with regards to 'software'?
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-08-2017 , 07:04 AM
Quote:
Originally Posted by icoon
Interesting in this light:

http://www.essentially.net/rsp/

Rock paper scissors AI that beats humans 61% of the time.
Can that be real? Maybe there is a bug? Or too small sample size?
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-08-2017 , 07:24 AM
Quote:
Originally Posted by vesku
Can that be real? Maybe there is a bug? Or too small sample size?
It seems it "adapted," as it states; that with the other comments seems to indicate it exploited human psychology/tendencies to get a 60% winrate (even when random answers give 50% - I am not too sure about that).

I do think it is good in poker, so I am not too surprised. The sample size was 1.2M if I saw it right. If all humans played it seriously, I don't think so, but I guess adapting would still work here also. Ask any magic man, and they will say that it likely will work.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-08-2017 , 08:54 AM
The way it works is pretty simple. Say your last 3 turns you played {rock, paper, rock}. Then they will look in their database for all the times people played {rock,paper,rock} and see what option people pick next most of the time.

When u play more games they will start using your individual patterns as well.

Its a simple exploitative strategy and once you know how it works you can trick it and win, but initially almost everyone loses against it.

Its similair in poker. If people can decide to bluf or check in a close spot they will fail at randomizing it unless they use a tool. They bluffed the last 3 times ( or got a lot of good hands) they will bluf less often the 4th time. They didnt bluf for a long time, they will do it now with higher probability

If you are running away from a wild lion and you ran to the left side three times in a row, the animal will pick up on the pattern and follow you quicker to the left. So the optimal strategy is to run to the right the 4th time. (The lion can not counteradjust to this because there might be a real reason why you always went left) This is why human s who randomize always try to "obstruct the previous pattern" rather then be truly random, result of evolution

Last edited by icoon; 12-08-2017 at 09:00 AM.
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-08-2017 , 10:28 AM
Quote:
Originally Posted by icoon
The way it works is pretty simple. Say your last 3 turns you played {rock, paper, rock}. Then they will look in their database for all the times people played {rock,paper,rock} and see what option people pick next most of the time.

When u play more games they will start using your individual patterns as well.

Its a simple exploitative strategy and once you know how it works you can trick it and win, but initially almost everyone loses against it.

Its similair in poker. If people can decide to bluf or check in a close spot they will fail at randomizing it unless they use a tool. They bluffed the last 3 times ( or got a lot of good hands) they will bluf less often the 4th time. They didnt bluf for a long time, they will do it now with higher probability

If you are running away from a wild lion and you ran to the left side three times in a row, the animal will pick up on the pattern and follow you quicker to the left. So the optimal strategy is to run to the right the 4th time. (The lion can not counteradjust to this because there might be a real reason why you always went left) This is why human s who randomize always try to "obstruct the previous pattern" rather then be truly random, result of evolution
well said
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote
12-08-2017 , 01:59 PM
Quote:
Originally Posted by icoon
Interesting in this light:

http://www.essentially.net/rsp/

Rock paper scissors AI that beats humans 61% of the time.
i would be curious to know if he still gets this result against a smart player who knows the AI get those results. I doubt it

I still think the AI would get >50% though
Google's AlphaZero just smashed the strongest chess engine. Is poker next? Quote

      
m