Google's AlphaZero just smashed the strongest chess engine. Is poker next? - Poker News

Two Plus Two Forums Poker News & Discussion News, Views, and Gossip

Google's AlphaZero just smashed the strongest chess engine. Is poker next?

Post Reply Subscribe

...

Page 1 of 3

1 2 3

Page 1 of 3

1 2 3

12-06-2017 , 08:47 PM

L1lyR0semary

newbie

Join Date: Oct 2015 Posts: 39

https://www.chess.com/news/view/goog...100-game-match

AlphaZero taught itself to crush the strongest chess engine in just 4 hours. I understand the fundamental differences between chess and poker. One being a game of complete information and one not, but AlphaZero's approach to learning and playing games is like nothing we've seen before.

It appears to me that AlphaZero would be able to teach itself not only to approximate the nash equilibrium in different forms of poker, but to assess the strength of the opponent and start to employ maximal exploitative strategies that the opponent wouldn't be capable of adjusting to.

I'd sure like to see an AlphaZero vs Libratus match.

Quote

12-06-2017 , 08:54 PM

HITTHEPANDA

veteran

Join Date: Jan 2008 Posts: 2,180

https://www.youtube.com/watch?v=92tn67YDXg0

Last edited by Videopro; 12-07-2017 at 12:27 PM. Reason: removed deleted content

Quote

12-06-2017 , 09:10 PM

SrslySirius

Animated Character

Join Date: Jun 2010 Posts: 4,357

Quote:

Originally Posted by HITTHEPANDA

https://www.youtube.com/watch?v=92tn67YDXg0

That bot has been beaten by thousands of players. I am one of them. It performs some mechanical aspects of the game extremely well, but falls on its face strategically.

Quote

12-06-2017 , 10:00 PM

SrslySirius

Animated Character

Join Date: Jun 2010 Posts: 4,357

From the article in the OP, it also seems that Google handicapped Stockfish, giving their own machine a much greater advantage.

Why do AI researchers always seem to press their thumb on the scale?

Quote

12-06-2017 , 11:02 PM

David Sklansky

Administrator

Join Date: Aug 2002 Posts: 17,087

sss

Quote:

Originally Posted by L1lyR0semary

https://www.chess.com/news/view/goog...100-game-match

It appears to me that AlphaZero would be able to teach itself not only to approximate the nash equilibrium in different forms of poker, but to assess the strength of the opponent and start to employ maximal exploitative strategies that the opponent wouldn't be capable of adjusting to.
.

If the opponent knows that the computer is trying to do this he can double cross it and in the case of great players, not only lose less than they would against computers not trying to exploit, but perhaps even be favored. Of course the computer could perhaps be trained to recognize double crossers but then what of double double crossers?

Quote

12-06-2017 , 11:13 PM

TRUSTtheDRAWCESS

veteran

Join Date: Nov 2016 Posts: 2,775

Call me when an AI can beat 6 max games for a decent winrate. Especially when other players know it is an AI and collude against it. Oh wait, by then we will have AI rights and that will be a bannable offence.

Quote

12-07-2017 , 12:03 AM

H0RUS

Carpal \'Tunnel

Join Date: Apr 2015 Posts: 6,192

Play live and it wont matter

Quote

12-07-2017 , 12:28 AM

bustobrah

enthusiast

Join Date: Mar 2017 Posts: 93

Now that googles AI built its own AI that outperforms any human built AI, it won't be long. But by then there will be bigger things to worry about, like the robocalypse.

Quote

12-07-2017 , 12:38 AM

the orange crush

adept

Join Date: Nov 2011 Posts: 997

Is poker next

lol

Quote

12-07-2017 , 02:26 AM

#10

pucmo

adept

Join Date: Mar 2016 Posts: 1,187

Its 4 hours were more like decades on a personal computer. That was enough time for it to get a superior evaluation of positions (chess in this case), although I read that some GO-engines added some stuffs from it to improve their engines, and that figures to be the more selective search, that though is not a new idea in any way.

Heads up poker did like the same. It just played and played till its head was not spinning too much anymore, and had some limited combo evaluations. Ended to some sort of GTO perhaps, while the chess engine to be called "GTO," I don't know about that.

Quote

12-07-2017 , 03:26 AM

#11

Soepgroente

Pooh-Bah

Join Date: Sep 2008 Posts: 4,083

Quote:

Originally Posted by David Sklansky

sss

If the opponent knows that the computer is trying to do this he can double cross it and in the case of great players, not only lose less than they would against computers not trying to exploit, but perhaps even be favored. Of course the computer could perhaps be trained to recognize double crossers but then what of double double crossers?

You're probably going to lose more money leaving the door open than you win back by being able to pounce on the way the AI exploits you. And it probably adjusts back instantly. You will lose the leveling and theoretical game vs a superior AI every time as well.

Quote

12-07-2017 , 03:51 AM

#12

AltruisticRaven

newbie

Join Date: Jul 2016 Posts: 48

I think that if Deepmind does decide to focus on poker, the challenge will be an exploitative one. Approximating equilibria is likely a trivial task to DeepMind.

An exploitative challenge is open ended and involves poorly defined elements like human psychology. It would be extremely interesting to see how a sophisticated neural network would preform at that compared to humans. I have a feeling that it would be somewhat easy to get it better than most humans, but quite difficult to reach a superhuman level.

Quote

12-07-2017 , 06:01 AM

#13

ShallowMind

newbie

Join Date: Nov 2016 Posts: 45

Quote:

I think that if Deepmind does decide to focus on poker, the challenge will be an exploitative one. Approximating equilibria is likely a trivial task to DeepMind.

+1 to this, although I think exploitative is not a huge problem when HU.

What's more interesting is balancing between equilibrium and exploitation in a multiplayer setting. Say you have a 6 max table 100bb NLHE table with two weak players and 3 strong players, + the AI. If the AI can outperform the 3 strong players globally, that would be a very impressive achievement.

It would be difficult to benchmark appropriately, as you would need those 5 humans to play several hundreds of thousands of hands (maybe millions?) for the results to be statistically significant.

Quote

12-07-2017 , 11:08 AM

#14

icoon

adept

Join Date: Sep 2010 Posts: 744

Quote:

Originally Posted by AltruisticRaven

Humans and robots learn in the same way: Trial and error + evolution.

You make a interaction with the environment, if the interaction is helpfull to reach your goal you save it in the strategy and pick that move more often, if it fails you throw it away.

Humans learn with 7Billion CPUs at the same. Every time one CPU(human brain) learns something he tells it to the other CPUs trough our system of information sharing (Language + internet). Each human will use an algorithm to update his mind with the new info. This way the knowledge spreads like a virus.

The way humans learnt to exploit other humans at poker works like this as well: Maybe 10M human brains played poker online, Each time some1 learns an exploitative move or population tendency while playing or studying the data of his sessions he updates his own mind+ shares it with some of his friends or students or posts it on a internet forum.

If a robot wants to get good at exploiting humans, he also will need to play vs humans. (Or at least use a database of human play). There is no other way. He will need to see the way humans play, then trial and error on the best way to exploit, then trial and error too detect the adjustments

The problem is: (This is the problem for all real-life robot learning).

This learning takes time. The robot can play 100K hands vs itself in one minute, because it can simulate the task on a pc. But it can not play 100K hands vs humans in one minute. It will take way more time. There is no easy way to speed-up this process This is why robots are crushing GO and CHESS, and will get superhuman all all pcgames+ doing stuff on the internet,long before they will get superhuman at tasks that require difficult to simulate slow interaction with humans.

However if u would feed a large DB of hands to the google AI, or pokerstars would be OK with letting loose say a couple of hundred Google bots in the games, I think it would get way superhuman at exploiting. humans arent that good at complex number processing

speaking mostly about HU, dont know enough about 6m

Last edited by icoon; 12-07-2017 at 11:14 AM.

Quote

12-07-2017 , 11:17 AM

#15

icoon

adept

Join Date: Sep 2010 Posts: 744

Interesting in this light:

http://www.essentially.net/rsp/

Rock paper scissors AI that beats humans 61% of the time.

Quote

12-07-2017 , 12:18 PM

#16

Neil S

King of the sidebar

Join Date: Sep 2004 Posts: 19,677

Quote:

Originally Posted by uradoodooface

How about Chess is nowhere near real warfare or even real time simulated warfare. Lets see if this thing can beat a real player playing an old school Warcraft game (not the lame ass MMO), Starcraft, LOrd of the Realm in combat mode, ect.

Turn based games do not mean **** when it comes to warfare or some bots ability to crush someone in any game. I'lll **** a turn based bot in the mouth cuz it's worthless. REal time or go **** your mother. Turn based games are for peasants from a country who lives like it's 1017 AD. So this bot is the equivalent of a peasant from the 1000s lol. :O

Didn't an AI crush the top DOTA 2 players?

Quote

12-07-2017 , 01:19 PM

#17

the4bettingmonk

old hand

Join Date: May 2012 Posts: 1,596

Quote:

Originally Posted by Neil S

Didn't an AI crush the top DOTA 2 players?

Not yet but getting close. However it's a little redundant as bots can make a LOT more actions per minute than a human can. So even if the bot's strategy is worse they can still win with more units.
edit: Depends on how you define as top players.

Quote

12-07-2017 , 01:32 PM

#18

bipolarbearclaw

journeyman

Join Date: Jun 2012 Posts: 340

I wonder what Alpha Zero would gain from playing a billion hands vs. the classic HU LHE unexploitable.

Quote

12-07-2017 , 04:24 PM

#19

omnishakira

High Troller

Join Date: Sep 2007 Posts: 9,732

Poker is a joke compared to Go lol. no chance

Maybe BroodWar could give it a run for its money

Quote

12-08-2017 , 04:54 AM

#20

the orange crush

adept

Join Date: Nov 2011 Posts: 997

Quote:

Originally Posted by the orange crush

Is poker next

lol

Quote:

Originally Posted by omnishakira

Poker is a joke compared to Go lol. no chance

yup

Why'd all the internet wizards go to live games a long time ago?

Why are people getting banned online all the time with regards to 'software'?

Quote

12-08-2017 , 07:04 AM

#21

vesku

centurion

Join Date: Feb 2012 Posts: 176

Quote:

Originally Posted by icoon

Interesting in this light:

http://www.essentially.net/rsp/

Rock paper scissors AI that beats humans 61% of the time.

Can that be real? Maybe there is a bug? Or too small sample size?

Quote

12-08-2017 , 07:24 AM

#22

pucmo

adept

Join Date: Mar 2016 Posts: 1,187

Quote:

Originally Posted by vesku

Can that be real? Maybe there is a bug? Or too small sample size?

It seems it "adapted," as it states; that with the other comments seems to indicate it exploited human psychology/tendencies to get a 60% winrate (even when random answers give 50% - I am not too sure about that).

I do think it is good in poker, so I am not too surprised. The sample size was 1.2M if I saw it right. If all humans played it seriously, I don't think so, but I guess adapting would still work here also. Ask any magic man, and they will say that it likely will work.

Quote

12-08-2017 , 08:54 AM

#23

icoon

adept

Join Date: Sep 2010 Posts: 744

The way it works is pretty simple. Say your last 3 turns you played {rock, paper, rock}. Then they will look in their database for all the times people played {rock,paper,rock} and see what option people pick next most of the time.

When u play more games they will start using your individual patterns as well.

Its a simple exploitative strategy and once you know how it works you can trick it and win, but initially almost everyone loses against it.

Its similair in poker. If people can decide to bluf or check in a close spot they will fail at randomizing it unless they use a tool. They bluffed the last 3 times ( or got a lot of good hands) they will bluf less often the 4th time. They didnt bluf for a long time, they will do it now with higher probability

If you are running away from a wild lion and you ran to the left side three times in a row, the animal will pick up on the pattern and follow you quicker to the left. So the optimal strategy is to run to the right the 4th time. (The lion can not counteradjust to this because there might be a real reason why you always went left) This is why human s who randomize always try to "obstruct the previous pattern" rather then be truly random, result of evolution

Last edited by icoon; 12-08-2017 at 09:00 AM.

Quote

12-08-2017 , 10:28 AM

#24

Roman

veteran

Join Date: May 2004 Posts: 2,987

Quote:

Originally Posted by icoon

well said

Quote

12-08-2017 , 01:59 PM

#25

omnishakira

High Troller

Join Date: Sep 2007 Posts: 9,732

Quote:

Originally Posted by icoon

Interesting in this light:

http://www.essentially.net/rsp/

Rock paper scissors AI that beats humans 61% of the time.

i would be curious to know if he still gets this result against a smart player who knows the AI get those results. I doubt it

I still think the AI would get >50% though

Quote

Page 1 of 3

First

1 2 3

Last

Post Reply Subscribe

...

Page 1 of 3

First

1 2 3

Last