Common Rollout Settings - GNU and Snowie - Backgammon Forum - Discuss Backgammon Strategy, Clubs and Books

Two Plus Two Forums Sports and Games Chess and Other Board Games Backgammon Forum hosted by Bill Robertie.

Common Rollout Settings - GNU and Snowie

Post Reply Subscribe

...

03-01-2014 , 02:07 PM

PeteMorgan

stranger

Join Date: Mar 2014 Posts: 14

First time poster here, so excuse any ignorance. I have a question about general rollout settings. I use GNU and Snowie, though haven't used the rollout functions that much. Starting to do more of that. I know there is a TON of discussion about the different strengths of the separate programs - not really interested in that. More curious, from a practical perspective, of what other players have found to be useful for reasonably-sized rollouts, that are good enough in practice. My current settings for rollouts are:

Snowie - playing level at 2-ply, 648 trials (I believe that number was from a Robertie article) non-truncated, cubeful.

GNU - playing level at 2-ply (Supremo), 648 trials (set to end at 324 if standard deviation is small enough (0.0100), variance reduction, cubeful.

If others have found more useful settings to get consistent results, I'm all ears. Because of the variance in the results I get at those settings, I'm thinking these may not be quite enough for statistical significance.

Thanks,

Pete

Quote

03-01-2014 , 07:16 PM

Taper_Mike

adept

Join Date: Sep 2009 Posts: 1,030

Pay attention to the confidence intervals reported in a rollout. If the intervals for the top plays overlap, that is an indication that you need more trials.

For anything that is close, 648 trials or 1k = 1296 trials is not enough. I usually begin at 5k = 5184 trials and go from there. I cap things at 62k (62208 trials), even if the results are not statistically significant. My secret is to let the long ones run overnight or when I am not using the computer. For rollouts in GnuBg, you may not need Supremo. World Class will generally yield similar results in much less time. For Snowie, I recommend switching to 3-ply (which is the same as GnuBg’s 2-ply).

After trying to scrimp at the beginning, I have come around to the point of view of Nack Ballard and David Rockwell. These players are experts in the opening game. They will often set up two positions that differ by only a pip or two, and then make long rollouts to see whether the change has caused a shift in the margins between the top two plays. In the following post, Nack describes his experience making three different 1k rollouts of the same position and getting three different top plays:

http://www.bgonline.org/forums/webbb...es;read=155306

If all you care about is trying to determine the top play, then you can get by with shorter rollouts. When you are interested to know the precise margins between plays, you need many more trials in order to wring out the variance.

Xavier Dufaure de Citres, the designer of eXtreme Gammon, reports that a 1k, 3-ply rollout is equivalent to an XGR++ evaluation. See the second-to-last paragraph of this post:

http://www.bgonline.org/forums/webbb...es;read=115836

My own experience confirms this. So why waste time with 1k rollouts?

I am a big believer in truncated rollouts. In general, a 1k trunc (or 648-trial trunc) is much more reliable than a 1k full rollout. That said, for anything serious, I do the longer, full rollouts described above. Users of GNU Backgammon may want to set up truncated rollouts that mimic XGR++. I explain how in this post:

http://www.bgonline.org/forums/webbb...=134413‎

Many will read advice such as that of Nack and David, and then continue to make 1k rollouts. Often they don’t have the time for anything longer. If that describes you, I suggest you switch to truncated rollouts, making sure that variance reduction is enabled.

Hope this helps.

Mike

Last edited by Taper_Mike; 03-01-2014 at 07:25 PM.

Quote

03-01-2014 , 07:36 PM

Taper_Mike

adept

Join Date: Sep 2009 Posts: 1,030

Two more things:

1. Bill's recommended Snowie settings were appropriate when he made them. Now that computers are running 10 to 20 times faster, however, you should use stronger settings.

2. In GnuBg, make sure you have set up multithreading if you have it.

Settings > Options > Other > Eval Threads

Mike

Quote

03-01-2014 , 07:42 PM

PeteMorgan

stranger

Join Date: Mar 2014 Posts: 14

Thanks Mike,

Very helpful stuff. I did see your short article on setting GNU to mimic XGR. I plan to revise the GNU settings as you suggest. Will also try some Snowie rollouts at 3-ply when I have time. Like most, I don't have a lot of time to run these, but when I do, I'd like to get the most out of those programs - at least to the point where, if I'm not getting the absolute best move, at worst the suggested plays (if mistakes) will be very close, so definitely not blunders. Haven't purchased XG (any version), though at some point I probably will. Until then, it's Snowie and GNU.

Regards,

Pete

Quote:

Originally Posted by Taper_Mike

Quote

03-01-2014 , 07:44 PM

PeteMorgan

stranger

Join Date: Mar 2014 Posts: 14

Thanks again.

Pete

Quote:

Originally Posted by Taper_Mike

Quote

03-01-2014 , 07:48 PM

PeteMorgan

stranger

Join Date: Mar 2014 Posts: 14

Hi Mike,

I just checked - multithreading is on, and is set to 1. Should that be set differently?

Thanks,

Pete

Quote:

Originally Posted by Taper_Mike

Quote

03-01-2014 , 07:59 PM

Taper_Mike

adept

Join Date: Sep 2009 Posts: 1,030

Quote:

Originally Posted by PeteMorgan

I just checked - multithreading is on, and is set to 1. Should that be set differently?

Probably. It depends whether you have a CPU chip that supports multithreading. For instance, if you have a duo-core chip where each core runs two threads, then you shoud set the number of threads to 4.

Under Windows, you can use the Task Manager to make sure that the CPU is maxed out during rolllouts. If utilization is only 50% or less that generally indicates that one or more of you cores is idle!

Mike

Quote

03-01-2014 , 08:19 PM

PeteMorgan

stranger

Join Date: Mar 2014 Posts: 14

Great - thanks again for all the help.

Pete

Quote:

Originally Posted by Taper_Mike

Probably. It depends whether you have a CPU chip that supports multithreading. For instance, if you have a duo-core chip where each core runs two threads, then you shoud set the number of threads to 4.

Under Windows, you can use the Task Manager to make sure that the CPU is maxed out during rolllouts. If utilization is only 50% or less that generally indicates that one or more of you cores is idle!

Mike

Quote

03-02-2014 , 02:06 AM

bleep69

grinder

Join Date: Sep 2008 Posts: 483

When playing, both XG and Gnu immediately calculate the best move. What exactly does a roll-out do? Makes sure?

Quote

03-02-2014 , 03:37 AM

#10

Taper_Mike

adept

Join Date: Sep 2009 Posts: 1,030

Quote:

Originally Posted by bleep69

What exactly does a roll-out do? Makes sure?

Right!

For a given play, the neural net produces estimates of wins, gammon wins, backgammon wins, losses, gammon losses, and backgammon losses. In GnuBg, a 0-ply "analysis" uses those estimates to choose a play. These estimates are not perfect.

To improve the result, a bot can use more than one ply, where each ply is a turn. In GnuBg, for instance, a 1-ply analysis calculates the win rates listed above by making a candidate play, and then letting the opponent take a turn. For the opponent, GnuBg checks all 21 possible rolls, feeding each one into the neural net to determine the best play. GnuBg then averages the values that are returned by the net. The average is weighted, so that non-doublets count twice as much as doublets.

Further plies repeat the process (in a fashion), allowing you to look ahead further and further.

A "rollout" plays a game to its conclusion many times. The same statististics for wins, gammon wins, and so on, are easy to determine if you play all the way to the end. By doing this a large number of times, eventually the numbers will stabilize, giving you fairly accurate estimates. When the number of trials is large, rollouts are considered to be the most accurate way to assess a position.

A 1-ply rollout chooses the moves in a game using a 1-ply analysis. Similarly, a 2-ply rollout uses 2-ply analysis. Here you can get confused because of the way plies are counted. GnuBg is the odd man out. Its ply counts are always one less than XG or Snowie. Thus, 2-ply in GnuBg is the same as 3-ply in XG and Snowie.

There is a general consensus that the 3-ply rollouts of Snowie and XG are the minimum you should use for serious work. Neil Kazaross, a backgammon Giant, frequently uses 4-ply in XG. For GnuBg users the 3-ply minimum of XG translates into a 2-ply minimum for GnuBg, the setting that is used when you select "World Class" or "Supremo."

Each game in a "truncated rollout" is stopped after a certain number of moves. At that point, the neural net is called to estimate the various win percentatges. Again, there is some confusion in terminology, because the word ply is used to describe the "depth" of a truncated rollout. If you stop after five turns, for instance, the rollout is said to be truncated at five plies.

The most powerful analysis level in XG is a truncated rollout called XG Roller++ or XGR++, for short. These are the settings is uses:

XGRoller++ for Checker Plays
360 games with VR
Truncated after 5 moves using 3-ply analysis
First 2 decisions in 3-ply
Then 2-ply checker and 3-ply cube

XGRoller++ for Cube Decisions
360 games with VR
Truncated after 7 moves using 3-ply analysis
Stop when reaching 0.010 Confidence (minimum 180 games)
First 2 decisions in 3-ply
Then 2-ply checker and 3-ply cube

Hope this helps.

Mike

Quote

03-03-2014 , 06:17 AM

#11

bleep69

grinder

Join Date: Sep 2008 Posts: 483

Thanks for that detailed reply Mike, very helpful.

Do you know how often the move that is immediately suggested turns out to be incorrect? When would you do a rollout, and not trust what the program immediately spits out?

Quote

03-04-2014 , 10:57 AM

#12

Taper_Mike

adept

Join Date: Sep 2009 Posts: 1,030

Quote:

Originally Posted by bleep69

Do you know how often the move that is immediately suggested turns out to be incorrect?

The Deprelli Study attempts to answer that very question. You can see the results here:

http://extremegammon.com/studies.aspx

Quote:

Originally Posted by bleep69

When would you do a rollout, and not trust what the program immediately spits out?

When I play against GnuBg, I use "Grandmaster" mode. Against XG, I use "3-ply/XGR+". Both of these modes can be wrong by as much as 0.02 or more on tough plays. I get an overturn like that perhaps as often as 1 game out of 4. When I am suspicious (you have to develop an intution for this!), I use XGR++ or the GnuBg rollouts that mimic XGR++ (see above) to check. I also use XGR++ to check almost all the plays I get wrong. Making a rollout is almost never necessary.

With online play, my usual habit is to download my matches, and then analyze (Analyze > Session...) them using XGR++ rather than XG's usual 3-ply World Class. XGR++ catches most of the errors that would be made in a 3-ply analysis. Very seldom do I have any need to make rollouts.

I am part of a team that has been making rollouts for the XG Opening Book. For that project, I have made many lengthy rollouts of 1st and 2nd roll positions. The latest release of the OB was version 3.70. It came out last Fall.

I have a particular interest in early-game play. I usually roll out any error I have made in the first 3-7 moves of a game. Those ones go into my early-game flashcard collection. Another set of flashcards covers all the other interesting positions I have saved. Those ones get rolled out, too.

The bottom line, however, is that you don't need very many rollouts. XGR++ is fine. That said, I have been meaning to create a 1k truncated rollout that is a bit stronger for some time. I just have not gotten around to it. If I had one, I think it would run very fast. If I were to simply copy the XGR++ settings, for instance, and extend the rollout to 1k, runtime should be less than four times longer than XGR++.

XGR++ uses only 360 trials, but matches the accuracy of a 1k full rollout. A well-designed 1k trunc might be as good as a 5k rollout (but I don't know!). You should be careful, however, because truncs should not be used in all circumstances. A backgame, for instance, will often be analyzed better by a full rollout than any trunc (including XGR++).

Mike

Last edited by Taper_Mike; 03-04-2014 at 11:03 AM.

Quote

03-04-2014 , 04:37 PM

#13

PeteMorgan

stranger

Join Date: Mar 2014 Posts: 14

Hi Mike,

You mentioned you use GnuBG at Grandmaster level for analysis. I read somewhere that GnuBG is less than reliable for CUBE decisions at 3-ply (Grandmaster) and is actually more reliable running 2-ply (Supremo) (or 4-ply, if you have lots of time) for cube decisions.

Is that how you have things set for GNU (checker play - grandmaster, cube decisions - supremo)?

Thanks,

Pete

Quote:

Originally Posted by Taper_Mike

The Deprelli Study attempts to answer that very question. You can see the results here:

http://extremegammon.com/studies.aspx

When I play against GnuBg, I use "Grandmaster" mode. Against XG, I use "3-ply/XGR+". Both of these modes can be wrong by as much as 0.02 or more on tough plays. I get an overturn like that perhaps as often as 1 game out of 4. When I am suspicious (you have to develop an intution for this!), I use XGR++ or the GnuBg rollouts that mimic XGR++ (see above) to check. I also use XGR++ to check almost all the plays I get wrong. Making a rollout is almost never necessary.

With online play, my usual habit is to download my matches, and then analyze (Analyze > Session...) them using XGR++ rather than XG's usual 3-ply World Class. XGR++ catches most of the errors that would be made in a 3-ply analysis. Very seldom do I have any need to make rollouts.

I am part of a team that has been making rollouts for the XG Opening Book. For that project, I have made many lengthy rollouts of 1st and 2nd roll positions. The latest release of the OB was version 3.70. It came out last Fall.

I have a particular interest in early-game play. I usually roll out any error I have made in the first 3-7 moves of a game. Those ones go into my early-game flashcard collection. Another set of flashcards covers all the other interesting positions I have saved. Those ones get rolled out, too.

The bottom line, however, is that you don't need very many rollouts. XGR++ is fine. That said, I have been meaning to create a 1k truncated rollout that is a bit stronger for some time. I just have not gotten around to it. If I had one, I think it would run very fast. If I were to simply copy the XGR++ settings, for instance, and extend the rollout to 1k, runtime should be less than four times longer than XGR++.

XGR++ uses only 360 trials, but matches the accuracy of a 1k full rollout. A well-designed 1k trunc might be as good as a 5k rollout (but I don't know!). You should be careful, however, because truncs should not be used in all circumstances. A backgame, for instance, will often be analyzed better by a full rollout than any trunc (including XGR++).

Mike

Quote

03-04-2014 , 10:13 PM

#14

Taper_Mike

adept

Join Date: Sep 2009 Posts: 1,030

Quote:

Originally Posted by PeteMorgan

I read somewhere that GnuBG is less than reliable for CUBE decisions at 3-ply (Grandmaster) and is actually more reliable running 2-ply (Supremo) (or 4-ply, if you have lots of time) for cube decisions.

Is that how you have things set for GNU (checker play - grandmaster, cube decisions - supremo)?

In a word, "No."

You are right to be concerned, however, because this was a real problem in previous versions of GnuBg. With the release of version 1 and the new neural nets it uses, the problem appears to have been alleviated. Michael Petch, one of the programmers on the GnuBg team, says as much in this post:

http://forumserver.twoplustwo.com/13...lease-1361031/

My experience has been the same. I am using Grandmaster (3-ply) for analysis, World Class (2-ply) for rollouts, and the hybrid settings of my XGR++ clone for truncated rollouts.

Mike

Quote

03-04-2014 , 10:53 PM

#15

Taper_Mike

adept

Join Date: Sep 2009 Posts: 1,030

There is one more thing that Snowie users need to consider.

Snowie offers three types of rollouts: Cubeless, Cubeful, and Live Cube. Given enough trials, Live Cube should be the most accurate of the three. On a fast computer, that is the setting I recommend.

When computers were slower, Cubeful was the setting used by most Snowie users. Those rollouts apply the Janowski formula to simulate the effect a live cube would have in a game. They are faster, but less accurate, than Live Cube rollouts.

Unfortunately, Snowie buries the results of a Live Cube rollout in the middle of its data. You have to look for the lines that say “Live Cube.” The equities that are displayed more prominently are the Cubeful, Janowski-adjusted ones.

Of course, any rollout made on any bot should use “variance reduction.” There may be an odd exception, but I don’t know of one.

Mike

Last edited by Taper_Mike; 03-04-2014 at 11:03 PM.

Quote

03-05-2014 , 04:38 AM

#16

bleep69

grinder

Join Date: Sep 2008 Posts: 483

Thanks again for the detailed explanation Mike.

Quote

03-05-2014 , 09:11 AM

#17

Taper_Mike

adept

Join Date: Sep 2009 Posts: 1,030

Quote:

Originally Posted by bleep69

Thanks again for the detailed explanation Mike.

You're welcome. Glad I could help.

Mike

Quote

03-06-2014 , 01:10 PM

#18

PeteMorgan

stranger

Join Date: Mar 2014 Posts: 14

Thanks Mike,

I checked my settings again and made a slight change based on your recommendations. For the rollouts I had it set to Supremo. I've since switched to World Class for the rollouts, Grandmaster for analysis. My computer is starting to age so things move a little slow, but I'm not in a hurry.

Pete

Quote:

Originally Posted by Taper_Mike

In a word, "No."

You are right to be concerned, however, because this was a real problem in previous versions of GnuBg. With the release of version 1 and the new neural nets it uses, the problem appears to have been alleviated. Michael Petch, one of the programmers on the GnuBg team, says as much in this post:

http://forumserver.twoplustwo.com/13...lease-1361031/

My experience has been the same. I am using Grandmaster (3-ply) for analysis, World Class (2-ply) for rollouts, and the hybrid settings of my XGR++ clone for truncated rollouts.

Mike

Quote

03-06-2014 , 02:22 PM

#19

Taper_Mike

adept

Join Date: Sep 2009 Posts: 1,030

Quote:

Originally Posted by PeteMorgan

For the rollouts I had it set to Supremo. I've since switched to World Class for the rollouts

Using World Class for rollouts (or analysis) instead of Supremo should knock about 25% off runtimes.

Quote:

Originally Posted by PeteMorgan

[I switched to] Grandmaster for analysis. My computer is starting to age so things move a little slow, but I'm not in a hurry.

In that case, I would probably switch to Supremo for analysis. If that turns out to be too slow, then World Class. When you are playing against GnuBg, you don't want a sluggish machine to interrupt the flow of a game. Besides, GnuBg beats me up on any of those modes!

Mike

Quote

Post Reply Subscribe

...