Open Side Menu Go to the Top
Register
Error rating discrepancy between GNU Backgammon and Backgammon Galaxy Error rating discrepancy between GNU Backgammon and Backgammon Galaxy

11-25-2019 , 05:24 PM
Just played a 4-game match to 7 on Backgammon Galaxy. Thinking I played a decent game, I imported the match to GNUbg and had it analyzed there, too. GNUbg analysis is set to "supremo," which I believe is 2-ply and certainly strong enough.

Backgammon Galaxy gave me a rating of 6.27 from...
  • 8 checker errors
  • 4 checker blunders
  • 0 cube errors
  • 1 cube blunder
However, GNUbg game me a rating of 4.1 from...
  • 3 checker "doubtfuls"
  • 3 checker "bads"
  • 0 checker "very bads"
  • 0 cube errors
  • 1 cube blunder (a substantial one)


Seems to me that the difference between 6.27 and 4.1 is substantial. Interestingly enough, my opponent went from a 4.27 per Backgammon Galaxy to a 4.7 per GNUbg. I'm not sure how to make sense of it. Any insight would be appreciated. Thanks.

edit: Ran 3-ply analysis of the game and it gave me 4.5, which is still a marked difference.

Last edited by fullerene; 11-25-2019 at 05:52 PM.
Error rating discrepancy between GNU Backgammon and Backgammon Galaxy Quote
11-25-2019 , 05:58 PM
Gnu counts forced and meaningless moves as decisions that you are rewarded for. XG doesn't, though it does make each real decision count for more so the error rate is comparable to Gnu's (on average).

So the upshot is, if you have a greater than average number of forced moves, Gnu's error rate will be lower, if you have fewer than average forced moves, XG's error rate will be lower.

(I assume Backgammon Galaxy uses XG for analysis)

Last edited by _Z_; 11-25-2019 at 06:06 PM.
Error rating discrepancy between GNU Backgammon and Backgammon Galaxy Quote
11-25-2019 , 06:17 PM
Quote:
Originally Posted by _Z_
Gnu counts forced and meaningless moves as decisions that you are rewarded for. XG doesn't, though it does make each real decision count for more so the error rate is comparable to Gnu's (on average).

So the upshot is, if you have a greater than average number of forced moves, Gnu's error rate will be lower, if you have fewer than average forced moves, XG's error rate will be lower.

(I assume Backgammon Galaxy uses XG for analysis)
Ah, I see. I wasn't aware of rating differences regarding forced/meaningless moves. Certainly seems to be a knock on GNU.
Error rating discrepancy between GNU Backgammon and Backgammon Galaxy Quote
11-26-2019 , 05:20 PM
GNUbg shows two error rates, what it calls "Error rate mEMG" and "Snowie error rate". It looks like you used the latter (and _Z_ did in its comment as well).

Snowie error rate is what was used by the earlier bot by this name. It the the sum of errors by the player divided by the number of decisions by *both* players.

GNUbg later tried to improve this by using the decisions of the graded player only and not counting forced moves or "not close" non-double decisions. This obviously gives numbers that are on average a bit more that twice the Snowie error rate.

XG later used for its PR a definition similar but not identical to GNUbg's. In addition to forced moves it discards non-forced but meaningless moves (in 100% decided races mostly) and its definition of a close cube decision is slightly different from that of GNUbg. On average, its denominator is a little smaller that GNUbg's. Moreover it divides the result by 2 to obtain something in the same ballpark as Snowie's...

In your example, assuming Backgammon Galaxy number is indeed a XG PR, GNUbg's own error rate is probably somewhere between 10 and 14. Definitely not 4.1 or 4.5.

Besides the rating formulas, GNUbg and XG evaluation functions differ but they still play in a very similar way. On average XG thinks that GNUbg plays with a 0.5 PR and GNUbg that XG does with an error rate of 1 or thereabouts. On a small sample of more haphazard human play the discrepancies may be larger.
Error rating discrepancy between GNU Backgammon and Backgammon Galaxy Quote

      
m