Open Side Menu Go to the Top
Register
Academic Werewolf Discussions Go Here Academic Werewolf Discussions Go Here

02-19-2011 , 06:14 AM
Quote:
Originally Posted by kokiri
edit: ignore all of what i wrote, my main point is that:

the winrate of someone with 60 games played is emphatically not the same as their chance of wining the 61st game

so the line you draw is essentially saying after 10 games, your chance of winning the next game stays the same, but that it is higher than it would be for the first 10 games
I know that. That's what I was trying to get at by proxying win rates within certain ranges. There could easily be an effect (albeit a small one, or we'd see it in the aggregate numbers to a lesser extent), but in this data set it's not there. Any effect there may be is masked by the noise.

The point you just erased is actually a good one — you can't really run linear regression on this even with perfect data, as the win rates of more experienced players include some games played as less experienced players. I don't know the best way to take that out formally (or whether it's possible without ordered data for each player). But my intuitive understanding of it as this point is captured above: there may be a weak effect in the 15 to 30 range, and thereafter it looks like players win at slightly more than the aggregate rate... but by slightly, I do mean slightly.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 06:15 AM
Quote:
Originally Posted by atakdog
Note: if it weren't for the 498 games played by people with just one or two games, I think a simple pattern might emerge (with some noise, obviously). Eyeballing it it does look the win rate goes up after being low for the initial X games (where is X is somewhere between 15 and 30), but it's hard to explain why people playing their first game, and their second, should tend to do so much better than those who've played for a little while.

One thing I was planning on doing is seeing whether different trends might be discernible for village and wolf games, but I worry that the data aren't robust enough for that — too few wolf games in the sample for all but the most experienced players.
Possible explanations:

1. They are exempt from the lynch the first few days, which is beneficial whether they are a wolf or a villager. New players get passes and certainly dont get lynched on day 1 or day 2.

2. As villagers, they are an asset. Most of the time, they are really obvious. They are mislynched far less than the average I would imagine.

3. Their wolf range is estimated to be very narrow, they are seen as not capable of very much. If they are somewhat smart they can with this. See Kukra first wolf game, see Gabe first wolf game.

Last edited by BombayBadboy; 02-19-2011 at 06:27 AM. Reason: Added 3.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 06:21 AM
Yes, BBB, I was wondering about that. Another possibility is that after the first couple games, players start experimenting... and their experiments tend to suck.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 06:23 AM
Quote:
Originally Posted by atakdog
RangeComputed win rate
10.493
20.492
3 – 50.445
6 – 100.469
11 – 150.414
16 – 200.694
21 – 300.435
31 – 400.513
41 – 600.493
61 – 800.511
Above 800.561
so subbing in a middle figure for the ranges, 4, 8, 13, etc. and 100 for the final entry, and running an LSR generates an equation of winning chance = 48% + 0.001%*(#games), so is essentially flat

i'm sick, so not really on top form but a couple of points:
1) it shouldn't be a linear model, obviously since it's bounded by 100%
2) i have no idea without giving it some thought as to how to run this more properly - it probably needs more weighting
3) i dunno how to get the R^2 out of my spreadsheet atm
4) but it will be quite low, but also, that's not too much of a problem - if things are fitting too well, i would be more worried

Last edited by kokiri; 02-19-2011 at 06:27 AM. Reason: edit: 0.001% not 0.01% oops.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 06:27 AM
Quote:
Originally Posted by atakdog
but it's hard to explain why people playing their first game, and their second, should tend to do so much better than those who've played for a little while.
Gotta milk the enthusiasm before they join the jaded has beens.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 06:29 AM
I agree it's positive. It's just incredibly weak. I don't know what the confidence interval on that 1% is, but it can't be all that narrow and the effect is so small — a 1% increase in the win chances at game 110 versus game 10? Even if that's real, I think the more important point is that while experience probably does matter, it matters much less than most of us would like to think. Or so the data seem to tell us.

If we could eliminate the first and second games, maybe we'd get a stronger effect, probably on the order of 2%, but unless we have a conceptual reason for doing that I don't like it.


As an aside, playing with data like this is something I spent about three years doing when working for a professional expert witness. I promise you, I can make these numbers prove anything you want them to, and can coach you for testimony that will survive any cross-examination and convince any jury. But that won't make it correct.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 06:48 AM
Running regression on those proxied incremental rates while excluding one- and two-game players does reveal a trend: 47.8% + (0.072% x games) (R2 = 0.076). That would mean that the player with 110 games would be expected to win at a 7% greater rate than the one with ten games. The problems, though, are:
  • jiggering the data until we find a pattern, without having a conceptual basis for it, is a poor methodology that will always yield a stronger pattern than actually exists; and
  • the top end of the data are determined by very few players.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 06:53 AM
Quote:
Originally Posted by kokiri
so subbing in a middle figure for the ranges, 4, 8, 13, etc. and 100 for the final entry, and running an LSR generates an equation of winning chance = 48% + 0.001%*(#games), so is essentially flat

i'm sick, so not really on top form but a couple of points:
1) it shouldn't be a linear model, obviously since it's bounded by 100%
2) i have no idea without giving it some thought as to how to run this more properly - it probably needs more weighting
3) i dunno how to get the R^2 out of my spreadsheet atm
4) but it will be quite low, but also, that's not too much of a problem - if things are fitting too well, i would be more worried
It's been too long, so I've forgotten... but I'll bet one of the young whippersnapper math geeks here can remind us what the conceptually preferred way to model correlation of a probability against a single independent variable is.

If I were creating testimony, I'd start with that and then play around with games played, log of games played, square root of games played... But ultimately we'd present something really simple (and totally specious) to the jury.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 07:02 AM
OK, still to do (I need to go to bed): looking at villager win percentage as a function of games played. Eyeballing the data, it looks like there may be a much clearer pattern there: my sense of it as that the most experienced players are winning noticeably more than 49% (the average) of their village games, but possibly less than the 46% average of their wolf games.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 07:12 AM
Quote:
Originally Posted by atakdog
  • the top end of the data are determined by very few players.
i don't think anyone thinks luckayluck is the best wwer around.

edit:
http://en.wikipedia.org/wiki/Logistic_regression
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 07:17 AM
Does the data include the dates that the games were played? I'd be interested in seeing what percentage of players with 40+ games have a better winrate in their second 20 than their first 20. I'd imagine it's more than 50% but by a small amount.

As for the weird effect with players doing better in their first 2 games I do think there is some merit in the fact that they get a lot of leeway in their first two games, and some wolves can be flat-out bizarre and live a long time just because no one knows what to make of it. After a couple games, perhaps they are given more respect, which actually makes players less likely to clear them?

I don't think that this stuff proves that there is no skill in ww. I think it shows that players do most of their improvement quite rapidly, and beyond that point their improvements mostly just track the overall improvement of the player pool.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 07:32 AM
The data unfortunately doesn't include dates, no.

Maybe I can get working on that plus data of when people were lynched / killed next time I have free time on my hands :P
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 07:33 AM
Well as long as you have the game names in there you can grab the dates pretty easily from the archive thread.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 07:35 AM
After watching an excellent video about the math behind figuring if you are a winning poker player I have to say.

"LOL sample size" to all of this.

I wonder how many vanilla 13ers we would have to play to produce a sample that is w/i +/- 5% of our actual skill.

I think it might be as many as 1k but I have no idea how to figure these things out.

Someone who is great at stats should be able to tell us.

Edit: Mets, Wahoo wanna try?
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 07:37 AM
just because you can't definitely prove which players are winners or losers in the long run doesn't mean that there aren't winners and losers in the long run
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 07:43 AM
Quote:
Originally Posted by soah
just because you can't definitely prove which players are winners or losers in the long run doesn't mean that there aren't winners and losers in the long run
Agreed 100% and I never said that.

I just don't think we have anywhere near the amount of games to prove which are which.

Especially considering these stats include a large portion of mish-mashes.

I mean does anyone think 100 hands would be a large enough sample to decide if someone is a winner or a loser? and that doesn't even add in the complexity of 8-50 other players.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 07:52 AM
hands of poker don't have the same variance as a game of werewolf

my results are about 1 standard deviation above average
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 07:56 AM
Quote:
Originally Posted by soah
hands of poker don't have the same variance as a game of werewolf

my results are about 1 standard deviation above average
OTI doesnt want to hear this. DUCY?
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 07:58 AM
Some "shocking" numbers though. I mean, look at SH, Nez for example.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 08:18 AM
Ok so is 1k hands a big enough hand sample size?

10k hands?

100k hand?

We need someone with stats knowledge to tell us how many games is meaningful.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 08:34 AM
it's much less like looking at one player's stats and much more like looking at the database of a range of players, all with varying samples - so like the graphs they produced when they were proving the ap/ub cheats were cheating, say.

Also, if you think about it, a game is so dependent on the whole lineup, that any one player's impact is going to be small, so even if you could model it, it would look like:
P(win) = 50% +/- skill +/- teammates
where skill includes any experience factor as well.
I suspect the impact of skill would turn out to be only a few percentage points at best, and that means that presumably you need a big sample to discern the difference.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 08:39 AM
Quote:
Originally Posted by kokiri
it's much less like looking at one player's stats and much more like looking at the database of a range of players, all with varying samples - so like the graphs they produced when they were proving the ap/ub cheats were cheating, say.

Also, if you think about it, a game is so dependent on the whole lineup, that any one player's impact is going to be small, so even if you could model it, it would look like:
P(win) = 50% +/- skill +/- teammates
where skill includes any experience factor as well.
I suspect the impact of skill would turn out to be only a few percentage points at best, and that means that presumably you need a big sample to discern the difference.
Agreed.

Which is why we need a stats expert to help us even come close to finding a suitable sample size.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 09:17 AM
I do think some players are far superior to others, but with so many players responsible for the result of any given game and all mishmashes being balanced differently, it could take a million games to statistically determine an individual player's skill level.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 12:42 PM
Quote:
Originally Posted by OnThInIcE911
We need someone with stats knowledge to tell us how many games is meaningful.
A ton more.
Academic Werewolf Discussions Go Here Quote
02-19-2011 , 12:44 PM
assuming there is player skill, it seems like the best way to demonstrate it would be to have a large sample of rather smaller games, since it seems intuitive that the smaller the game size, the more "impact" a good player is going to have
Academic Werewolf Discussions Go Here Quote

      
m