Two Plus Two Publishing LLC Two Plus Two Publishing LLC
 

Go Back   Two Plus Two Poker Forums > General Gambling > Probability

Probability Discussions of probability theory

Reply
 
Thread Tools Display Modes
Old 01-04-2009, 04:53 AM   #1
stranger
 
Join Date: Jan 2009
Posts: 9
How Many Trials to be Statistically Significant?

I'm asking this because I play on a major online site as you may suspect. I felt that I was losing way too many races but didn't know if maybe I was just remembering only those that I lost, so I thought the only way to decide was to do some real number crunching.

I'm not going to mention the site and this is not a bad beat whining post.

I'm trying to test the hypothesis of am I losing more races than I should? i.e. is a fair coin being tossed?

For this purpose, a trial is a race either pre flop all in or all in after the flop, where there are cards yet to burn, i.e. you are not drawing dead when the chips are all in.

I'm going through hand histories, most recent to oldest, looking for hands that were a race situation as mentioned.

Next I'm using pokerstove to calculate the probability of winning the hand. This gives me the expected outcome.

Next, I'm assigning a value of 0 to a lost race and 1 to a won race and .5 to a chop.

Then, I calculate Actual minus Expected.

I sum these over all trials (races) to get my result. Overall, given a fair game the sum of all of the outcomes should be zero. i.e. races will even out over the long run within expected deviations.

Then, I calculated the standard deviation of these trials (races).

This is where things get interesting.

Results

Trials 38 (This is only over a less than a weeks hand histories)
Sum of Actual minus expected over 38 trials = -6.26
Std Deviation over 38 trials = .5

Result is that I am 12.52 std deviations off of the expected value of 0. Anyone who knows statistics knows that this is off the charts.

I have a better chance of winning a drawing where everyone in the world gets one ticket and I get the winning one.

I was a math major at one time, but am many years removed from my statistics course. But given the low number of trials, can I conclude that the game is not fair?

I created a new i.d. for this post so that it could not be tied to my online handle.
ElvisPresley is offline   Reply With Quote
Old 01-04-2009, 11:57 AM   #2
Carpal \'Tunnel
 
Sherman's Avatar
 
Join Date: Jun 2005
Location: Psychology Department
Posts: 7,762
Re: How Many Trials to be Statistically Significant?

You already have a good start for a chi-square test.

Chi-square is calculated as follows:

sum ( ( (observed - expected)^2 ) / expected )

So all you need to do is square the differences you have already calculated and divided those squared differences by the expected value. Then sum these up. The resulting value is distributed as chi-square with degrees of freedom equal to K - 1, where K is the number of hands in your sample.

You can look up the probability of obtaining this value in a chi-square table, or you can compute it using excel: =chitest(observed range, expected range).

The result of the excel formula is the probability of getting the observed result if in fact there is no bias.

As a forewarning though, doing a chi-square test isn't really a fair hypothesis test...you already suspected you were running bad over this small sample of hands which is what motivated you to do the test. A more appropriate test would include all hands you have played on said site.

Sherman
Sherman is offline   Reply With Quote
Old 01-04-2009, 03:08 PM   #3
stranger
 
Join Date: Jan 2009
Posts: 9
Re: How Many Trials to be Statistically Significant?

Quote:
Originally Posted by Sherman View Post
The result of the excel formula is the probability of getting the observed result if in fact there is no bias.
Sherman
Thanks for feedback and I've Googled to get back up to speed on chi square test.

I'm getting a value of .998, which would say that the hypothesis that the game is fair can be rejected with a good a amount of confidence if I understand this correctly.

I'm going to continue populating (up to 42 trials to get the .998).

Updated results
Trials 42
Result -6.4
Std Dev .49
Chi Dist .998

I'll continue to update as I populate the table more.

Comments?
ElvisPresley is offline   Reply With Quote
Old 01-04-2009, 04:36 PM   #4
journeyman
 
Join Date: Aug 2007
Posts: 246
Re: How Many Trials to be Statistically Significant?

I'm not really sure if your methodology is right. Why is the expected value of the game equal to 0? If you rougly win as much as you lose (EV=0), then in your reasoning the EV must be 0.5, since losing = 0 and winning = 1. So there's something wrong in your analysis in my opinion.

I assume a fair game (like tossing a coin) is a game with an expected value of 0. Like for example when you win 49% of the time, lose 49% of the time and a tie occurs 2% of the time. Then the EV is (0.49)(+1) + (0.49)(-1) + (0.02)(0) = 0.

If I were you, I would do this:

winning = 1
losing = -1
tie = 0

This is logical, because when you win, you win value (+1). When you lose, you lose value (-1) and when it's a tie, both parties gain/lose the same or stay in the same condition (0).

To get the mean: take the sum of all outcomes and devide by the amount of trials. To get the standard deviation: subtract the mean you just calculated from every outcome, then take the square of this difference. Sum up all these squared differences, and divide them by the number of trials minus 1. Take the square root of the number you become and this is the standard deviation. You can just fill in all the outcomes (-1, 1 and 0) in Excel and use AVERAGE() and STDEV(), which will be way faster...

This mean or average should be close to 0 if the game is fair and the amount of trials is not ridiculously small. Also, you need more then 30 trials the apply the confidence intervals I state later on.

Since I don't have your original data, I will just simulate 40 trials where you lose a bit more then you should expect to lose: You lost 23 times, won 15 times and it was a tie 2 times.

The mean of these trials is -0,2 and the standard deviation is 0,96609.

Now you are testing the hypothesis that this game is fair, or formally, that the mean of this game is equal to 0. If we can reject H0, we can say that this game is not fair.

H0: µ = 0 (the game is fair)
H1: µ ≠ 0 (the game is unfair)

So you need a two-tailed test here... The simplest thing you can do is construct a 95% or 99% confidence interval... These intervals will make you able to say: "I'm 95/99% sure that the population mean lies in this interval".

CI(95%) = mean ± 1.96 stdev/sqrt(n)
CI(99%) = mean ± 2.57 stdev/sqrt(n)

If the value 0 is not in that confidence interval, you can reject H0 and say that the game is unfair at the given confidence leven (and for example take appropriate legal actions).

So let’s construct those intervals:

CI(95%) = -0.2 ± 1.96 * (0,96609/sqrt(40))
CI(99%) = -0.2 ± 2.57 * (0,96609/sqrt(40))

CI(95%) = [ -0,4994; 0,0994 ]
CI(99%) = [ -0,5926 ; 0,1926 ]

Since the value 0 is in both intervals, we can't reject H0, neither with a 95%, nor with a 99% confidence level. So based on my simulated data (where you lost a bit more then expected, I still can't say it's an unfair game).

The number of trials doesn't really matter, bit bigger is obviously better. Just make sure you got more then 30 trials because otherwise you gotta use a Student t-distribution instead of the normal distribution and that just complicates matters too much...

You will always be statistically significant (as long as trials > 30) at the 95% or 99% confidence level if you use this method I discribed. But the conclusion could ofcourse be that the game is actually fair. Maybe if you are really really sure this game is unfair, you can do more trials to reduce the standard error (which is stdev / sqrt(n)). The more trials you have, the smaller the intervals become, so the accurater your prediction becomes. Maybe then you can reduce an interval to one that does not contain the value zero. It is pretty unlickely though that you will find an interval that does not contain 0 is the game is actually fair.

So do this over for your own data, and see what you get...
Let us know!
Riverdale27 is offline   Reply With Quote
Old 01-04-2009, 05:32 PM   #5
stranger
 
Join Date: Jan 2009
Posts: 9
Re: How Many Trials to be Statistically Significant?

Quote:
Originally Posted by Riverdale27 View Post
I'm not really sure if your methodology is right. Why is the expected value of the game equal to 0? If you rougly win as much as you lose (EV=0), then in your reasoning the EV must be 0.5, since losing = 0 and winning = 1. So there's something wrong in your analysis in my opinion.
Let me clarify and you'll see why I expect 0. We'll use the coin toss as an example

Trial Expected Actual Actual - Expected
1 .5 1 .5
2 .5 0 -.5
3 .5 1 .5
4 .5 0 -.5

So, in this coin example with four trials, the actual minus expected totals to zero showing a fair coin is being used. I'm using expectation of zero to be the sum over all actual minus expected. The expected is my probability of winning as calculated by pokerstove after all in, and with cards on the board yet to come and no one is drawing dead.

The mean of actual minus expected should approach zero.

I'll modify my spreadsheet to add the CI formulas you show and report back.

Given the clarification do you agree with my methodology?
ElvisPresley is offline   Reply With Quote
Old 01-04-2009, 05:40 PM   #6
journeyman
 
Join Date: Aug 2007
Posts: 246
Re: How Many Trials to be Statistically Significant?

Aha yes, now I can see what you mean. Then indeed the expected difference should be 0.

But why the detour of subtracting the EV from the outcome? You can also just average the outcomes themselves (0.5, 0 and 1 for tie/loss/win), which should have an average of 0.5 in a fair game. Then you just test:

H0: µ = 0.5
H1: µ ≠ 0.5

The detour of first subtacting the EV from every outcome and then taking the average of the (outcome-EV) numbers is not really efficiënt. Not that it really matters that much with small calculations like these ofcourse...

But yeah, I'm interested in the conclusions, so keep us posted
Riverdale27 is offline   Reply With Quote
Old 01-04-2009, 05:49 PM   #7
stranger
 
Join Date: Jan 2009
Posts: 9
Re: How Many Trials to be Statistically Significant?

So, after 68 trials tabulated and using the confidence interval approach. I get,

mean = -.09
std dev = 0.46
n = 68

99% ci [-.24,.05]
95% ci [-.2,.02]

Which would indicate that I cannot say at this point the game is unfair.
ElvisPresley is offline   Reply With Quote
Old 01-04-2009, 09:40 PM   #8
Carpal \'Tunnel
 
Sherman's Avatar
 
Join Date: Jun 2005
Location: Psychology Department
Posts: 7,762
Re: How Many Trials to be Statistically Significant?

I'm not sure what your .998 value is exactly, but if that is your chi-square value there is no way that is statistically significant on any number of degrees of freedom, let alone 47. Further, if .998 is the value given using =chitest in excel, the .998 represents the probability of the data if the null hypothesis were true...in that case we have a very high probability and cannot conclude that the game is unfair.

Sherman
Sherman is offline   Reply With Quote
Old 01-04-2009, 11:18 PM   #9
stranger
 
Join Date: Jan 2009
Posts: 9
Re: How Many Trials to be Statistically Significant?

I'm taking a break from tabulating tonight. So far I have

n = 93
mean = -0.127
std dev = 0.461

95% ci [-0.221, -0.033]
99% ci [-0.250, -0.004]

This is telling me that what I've been suspecting has something to it. It is not reasonable to expect to lose so many races.


Sherman - I need to study the Chi square test as I obviously do not understand what it means at this point. But, the result I'm now getting from

Chitest(observed range, expected range) is .9999982

and as expected, I got the same value by doing the calculation manually and looking it up with the chidist(value, df) function.

What is this chi square test telling me? If I understand you that says we cannot conclude the game is unfair even though the confidence intervals would indicate otherwise?
ElvisPresley is offline   Reply With Quote
Old 01-05-2009, 12:15 AM   #10
Carpal \'Tunnel
 
Sherman's Avatar
 
Join Date: Jun 2005
Location: Psychology Department
Posts: 7,762
Re: How Many Trials to be Statistically Significant?

Quote:
Originally Posted by ElvisPresley View Post
I'm taking a break from tabulating tonight. So far I have

n = 93
mean = -0.127
std dev = 0.461

95% ci [-0.221, -0.033]
99% ci [-0.250, -0.004]

This is telling me that what I've been suspecting has something to it. It is not reasonable to expect to lose so many races.


Sherman - I need to study the Chi square test as I obviously do not understand what it means at this point. But, the result I'm now getting from

Chitest(observed range, expected range) is .9999982

and as expected, I got the same value by doing the calculation manually and looking it up with the chidist(value, df) function.

What is this chi square test telling me? If I understand you that says we cannot conclude the game is unfair even though the confidence intervals would indicate otherwise?
Then the value you are computing is in fact the p-value. The p-value is the probability of getting the data you got (or more extreme) if the sample came from a population that is unbiased (null population). In my opinion, it is pretty overwhelming evidence that you are in fact not running "unprobabilistically" worse than expectation.

Sherman
Sherman is offline   Reply With Quote
Old 01-05-2009, 10:54 AM   #11
grinder
 
Agthorr's Avatar
 
Join Date: Apr 2005
Location: Dallas, TX
Posts: 477
Re: How Many Trials to be Statistically Significant?

Quote:
Originally Posted by Riverdale27 View Post
But why the detour of subtracting the EV from the outcome? You can also just average the outcomes themselves (0.5, 0 and 1 for tie/loss/win), which should have an average of 0.5 in a fair game. Then you just test:
Because he's not testing if the game is "fair" in the sense that you mean (each player having a 50% chance of winning).

He's testing if the chance of winning a hand in reality is the same as the advertised chance of winning (based on the visible cards). He wants to know if the game is rigged.

OP's methodology of computing (Actual - Expected) is great for this purpose. I like the confidence interval methodology.

Chi-Square is a minefield of chances to get the methodology wrong in obscure ways. Also, any time you get a p-value of larger than .99, it's very likely there's been a methodological error because that p-value shows that the data fit the expected values TOO well to be believed. A p-value of .9999982 is like rolling a 6-sided die 6 thousand times and finding that you have rolled each value exactly one thousand times. It's too perfect.

ElvisPresley: if you PM me your table (or email me a spreadsheet at daniel@pokersleuth.com), I'd be happy to take a look at it.
Agthorr is offline   Reply With Quote
Old 01-05-2009, 11:46 AM   #12
Carpal \'Tunnel
 
Sherman's Avatar
 
Join Date: Jun 2005
Location: Psychology Department
Posts: 7,762
Re: How Many Trials to be Statistically Significant?

Quote:
Originally Posted by Agthorr View Post
Because he's not testing if the game is "fair" in the sense that you mean (each player having a 50% chance of winning).

He's testing if the chance of winning a hand in reality is the same as the advertised chance of winning (based on the visible cards). He wants to know if the game is rigged.

OP's methodology of computing (Actual - Expected) is great for this purpose. I like the confidence interval methodology.

Chi-Square is a minefield of chances to get the methodology wrong in obscure ways. Also, any time you get a p-value of larger than .99, it's very likely there's been a methodological error because that p-value shows that the data fit the expected values TOO well to be believed. A p-value of .9999982 is like rolling a 6-sided die 6 thousand times and finding that you have rolled each value exactly one thousand times. It's too perfect.

ElvisPresley: if you PM me your table (or email me a spreadsheet at daniel@pokersleuth.com), I'd be happy to take a look at it.
I agree with your regarding the size of the p-value, but calling chi-square a minefield for methodological error...? It is one of the most straightforward inferential statistics there is...but I suppose we could consider all inferential statistics a minefield for methodological error.

Sherman
Sherman is offline   Reply With Quote
Old 01-05-2009, 12:04 PM   #13
grinder
 
Agthorr's Avatar
 
Join Date: Apr 2005
Location: Dallas, TX
Posts: 477
Re: How Many Trials to be Statistically Significant?

Quote:
Originally Posted by Sherman View Post
It is one of the most straightforward inferential statistics there is...but I suppose we could consider all inferential statistics a minefield for methodological error.
Well, for starters, the chi-square test is unreliable if any of the the expected frequencies are less than 5. In the methodology you outlined (as I understand it), all of the expected outcomes are always less than or equal to 1.

Ergo, the gibberish p-value.

There are certain problems where Chi-square is great and it has a long history in certain disciplines (such as comparing a new drug with a placebo). For other problems, it's terrible.
Agthorr is offline   Reply With Quote
Old 01-05-2009, 08:51 PM   #14
stranger
 
Join Date: Jan 2009
Posts: 9
Re: How Many Trials to be Statistically Significant?

Quote:
Originally Posted by Agthorr View Post
ElvisPresley: if you PM me your table (or email me a spreadsheet at daniel@pokersleuth.com), I'd be happy to take a look at it.
Thanks for the offer. I'll email it to you.
ElvisPresley is offline   Reply With Quote
Old 01-06-2009, 10:14 AM   #15
grinder
 
Agthorr's Avatar
 
Join Date: Apr 2005
Location: Dallas, TX
Posts: 477
Re: How Many Trials to be Statistically Significant?

Quote:
Originally Posted by ElvisPresley View Post
Thanks for the offer. I'll email it to you.
Thanks. After playing around with various approaches for the confidence intervals, I decided to use the sledgehammer approach. I took your expected probability of winning each of the 93 hands and ran a simulation that generated random numbers to determine a simulated win or loss for each. I then ran 10,000 such simulations. Only 0.47% of the simulations ran worse than you; 99.53% did better.

Before you jump to any conclusions, I'd check the following:
  • Did you miss any all-in hands from your spreadsheet?
  • Did you type all of the relevant players and their hands into the spreadsheet correctly?
  • Did you enter the information correctly into Poker Stove (or similar tool)?
  • Did you copy the results correctly from Poker Stove (or similar tool)?

Maybe someday I'll write a tool to compute all of this stuff automatically from hand histories, to eliminate the possibility of typographical errors. Have to finish my HUD first, though.
Agthorr is offline   Reply With Quote

Reply
      

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off




All times are GMT -4. The time now is 02:19 AM.


Powered by vBulletin®
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.6.0 ©2011, Crawlability, Inc.
Copyright © 2008-2010, Two Plus Two Interactive