08-11-2010 , 08:16 AM
Quote:
Originally Posted by :::grimReaper:::
I asked my professor once, "If you create 100 95% confidence intervals and select one at random, then there is a 95% chance of getting a confidence interval w/ the true mean, right?" She hesitantly said yes.

But after I passed the class, I realized this is very iffy. If I create, say, 203 CI's, then still there's 95% chance of getting CI w/ the true mean when I select one at random. But then, if I create 1 CI and select one at "random," then saying there's a 95% chance of getting the true mean is a little weird and almost seems a contradiction for the reason stated in the last post. I suppose the statement still holds if you don't know the bounds of the CI.

Do you want me to email some professors?
Yeah, that sounds like a good idea because saying that a CI says that 95% of CI's will hold the true mean sounds like you agree with (since this is a confidence interval) "this CI may be one of 95% of CI's that contain the true mean..aka..this can 'basically' determine that there is a 95% chance my win rate falls between this interval..because of the CI's out there..this one has a 95% chance of containing the true interval, and an interval out there either DOES or DOES NOT contain a true interval so there are no chance within chances. What I mean is, 95% of CI's 100% include the true mean, and 5% of the CI's 100% do not include the true mean.

Does this make sense?

I believe that when my professor taught me CI's, the important thing he said before the chapter started was something like:

Confidence Intervals technically can't be interpreted to mean what they indicate.
But CI's can be interpreted to mean what you think they mean based on what they technically indicate.
08-11-2010 , 08:25 AM
Quote:
Originally Posted by sbarnhouse
OK...I went to a special school back in the 6th-7th grade for being mathematically gifted, and I have no damn idea what is going on here. Granted I am now 27, and I smoked a LOT of weed in between then and now... anyways, I just now got back into school and I am taking a math class which will be the same **** I learned when I was 12 years old, at best, but I seriously don't remember any of it. I gave up on school a LONG time ago. My question is this. Can someone please direct me to where I can learn 1. equity 2. EV 3. ICM, and whatever other kinds of math are used during a given session? I know odds and probability and all that basic **** like the back of my hand, but I want to start learning the advanced strategies. I also realize I can just go Google all of this, but that is so random and iffy.

On another note, why can't you simply keep a record of time played each session, your +/- for that session, no. of hands per hour, and so forth. Is this not in depth enough? I know if I looked back and saw that I was ahead \$14,500 over 500 hours and had played x number of hands, I could come up with \$ won per hr. and per session, etc. I would think (personally, no hating) that all the time spent on this would be better spent studying my game and improving my leaks...no? Again, this is all just questions, me trying to get an idea of where you guys are coming from. Thanks.
That would be nice, but situations like this have precisely nothing to do with this thread.
08-11-2010 , 08:29 AM
I don't really understand the confusion. A confidence interval around a sample mean is calculated to be the interval that has X% chance of containing the true population mean. It isn't a question of what it tells us, because it is specifically calculated to be exactly that by definition. If we calculate a 95% confidence interval around the sample mean, then we are specifically calculating the interval that has 95% probability of containing the population mean.

For a normal distribution we know that ~95% of the area under the curve is encompassed by +/- 2SD from the mean. That is just a fact of normal distributions. And using that fact with a sample requires us to assume we are dealing with a normal distribution.

What am I missing?
08-11-2010 , 11:10 AM
Quote:
A confidence interval around a sample mean is calculated to be the interval that has X% chance of containing the true population mean.
I don't think this is really true. I'm not a statistician but I think it's something much more like... If we had 100 players who's True Win Rate was the observed mean, and they all played this number of hands, 95% of them would have outcomes within the 95% CI.
08-11-2010 , 11:25 AM
Quote:
Originally Posted by RustyBrooks
I don't think this is really true. I'm not a statistician but I think it's something much more like... If we had 100 players who's True Win Rate was the observed mean, and they all played this number of hands, 95% of them would have outcomes within the 95% CI.
You're now calculating the CI on the population, not the sample. And in that case it says the same thing I said. Both are true. Which of these gets used depends on whether the population mean is actually known. For winrates it usually isn't, so you are doing a CI on the sample, and from that estimating the population. You aren't predicting sample distribution from knowing the population mean (which is what your statement says).
08-11-2010 , 12:41 PM
Quote:
Yeah, that sounds like a good idea because saying that a CI says that 95% of CI's will hold the true mean sounds like you agree with (since this is a confidence interval) "this CI may be one of 95% of CI's that contain the true mean..aka..this can 'basically' determine that there is a 95% chance my win rate falls between this interval..because of the CI's out there..this one has a 95% chance of containing the true interval
I think it's clear now: true, 95% of identically constructed CI's will contain the true mean, but once you realize the bounds of one particular interval, you can't say there's a 95% chance of that particular CI containing the true mean, b/c like I said, the probability is either 0% or 100%. The bounds of the interval are no longer random after data is obtained, and the true mean is a constant, and is never random. Therefore speaking of probability doesn't make sense.
08-11-2010 , 01:59 PM
Quote:
Originally Posted by :::grimReaper:::
I think it's clear now: true, 95% of identically constructed CI's will contain the true mean, but once you realize the bounds of one particular interval, you can't say there's a 95% chance of that particular CI containing the true mean, b/c like I said, the probability is either 0% or 100%. The bounds of the interval are no longer random after data is obtained, and the true mean is a constant, and is never random. Therefore speaking of probability doesn't make sense.
But the data is random. This is one 1250 hand session in a population. I'm saying that samples set up like this (including this sample) either do or do not contain the true mean. 95/100 of CI's like this one contain the true mean and 5/100 of the samples will not contain the true mean. This makes the probability either 0% or 100% for all of the samples, and just tells you how many of the samples are accurate or not.

How else would you construct more CI's like this one without obtaining data? The bounds are definitely not random because they are estimates of what the low and high probably are. They are there because nothing in probability is certain (1). The true mean is a constant, and the bounds determine a reasonable indication of a high and low the constant falls in, given data and a standard deviations. I never said the true mean is random, and I don't think that the bounds are random given my data set either..I think they are pretty accurate.

In fact, since I buy-in for \$100 at a time, if you tack on 20 more hours played to my sessions, and say that my net profit/loss for those 20 hours is (\$1000) then I would have an average total BB/100 of -3 BB/100. I'd say there's probably something like a 5% chance that I'm going to net loss 10 BI's in a 20 hour period vs the players I'm playing against.
08-11-2010 , 08:51 PM
Quote:
But the data is random.
....
How else would you construct more CI's like this one without obtaining data?
You misunderstood what I said. Take this analogy: Let X be a discrete uniform variable over {1,2,3,4} and let c be fixed to exactly one of #s. We can say, 25% of the time, X = c, BUT when we see X = 2 for one particular trial, we can't say there's a 25% chance X = 2 = c, as this would imply c is random, which it isn't.

It's the exact story here. Before we realize the bounds of a particular 95% CI, it's random and will contain the mean w/ 95%. I can say over the next 200 CI's, 190 will contain the true mean. But once we know the bounds of the very next CI, probabilistic statements go out the window.

I'm "99%" sure this is right, let me confirm this emailing a professor.
08-11-2010 , 09:50 PM
I don't think your argument really makes sense, otherwise probability doesn't work at all. In your {1,2,3,4} example let's say I pick one of the numbers and you also pick one. In a probability sense, there is a 25% chance you picked my number but the way you're presenting it, either we picked the same one and therefore the chance is 100%, or we didn't and the chance is 0%.

Like I said before, I'm a statistics caveman, I mostly figured out what I could and make it work for me, as soon as people start talking about confidence intervals I usually shut up because I barely understand it, and what I understand is probably wrong.
08-12-2010 , 04:04 AM
A professor agrees w/ what I said. I guess I'll ask one more to be completely sure?

Quote:
Originally Posted by RustyBrooks
I don't think your argument really makes sense, otherwise probability doesn't work at all. In your {1,2,3,4} example let's say I pick one of the numbers and you also pick one. In a probability sense, there is a 25% chance you picked my number but the way you're presenting it, either we picked the same one and therefore the chance is 100%, or we didn't and the chance is 0%.
Not exactly, I present the analogy to emphasize the distinction between before and after realizing a random variable. c is a constant, is never changing, is never random, it's fixed. But P(X=c) = .25, b/c X is random, i.e before we see a realization of X. But if we see X came up as 2 one time, we can't say c = 2 w/ 1/4 (or any) probability, c=2 or c/=/2, again c is not random.

As for picking #s like you mentioned, that's a different model. That would be X and Y being independent discrete uniform RVs over {1,2,3,4}. Then yes, P(X=Y) = P(X=1) = P(Y=3) etc like you said.
08-12-2010 , 05:19 PM
Quote:
Originally Posted by :::grimReaper:::
I asked my professor once, "If you create 100 95% confidence intervals and select one at random, then there is a 95% chance of getting a confidence interval w/ the true mean, right?" She hesitantly said yes.

But after I passed the class, I realized this is very iffy. If I create, say, 203 CI's, then still there's 95% chance of getting CI w/ the true mean when I select one at random. But then, if I create 1 CI and select one at "random," then saying there's a 95% chance of getting the true mean is a little weird and almost seems a contradiction for the reason stated in the last post. I suppose the statement still holds if you don't know the bounds of the CI.

Do you want me to email some professors?
I didn't know where to go back and quote, so I just chose this one to represent what you are saying. I know that you're saying "There is a 95% chance that CI's created in the future will contain the true interval."
My question is though, what does "that" mean? If you can't create other CIs based off of data, Then what would you create another CI out of? This confidence interval was created from my data..and it means that other CIs created would have a 95% chance of containing the true mean..sure, that's the textbook definition..but where do those other CIs come from? What's an example of a future CI that this CI indicates?
08-12-2010 , 06:44 PM
Quote:
I didn't know where to go back and quote, so I just chose this one to represent what you are saying. I know that you're saying "There is a 95% chance that CI's created in the future will contain the true interval."
My question is though, what does "that" mean? If you can't create other CIs based off of data, Then what would you create another CI out of? This confidence interval was created from my data..and it means that other CIs created would have a 95% chance of containing the true mean..sure, that's the textbook definition..but where do those other CIs come from? What's an example of a future CI that this CI indicates?
If you read my last posts to Rusty, what I'm saying is, *before* you gather data to construct the 95% confidence interval, the bounds of the 95% are random, and in fact yes, there's a 95% chance that interval will contain the true mean. But after data is gathered and you construct your CI, the bounds are no longer random (the bounds are #s), and of course the true mean is not random, so you can't say that there's a 95% chance the true mean is in (-2,38), as the mean is either in there or it's not.
08-12-2010 , 06:57 PM
Quote:
Originally Posted by :::grimReaper:::
If you read my last posts to Rusty, what I'm saying is, *before* you gather data to construct the 95% confidence interval, the bounds of the 95% are random, and in fact yes, there's a 95% chance that interval will contain the true mean. But after data is gathered and you construct your CI, the bounds are no longer random (the bounds are #s), and of course the true mean is not random, so you can't say that there's a 95% chance the true mean is in (-2,38), as the mean is either in there or it's not.
Let me repeat a question I think I asked way back, perhaps in a different form. OP's 95% CI was (-3, 36), I believe. We all agree now on just what that means technically. The question I have is how would you interpret or use that information? For example, suppose someone offered to bet you that OP was not a winning player. Would the CI result have any bearing on your decision to take the bet, knowing that the true win rate is or not in that interval according to frequentist theory?

If another player, B, with the same number of hands as OP, had a CI of, say, (-10, +10), would you think OP had a better, equal or less likely chance to be a winning player than B?
08-12-2010 , 07:04 PM
Yeah basically at this point I no longer feel like I know what a CI actually means, and what you would normally do with it in a case like this. What does it tell you? What is it that it's confident of?
08-12-2010 , 07:18 PM
A confidence interval is not really a confidence interval when susceptable to guassain noise due to inherant uncertainity?
08-12-2010 , 08:24 PM
Quote:
Originally Posted by statmanhal
Let me repeat a question I think I asked way back, perhaps in a different form. OP's 95% CI was (-3, 36), I believe. We all agree now on just what that means technically. The question I have is how would you interpret or use that information? For example, suppose someone offered to bet you that OP was not a winning player. Would the CI result have any bearing on your decision to take the bet, knowing that the true win rate is or not in that interval according to frequentist theory?

If another player, B, with the same number of hands as OP, had a CI of, say, (-10, +10), would you think OP had a better, equal or less likely chance to be a winning player than B?
I think if you overlay these two probability distribution graphs then the one to bet on will be obvious.
08-12-2010 , 09:11 PM
Quote:
I think if you overlay these two probability distribution graphs then the one to bet on will be obvious.
Yeah, I would hope the answer is pretty obvious. Even if there was no statistical significance, you might as well choose that with the more winning range, given all else equal. What I'm trying to do is to get those who seem to focus on the issue that the true parameter is or is not in the interval, to tell me what they would do with a confidence interval estimate. Heck, it's an estimate. If it's not some kind of estimate of your win rate then just what is it? Well, to keep them happy, instead of saying that the probability is 95% that my win rate is between X and Y, I would say I am 95% confident it is between X and Y. A semantic difference with no distinction, or something like that.

Those who advocate Bayes have to start with a prior distribution, which could very well be quite subjective. Well, one could make a prior distribution out of a confidence interval estimate and then have the same theoretical basis for ascribing a probability to a population parameter.
08-12-2010 , 09:21 PM
deleted

Last edited by spadebidder; 08-12-2010 at 09:30 PM. Reason: deleted poor idea
08-12-2010 , 09:31 PM
I'm trying to think through how to choose which to bet on when it is non-obvious. It seems like you want the greatest area under the curve to the right of zero, which is not necessarily the one with the highest mean, nor necessarily the one with the highest 95% CI boundary. Because of different kertosis. Am I thinking right?

Edit: which raises an interesting question. What is the distribution curve with the greatest area under it? There should be a formula for this, or an obvious answer that escapes me.

Duh, I guess if they are normal the area is equal.

Last edited by spadebidder; 08-12-2010 at 09:48 PM. Reason: think more post less
08-12-2010 , 09:59 PM
Quote:
I'm trying to think through how to choose which to bet on when it is non-obvious. It seems like you want the greatest area under the curve to the right of zero, which is not necessarily the one with the highest mean, nor necessarily the one with the highest 95% CI boundary. Because of different kertosis. Am I thinking right?

Edit: which raises an interesting question. What is the distribution curve with the greatest area under it? There should be a formula for this, or an obvious answer that escapes me.

Duh, I guess if they are normal the area is equal.
The distribution with the greatest area to the right of zero is equivalent to finding what percentile zero is on the CDF.

In general, a standard test of hypothesis can be performed by seeing how much overlap there is in the sampling distributions. This is then basically comparing CI overlap although more is involved if you want to get into such things as Type I and Type II errors, the former being rejecting the null hypothesis that there is no difference in means when in fact that is so (reject true) and the latter error being failing to reject the null hypothesis when it’s not true (accept false). But, if the data do not indicate a significant difference, wouldn’t it make sense to choose that set of data as ‘better’ that appears to be better – be it because of the means or how the data are distributed, as you suggest. The choice of how to determine which is the best metric or test statistic would depend a lot on what you intend to do with the result.

As someone suggested much earlier in this thread – why all the fancy schmancy analysis? He suggested something like the following:

I mark down how much I have in my poker bank account before I start playing. I then look at the account balance after playing X months. If it’s greater, I’m a winner; if it’s less, I’m a loser.

Homespun wisdom.
08-13-2010 , 12:36 AM
Quote:
Originally Posted by statmanhal
I mark down how much I have in my poker bank account before I start playing. I then look at the account balance after playing X months. If it’s greater, I’m a winner; if it’s less, I’m a loser.

Homespun wisdom.
Yeah, you could do that. But it might be better to say..hmm, after 2 days played time, I have found a way to be "pretty sure" that I'm a winning/losing player at the game I'm playing at. Maybe I should/shouldn't keep playing at the location I'm playing at based on my results because I think game selection is important.

Or..you could just wait and play for months and months when 2 days might have already given you a sample that could tell you if you're winning. If it were possible to find out after 50 hours..I know which one I'd rather do.

Everybody already knows if you play for months and months you can be pretty close to how you run when luck is taken out..but that's not the point of this thread..
08-13-2010 , 12:50 AM
Quote:
Originally Posted by :::grimReaper:::
If you read my last posts to Rusty, what I'm saying is, *before* you gather data to construct the 95% confidence interval, the bounds of the 95% are random, and in fact yes, there's a 95% chance that interval will contain the true mean. But after data is gathered and you construct your CI, the bounds are no longer random (the bounds are #s), and of course the true mean is not random, so you can't say that there's a 95% chance the true mean is in (-2,38), as the mean is either in there or it's not.
But you can't construct a confidence interval without data..because you have to have bounds in a CI..and to get those bounds, you have to have data. Before data is gathered there is no CI so there is no chance that a CI has a 95% chance of containing the true mean because there is no CI in existence because CI's are based on data. There are no CI's that will be produced in the future unless more data is obtained for the reason above (CI's are based on data)...so there isn't a 95% chance that future CI's will contain the true mean because no future CI's will be formed..without data.

If that is wrong then please tell me how I am supposed to form a future CI based on this data that will have a 95% chance of containing the true mean.

Also, your "it's either 0% or 100% so there is no 95%" argument doesn't make sense. I am 80% confident that it's between 0.XX and 33.XX BB/100. I'm 95% confidence that it's between -3.XX and 36.XX BB/100. This is set up this way because since you are grabbing more BB/100 chances..you are more likely to have "picked" the right number. Just like someone who buys a lottery ticket is twice as confident that he will win if he buys 2 tickets instead of 1.

It's not quite the same thing in this situation because not all numbers have the chance of being someone's BB/100 average, but how does it not make sense that something that either is or isn't something can have a certain chance of being one or the other?

In Poker..you either Are or Are Not going to win the hand at the end of it. Yet we still say that there are different percentage chances that different hands have in winning. The point of a CI is to be a degree of confident that the true mean falls between a low bound and a high bound. I understand the "technical speak" of that being all future CI's have a 95% chance of containing the true mean, but can you apply that to real life? If you could then that would mean that the first relevant future CI you come up with will be my range of BB/100 averages, and we would be 95% confident that that range was a range that included my average. So make the "technical" definition apply to real life. Thanks
08-13-2010 , 05:18 AM
Quote:
But you can't construct a confidence interval without data..because you have to have bounds in a CI..and to get those bounds, you have to have data. Before data is gathered there is no CI so there is no chance that a CI has a 95% chance of containing the true mean because there is no CI in existence because CI's are based on data. There are no CI's that will be produced in the future unless more data is obtained for the reason above (CI's are based on data)...so there isn't a 95% chance that future CI's will contain the true mean because no future CI's will be formed..without data.
You're not understanding what I'm saying. The bounds of the CI are random before data is gathered, before data is gathered you can think of the CI as (L,U), where L and U are RVs. Just like you don't know what the # of a dice is before it's rolled, you don't know the bounds of the CI. Similarly, if you measure heights of a population, the sample mean is *random* before data is collected.

Quote:
Also, your "it's either 0% or 100% so there is no 95%" argument doesn't make sense. I am 80% confident that it's between 0.XX and 33.XX BB/100. I'm 95% confidence that it's between -3.XX and 36.XX BB/100. This is set up this way because since you are grabbing more BB/100 chances..you are more likely to have "picked" the right number. Just like someone who buys a lottery ticket is twice as confident that he will win if he buys 2 tickets instead of 1.
No, you're just using a different confidence interval.

Quote:
It's not quite the same thing in this situation because not all numbers have the chance of being someone's BB/100 average, but how does it not make sense that something that either is or isn't something can have a certain chance of being one or the other?
Please read my posts, I feel like I'm repeating myself to everyone. The true parameter is not random, you can't say the true parameter is between -10 and 50 w/ some probability between 0 and 1. Only random variables can be described in terms of probability. Yes, it's that simple.

Quote:
If you could then that would mean that the first relevant future CI you come up with will be my range of BB/100 averages, and we would be 95% confident that that range was a range that included my average.
To clarify, there's no time element (skimming back, I don't think I ever mentioned "future").

(I answer the last part of your post in the very next post)
08-13-2010 , 05:20 AM
Quote:
So make the "technical" definition apply to real life. Thanks
Quote:
Originally Posted by statmanhal
The question I have is how would you interpret or use that information?
One particular CI will just give you an "idea" of where the true parameter lies.

Maybe it'll help if I explain the idea behind CI. In probability theory (not statistics), if you take the average of n independent identically distributed (iid) random variables w/ mean u and standard deviation a, the average is a random variable, and it converges to the normal distribution as n tends to infinity by Central Limit Theorem, and its mean is u and standard deviation is a/sqrt(n) by simple calculation.

So let X = (1/n)(Z1 + ... + Zn), the average of all Z's, where each Z is iid w/ mean u and sd a. X converges to Normal (u, a/sqrt(n)) as n goes to infinity. So we can say:

P(u-(1.96a/sqrt(n))< X < u+1.96a/sqrt(n))) = 0.95. as n tends to infinity.

which using math, is saying the exact same statement as:

P(X-(1.96a/sqrt(n)) < u < X+(1.96a/sqrt(a))) = 0.95

So now back to statistics. Notice, that above equation looks very similar to a confidence interval, in fact, the above equation is the idea behind confidence intervals. Also notice, the above equation states that the probability u is in the interval (X-(1.96a/sqrt(n), X+(1.96a/sqrt(n)) is 95%, which is a true statement, and it's what I've been saying (again remember, X is random before we gather data).

Now we gather data, and find that X=x (small x, our sample mean, is a #, it's not random). We don't know a, but we use the "bootstrap method" and guess a is the sd of samples. Now we plug in x in above interval and call it our confidence interval, which now is:

(x-(1.96a/sqrt(n)),x+(1.96a/sqrt(a)))

Does this mean u is in the above interval w/ 95% probability? No! There is nothing probabilistic in the above interval once we replace X w/ x. Everything in the above statement are #s, not random variables. All we can say is that 95% of CI's (over an infinite number of CI's) created in this fashion will contain the true mean. Unfortunately, you don't know which one w/ absolute certainty.

Last edited by :::grimReaper:::; 08-13-2010 at 05:26 AM.
08-23-2010 , 03:17 PM
Maybe you should use this instead (HEM gives you SD):
http://www.castrovalva.com/~la/win.htm

Btw 16 BB/100 or any small range around that looks very optimistic, esp at 200NL.

m