Open Side Menu Go to the Top
Register
Standard deviation of a sample Standard deviation of a sample

12-23-2017 , 07:02 AM
I am trying to understand the basics of probability as it relates to poker and am using the book The Mathematics of poker (Chen, Ankenman).

I keep track of my winrate. I now understand it is merely a sample taken from an underlying distribution of all possible true win rates. I imagine that the standard deviation is the most useful data about my actual win rate is, especially in terms of confidence intervals.

How is the standard deviation calculated from observing my play? We know the monetary result of each hand and the mean (my winrate) but how do we know the probability of any particular outcome? Do we use z-scores based on a sigma value somehow derived from the universe of players who have played a similar number of hands as I have?

Thanks in advance.
Standard deviation of a sample Quote
12-24-2017 , 10:02 PM
If you are playing on-line, Poker tracking programs provide the s.d. If live or you don't have a tracking program see BuceZ response in this thread.

https://forumserver.twoplustwo.com/2...dard+deviation
Standard deviation of a sample Quote
12-24-2017 , 11:10 PM
I don't think standard deviation is that important of a number - it tends to have a fairly narrow range of values compared to win rate.

Standard deviation is a general statistical term and it's easy to compute. First, you have to have a bunch of "samples", in poker it's typical for each sample to be a set number of hands, or a set unit of time, such as 1 hour.

So you have a bunch of $ won per sample interval, like, you have many samples of 100 hands, each with $ won or lost. You compute the average. Call the average u.

For each sample in the record, subtract it from u, and square that. Add all these together and divide by the number of samples. That's your variance. standard deviation is the square root of variance. There's more info on wikipedia or khan academy or the internet.
Standard deviation of a sample Quote
12-25-2017 , 05:01 AM
Thanks for your replies and the link. After looking further into the matter I agree SD is not that important in terms of inferring my true win rate. Only a large enough sample size can do that.

The book only treats statistics in the very beginning. I found the subject interesting but not enough to really dive in. However, it struck me that in several examples the authors simply assume an SD. In the formula they give for variance, each summed square of the distance to the mean is weighted by the probability of the result. So I was curious how that can be calculated.
Standard deviation of a sample Quote
12-25-2017 , 02:42 PM
When you're dealing with a purely theoretical model, you can calculate variance using the probability of each outcome and the EV of each outcome - I guess this is what you're referring to above.

But when finding the standard deviation of a game that you have actual results for, it's just easier to do it from the results. This is especially true for poker, where it's not really practical to evaluate the EV and probability of *every possible* outcome. Variance converges fairly quickly so you don't need a huge sample size, and also, for most games, the range of variances among "normal" play is usually not that large.

I don't have numbers on hand, but for example when I was looking at 7 card stud hilo games, I found that variance might be in the 15-20bb/100 range (this is sort of a half remembered estimate, the actual numbers I don't remember). Extremely bad players might have much higher variance OR much lower variance. Imagine a particular kind of player, for example, who simply only calls, and never folds. He will lose a lot of money, but his EV line will go straight down, with very little variance. A player who shoves a lot might see a lot of small wins and a few big losses, which can lead to high variance.

Most decent players fall within a very narrow variance range. Some extremely tight players might have somewhat lower variance.

By the way, it might seem like I'm describing 2 ways to calculate variance, but they're actually the same. When you're measuring the variance of a population, you calculate the mean, subtract each sample from the mean, square that, add them together and divide by the number of samples.

When using a probabilistic model, as you mention you use the distance from the mean squared multiplied by the probability. But this is really the same thing. As a really simple example lets consider a case where we have
25% EV = 1
25% EV = 2
50% EV = -1
just a totally made up example. Clearly the mean is
.25 + .25*2 - .5 = .25

variance would be
.25(.25-1)^2 + .25(.25-2)^2 + .5(.25+1)^2 = 1.6875

But, you could also "imagine" a sample population using the individual percentages. Like, a truly representative population of 20 samples would have 5, 5 and 10 of each outcome. So variance would be
(
(.25-1)^2 + (.25-1)^2 +... (5 times)
+ (.25-2)^2 + (.25-2)^2 +... (5 times)
+ (.25+1)^2 + (.25+1)^2 +... (10 times)
) / 20

Which I think you can see is the same thing, because adding something 5 times is the same as multiplying by 5, and when you divide 5/20 you get .25 and so the component of variance from the first outcome could be expressed either as

((.25-1)^2 + (.25-1)^2 + (.25-1)^2 + (.25-1)^2 + (.25-1)^2) / 20
or
.25*(.25-1)^2

That was really long winded but I wanted to be as clear as possible.
Standard deviation of a sample Quote
12-25-2017 , 03:46 PM
Yes very clear. Thank you for all the time you took to make it that way.

As an aside, it seems the authors imply it is not statistically possible to infer the true distribution of starting hand ranges only by observing them because of bias (position and hand strength) and small sample size. So does this mean we assume standard starting ranges and only take meaning from observations that strongly deviate from these norms?

Thanks.
Standard deviation of a sample Quote
12-25-2017 , 08:13 PM
Quote:
Originally Posted by solarglow
As an aside, it seems the authors imply it is not statistically possible to infer the true distribution of starting hand ranges only by observing them because of bias (position and hand strength) and small sample size. So does this mean we assume standard starting ranges and only take meaning from observations that strongly deviate from these norms?
This is not exactly an area of my specialty, and I've had a lot of questions about it myself over the years. But that does seem right to be. A *huge* problem is that you only see hands that go to showdown, which obviously show have a huge bias in them. Pairs and big cards should go to showdown more because they make showdown worthy hands more.

Something I always wondered is, do "most" people order starting hands in the same order? I think the answer is "no." If they did it would be easy because it doesn't take long to see what *percent* of hands a player plays. If they play 15% and you had an ordered chart that should correspond to a certain range. There are still issues of circumstances (position, have there been limpers/betters/raisers/etc) but it's way easier than reality.

I observed that with some types of players, they might place a higher value of suited cards, and other players might place a higher value on high cards, or connected cards, etc.

It seems like there's a lot of guesswork involved. I think over time you can start to classify players by "types" and that players with the same type might have similar ranges given a percentage of the time they pay to see a flop.

I think sometimes a flawed metric based on a large sample might be more useful than a perfect metric based on a small one. When I played a lot of stud games I would have my HUD show PFR based on door card, which is useful but takes 13x as long as getting a generic PFR. So I tended to show both numbers. Global PFR is less useful but converges faster so I didn't have to wait 1000 hands to know how to play a guy.
Standard deviation of a sample Quote
12-26-2017 , 02:16 PM
I guess the "types" would correspond to a few standard behavior profiles (including starting hand ranges) from which we are looking for strong evidence of deviation.

I play micro stakes 10NL HE online and 1/2 or 2/4 NL HE live. So the types that I benefit from are well known (rock, maniac, etc). Probably at higher levels there are TAGS and LAGS and other classifications.

I don't use a HUD because at this point I am not sure of the value. I want to train my observation skills. Even a metric like VPIP or 3bet % is suspect because as you point out different players would compose that interval differently. Also I believe the metric is an aggregate and could contain information from different types of games (6 handed, 10-handed, etc). Lastly, players sometimes vary their styles from session to session. Maybe they learned something new or just don't have the discipline.

I imagine at higher online stakes a HUD could be useful when I would know how to use it properly against much better players.
Standard deviation of a sample Quote

      
m