An in-depth discussion of the relation of playing style to variance: warning, math inside - Page 3 - Live Low-stakes No Limit Poker Forum

I'll try to clear up my point of view one more time.

Winrate is a mean. Being that our statistic is a mean, we should look at the standard error to see how much our sample mean (observed winrate) may differ from our population mean (true winrate). Note that this has nothing to do with the observed variance of the sample.

Of course it has something to do with the observed variance of the sample! How do you calculate standard error?

Quote:

So, swings, the sample's variance value, etc. has nothing to do with my point. My point is that loose players need less hours for their observed winrate to track to their true winrate. The interval estimates of their winrates will be narrower because the standard error associated with it is lower. This makes intuitive sense since their is less luck involved for loose players than tight players over equal sample size.

It might make intuitive sense to you, but I don't understand why. If you VPIP more hands your standard deviation rises, in turn raising your standard error.

I do not understand why you think the standard error should be lower. Your argument that loose players make more decisions is not correct because decisions are not data points, hands are.

Quote

02-12-2019 , 02:59 AM

#52

Angrist

Pooh-Bah

Join Date: Sep 2011 Posts: 3,887

Yea, the argument that loose players need fewer hours needs justification IMO. Also not sure that there's less luck involved playing more marginal thin spots than there is playing fewer high edge spots.

Quote

02-12-2019 , 11:37 AM

#53

cannabusto

Pooh-Bah

Join Date: Jan 2013 Posts: 4,876

Quote:

Originally Posted by browni3141

Of course it has something to do with the observed variance of the sample! How do you calculate standard error?

It might make intuitive sense to you, but I don't understand why. If you VPIP more hands your standard deviation rises, in turn raising your standard error.

I do not understand why you think the standard error should be lower. Your argument that loose players make more decisions is not correct because decisions are not data points, hands are.

Respectfully, I think you don't understand what standard errors are when we are dealing with means. Winrate is a mean statistic.

"The standard error of the mean is a measure of the dispersion of sample means around the population mean." In other words, standard error in this case asks "If we take repeated samples from a population, how spread out around the population mean will these sample means be?"

And so, it has zero to do with the observations in a particular sample and how swingy they may or may not be. We are only looking at the sample means (our observed winrates).

As for the 2nd part, I think my answer to the first part addresses this too wrt the standard error stuff. As for why I believe it makes intuitive sense for loose players who vpip more to achieve their true winrate quicker? Well, that is because their skill (or lack thereof) has more impact over short periods compared to tight players, who are more dependent on the deck (luck). By playing more hands, they are essentially increasing their sample size (I mean, they absolutely are increasing their sample size and this would be clear if we measured in hands played rather than hours played). And as I'm sure you know, as the sample size increases, sample means cluster more closely around the population mean. In our parlance, observed winrates cluster more closely around the true winrate.

Ultimately, it boils down to the fact that loose players have larger samples over an equal number of hours because they play more hands. And larger samples means more certainty.

Quote

02-12-2019 , 11:45 AM

#54

AlanBostick

Carpal \'Tunnel

Join Date: Sep 2002 Posts: 11,482

Quote:

Originally Posted by cannabusto

Winrate is a mean. Being that our statistic is a mean, we should look at the standard error to see how much our sample mean (observed winrate) may differ from our population mean (true winrate). Note that this has nothing to do with the observed variance of the sample.

Standard error has everything to do with the observed variance of the sample. If the observed variance of N samples is s^2, then the standard error is s/sqrt(N).

Standard error goes down with the square root of the sample size. It goes up with the square root of the sample variance.

Quote

02-12-2019 , 11:55 AM

#55

cannabusto

Pooh-Bah

Join Date: Jan 2013 Posts: 4,876

Quote:

Originally Posted by AlanBostick

Again, respectfully, you are not understanding that we are dealing with the standard error of a mean. Thus, the sample we are examining is actually a bunch of sample means, not observations that make up a single sample. The wiki article on standard error explains this well.

I mean, I'm not the smartest person on earth, but I do analyze data for a living. I'm very confident that I'm correct. To be clear, you and browni are correct given your current point of reference--it's just that we are not discussing what you two think we are.

Quote

02-12-2019 , 02:06 PM

#56

Angrist

Pooh-Bah

Join Date: Sep 2011 Posts: 3,887

Quote:

Originally Posted by cannabusto

Ultimately, it boils down to the fact that loose players have larger samples over an equal number of hours because they play more hands. And larger samples means more certainty.

This isn't right. Folding is still playing a hand and a datapoint. So if anything a tighter player will play slightly *more* hands than a looser player as they spend less *time* in each hand and the table will see more hands dealt per hour.

Quote

02-12-2019 , 02:18 PM

#57

cannabusto

Pooh-Bah

Join Date: Jan 2013 Posts: 4,876

You're right. Good point. I should have been discussing "decisions" or perhaps "betting rounds" rather than "hands."

Quote

02-12-2019 , 03:01 PM

#58

AlanBostick

Carpal \'Tunnel

Join Date: Sep 2002 Posts: 11,482

Quote:

Originally Posted by cannabusto

The true probability distribution is unknown. The true mean is unknown. The true variance is unknown.

The only way we have to estimate the true values are statistics we derive from our observations: the mean of our sample data, the variance of our sample data, and so on.

The standard error of a statistic derived from the sample data is simply another statistic derived from that same sample data.

In the case of win rate (which is a mean of the sample data) the standard error is the square root of the variance of the sample data divided by the square root of the number of data points in the sample.

The standard error has everything to do with the variance of the sample data.

It doesn't matter if the sample data consists of hands (e.g. in Poker Tracker) or ordered pairs of (session_net_result[k], session_duration[k])

It is patently obvious that the standard error of the mean of the sample data is going to depend on the swinginess, i.e., the variance, of the sample data.

Quote

02-12-2019 , 03:09 PM

#59

AlanBostick

Carpal \'Tunnel

Join Date: Sep 2002 Posts: 11,482

I have shown that folding every hand has very low, but nonzero, variance, and suggested very strongly that adding hands to play increases variance. In other words, in the limit of VPIP -> 0, variance increases with number of hands played.

If it is true that for sufficiently large VPIP , variance decreases with each additional hand played, then there must be a VPIP that is less than 100% for which variance is a maximum.

Quote

02-12-2019 , 03:21 PM

#60

cannabusto

Pooh-Bah

Join Date: Jan 2013 Posts: 4,876

Idk what to tell you at this point. You need to google more. Standard error and standard error of the mean are different things.

Imagine several 500 hour samples of someone who vpips every hand and shoves every flop. Now, imagine several 500 hour samples of someone who folds pre every hand except when dealt AA/KK. Which player is more likely to converge to their true winrate sooner? Whose results will be more consistent sample to sample?

Quote

02-12-2019 , 03:26 PM

#61

cannabusto

Pooh-Bah

Join Date: Jan 2013 Posts: 4,876

Here is a link that explains the difference more in depth, but I think the previously mentioned wiki article on SE is fine enough. https://stats.stackexchange.com/ques...deviation?rq=1

Quote

02-12-2019 , 04:58 PM

#62

browni3141

Pooh-Bah

Join Date: Aug 2015 Posts: 5,678

Quote:

Originally Posted by cannabusto

I mean, I'm not the smartest person on earth, but I do analyze data for a living. I'm very confident that I'm correct. To be clear, you and browni are correct given your current point of reference--it's just that we are not discussing what you two think we are.

Are they taking applications? I think I'm doing pretty well in this thread.

Quote:

Originally Posted by cannabusto

Idk what to tell you at this point. You need to google more. Standard error and standard error of the mean are different things.

Do you think I don't google?

Quote:

Originally Posted by cannabusto

Here is a link that explains the difference more in depth, but I think the previously mentioned wiki article on SE is fine enough. https://stats.stackexchange.com/ques...deviation?rq=1

Okay, I read your link, affirming everything Alan and I have been saying all along. Did you read it?

Can you answer the question quoted below? Rather than just declaring what I've already posted in the thread to be incorrect, can you post the correct answer?

Quote:

Originally Posted by browni3141

Of course it has something to do with the observed variance of the sample! How do you calculate standard error?

Quote

02-12-2019 , 05:31 PM

#63

cannabusto

Pooh-Bah

Join Date: Jan 2013 Posts: 4,876

Quote:

Originally Posted by browni3141

Are they taking applications? I think I'm doing pretty well in this thread.

Do you think I don't google?

Okay, I read your link, affirming everything Alan and I have been saying all along. Did you read it?

Can you answer the question quoted below? Rather than just declaring what I've already posted in the thread to be incorrect, can you post the correct answer?

I didn't say you weren't doing well. I don't know why you're being combative. I'd be delighted to be proven wrong. I'd rather learn than be right or have pride or whatever.

I didn't say you didn't Google. I said you two need to Google more. Perhaps I was incorrect about that, but still I don't believe you understand what I'm talking about. I say this with all due respect.

Yes, I can answer the question. Calculating the standard error of a parameter estimate involves taking sigma and dividing by the sqrt of n, as you well know.

Still, this doesn't explain why we are differing in our views. The standard error of the mean involves a sample of sample means. Not a sample of observations.

You and Alan are looking at me sideways because you know damn well that a LAG has more variance in a given sample of observations. As do I. But I am not talking about sample variance.

"The*sampling distribution*of a population mean is generated by repeated sampling and recording of the means obtained. This forms a distribution of different means, and this distribution has its own*mean*and*variance."

This is what I'm talking about. The distribution has its own mean and variance--it does not share these traits with its samples. If one were to take repeated samples, record the means, and analyze the resulting distribution, one could ascertain how much sample means vary across samples. This is explained very clearly in the link and I'm not sure where you're having a disconnect.

Point being that, despite greater variance within samples, sample means may well vary less for LAGs as I believe they do.

Quote

02-13-2019 , 09:44 AM

#64

Garick

Oberbiergenießer

Join Date: Dec 2007 Posts: 26,516

Quote:

Originally Posted by cannabusto

"The*sampling distribution*of a population mean is generated by repeated sampling and recording of the means obtained. This forms a distribution of different means, and this distribution has its own*mean*and*variance."

This is what I'm talking about. The distribution has its own mean and variance--it does not share these traits with its samples.

Point being that, despite greater variance within samples, sample means may well vary less for LAGs as I believe they do.

Thanks for posting this, especially the bolded. It helps make sense of the observed results that mpethy posted back in the day and I quoted ITT.

Regardless of "spot variance," LAGs have been observed to have smoother "big picture" graphs, and nits, especially short-stacking ones, to have much more jagged ones.

Quote

02-13-2019 , 10:08 AM

#65

cannabusto

Pooh-Bah

Join Date: Jan 2013 Posts: 4,876

Quote:

Originally Posted by cannabusto

Yes, I can answer the question. Calculating the standard error of a parameter estimate involves taking sigma and dividing by the sqrt of n, as you well know.

I accidentally wrote the standard error of the mean formula rather than what you asked, browni. For calculating the standard error of a statistic/parameter, you would subtract the mean from each observation, square the differences, and take the sqrt.

Quote

02-13-2019 , 03:00 PM

#66

rainbow57

grinder

Join Date: Jul 2008 Posts: 668

Quote:

Originally Posted by cannabusto

Point being that, despite greater variance within samples, sample means may well vary less for LAGs as I believe they do.

I am curious why you believe this.

Quote

02-13-2019 , 03:10 PM

#67

cannabusto

Pooh-Bah

Join Date: Jan 2013 Posts: 4,876

Empirically, I believe it because of mpethy's data analysis cited above. Logically, I believe it because they make more decisions/play more betting rounds than TAGs do in any given sample. This enriches each observation so that it effectively inflates the sample size, which leads to faster convergence to the mean.

Quote

02-14-2019 , 01:26 AM

#68

rainbow57

grinder

Join Date: Jul 2008 Posts: 668

Quote:

Originally Posted by cannabusto

I get that, intuitively, it can seem like a tighter style would lead to less variance and thus, a less consistent winrate over shorter time periods. But looser styles lead to more decision points and hands and opportunities to apply our edge. In statistical terms, it leads to more data and more data = less variance, all else being equal.

Looser players will see greater variations in session to session results. But their winrate is easier to estimate than a tighter player's rate over an equal amount of time.

AGs of any sort, tight or loose or somewhere in between, tend to look for spots to raise turn, bluff raise river, or take any other aggressive, +EV actions. Fit or fold, passive players will see more variance as their style is more dependent on connecting with the board, which is pure luck and will vary greatly over small to medium sized samples.

This is what I was looking for. The theoretical rational, or why it makes sense to you.

I still don't quite buy it. I get the idea and it does make some sense, but it really does not match my experience at all. Ibhavent found the data you referenced, can you quote that here?

From countless sharkscope giraffes I have seen, the LAG players always have giraffes that are all over the place... Sometimes winning over 5000 games, and then losing the next 10000, etc. The more TAG players always show up with very consistent graphs over much shorter samples. I consider myself TAG (sometimes slightly LAG, sometimes very TAG) and my results have always been very reliable. I also am at the upper end of profitability in any variation of NLHE I have played (not trying to brag here I am just explaining my experience and beliefs). I think playing LAG is intrinsically more susceptible to emotional stability and sharp focus... Constantly putting yourself in tricky spots for big pots is hard to keep stable. A TAG player has easier decisions and stronger ranges.

Question... if a LAG has greater SD but quicker convergence to the true mean win rate, then how do we make sense out of a confidence interval which increases in 'width' when the SD increases?

Quote

02-14-2019 , 02:07 AM

#69

poke4fun

veteran

Join Date: Mar 2011 Posts: 3,190

Something else to keep in mind is that the perception of LAG actually varies quite a bit, and it’s important to understand that when used in the context that LAG has lower variance than a tight player, said LAG is a winning player, not just a loose player that plays a lot more hands.

A "losing" loose player is going to have a lot higher variance than a winning LAG. I am also certain that a losing LAG player could actually lower his variance and improve his WR at the same time by simply playing fewer hands.

The reason why a winning LAG has lower variance can be explained as simply as using few examples:

30% equity hands - TAG/tighter players on average rely less on FE than a winning LAG, and therefore their EV is more likely to be directly correlated to their losing equity than a LAG to his. A winning LAG's EV in this range of hands is actually higher.
70% equity hands - winning LAG players on average also win more money, because their perceived range is wider. LAGs get called more often and for more money. Again, LAG has higher EV in these hands.

In other words, when comparing similar scenarios of 70/30 and on both sides, LAG loses less and wins more than TAG/tighter players.

Only caveat is that it is FAR more difficult to be a winning LAG and achieve lower variance than simply play TAG or fewer hands.

Quote

02-14-2019 , 03:42 AM

#70

browni3141

Pooh-Bah

Join Date: Aug 2015 Posts: 5,678

Quote:

Originally Posted by Garick

Quote:

Originally Posted by cannabusto

I didn't say you weren't doing well. I don't know why you're being combative. I'd be delighted to be proven wrong. I'd rather learn than be right or have pride or whatever.

Sorry for being combative. Frankly you don't seem like you'd be delighted to be proven wrong. You've already eliminated the possibility that you might be wrong when you said you're "100% certain" that you're correct.

Quote:

I didn't say you didn't Google. I said you two need to Google more. Perhaps I was incorrect about that, but still I don't believe you understand what I'm talking about. I say this with all due respect.

I admit I do not like to be wrong, and for this reason I research the topics I am posting about when I'm unsure about something to the point where I feel prepared to make strong arguments, and it's a bit irritating that you and Garick seem to think I am just confused and a quick Google search will clear things up.

Quote:

Yes, I can answer the question. Calculating the standard error of a parameter estimate involves taking sigma and dividing by the sqrt of n, as you well know.

Quote:

Originally Posted by cannabusto

No disagreement here. Since we are in fact talking about the standard error of the mean (right?), the two formulae are equivalent, and we can use the first one.

Quote:

Still, this doesn't explain why we are differing in our views. The standard error of the mean involves a sample of sample means. Not a sample of observations.

You and Alan are looking at me sideways because you know damn well that a LAG has more variance in a given sample of observations. As do I. But I am not talking about sample variance.

I know that standard error of the mean involves a sample of the sample means. I know you are not talking about sample variance. However the SEM can be estimated knowing the sample variance using the formula you, Alan and I have all posted. I don't understand why you acknowledge the formula but deny the relationship between variance and SEM.

Quote:

I see this is quoted directly from the Wikipedia article on standard error. Read the next few sentences:

"Mathematically, the variance of the sampling distribution obtained is equal to the variance of the population divided by the sample size. This is because as the sample size increases, sample means cluster more closely around the population mean.
Therefore, the relationship between the standard error and the standard deviation is such that, for a given sample size, the standard error equals the standard deviation divided by the square root of the sample size. In other words, the standard error of the mean is a measure of the dispersion of sample means around the population mean. "

https://en.wikipedia.org/wiki/Standard_error

Quote

02-14-2019 , 08:43 AM

#71

MikeStarr

Carpal \'Tunnel

Join Date: Jan 2016 Posts: 7,962

Quote:

Originally Posted by poke4fun

30% equity hands - TAG/tighter players on average rely less on FE than a winning LAG, and therefore their EV is more likely to be directly correlated to their losing equity than a LAG to his. A winning LAG's EV in this range of hands is actually higher.
70% equity hands - winning LAG players on average also win more money, because their perceived range is wider. LAGs get called more often and for more money. Again, LAG has higher EV in these hands.

Very nice post and very true. Im sure Im not the only one tired of this argument going on about the semantics of what variance is or what it means in mathematical terms.

There is no doubt that a good LAG player has a graph that looks smoother than a good TAGs graph. That's what most of us are talking about when we talk about variance.

When we talk about "long term" results, a good LAG will reach the long term much quicker than a good TAG.
If you break a LAGs graph into 1000 hour increments, the results will be more similar to each other than a good TAGs 1000 hour increments.

Now a bad LAG (which most of them are)....is a different story.

Quote

02-14-2019 , 09:23 AM

#72

Avaritia

Confirmed 2500 hour haver

Join Date: Feb 2013 Posts: 12,248

When an internet argument has devolved into the semantics of definition as opposed to the context of the subject matter, both sides have lost.

-Abraham Lincoln

Quote

02-14-2019 , 11:30 AM

#73

cannabusto

Pooh-Bah

Join Date: Jan 2013 Posts: 4,876

Quote:

Originally Posted by browni3141

Like I said before, what mpethy posted had nothing to do with variance in the statistical sense. Cannabusto and mpethy are not talking about the same thing. The post you quoted is correct, but it doesn't support any of your claims in the thread.

Sorry for being combative. Frankly you don't seem like you'd be delighted to be proven wrong. You've already eliminated the possibility that you might be wrong when you said you're "100% certain" that you're correct.

I admit I do not like to be wrong, and for this reason I research the topics I am posting about when I'm unsure about something to the point where I feel prepared to make strong arguments, and it's a bit irritating that you and Garick seem to think I am just confused and a quick Google search will clear things up.

No disagreement here. Since we are in fact talking about the standard error of the mean (right?), the two formulae are equivalent, and we can use the first one.

I know that standard error of the mean involves a sample of the sample means. I know you are not talking about sample variance. However the SEM can be estimated knowing the sample variance using the formula you, Alan and I have all posted. I don't understand why you acknowledge the formula but deny the relationship between variance and SEM.

I see this is quoted directly from the Wikipedia article on standard error. Read the next few sentences:

"Mathematically, the variance of the sampling distribution obtained is equal to the variance of the population divided by the sample size. This is because as the sample size increases, sample means cluster more closely around the population mean.
Therefore, the relationship between the standard error and the standard deviation is such that, for a given sample size, the standard error equals the standard deviation divided by the square root of the sample size. In other words, the standard error of the mean is a measure of the dispersion of sample means around the population mean. "

https://en.wikipedia.org/wiki/Standard_error

I appreciate the cordiality of the post. There's no need to be irritated regardless of who is right. These things can be tricky sometimes. And egos are deadly.

SEM cannot be estimated using the variance of a given sample. It can only be estimated using the variance of the sampling distribution--the variance of the distribution of means, not one of the distributions involving observations from a single sample. Likewise, the sqrt of n is not the sqrt of n observations, but the sqrt of n sample means.

In practical terms, you can calculate SEM from a single sample, but you're using the theoretical sampling distribution to do so.

Imagine an infinite amount of maniac samples. Now, imagine an infinite amount of OMC samples. The standard deviations and variances within samples will be wildly different between these two player types. But their SEMs will both be 0 because with infinite samples, the mean of the sampling distribution will equal the mean of the population.

And so, I contend that LAGs will asymptotically approach an SEM of 0 faster because their samples are richer. SEM is a function of sample size, not of within sample variance.

Quote

02-14-2019 , 11:33 AM

#74

cannabusto

Pooh-Bah

Join Date: Jan 2013 Posts: 4,876

Quote:

Originally Posted by rainbow57

Look at the above post. The CI of a parameter estimate will widen with SD as you would expect. That's a different thing altogether because we are only looking at one sample.

Quote

02-14-2019 , 11:35 AM

#75

cannabusto

Pooh-Bah

Join Date: Jan 2013 Posts: 4,876

Quote:

Originally Posted by Avaritia

When an internet argument has devolved into the semantics of definition as opposed to the context of the subject matter, both sides have lost.

-Abraham Lincoln

I get what you're saying, but I don't think that's what's happening here. It's just a confusing topic imo.

Quote

Page 3 of 8

First

1 2 3 4 5 6 7 8

Last

Post Reply Subscribe

...

Page 3 of 8

First

1 2 3 4 5 6 7 8

Last