09-28-2007 , 11:25 AM
if you want to read this stuff in german go here

Well, there were some recent discussions about the variance of Poker, how big the swings can get, how long you break even and so on.

there has never been a correct approach to this thing.

first everybody calculated confidence intervalls based on the PT SD and the winningrate per 100... this was then proven wrong in this thread , mainly by pokey

problems were that the samples ar not normaly distributed, so the whole confidence intervall stuff that is based on normal distribution is not correct.

there was an other approach by a guy who tried to use his own distribution, its not very well documented and i didn't calculate it with 6.5 PTBB but i figured it would be best to start the thing onbiased. the approach went definitly in the right direction. i don't know if that approach is calculated correctly though...

what i did:

i have a big database of observed NL400 hands. i queried one million results of pokerhands out of it, that players with more than 350 dollar stack played.

i plugged them in matlab und used this outcome to simulate pokerhandresults moneywise... so the simulation represents an average full stack NL400 player.

to do the confidence intervall stuff we ned a nomral distributed form of pokerresults. for this we simply use the central limit theorem:

first i simulated pokertracker. i took a coule of 10k random 100 samples of pokerhands and checked if they are somwhat normaly distributed: result:

not realy normaly distributed.

500 samples:

still not...

1k samples

we got a match.

if we want to do confidence intervalls we need to sample our pokerhands in 1k samples, calculate the sd and winrage per 1k and plug it into those "winratesimulations". you will see that the 100 sample winratesimulation underestimates the spread of the confidenceintervall. not huge, but still noticable:

if you callculate an 95% confidence intervall for 100k hands on a 10 BB winner with a variance of 70 BB you'll be betwen 5,6 BB and 14 BB

if you do it with an 1k sample with SD 294 (SD of my 1k sample sample) its between 4,2 and 15,7 BB/100

the problem is: we cannot simulate swings with 1k samples. wie don't know what happens within those 1k samples so we can't say how big the swings really are...

for that i used the distribution of my one million sample...

we cannot make a function out of this like if we had a normaly distributed distribution. the reason for this is, that a winning player would gain some amount by folding which is not correct. so we need to simulate a winning player by changing the distribution of the average player, which is slightly below zero in his winningrange due to rake...

for this i looked at all posivite outcomes in my sample and added the amount needed to generate a distribution with a winningrate of 5 PTBB per player (i know its a very good winningrate, but i used it because it shows a kind of "minimum" of variance every player has to take at least.)

its not 100% realistic becaus a good player will not only win more but also loose less. i dont think it will change big so for starters i went with only adding an amount to every positive outcome (= winning a hand)

i then simulated 100 mio hands with this sample meaning i took 100 mio times a random number out of this sample.

i then wrote an algorithm to count the downswings this player would take:

there is an 10+ downswing every 47k, an 15+ downswing every 150k and an 20+ downswing every 500k in buy ins...

so the probability to have an 20+ downswing in 100k hand is 20 percent or 1 out of 5... so 20+ downswings to very good players in a weak playerfield don't happen often but definitly will happen!

i did the same thing for an 2.5 PTBB player

there is a 10+ downswing every 42k, 15k every 87k 20+ every 164k and 30+ every 574k.

if you are a winningplayer in a good playerfield you will suffer 20+ downswings quite often!

i think this is the most accurate analysis done on variance in poker so far and i think i ruled out all the mistakes done in previous analysis...

still input is highly appreciated... (especially from pokey and the other statistic competences here on 2+2)
09-28-2007 , 03:17 PM
I can't see the "for that i used the distribution of my one million sample..."

Can you repost that image or link?
09-28-2007 , 04:00 PM
sure...

its nothing special, the graph doesn't really say anything exept there are a lots of zeros and blind amounts cause of folding...

http://img260.imageshack.us/img260/1...0averague5.jpg
09-28-2007 , 06:55 PM
The normal distribution is an estimate of the real distribution. It's a better estimate in fixed limit poker. In NL there's the variance of the variance. In FL the variance of sessions varies much less between sessions.
09-28-2007 , 07:19 PM
i haven't done any analysis on fixed limit poker but it seems logic that the distribution converges to normal within smaller samples than NL because the spread of losses and wins is much smaller

i disagree strongly with the statement that the normal distribution is an estimate of the real distribution. as you can see in no limit its not AT ALL normal. in fixed limit its closer but still FAR FAR away because you will have a lot of zeros because of folding.

the output is either normaly distributed or not, it is wrong to take it as an estimate if you know that it is not correct. you will have errors in your calculations and its impossible to tell how big the error is... as you can see in my calculations there is an error!

the only way to get a non normal distribution to a normal one is applying the CLT by drawing random samples from the original distribution with the size of n

in full stack no limit holdem this n is 1k... smaller samples are not normaly distributed and therefore useless for confidence intervalls

it might be smaller for limit hold'em
09-29-2007 , 05:59 AM
Quote:

first everybody calculated confidence intervalls based on the PT SD and the winningrate per 100... this was then proven wrong in this thread , mainly by pokey

problems were that the samples ar not normaly distributed, so the whole confidence intervall stuff that is based on normal distribution is not correct.

I don't agree with that summary of what Pokey said. Many people have misrepresented the logical arguments and conclusions from that thread.

PokerTracker does not compute the SD correctly. If you played a ridiculusly long session, the algorithm would fail spectacularly, but no one does that. How much is the SD estimate off by ignoring the variance within sessions? In many cases, PokerTracker will only slightly underestimate the SD. This is far from Pokey's statement that the PokerTracker SD is "meaningless." It means we have a slightly biased estimate, which we might be able to correct, and which is much better than nothing even uncorrected.

For example, polls asking people to give their PokerTracker stats say that the SD is about 10% higher when people play shorthanded limit as opposed to full ring limit. Do we throw this data out? We shouldn't if PokerTracker is underestimating all SDs by 10%. If the underestimates depend on session length, and shorthanded players play longer sessions, then we might need to increase the shorthanded SDs by a few percent more, on average, which would increase the size of the average gap.

The distribution per hand is not normal. So, what? Normal approximations are still applicable in many circumstances -- the whole point of a normal approximation is that it works on distributions that are not normal. Pokey correctly pointed out that it may take a different number of hands to be able to use a normal/Brownian approximation for the purpose of looking at downswings than for the ending distribution. But he didn't logically argue against ever using the normal/Brownian approximation.

Those aren't the only points Pokey made, but those are the main ones that are getting repeated in a garbled fashion.
09-29-2007 , 06:54 AM
i know that everbody misunderstands everything if we go in advanced statistics, thats why i decided to do it myself.

i am not saying pokertracker is completly wrong and pokey did exagerate by the word "meaningless" in the sence of how the word is used in all day language...

however: we do not know if the sd samples of pokertracker are normaly distributed, we do not know how big the error is. wo do know that we make some kind of error but statisticly if you make an error and don't know how big it is it is statisticly "meaningless" but not meaningless in the sence of the word used everyday!

as i proved above, pokerhand samples converge to normal distribution after approx 1000 hands.

if the session reports lower samples or reports bigger samples then we actually calculate for (meaning underestimating the SD)

there are a couple of errors within the pokertracker alogrithm that screws with it and its not correct.

i know that the results are still somewhat reasonable. my example proves this too, the confidence intervall is slightly bigger only!

you cannot use big samples however for downswing calculations and this is the MAIN argument that pokey pointed out!

there is a huge difference between doing a confidence intervall for your winningrate and actually simulate downswings and so on

this is why i used the original distribution and not a sample of samples of it... and this is really really important!
09-30-2007 , 12:36 PM
The distribution of individual hand returns is not very important unless you are projecting over a small number of hands (you're not) or there are extreme outliers (like hands with 1,000 BB swings; which there are not). What is very important is even tiny dependencies among hands, such as a win-rate that goes up and down or a tendency to tilt.

If I understand things, all your simulations assume independent hand results. The main difference is not the shape of the hand result distribution, but your assumed mean and standard deviation of hand results. Mean and standard deviation do affect downswing projections, shape does not except in the extremes.

I claim if you do independent simulations for any shape distribution that is remotely reasonable, and fix the mean and standard deviation, you will get similar values for expected downswings.

On the other hand, if you implement even small dependencies in the simulation, you will get big differences. For example, if you assume win rate is not fixed at 5 PTBB but is a random walk starting at 5 PTBB and moving up or down 0.01 PTBB every hand; or if you assume the mean result on a hand is 5 PTBB + 0.001*(profit over last 100 hands - 5); you will find great differences in downswings.

I'm not suggesting these are reasonable models of poker, I'm just saying if you want to get useful inferences about downswings, study downswings and dependencies, not hand-by-hand analyses assuming independence.

As a practical example, you will read a lot of nonsense about how Joe Dimaggio's 56-game hitting streak is a fantastically improbable statistical event, based on assuming at-bats are independent. The really silly thing about this argument is it also shows virtually all long hitting streaks are highly improbable. If your model shows something is improbable, but it happens frequently, change your model. If you study hitting streaks instead of modeling them as independent runs of at-bats, you find once someone gets a streak going, they are much less likely to be walked before getting a hit in a game. That skews the probabilities of long streaks enormously.

In poker, it's silly to assume people play the same way, and are played the same way, in the midst of a downswing versus winning periods. Bad players create their own downswings (they're called tilts). Good players can use the downswing to their advantage, at least in the kinds of games where the players are known.
09-30-2007 , 01:15 PM
thank you i really liked that input.

the issue you pointed out is totaly correct.

but i wanted to rule things like tilt and dependencies like this out. its just that a lot of people don't know how swingy poker can be only based on the variance itself and not the variance of players bring in because der winningrate flucutate (which does, i agree completly)

also it's kinda hard to simulate. i just wanted to show that even if two robots play against eachother you have to expect AT LEAST those swingfrequencies. if there are dependencies its even worse...

of course nobody plays the same and some games are more agressive than others but it should give a framework of how big swings can be an will be at least. this calculated on a correct databasis even though its a model and a model of course never equals reality, because its a simplified model of an average player with no dependencies.

i might gonna do something like you suggested, but the algorithms are probably pretty complicated to do and the dependencies of winningrate will increase time and cpu power for calculations greatly...

one thing though: you will not get similar results if you dont use the correct distribution. there will be quite big errors if the skew is not correct and events that are assumed almost impossible by the distribution are "only" unlikely
09-30-2007 , 02:18 PM
I don't know if you have read Nassim Taleb's The Black Swan. He calls the Normal distribution "GIF" for "great intellectual fraud," and argues that small probabilities of extreme events invalidate Normal calculations in many important cases. This is a popular work (but quite deep), there are more formal academic versions of the argument. Mason Malmuth makes a related point about situations that are "self-weighting" and "non-self-weighting."

The key is whether a single observation can significantly affect a total. If you measure the average wealth of individuals, and sample 100,000 US adults, adding Bill Gates to your sample will more than double your average. But if you measure the average height of even 1,000 people, adding an NBA center will not change your result much.

In this context the question is how much extreme hands add to the size of downswings. It can't be very much under most reasonable assumptions. Events that are reasonably common (like losing all-in) happen often enough that there are a reasonably predictable number over the number of hands that form a downswing. For example, if 1 hand in 100 is an extreme bad event, you expect to get 100 of them over a 10,000 hand downswing; you might get 80 or 120, but you're not going to get 50 or 150 if things are independent. Even in no-limit poker, this should not dominate the downswing compared to ordinary bad events adjusted for standard deviation. The only exception would be if you kept every penny of your winnings on the table and the other players could match your stack. Assuming you cash out reasonably often and buy-in for a fixed (or reasonably stable) amount; I don't see how you can have extreme events that will make a significant difference to downswing probabilities (adjusted for mean and standard deviation).
09-30-2007 , 06:20 PM
i don't know if i get your point (english is not my mother tongue)

is the effect you are describing explained by my results of 5 PTBB winningplayer who has quite frequently a 15+ downswing but very unfrequently a 20+ downswing, compared to the ratio 10+ downswing to 15+ downswing?

im just saying that if you try to simulate downswings with samples of 100 pokerhands and assume normal distribution you will get quite different results, as when you do with the original distribution (for the same player). because an all in happens quite frequently compared to lower losses..

even ruling out the problem that you won't catch the things that happen within the 100 pokerhands.

there has been such an simulation in an earlier thread, and there were critics about its results. i'd have to do my own calculations to show the exact difference.

of course that simulation will not create a completly silly result and my result doesn't really resembles pokerreality.

but all the calculations before have been made with samples of 100 pokerhands assuming normal distribution. this will affect confidenceintervalls and also kelly bankrolls (which would be the next step)

im using a kelly bankroll (im relativly tiltfree and don't have problems with stepping down and so on...) and i calculated the limitchangingpoints with standard deviation from PT. im assuming there will be some miscalculations because of my results right here but i have to look further into that anyways

again thank you very much for the input, its good and very interesting to have a discussion like this.
07-11-2019 , 03:50 AM
Quote:
Originally Posted by AaronBrown
I don't know if you have read Nassim Taleb's The Black Swan. He calls the Normal distribution "GIF" for "great intellectual fraud," and argues that small probabilities of extreme events invalidate Normal calculations in many important cases. This is a popular work (but quite deep), there are more formal academic versions of the argument. Mason Malmuth makes a related point about situations that are "self-weighting" and "non-self-weighting."
Hi Aaron:

I just noticed this thread and want to make a small comment. But first, let me state that I haven't done this type of statistical work for many years and don't plan to do so now.

For those who don't know, my "point about situations that are "self-weighting" and "non-self-weighting"" comes from my gambling theory book where I was talking about bankroll requirements, and for those not familiar with this book since it's now 32 years old, mine was the first work in this area of how much money you need to survive "that I know of) which made any sense. Anyway, what I was pointing out here was that the estimations that I made assumed that the expectation and standard deviation for each playing session was the same. Without this assumption it would not have been possible for me to have computed anything.

However, in reality, this assumptions is never true. So what does this mean. Well, if we take the simple case where the expectation is always the same for each playing session but the standard deviation is not, it means that some sessions will have a larger than average standard deviation and some sessions will have a smaller than average standard deviation. That should be obvious. But what it also does from a statistical perspective is to reduce the sample size which means that your confidence intervals should be larger than what a normal distribution would give us, which in turn meant to me that the bankroll estimates that I developed for this reason (as well as some others that other people pointed out) were too small.

Exactly how much too small is hard to say. But in my book I recommended that my bankroll numbers at three standard deviations be increased by 10 to 20 percent, and over the years, based on my own experience, I have found this to be reasonable.

Quote:
The key is whether a single observation can significantly affect a total. If you measure the average wealth of individuals, and sample 100,000 US adults, adding Bill Gates to your sample will more than double your average. But if you measure the average height of even 1,000 people, adding an NBA center will not change your result much.
I actually, years ago, have some direct experience here. It occurred back in the late 1970s when working for the Census Bureau on the Annual Housing Survey which today is called America's Housing Survey. What happened, and I remember this being in the Houston Standard Metropolitan Statistical Area, was that land samples would be taken where good address lists did not exist, and in one of these land samples it was discovered that a very large mobile home park had been built. And based on the sampling probability it now meant that our sample estimate would show that there were way more mobile homes in the rural areas of Houston than there were in all of Houston (and we knew this was independent sources). So, an adjustment had to be made as to exactly how much weight would be given to this "outlier" sample.

Quote:
In this context the question is how much extreme hands add to the size of downswings. It can't be very much under most reasonable assumptions. Events that are reasonably common (like losing all-in) happen often enough that there are a reasonably predictable number over the number of hands that form a downswing. For example, if 1 hand in 100 is an extreme bad event, you expect to get 100 of them over a 10,000 hand downswing; you might get 80 or 120, but you're not going to get 50 or 150 if things are independent. Even in no-limit poker, this should not dominate the downswing compared to ordinary bad events adjusted for standard deviation. The only exception would be if you kept every penny of your winnings on the table and the other players could match your stack. Assuming you cash out reasonably often and buy-in for a fixed (or reasonably stable) amount; I don't see how you can have extreme events that will make a significant difference to downswing probabilities (adjusted for mean and standard deviation).
Best wishes,
Mason

m