if you want to read this stuff in german go
here
Well, there were some recent discussions about the variance of Poker, how big the swings can get, how long you break even and so on.
there has never been a correct approach to this thing.
first everybody calculated confidence intervalls based on the PT SD and the winningrate per 100... this was then proven wrong in
this thread , mainly by pokey
problems were that the samples ar not normaly distributed, so the whole confidence intervall stuff that is based on normal distribution is not correct.
there was an
other approach by a guy who tried to use his own distribution, its not very well documented and i didn't calculate it with 6.5 PTBB but i figured it would be best to start the thing onbiased. the approach went definitly in the right direction. i don't know if that approach is calculated correctly though...
what i did:
i have a big database of observed NL400 hands. i queried one million results of pokerhands out of it, that players with more than 350 dollar stack played.
i plugged them in matlab und used this outcome to simulate pokerhandresults moneywise... so the simulation represents an average full stack NL400 player.
to do the confidence intervall stuff we ned a nomral distributed form of pokerresults. for this we simply use the central limit theorem:
first i simulated pokertracker. i took a coule of 10k random 100 samples of pokerhands and checked if they are somwhat normaly distributed: result:
not realy normaly distributed.
500 samples:
still not...
1k samples
we got a match.
if we want to do confidence intervalls we need to sample our pokerhands in 1k samples, calculate the sd and winrage per 1k and plug it into those "winratesimulations". you will see that the 100 sample winratesimulation underestimates the spread of the confidenceintervall. not huge, but still noticable:
if you callculate an 95% confidence intervall for 100k hands on a 10 BB winner with a variance of 70 BB you'll be betwen 5,6 BB and 14 BB
if you do it with an 1k sample with SD 294 (SD of my 1k sample sample) its between 4,2 and 15,7 BB/100
the problem is: we cannot simulate swings with 1k samples. wie don't know what happens within those 1k samples so we can't say how big the swings really are...
for that i used the distribution of my one million sample...
we cannot make a function out of this like if we had a normaly distributed distribution. the reason for this is, that a winning player would gain some amount by folding which is not correct. so we need to simulate a winning player by changing the distribution of the average player, which is slightly below zero in his winningrange due to rake...
for this i looked at all posivite outcomes in my sample and added the amount needed to generate a distribution with a winningrate of 5 PTBB per player (i know its a very good winningrate, but i used it because it shows a kind of "minimum" of variance every player has to take at least.)
its not 100% realistic becaus a good player will not only win more but also loose less. i dont think it will change big so for starters i went with only adding an amount to every positive outcome (= winning a hand)
i then simulated 100 mio hands with this sample meaning i took 100 mio times a random number out of this sample.
i then wrote an algorithm to count the downswings this player would take:
there is an 10+ downswing every 47k, an 15+ downswing every 150k and an 20+ downswing every 500k in buy ins...
so the probability to have an 20+ downswing in 100k hand is 20 percent or 1 out of 5... so 20+ downswings to very good players in a weak playerfield don't happen often but definitly will happen!
i did the same thing for an 2.5 PTBB player
there is a 10+ downswing every 42k, 15k every 87k 20+ every 164k and 30+ every 574k.
if you are a winningplayer in a good playerfield you will suffer 20+ downswings quite often!
i think this is the most accurate analysis done on variance in poker so far and i think i ruled out all the mistakes done in previous analysis...
still input is highly appreciated... (especially from pokey and the other statistic competences here on 2+2)