Estimating the accuracy of a small hands-sample - Poker Theory

Two Plus Two Forums Poker Strategy Poker Theory & GTO

Estimating the accuracy of a small hands-sample

Post Reply Subscribe

...

09-11-2017 , 09:55 AM

wtbhhz

stranger

Join Date: Sep 2017 Posts: 4

Since I'm sitting on this mathematical problem for a while I decided to ask the community around here (I'm usually more of a "reader").

In this situation I'm trying to predict Villain's VPIP. This is an example and I will leave all other factors that can help to get an idea of Villain's VPIP (such as stack-size, position, stakes, etc.) out of the question. This is a purely theoretical situation.

Now, it could be that I'm missing something but I consider the following two facts to be true:

- If I have 0 hands on Villain I will look entirely to the population to estimate Villain's VPIP. If I know that most people have a VPIP around, let's say, 25%, I consider it most likely that Villain's VPIP is also around 25% (or around the population's average) and will play accordingly.
- If I have 1000 hands on Villain I will look entirely to the data I have on this particular Villain. So, if this statistic says Villain's VPIP is 45%. I don't care about the population's average of 25% and will play as if Villain is playing 45% of his hands.

Now, since there are situations between these two extremes, I want to know the math.

For example: if I have four hands on Villain, and Villain's VPIP shows up as 100%. I know that it's more likely that Villain just -by chance- got four playable hands in a row than that Villain really plays 100% of his hands. But, I think (but I could be totally wrong) that this small sample of only four hands says tells me -in combination with my knowledge about the population- at least a bit more about Villain. Intuitively I would estimate Villain has a higher chance of being a 40-VPIP kind of player than I would estimate before those four hands.

But, how to do the math? How to weigh a big and accurate sample over a population and a small (but bigger-getting) sample on a Villain together to form my most accurate prediction about Villain?

Visualizing what I mean in a table (numbers are completely made up, but should give an idea of what I mean):

To not regard a Villain's VPIP as 100 after opening just two hands is for sure a good idea. But what would be a good approach? Or is there none and do I -statistically- have to regard the chance Villain plays the next hand as 100%?

One approach I took was to calculate the probability of a particular event (let's say, opening two hands in a row) to happen with every possible VPIP-statistic. So, for example. If Villain is to actually have a VPIP of 25%, but plays the first two hands I play with him/her, the chance of this occuring is ~6%. If Villain was to open two hands out of three hands, the chance if this occuring would be ~14%. Now, after calculating this for every possible VPIP stat (rounded to 0 decimals) I could make a weighted sum and estimate Villain's VPIP more accurately. However, this doesn't take into account the fact that there are simply more people with a VPIP of 25% so this makes it (at least intuitively) extra likely.

The following image explains my approach a bit more:

The image consists of five tables with each two columns. The situation is that Villain opened 40% of the hands but each table represents a different sample size. From left to right: 10-25-50-100-250.

The table should be read as for example: if Villain's real VPIP is 39%, then there is a 25,03% chance this player plays 4 hands of the next 10 hands in a row.

Thanks a lot!

Last edited by wtbhhz; 09-11-2017 at 10:11 AM.

Quote

09-11-2017 , 10:14 AM

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

This has been solved, although it is lost somewhere in the vast 2p2 landscape. The TL

R is that you get a surprisingly reliable number for vpip after only a few dozen hands. Iirc, it was statistically significant information after 20 hands.

The downside is that even if you know a players vpip EXACTLY this is still only secondary info, and how they play their hands in different scenarios is much more important info. Those stats happen less often and are situational dependent, so takes longer history to get usable info.

Quote

09-11-2017 , 10:18 AM

wtbhhz

stranger

Join Date: Sep 2017 Posts: 4

Quote:

Originally Posted by robert_utk

This has been solved, although it is lost somewhere in the vast 2p2 landscape. The TL

I totally agree with you! But it is a problem that just bothered my mind. More of a theoretical problem I just wanted an answer on.

Do you, by any chance, remember any phrases used in the OP or comments, or words in the title or anything like that? So I can find this topic by myself? I would be eager to read it.

Last edited by wtbhhz; 09-11-2017 at 10:28 AM.

Quote

09-11-2017 , 10:40 AM

Bob148

Carpal \'Tunnel

Join Date: May 2012 Posts: 11,972

With each aggressive action, your opponent further pigeon holes himself as an over aggressive player. With each passive action, your opponent further pigeon holes himself as a passive player. With each fold, your opponent further pigeon holes himself as a tight player. With each call, your opponent further pigeon holes himself as a loose player.

Paging Sir Bayes.

Quote

09-11-2017 , 10:59 AM

wtbhhz

stranger

Join Date: Sep 2017 Posts: 4

I know enough. Thanks Bob and Robert!

Quote

09-11-2017 , 11:07 AM

wtbhhz

stranger

Join Date: Sep 2017 Posts: 4

This topic can be useful for people looking for the same answer: http://archives1.twoplustwo.com/show...fpart=all&vc=1

The calculations in the topic are not completely correct, but the idea is. I feel like I had a wall before my face, using Bayes' theorem for a long time in other applications but not seeing the fact I could use it for this matter.

Quote

09-11-2017 , 12:15 PM

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

Quote:

Originally Posted by wtbhhz

Thats the one! Glad you found it, I forgot where I had read it.

Quote

09-11-2017 , 01:46 PM

statmanhal

Pooh-Bah

Join Date: Jan 2009 Posts: 4,987

Quote:

Originally Posted by wtbhhz

In this situation I'm trying to predict Villain's VPIP. This is an example and I will leave all other factors that can help to get an idea of Villain's VPIP (such as stack-size, position, stakes, etc.) out of the question. This is a purely theoretical situation.

In addition to using Bayes Theorem, which the referenced link does, you can use standard confidence interval theory or a significance test.
VPIP and PFR are probabilities; therefore the binomial distribution applies if the probabilities are constant over the observation period and will remain so.

Confidence Interval: Assume a VPIP of 40% was observed over a sample of 20 hands. You can say the following: I am 90% confident that the true VPIP is at least equal to 0.40 – 1.64*SQRT (0.4*0.6/20) =22%. This is based on approximating the binomial using the normal distribution and the sample size should be at least 20. A more accurate calculation shows the lower limit to be 24.9%.

The general formula if the sample VPIP is V based on N samples is

C% lower confidence interval is approximately V – Zc*SQRT(V*(1-V)/N)

where Zc= 0.84 for C%=80%, 1.28 for 90%, 1.64 for 95%.

Significance Test: Given a sample VPIP value, say Vs=80% based on a sample of 10 hands (8 bets) , a binomial calculator can be used to determine the likelihood of that result if the true VPIP is Vt, say Vt=50%. If you have Excel, the formula is:

Probability (sample VPIP >=80% for 10 hands if true VPIP = 50%) = 1 -BINOMDIST(7, 10, 0.50,1)

The binomial term is the probability of 7 or less bets in 10 hands, so 1 – this probability is the chance of 8 or more bets in 10 hands.

For this example, the probability is 5.5% that this result will occur if villain’s true VPIP was 50%. Thus the result does not contradict the 50% hypothesis at the 5% significance level, though it’s close. However if you hypothesize that villain’s VPIP was 30%, then,

Pr = 1-BINOMDIST(7,10, 0.30, 1) = 0.16%, making such a sample result highly unlikely for Vt= 30%.

In general:

Pr(sample VPIP >= Vs over N samples if true VPIP = Vt) = 1- BINOMDIST(B, N, Vt, 1)

where B is the number of voluntary bet hands= Vs*N (an integer)

Quote

09-14-2017 , 03:13 PM

statoverflow

newbie

Join Date: Sep 2017 Posts: 23

There's a statistical distribution for this type of thing: it's the Beta distribution. You can elicit a prior Beta distribution from the population and update that as data comes in.

https://en.wikipedia.org/wiki/Beta_distribution

For a course about this, check out Week 1 of the Coursera course

https://www.coursera.org/learn/bayesian

Quote

09-14-2017 , 03:41 PM

#10

whosnext

Carpal \'Tunnel

Join Date: Mar 2009 Posts: 6,732

The beta-prior approach has been discussed in several threads.

Here is a thread which describes how to utilize your "prior" beliefs of the distribution from which villain emanates. And then how to "update" your beliefs with new observation of villain to derive your new "posterior" distribution for villain (mean and standard deviation).

https://forumserver.twoplustwo.com/2.../#post48790891

The crux of the matter is given in this post.

Quote:

Originally Posted by whosnext

Analytically a beta distribution has two parameters (call them A and B). The mean of a beta is A/(A+B) and the variance is (AB)/(((A+B)^2)*(A+B+1)).

Using this as your Bayesian prior, if you know (or are willing to estimate) your mean and variance (square of standard deviation), you can solve for A and B using the above formulas.

Then the Bayesian posterior distribution is given by its two parameters A' = A + K and B' = B + T - K where there are K "successes" out of T new observations. Mean and variance of the posterior distribution are easily derived using the formulas above.

Quote

09-15-2017 , 10:29 AM

#11

robert_utk

Not From the UK

Join Date: Jan 2005 Posts: 4,822

You guys are amazing at maths. I feel like a dweeb, but at least I can follow along.

Estimating the accuracy of a small hands-sample

Quote

Post Reply Subscribe

...