Hand history analysis - Gambling and Probability

Two Plus Two Forums Other Topics Probability

Hand history analysis

Post Reply Subscribe

...

10-12-2020 , 05:34 PM

d2_e4

certifiable moron

Join Date: Sep 2014 Posts: 12,107

I have approximately 35,000 NLHE and PLO tournament hand histories in raw format from a popular poker site. I would like to analyse these to see if there is a bias in the way the cards are dealt. Specifically, for all hands that went to showdown, I would like to see if the river card distribution is biased towards the hand which was behind on the turn. I am not concerned with the players or the action in the hand.

Is this sample size large enough to draw any reasonable conclusions, and if so, would anyone here be willing to perform this analysis for a fee? Looking for posters with some history of posting stats content in this forum please.

Last edited by d2_e4; 10-12-2020 at 05:41 PM.

Quote

10-12-2020 , 07:48 PM

heehaww

Pooh-Bah

Join Date: Aug 2011 Posts: 5,081

I don't need a fee if the monkey work is already done, ie converting the HH to a CSV with two columns: turn equity and river outcome. I already have code that can do the math from there.

Quote:

Is this sample size large enough

It depends on how different the results are from expectation. With a Bayesian approach it also depends on our priors. First I would simply calculate how rare it is to observe these results with a fair RNG, then I'd do some Bayesian analyses with priors reflecting an initial belief that it's fair.

Quote

10-12-2020 , 11:52 PM

d2_e4

certifiable moron

Join Date: Sep 2014 Posts: 12,107

Hi heehaww, thank you for your response. Unfortunately the HHs are in raw text format and I don't have any software that would do that conversion. Also, I may look to do additional analysis on the data once this initial phase is complete, so it would make sense to first convert to whatever format is most conducive to such. Shall I drop you a PM?

Last edited by d2_e4; 10-13-2020 at 12:00 AM.

Quote

10-17-2020 , 02:56 PM

browni3141

Pooh-Bah

Join Date: Aug 2015 Posts: 5,678

It seems like more of a programming task. I'm not volunteering, but there's no reason a separate person couldn't do the conversion.

Edit to say, you really only need to give the original HH format (an example for testing would be best), and output specifications. You could share that info publicly and make an offer for a working converter. I think there are sites that make work requests like this pretty easy. You can also post it here, although the task shouldn't require knowledge specific to poker.

Last edited by browni3141; 10-17-2020 at 03:02 PM.

Quote

10-18-2020 , 05:01 AM

d2_e4

certifiable moron

Join Date: Sep 2014 Posts: 12,107

Heehaww and I are discussing it over email. One of us will update the thread if there's anything interesting to report. For now, I'm operating on the assumption that I'm seeing monsters under the bed.

Heehaww is away for the next week, so there won't be any updates for a little while regardless.

Quote

10-29-2020 , 12:39 AM

heehaww

Pooh-Bah

Join Date: Aug 2011 Posts: 5,081

I think I can automate the hand filtering/converting process. However, I just noticed a big problem with my plan: showdown bias. Missed draws will sometimes fold on the river, whereas completed draws won't. The outdrawn player will sometimes fold too, but that doesn't cancel out. A proper analysis would require knowledge of the hole cards in non-shown-down hands, or short of that, detailed population stats on how often missed draws successfully bluff, how often the other player goes for value again, etc. Absent both of those, I could focus on the times Hero chased and the times people were all-in, but I doubt that would be an adequate subsample from 35k hands.

This seems like a dead end. Is there a good way to correct for the showdown bias?

Quote

11-07-2020 , 09:31 AM

heehaww

Pooh-Bah

Join Date: Aug 2011 Posts: 5,081

We've decided to go with pre-river all-in hands for the above reason. Whether the sample will be large enough remains to be seen.

Should the HH results call for a Bayesian analysis (which they might not), I've given the priors some thought and will share my ideas here.

Since I won't be confident in any particular set of priors I choose, I'll try a range of priors and see how different the conclusions are, so as to avoid having my final conclusion be too dependent on the priors being just right.

For starters, there's the question of how a dishonest RNG might treat different underdog hands. I might try two possibilities:
(A) If there's an equity boost, then it's scaled to proportion such that if 20% becomes 25% then 4% becomes 5%.
(B) All underdogs receive the same % equity boost, if any.

I think (A) is more likely because boosting a 4% to 9% would be easier to detect statistically than boosting a 20% to 25%, and perhaps too easy. However, I wonder if that's mitigated by the fact that smaller draws won't see the river as often, so maybe a similar number of dealt hands is required to detect the rig.

Both options make for a discrete 2D distribution of prior weights. Before observing the HH data, we start with beliefs about the likelihood of fairness and that of various rig sizes. If you think there's a 99% chance that the site is fair, then that means in a HU pot where someone is drawing to 9 river outs, you think there's a 99% chance that the draw's chance is 9/44. I'll assume P(x<9/44)=0 because a rig against draws would be against the site's interest. Therefore, we only have to consider the shape for x>9/44. I'll assume that smaller rigs are more likely than larger rigs, so the weights will strictly decrease until reaching 0 (ie I'll put a cap on how large the rig can be). I'll try a few different shapes from the 2nd point to the last, all strictly decreasing. I'll also a try different weights for P(x=9/44), which I expect to have a bigger impact on the conclusion.

Quote

Post Reply Subscribe

...