Open Side Menu Go to the Top
Register
Validity of the Poisson Distribution Validity of the Poisson Distribution

02-18-2024 , 06:24 PM
A question regarding the Poisson distribution.

Citing Feustel in Conquering Risk the Poisson distribution can be used in a "good enough" sense if the number of trials divided by the probability of success is at least 200. This number, I assume, comes from Feustel's note that the Poisson can be used when the number of trials is at least 20 and the odds of success is no more than 10%.

In analyzing MLB hits per game, I have found that a player will see about 11 pitches and get a hit off 4% of those pitches. 11 divided by 4% yields yields 275, which is over that 200 line.

Based on these numbers and Feustel's rule of thumb, I believe the Poisson distribution can be used to estimate MLB player hits per game.

Criticism and correction of this reasoning very much appreciated.

Thanks, all.
Validity of the Poisson Distribution Quote
02-18-2024 , 10:58 PM
As Poission approximates a binomial distribution, it only works if the events are independent.

I don't think pitches are independent, but for your model maybe they are independent enough. For example, if a player whiffs on the first x pitches does that make it more likely the next pitch is a whiff?
Validity of the Poisson Distribution Quote
02-19-2024 , 04:26 PM
Quote:
Originally Posted by PokerHero77
As Poission approximates a binomial distribution, it only works if the events are independent.

I don't think pitches are independent, but for your model maybe they are independent enough. For example, if a player whiffs on the first x pitches does that make it more likely the next pitch is a whiff?
I would agree with this. In actuality, yes, there are a very large number of variables that make each pitch dependent on the last. However, as the saying goes, all models are wrong, but some are useful. The actual computation power to model how many hits a player would get given the number of variables to a "near perfect" would be computationally intractable. But, I would argue that given the assumptions outlined, Poisson is "good enough," assuming one is aware of the limitations.
Validity of the Poisson Distribution Quote
02-20-2024 , 12:43 AM
I suggest you use the projected distribution and see how that compares empirically with data from the past few years, where hit rate/pitch is reasonably constant. I suspect the model will under shoot in games with more pitches thrown, and over shoot with the opposite condition.

Another problem with your model is using a non-binomial stat (pitch) and applying a binomial property to it (hit or no hit). In fact a pitch can result in a set of outcomes, some resembling a hit and others not resembling a hit at all. Perhaps it would be worthwhile to construct poissons for each of the possible hit results and use accordingly.

Last edited by PokerHero77; 02-20-2024 at 01:02 AM. Reason: This might be a good exercise for wolly
Validity of the Poisson Distribution Quote
02-20-2024 , 03:07 AM
I am assuming you would be using an interval closely approximating the # of pitches in a game expected for each side.
Validity of the Poisson Distribution Quote

      
m