Quote:
Originally Posted by Laughing Assassin
Doing some hand analysis and I have come across an issue. The problem I have come across is best explained by way of an example:
1) Consider this example of a 'perfect' binomial distribution:
A number of trains (n) travel a given route every day and they all have the same probability of being late (p).
The expected number of trains that are late = pn
The standard deviation from this expected number = SQRT(pn(1-p))
2) Now consider this example, it is not really a binomial distribution but is similar to one:
A number of trains (n) travel different routes every day and they all have different probabilities of being late. Let the mean of these probabilities be m.
The expected number of trains that are late = mn
Can the standard deviation be estimated using SQRT(mn(1-m)) ????
If the trains are independent, the variance of the number late is the sum of the variances for each train. The variance for a train with probability p of being late is the variance of a random variable that is 1 with probability p and 0 with probability 1-p which gives a variance of p(1-p). Sum these for each train and take the square root to get the standard deviation. That's how the formula you gave for the binomial distribution is derived too with all probabilites and variances the same.
You can do some simple examples to see how close it is to sqrt(mn(1-m)). It isn't the same because of the p
2 terms, but it can be close if the probabilites are small or if they are almost the same.
In real life they might not be independent if one train being late can make others late. Then you would need the covariances or correlations.
Last edited by BruceZ; 05-27-2011 at 07:55 AM.