Open Side Menu Go to the Top
Register
How to work out frequency of three events with 15 possible outcomes each How to work out frequency of three events with 15 possible outcomes each

04-07-2014 , 06:06 PM
There are three independent events: A, B, and C.
Each event will result in an outcome of an integer from 1 to 13, inclusive.

My goal is to calculate the likelihood of the outcome of the totality of the events to add to 3, 4, 5, 6, .... 39. (for example, using data below, likelihood of adding to 3 is p1 x p1 x p1 = .0360 x .0423 x .0423 = .0064). Clearly this gets more involved with the more middling outcomes.

I can manually calculate this in Excel, though it would take 40 minutes to program the formula. I could also do it by hand, but it would take a lot longer.

Is there a faster way? Specifically would be interested a way to do this in excel or by hand, but open to computational approaches as well (though I likely would be unable to mimic them as I have no background in programming.)

This is the data I am working with:
Quote:
Event A:
Outcome | Probability

1 .0360
2 .0327
3 .1428
4 .1035
5 .2414
6 .1057
7 .1700
8 .0550
9 .0714
10 .0131
11 .0240
12 .0033
13 .0011

Event B:
Outcome | Probability

1 .0423
2 .0329
3 .1403
4 .1050
5 .2288
6 .0925
7 .1865
8 .0619
9 .0533
10 .0172
11 .0235
12 .0125
13 .0031

Event C:
Outcome | Probability

1 .0423
2 .0329
3 .1403
4 .1050
5 .2288
6 .0925
7 .1865
8 .0619
9 .0533
10 .0172
11 .0235
12 .0125
13 .0031
(Yes, Event C and B have identical probabilities in this example, though that wouldn't always be the case).
Thank you in advance for any direction.

How to work out frequency of three events with 15 possible outcomes each Quote
04-07-2014 , 09:38 PM
Here's a simple R script to do it. R is a free and easy download. You can do the equivalent in Excel's Visual Basic or any programming language.

Code:
A = c(.0360, .0327, .1428, .1035, .2414, .1057, .1700, .0550, .0714, .0131, 
.0240, .0033, .0011)

B = c(.0423, .0329, .1403, .1050, .2288, .0925, .1865, .0619, .0533, .0172, 
.0235, .0125, .0031)

C = c(.0423, .0329, .1403, .1050, .2288, .0925, .1865, .0619, .0533, .0172, 
.0235, .0125, .0031)

results = rep(0,39)
for (a in 1:13)
  for (b in 1:13)
    for (c in 1:13)
      results[a+b+c] = results[a+b+c] + A[a]*B[b]*C[c]
results
Output:

Code:
> results
 [1] 0.000000e+00 0.000000e+00 6.441444e-05 1.587100e-04 8.127903e-04
 [6] 1.658307e-03 4.816062e-03 8.175240e-03 1.736831e-02 2.501321e-02
[11] 4.254093e-02 5.270001e-02 7.509582e-02 8.089312e-02 9.913265e-02
[16] 9.383060e-02 1.007541e-01 8.479537e-02 8.096549e-02 6.162306e-02
[21] 5.291740e-02 3.697923e-02 2.879234e-02 1.863715e-02 1.309842e-02
[26] 7.876627e-03 4.985034e-03 2.781979e-03 1.584212e-03 8.174344e-04
[31] 4.110790e-04 1.872908e-04 8.244036e-05 3.378605e-05 1.268285e-05
[36] 3.745880e-06 8.185350e-07 1.169630e-07 1.057100e-08
Note that your B and C probabilities only sum to 0.9998 instead of 1. Consequently, the resulting probabilities only sum to 0.9996. If you fix that, they will sum to 1. Also note that for 3, the probability is about 0.000064, not 0.0064.
How to work out frequency of three events with 15 possible outcomes each Quote
04-08-2014 , 10:19 AM
Thank you so much for this. This is cool. I downloaded R, and can easily replace the variables with different probabilities and do this myself now. Thank you.

I redid this for a fictitious 4-event scenario with Event D being the same distribution as B,C (including the rounding "error".) I coded it as such, and I believe I did this correctly, following your pattern.

code:
Quote:
A = c(.0360, .0327, .1428, .1035, .2414, .1057, .1700, .0550, .0714, .0131,
.0240, .0033, .0011)

B = c(.0423, .0329, .1403, .1050, .2288, .0925, .1865, .0619, .0533, .0172,
.0235, .0125, .0031)

C = c(.0423, .0329, .1403, .1050, .2288, .0925, .1865, .0619, .0533, .0172,
.0235, .0125, .0031)

D = c(.0423, .0329, .1403, .1050, .2288, .0925, .1865, .0619, .0533, .0172,
.0235, .0125, .0031)

results = rep(0,52)
for (a in 1:13)
for (b in 1:13)
for (c in 1:13)
for (d in 1:13)
results[a+b+c+d] = results[a+b+c+d] + A[a]*B[b]*C[c]*D[d]
results
results:
Quote:
[1] 0.000000e+00 0.000000e+00 0.000000e+00 2.724731e-06 8.832669e-06 4.863993e-05
[7] 1.259177e-04 4.037148e-04 8.645357e-04 2.066121e-03 3.770339e-03 7.337742e-03
[13] 1.164697e-02 1.928375e-02 2.701233e-02 3.902297e-02 4.876592e-02 6.249266e-02
[19] 7.028096e-02 8.085723e-02 8.249048e-02 8.603144e-02 8.025996e-02 7.652701e-02
[25] 6.582912e-02 5.780380e-02 4.620979e-02 3.757830e-02 2.808190e-02 2.122001e-02
[31] 1.487654e-02 1.045983e-02 6.889990e-03 4.511836e-03 2.792193e-03 1.701421e-03
[37] 9.859915e-04 5.574368e-04 3.005453e-04 1.569356e-04 7.822674e-05 3.740906e-05
[43] 1.687990e-05 7.207708e-06 2.896337e-06 1.093206e-06 3.722678e-07 1.079511e-07
[49] 2.477437e-08 4.247915e-09 4.947228e-10 3.277010e-11
Again, thank you.

Last edited by DMMx69; 04-08-2014 at 10:20 AM. Reason: Although my forum formatting isn't as elegant....
How to work out frequency of three events with 15 possible outcomes each Quote
04-08-2014 , 03:42 PM
Now attempting this with 10 events. It does not seem to be working - but I'm not sure if it is crashing due to poor coding? or if it is actually taking a long time to work through it? Been "running" for 15 minutes.

I know nothing about programming, but tried to mimic your inputs as best I could. (And it did work when I had 4 events.)

Code:
Quote:
A = c(.0405, .0374, .1468, .1063, .2368, .1265, .1630, .0445, .0628, .0121, .0192, .0020, .0000, .0020)

B = c(.0427, .0305, .1442, .0998, .2329, .1109, .1803, .0555, .0577, .0144, .0200, .0089, .0022, .0000)

C = c(.0427, .0305, .1442, .0998, .2329, .1109, .1803, .0555, .0577, .0144, .0200, .0089, .0022, .0000)

D = c(.0351, .0229, .1340, .1024, .2378, .1003, .1827, .0673, .0566, .0172, .0265, .0129, .0043, .0000)

E = c(.0351, .0229, .1340, .1024, .2378, .1003, .1827, .0673, .0566, .0172, .0265, .0129, .0043, .0000)

F = c(.0297, .0222, .1342, .0941, .2350, .1097, .1853, .0675, .0586, .0148, .0319, .0126, .0044, .0000)

G = c(.0297, .0222, .1342, .0941, .2350, .1097, .1853, .0675, .0586, .0148, .0319, .0126, .0044, .0000)

H = c(.0253, .0214, .1404, .0673, .2495, .0897, .1979, .0750, .0780, .0205, .0283, .0049, .0019, .0000)

I = c(.0253, .0214, .1404, .0673, .2495, .0897, .1979, .0750, .0780, .0205, .0283, .0049, .0019, .0000)

J = c(.0284, .0262, .1223, .0306, .2314, .1092, .2162, .0742, .1092, .0197, .0240, .0087, .0000, .0000)


results = rep(0,140)
for (a in 1:14)
for (b in 1:14)
for (c in 1:14)
for (d in 1:14)
for (e in 1:14)
for (f in 1:14)
for (g in 1:14)
for (h in 1:14)
for (i in 1:14)
for (j in 1:14)
results[a+b+c+d+e+f+g+h+i+j] = results[a+b+c+d+e+f+g+h+i+j] + A[a]*B[b]*C[c]*D[d]*E[e]*F[f]*G[g]*H[h]*I[i]*J[j]
results
How to work out frequency of three events with 15 possible outcomes each Quote
04-08-2014 , 06:09 PM
Quote:
Originally Posted by DMMx69
Now attempting this with 10 events. It does not seem to be working - but I'm not sure if it is crashing due to poor coding? or if it is actually taking a long time to work through it? Been "running" for 15 minutes.

I know nothing about programming, but tried to mimic your inputs as best I could. (And it did work when I had 4 events.)

Code:
Your code requires 14^10=289254654976 iterations. It will last forever.
Break your "sum" into steps. First add events A,B,C,D,E. Then add the others. Finally add the results you obtained from the two previous steps. You will gain a lot of time.
How to work out frequency of three events with 15 possible outcomes each Quote
04-09-2014 , 02:42 AM
Quote:
Originally Posted by DMMx69
Now attempting this with 10 events. It does not seem to be working - but I'm not sure if it is crashing due to poor coding? or if it is actually taking a long time to work through it? Been "running" for 15 minutes.
It would probably take months to run that many that way. The script below breaks it up as nick suggests, and it just takes a couple seconds. It does 2 groups of 5 with 14 each with numbers ranging from 5 to 70, and then combines those 2 that now have 66 each. So instead of 14^10 = 289254654976 iterations, there is only 2*14^5 + 66^2 = 11080004 iterations, which is 267827.4 times faster, but it's even faster than that because the new loops execute faster. On top of that, the 2 lines I added at the top of the code will enable a byte compiler which speeds it up by another factor of about 7. All together it's about 3 million times faster. For loops are slower in R than other languages, so the byte compiler makes a big difference.

Your A only sums to 0.9999, and your H-J sums to 1.0001, so the results sum to 1.0002.

Code:
require(compiler)
enableJIT(3)

A = c(.0405, .0374, .1468, .1063, .2368, .1265, .1630, .0445, .0628, .0121, .0192, .0020, .0000, .0020)

B = c(.0427, .0305, .1442, .0998, .2329, .1109, .1803, .0555, .0577, .0144, .0200, .0089, .0022, .0000)

C = c(.0427, .0305, .1442, .0998, .2329, .1109, .1803, .0555, .0577, .0144, .0200, .0089, .0022, .0000)

D = c(.0351, .0229, .1340, .1024, .2378, .1003, .1827, .0673, .0566, .0172, .0265, .0129, .0043, .0000)

E = c(.0351, .0229, .1340, .1024, .2378, .1003, .1827, .0673, .0566, .0172, .0265, .0129, .0043, .0000)

F = c(.0297, .0222, .1342, .0941, .2350, .1097, .1853, .0675, .0586, .0148, .0319, .0126, .0044, .0000)

G = c(.0297, .0222, .1342, .0941, .2350, .1097, .1853, .0675, .0586, .0148, .0319, .0126, .0044, .0000)

H = c(.0253, .0214, .1404, .0673, .2495, .0897, .1979, .0750, .0780, .0205, .0283, .0049, .0019, .0000)

I = c(.0253, .0214, .1404, .0673, .2495, .0897, .1979, .0750, .0780, .0205, .0283, .0049, .0019, .0000)

J = c(.0284, .0262, .1223, .0306, .2314, .1092, .2162, .0742, .1092, .0197, .0240, .0087, .0000, .0000)

probs1 = rep(0,70)
probs2 = rep(0,70)
results = rep(0,140)
for (a in 1:14)
 for (b in 1:14)
   for (c in 1:14)
     for (d in 1:14)
       for (e in 1:14)
         probs1[a+b+c+d+e] = probs1[a+b+c+d+e] + A[a]*B[b]*C[c]*D[d]*E[e]
for (f in 1:14)
 for (g in 1:14)
   for (h in 1:14)
     for (i in 1:14)
       for (j in 1:14)
         probs2[f+g+h+i+j] = probs2[f+g+h+i+j] + F[f]*G[g]*H[h]*I[i]*J[j]
for (i in 5:70)
  for (j in 5:70)
    results[i+j] = results[i+j] + probs1[i]*probs2[j]
results

Last edited by BruceZ; 04-09-2014 at 02:58 AM.
How to work out frequency of three events with 15 possible outcomes each Quote
04-09-2014 , 11:32 AM
OK - thank you Nick and Bruce. I realized my code was requiring a lot of computations, but I have no idea what "a lot" really is to a computer as I had no frame of reference. I appreciate your inputs and will work with the code like this now for future iterations.

Thanks again.
How to work out frequency of three events with 15 possible outcomes each Quote
04-09-2014 , 02:01 PM
Computationally, all we're doing is convolution. We are computing A*B*C*...*J where * here means convolution not multiply. We can do this with the R convolution routine, and it's much faster because convolution is performed internally with compiled algorithms that do fast Fourier transforms. FFTs have complexity of n*log(n) instead of n^2. R defines convolution a little screwy, so we need the first line to define it the way we want it. So we want to basically do this:

Code:
conv = function(f,g) convolve(f, rev(c(0, g)), type = "o")
a = conv(A,B)
b = conv(a,C)
c = conv(b,D)
d = conv(c,E)
e = conv(d,F)
f = conv(e,G)
g = conv(f,H)
h = conv(g,I)
i = conv(h,J)
i
I made this better by putting all of your probabilities in a single matrix and looping through it instead of having separate letters. Then you can do different numbers of integers by just changing the matrix instead of the code itself. You can edit that in the code, or separately if you type edit(M) which brings up a spreadsheet type thing. So now the code is just a few lines aside from the matrix, and it responds instantaneously.

Code:
M = matrix(c(
.0405, .0374, .1468, .1063, .2368, .1265, .1630, .0445, .0628, .0121, .0192, .0020, .0000, .0020,
.0427, .0305, .1442, .0998, .2329, .1109, .1803, .0555, .0577, .0144, .0200, .0089, .0022, .0000,
.0427, .0305, .1442, .0998, .2329, .1109, .1803, .0555, .0577, .0144, .0200, .0089, .0022, .0000,
.0351, .0229, .1340, .1024, .2378, .1003, .1827, .0673, .0566, .0172, .0265, .0129, .0043, .0000,
.0351, .0229, .1340, .1024, .2378, .1003, .1827, .0673, .0566, .0172, .0265, .0129, .0043, .0000,
.0297, .0222, .1342, .0941, .2350, .1097, .1853, .0675, .0586, .0148, .0319, .0126, .0044, .0000,
.0297, .0222, .1342, .0941, .2350, .1097, .1853, .0675, .0586, .0148, .0319, .0126, .0044, .0000,
.0253, .0214, .1404, .0673, .2495, .0897, .1979, .0750, .0780, .0205, .0283, .0049, .0019, .0000,
.0253, .0214, .1404, .0673, .2495, .0897, .1979, .0750, .0780, .0205, .0283, .0049, .0019, .0000,
.0284, .0262, .1223, .0306, .2314, .1092, .2162, .0742, .1092, .0197, .0240, .0087, .0000, .0000),
ncol=14, byrow=TRUE)

conv = function(f,g) convolve(f, rev(c(0, g)), type = "o")
c = conv(M[1,],M[2,])
if (nrow(M) > 2) for (i in 3:nrow(M)) c = conv(c,M[i,])
c[1:(nrow(M)-1)] = 0
plot(c,xlab='Sum',ylab='Probability')
c
One thing about this is that numbers that are 0 or very very tiny won't be exact due to roundoff error, and some will be very small and negative. For example, for 140 you get -7.137148e-18. But that only affects numbers that are around 1e-17, so you probably don't care. We could just round those to 0 too. I added a line to force 1-9 to be 0 in this case.

I added a plot to the code. Here it is:



BTW, if you want to check that the probabilities add to 1, you can do for example, sum(A) with the old code, or with the new code sum(M[1,]), sum(M[2,]), etc. sum(M) should sum the whole matrix and give 10 right now, except it gives 10.0002.

Last edited by BruceZ; 04-10-2014 at 03:26 AM. Reason: Added plot to code
How to work out frequency of three events with 15 possible outcomes each Quote
04-09-2014 , 03:18 PM
Quote:
Originally Posted by DMMx69
I realized my code was requiring a lot of computations, but I have no idea what "a lot" really is to a computer as I had no frame of reference.
OT, but a tip for you: when you did your first try, which contained 4 loops of 1 to 13, it did 13^4 = 28561 iterations. If you observe the time it took to execute, you can estimate how long it would take 14^10 = 289254654976 iterations: it would take roughly 289254654976/28561 = 10127609,5016280942544 times as long. So, if your first program executed in e.g 1 second, your second attempt would take about 10127610 seconds = ~117 days. Of course that is not accurate, but you can get a pretty good feeling whether something is feasible or not.

In many cases one may want to solve a problem by simulation, e.g. making a lot of tries and observing the results. Often one doesn't really know how many simulations are needed to get a reliable result, but one can first run e.g. 10000 simulations, again measure the time it took, and then estimate what is the feasible maximum number of simulations than can be run.
How to work out frequency of three events with 15 possible outcomes each Quote
04-09-2014 , 10:49 PM
Thank god. I was going to comment earlier, but all the for loops made me vomit before I could speak.
How to work out frequency of three events with 15 possible outcomes each Quote
04-09-2014 , 10:57 PM
The 2 groups of 5 nested for loops followed by the 2 nested for loops is plenty fast when byte compiled. It runs in a couple seconds with no roundoff error as you get from the FFT. The FFT is significantly faster though, and for bigger jobs that would be the way to go. For loops are appropriate when new results depend on previous results as here. The 10 nested for loops would take all day even in a compiled language. This would also be parallelizable on a multi-core machine. R handles that via aapply.

BTW, for some reason Ra with the JIT compiler has been deprecated and is not compatible with current versions of R, though you can still get the older version. I wonder why since it was faster than byte compilation, and it was especially good for speeding up for loops. Maybe they integrated it with standard R now, I don't know.

Last edited by BruceZ; 04-10-2014 at 03:11 AM.
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 02:17 AM
Also remember that if you have many many events, the central limit theorem provides a very useful approximation. The resulting distribution will be closer and closer to a normal distribution:

Code:
#putting all the probabilities together
probs<-list(A,B,C,D,E,F,G,H,I,J)

#calculating the max score
max_score<-sum(vapply(probs,length,1L))

#calculating the EV for each event
means<-vapply(probs,function(x) sum(x*seq_along(x)),1)

#calculating the variance for each event
variances<-mapply(function(x,y) (seq_along(x)-y)^2*x,probs,means)

#approximating the result with a normal distribution. The mean and the variance of the distribution are the sum of means and variances of the single events  
res<-diff(pnorm(seq(from=0.5,to=max_score+0.5,by=1),sum(means),sqrt(sum(variances))))
If you have performed the exact calculation (the c object in BruceZ code), you can check how good this approximation is by plotting:

Code:
plot(c)
points(res,col="blue")
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 03:29 AM
For BruceZ: the R function `Reduce' applies a function which takes two arguments to the first two elements of a list, then applies it to the result and the third element and so on. So you can eliminate a loop from your code:

Code:
probs<-list(A,B,C,D,E,F,G,H,I,J)
resultWithReduce<-Reduce(conv,probs)
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 03:47 AM
Not working.

Code:
> means<-vapply(probs,function(x) sum(x*seq_along(x)),1)
Note: no visible global function definition for 'sum' 
Error in x * seq_along(x) : non-numeric argument to binary operator
> 
> #calculating the variance for each event
> variances<-mapply(function(x,y) (seq_along(x)-y)^2*x,probs,means)
Error in mapply(function(x, y) (seq_along(x) - y)^2 * x, probs, means) : 
  object 'means' not found
> 
> #approximating the result with a normal distribution. The mean and the variance of the distribution are the sum of means and variances of the single events  
> res<-diff(pnorm(seq(from=0.5,to=max_score+0.5,by=1),sum(means),sqrt(sum(variances))))
Error in pnorm(seq(from = 0.5, to = max_score + 0.5, by = 1), sum(means),  : 
  object 'means' not found
Also, what is the reason for diffing the pnorm instead of using dnorm?
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 03:50 AM
Quote:
Originally Posted by nickthegeek
For BruceZ: the R function `Reduce' applies a function which takes two arguments to the first two elements of a list, then applies it to the result and the third element and so on. So you can eliminate a loop from your code:

Code:
probs<-list(A,B,C,D,E,F,G,H,I,J)
resultWithReduce<-Reduce(conv,probs)
Thanks, but it doesn't really eliminate a loop, it just hides it in Reduce which works with for loops. It's like people who think apply is somehow faster than a for loop when it's really just a wrapper for a for loop. Syntactic sugar is nice in that it makes code more concise, but on here it tends to make it unreadable for people who aren't R experts. Plus it's often at the expense of additional data structures, so if I had a matrix, I now need to break that into a list.


"I’m just going to say it. I like for loops in #Rstats, makes my code readable. All you [a-z]*ply snobs can shove it!"

— Ted Hart (@DistribEcology) March 12, 2013

http://www.noamross.net/blog/2013/4/25/faster-talk.html

Last edited by BruceZ; 04-10-2014 at 04:30 AM.
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 04:30 AM
Also, it doesn't work.

> probs<-list(A,B,C,D,E,F,G,H,I,J)
> resultWithReduce<-Reduce(conv,probs)
Error in convolve(f, rev(c(0, g)), type = "o") : non-numeric argument

It looks like Reduce needs a vector, not a list.
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 04:38 AM
Quote:
Originally Posted by BruceZ
Not working.

Code:
> means<-vapply(probs,function(x) sum(x*seq_along(x)),1)
Note: no visible global function definition for 'sum' 
Error in x * seq_along(x) : non-numeric argument to binary operator
> 
> #calculating the variance for each event
> variances<-mapply(function(x,y) (seq_along(x)-y)^2*x,probs,means)
Error in mapply(function(x, y) (seq_along(x) - y)^2 * x, probs, means) : 
  object 'means' not found
> 
> #approximating the result with a normal distribution. The mean and the variance of the distribution are the sum of means and variances of the single events  
> res<-diff(pnorm(seq(from=0.5,to=max_score+0.5,by=1),sum(means),sqrt(sum(variances))))
Error in pnorm(seq(from = 0.5, to = max_score + 0.5, by = 1), sum(means),  : 
  object 'means' not found
Also, what is the reason for diffing the pnorm instead of using dnorm?
Very strange. Did you execute probs<-list(A,B,C,D,E,F,G,H,I,J) at the top of it? If so, what R version do you have? Which is the output of search()? My guess is that you didn't define probs and you have some package attached that defines the function `probs' and tries to multilply a function to a vector in x*seq_along(x).

It is certainly true that Reduce hides a loop. However, functions lapply, vapply and others handle the needed loop internally. So they are faster than a R loop.
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 04:50 AM
Quote:
Originally Posted by BruceZ
Also, it doesn't work.

> probs<-list(A,B,C,D,E,F,G,H,I,J)
> resultWithReduce<-Reduce(conv,probs)
Error in convolve(f, rev(c(0, g)), type = "o") : non-numeric argument

It looks like Reduce needs a vector, not a list.
No, Reduces takes either vectors or lists. Can you run the first example from the Reduce man page?

Code:
## A general-purpose adder:
     add <- function(x) Reduce("+", x)
     add(list(1, 2, 3))
As you can see, Reduce can take lists as well (just try Reduce("+",list(1,2,3))). I think is a R version issue, or some package issue. Try running the same code I provided naming the initial list other than probs.
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 05:02 AM
Quote:
Originally Posted by nickthegeek
Very strange. Did you execute probs<-list(A,B,C,D,E,F,G,H,I,J) at the top of it? If so, what R version do you have? Which is the output of search()? My guess is that you didn't define probs and you have some package attached that defines the function `probs' and tries to multilply a function to a vector in x*seq_along(x).
Yes, I ran the whole thing that set probs to a list. I also renamed it with the same result. I have an old version of R 2.15.0.

> search()
[1] ".GlobalEnv" "package:compiler" "packageartitions"
[4] "package:stats" "package:graphics" "package:grDevices"
[7] "package:utils" "package:datasets" "package:methods"
[10] "Autoloads" "package:base"


Quote:
It is certainly true that Reduce hides a loop. However, functions lapply, vapply and others handle the needed loop internally. So they are faster than a R loop.
Yeah, the ones based on lapply are supposed to be slightly faster than a for loop since lapply is compiled. Regular apply isn't compiled.


Quote:
As you can see, Reduce can take lists as well (just try Reduce("+",list(1,2,3))).
That runs OK. I get 6. That's just a list of numbers though instead of a list of vectors. The man page just says vector for some reason.
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 05:16 AM
Quote:
Originally Posted by BruceZ
Yes, I ran the whole thing that set probs to a list. I also renamed it with the same result. I have an old version of R 2.15.0.

> search()
[1] ".GlobalEnv" "package:compiler" "packageartitions"
[4] "package:stats" "package:graphics" "package:grDevices"
[7] "package:utils" "package:datasets" "package:methods"
[10] "Autoloads" "package:base"




Yeah, the ones based on lapply are supposed to be slightly faster than a for loop since lapply is compiled. Regular apply isn't compiled.




That runs OK. I get 6. That's just a list of numbers though instead of a list of vectors. The man page just says vector for some reason.
Can't see any `dangerous' package. I'm very surprised if this is a R version issue, since the code I gave is very basic and should run on all R versions. I will try later to install locally a 2.15 version and run the code, to see if I can reproduce your problem. Just out of curiosity, what do you get if you run Reduce("+",list(c(1,2),3,4))? And what if you run sum(probs[[1]]*seq_along(probs[[1]])) (of course replace probs with the new name)? What is the output of str(probs)?
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 06:20 AM
LOL, I found the problem. I had C defined as a function (choose). So it was reading crap for that one into the list. I fixed that, and everything works. Your curve is right on top of my curve, though each value appears slightly shifted. Yours is red now.



BTW, here's a handy function for copying a single graph that's up to your directory as a jpeg:

Code:
jpg = function(filename) {
  dev.copy(jpeg,paste("My Pictures/Misc/",filename,sep=""))
  dev.off()
}
Change the directory of course for your situation. Call as:

jpg("whatever_name.jpg")
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 06:31 AM
Quote:
Originally Posted by BruceZ
LOL, I found the problem. I had C defined as a function (choose). So it was reading crap for that one into the list. I fixed that, and everything works. Your curve is right on top of my curve, though each value appears slightly shifted. Yours is red now.



BTW, here's a handy function for copying a single graph that's up to your directory as a jpeg:

Code:
jpg = function(filename) {
  dev.copy(jpeg,paste("My Pictures/Misc/",filename,sep=""))
  dev.off()
}
Change the directory of course for your situation. Call as:

jpg("whatever_name.jpg")
Great, I was getting crazy trying to figure out the issue!

BTW, I used pnorm instead of dnorm because we are passing from a discrete variable to a continuous one. So I thought the the probability of having, for instance, 33 is the integral between 32.5 and 33.5 of the resulting normal distribution. Of course, if you just use dnorm, you'll obtain very close values.
How to work out frequency of three events with 15 possible outcomes each Quote
04-10-2014 , 01:57 PM
Quote:
Originally Posted by hauturi
OT, but a tip for you....So, if your first program executed in e.g 1 second, your second attempt would take about 10127610 seconds = ~117 days.
Makes sense, and understood. Thank you.

Thanks to all for the various help. Following along with the rest of the thread as well, though the coding part is beyond me.
How to work out frequency of three events with 15 possible outcomes each Quote
04-11-2014 , 06:22 AM
Quote:
Originally Posted by DMMx69
Makes sense, and understood. Thank you.

Thanks to all for the various help. Following along with the rest of the thread as well, though the coding part is beyond me.
You are right, the coding part is actually really difficult if you don't know much about R and programming in general. So, I'll try to clarify my previous post.

If you have many variables and you are just interested in the sum of them, a fundamental theorem can speed up very considerably your calculations.

Such theorem states that the sum of many independent variables behaves similarly to a normal distribution, whose mean (variance) is the sum of the means (variances) of each variable. The bigger the number of variables, the more accurate this approximation becomes.

So, you can proceed as follow:

1) compute the mean for each variable (by summing each outcome weighted with the relative probability);
2) compute the variance for each variable (by summing the square of the difference between each outcome and the mean, weighted with the relative probability);
3) sum all means;
4) sum all variances.

You are done! The sum of all variables will behave approximately like a normal distribution with mean and variance you have evaluated in steps 3 and 4. This means, for example, that the probability of having a value between mean - sqrt(variance) and mean + sqrt(variance) will be about 68%. If you need the probability of a single value, you can exploit functions that are implemented in most programming languages (if you are famliar with excel (i'm not) you should find the right function pretty easily).

Hope this clarifies a little.
How to work out frequency of three events with 15 possible outcomes each Quote

      
m