Quote:
Originally Posted by masque de Z
Its not orders of magnitude, its at worse a factor of 3 pending better analysis of the ridiculous 3 points levels statistics (but you better believe it that even 3 points tell you something).
Either they use CLT friendly distribution and it works or they dont. All the major distributions seen in books pass it if they meet the rho/sd^3/n^(1/2)<0.15 or so criterion. I can list a ton that pass it.
No, three data points don't tell you anything about the underlying distribution.
Again, show me one of these books where it states that summations of 20-27 are sufficient given an unknown underlying distribution. Just one.
Again, "well, every example I have seen in a textbook is x" is an argument from ignorance.
https://en.wikipedia.org/wiki/Argument_from_ignorance Now that you have seen distributions where it doesn't, you have no excuse to keep saying it. You learned something new. Pretending that it matters that you didn't know something before is just a weird thing to do. No one is mad at you for not knowing something at the outset of a conversation.
Quote:
If you had to choose something that would be the best choice if a distribution was creating it? It either finds it or it doesnt. But its your only chance to get it right.
The correct answer is still, "no, we can't know/calculate the variance because we don't have the right data to do so."
Quote:
I can think of many applications that all it takes is to be correct within an order of magnitude in some desperate situation that you cant measure more due to practical limits. If you find something in concentrations that are lethal and must be alarmed to do more testing its ok to be alarmed in that neighborhood. If you needed to go out of some habitat on Mars and all you know is how high the solar radiation event was at that moment at the surface you wouldnt mind if its x or 10x since even 0.1x is lethal for example. If the properties of the solar event satisfied distributions that were CLT friendly it would be a good guess to not go out for example then. If the signal could be caused by a ton of things (even those that are not at all fast CLT convergence friendly) but some lethal in them had these properties such estimation might mean something of value indeed why not. You cannot quantify your risk but you know it could be there.
I can think of many distributions that don't converge to a normal curve with summations under 30. Is that an indication that there aren't distributions that do? Would that be me making an argument from ignorance?
Yes. It would be an argument from ignorance. Same as your paragraph.
Quote:
Why is it so difficult to imagine that in case of uncertainty you cannot know what you have but you can know if what you have shares a given property the result means something. Many cases in life you can think something is there or not and if its not there your failure in predicting something doesnt matter but if it is you save the day.
You can state, "well, if we presume that the three data points we have come from a population distribution that converges to normal via CLT at n<=20 AND we presume that they were a randomly selected unbiased sample of said population without removal AND we presume that our sample mean is exactly equal to the population mean AND we presume that minimizing the squared difference of 3 data points from said sample/population mean will give us a reasonable estimation of the variance of the underlying sample AND we presume that we don't have to apply a correction such as Bessel's correction to say that said sample variance is equal to the population variance, then..."
Should we presume all of those things? **** no!
Quote:
What if they asked you predict the sd within a factor of 3 in this problem given that the mechanism produces a distribution? If you get it right within 3 you get $10000. If you fail you get nothing. Would you choose to predict anything then? How would you approach a prediction? What if you had 10 measurements instead of 3?
"20."
"You lose because SD is an appropriate measure of dispersion only for normal curves. You should have stated the lognormal SD or at the very least have given the average absolute deviation from the mean."
Quote:
So Brian take your girl analogies and shove them because any sensible girl would be insulted that this is how you create analogies missing entirely the point.
See my "well, if..." paragraph above. The analogy was apt.
Quote:
You cant hide for long behind your unwillingness to play the game with a distribution of your choice that passes the criterion. But the ultimate test is this; 10k if you guess it within a factor 2 say or nothing if you miss or if you fail to guess. What is your guess? What if the data was 132,270,365,330,410,515,240,267,335,221? (all 25 point sums say)
"Can you know/estimate the number of fingers I am holding behind my back?"
"No."
"If I offer a million dollars and you get the answer correct, then if you guess that means that you think that you can know/estimate the number of fingers I am holding behind my back."
"Ummm. No. It doesn't mean that at all."
For those 10 summations, given that the distribution doesn't appear to be normal and definitely doesn't have a strong peak,* I calculate the maximum sample SD and divide by 2, which gives a wild guess of 32.
X-axis is values:
X-axis is standard deviations:
*there aren't any tests of non-normality that are sensitive with so few data points.