Open Side Menu Go to the Top
Register
Variance question Variance question

12-04-2015 , 12:39 PM
Quote:
Originally Posted by masque de Z
Its not orders of magnitude, its at worse a factor of 3 pending better analysis of the ridiculous 3 points levels statistics (but you better believe it that even 3 points tell you something).

Either they use CLT friendly distribution and it works or they dont. All the major distributions seen in books pass it if they meet the rho/sd^3/n^(1/2)<0.15 or so criterion. I can list a ton that pass it.
No, three data points don't tell you anything about the underlying distribution.

Again, show me one of these books where it states that summations of 20-27 are sufficient given an unknown underlying distribution. Just one.

Again, "well, every example I have seen in a textbook is x" is an argument from ignorance. https://en.wikipedia.org/wiki/Argument_from_ignorance Now that you have seen distributions where it doesn't, you have no excuse to keep saying it. You learned something new. Pretending that it matters that you didn't know something before is just a weird thing to do. No one is mad at you for not knowing something at the outset of a conversation.

Quote:
If you had to choose something that would be the best choice if a distribution was creating it? It either finds it or it doesnt. But its your only chance to get it right.
The correct answer is still, "no, we can't know/calculate the variance because we don't have the right data to do so."

Quote:
I can think of many applications that all it takes is to be correct within an order of magnitude in some desperate situation that you cant measure more due to practical limits. If you find something in concentrations that are lethal and must be alarmed to do more testing its ok to be alarmed in that neighborhood. If you needed to go out of some habitat on Mars and all you know is how high the solar radiation event was at that moment at the surface you wouldnt mind if its x or 10x since even 0.1x is lethal for example. If the properties of the solar event satisfied distributions that were CLT friendly it would be a good guess to not go out for example then. If the signal could be caused by a ton of things (even those that are not at all fast CLT convergence friendly) but some lethal in them had these properties such estimation might mean something of value indeed why not. You cannot quantify your risk but you know it could be there.
I can think of many distributions that don't converge to a normal curve with summations under 30. Is that an indication that there aren't distributions that do? Would that be me making an argument from ignorance?

Yes. It would be an argument from ignorance. Same as your paragraph.

Quote:
Why is it so difficult to imagine that in case of uncertainty you cannot know what you have but you can know if what you have shares a given property the result means something. Many cases in life you can think something is there or not and if its not there your failure in predicting something doesnt matter but if it is you save the day.
You can state, "well, if we presume that the three data points we have come from a population distribution that converges to normal via CLT at n<=20 AND we presume that they were a randomly selected unbiased sample of said population without removal AND we presume that our sample mean is exactly equal to the population mean AND we presume that minimizing the squared difference of 3 data points from said sample/population mean will give us a reasonable estimation of the variance of the underlying sample AND we presume that we don't have to apply a correction such as Bessel's correction to say that said sample variance is equal to the population variance, then..."

Should we presume all of those things? **** no!

Quote:
What if they asked you predict the sd within a factor of 3 in this problem given that the mechanism produces a distribution? If you get it right within 3 you get $10000. If you fail you get nothing. Would you choose to predict anything then? How would you approach a prediction? What if you had 10 measurements instead of 3?
"20."

"You lose because SD is an appropriate measure of dispersion only for normal curves. You should have stated the lognormal SD or at the very least have given the average absolute deviation from the mean."

Quote:
So Brian take your girl analogies and shove them because any sensible girl would be insulted that this is how you create analogies missing entirely the point.
See my "well, if..." paragraph above. The analogy was apt.

Quote:
You cant hide for long behind your unwillingness to play the game with a distribution of your choice that passes the criterion. But the ultimate test is this; 10k if you guess it within a factor 2 say or nothing if you miss or if you fail to guess. What is your guess? What if the data was 132,270,365,330,410,515,240,267,335,221? (all 25 point sums say)
"Can you know/estimate the number of fingers I am holding behind my back?"

"No."

"If I offer a million dollars and you get the answer correct, then if you guess that means that you think that you can know/estimate the number of fingers I am holding behind my back."

"Ummm. No. It doesn't mean that at all."

For those 10 summations, given that the distribution doesn't appear to be normal and definitely doesn't have a strong peak,* I calculate the maximum sample SD and divide by 2, which gives a wild guess of 32.

X-axis is values:


X-axis is standard deviations:


*there aren't any tests of non-normality that are sensitive with so few data points.
Variance question Quote
12-18-2015 , 09:23 AM
If we can rule out magic or tomfoolery and establish the tree is actually dropping apples naturally containing gold, then I would assume a fairly uniform distribution among the fruit (as well as much of the rest of the tree, branches, leaves, etc.) and these four measurements would be fine to estimate the variance. That said, you can keep your poisonious apple tree, if there's enough gold in the orebody to leach into the groundwater, I'm gonna dig it all up!
Variance question Quote
12-18-2015 , 09:56 AM
What's the basis for that assumption? We know nothing about the tree or how it magically produces gold. You can assume a uniform distribution or a normal distribution just to make the math easy, but that's not a meaningful answer to the question.

��
Variance question Quote
12-18-2015 , 10:15 AM
^ Ya.

Boötes void

And there's humanity itself. All things being equal, a normal distribution in the physical universe is based on clumping. Sure you can posit a idea that there is a counter-point to Boötes and that it's still "normal distro" but here's the thing in a baryonic set this vast...

The bound is != 1, usually {...-1, 2...} for teritrary sets.
Variance question Quote
12-18-2015 , 10:51 AM
Well, I tend to always rule out magic, because magic isn't real. So we're left with some natural process or hoax. If I hear of a tree producing golden apples, tbh the first thing I'll assume is it is some form of hoax until proven otherwise, and we can predict nothing much about the next apple dropped. So if I am going to estimate the variance, I'm going to first qualify it with the assumption the tree is actually producing gold-laced apples by some natural process.
Variance question Quote
12-18-2015 , 11:12 AM
Why wouldn't it?

Just as lubrication for locomotion needn't use water, there is absolutely no reason that as you go deeper into a galaxy, closer to the cores, that botanical life wouldn't use heavy elements in increasing ratios.

There are bodies that stay in steady multi-g for really long periods.

And sure it's possible to biohack existing botany to induce golden apples. It'll take awhile though.

nm done. You know there are trace amounts of gold in Pall Mall Golds, right?

****'s ****ing delicious.
Variance question Quote
12-18-2015 , 11:13 AM
There are natural processes that produce non-uniform distributions. Someone earlier said that real apples come in a bimodal distribution. I'd also point out that producing golden apples is quite magical. There's no reason to assume a uniform distribution based on the info we have at hand.
Variance question Quote
12-18-2015 , 11:22 AM
Local uniform distros, no.

It's more complex than that, Trolly, but if someone else comes up with an equation, they'll let me know. Or publish it themselves.

Either way, Clarke's 3rd indeed.
Variance question Quote
12-18-2015 , 11:58 AM
Keep in mind though that bimodal is not a problem in itself. Its the very sharp isolated nature of it (if there like a fork with thin tops) that makes the summation convergence to normal in only 25 a problem (eg see the parabolic in a link i gave earlier with examples of starting distributions - left and right tops to see what i mean that is a similar problem to bimodal). So most bimodals people know (like tops with some spread relatively close, spread distance comparable to widths) would still pass the summation to normal approximation nicely typically.

If you have no reason to anticipate any reasonable distribution that converges fast though to normal at n=25 you have an undefined problem, but if you had to make a guess you would be ok typically for most distributions of natural phenomena that are not very sharp in multiple tops or have well separated population neighbourhoods. Still of course significant sd uncertainty because of limited number of sums/data (only 3+1)

Last edited by masque de Z; 12-18-2015 at 12:03 PM.
Variance question Quote
12-18-2015 , 12:07 PM
Quote:
Originally Posted by Trolly McTrollson
There are natural processes that produce non-uniform distributions. Someone earlier said that real apples come in a bimodal distribution.
Sure, but we can use what we know about the biology of the apple tree to make reasonable assumptions about the most likely root of gold deposition within the fruit. For example, it may turn out, due to the chemical properties of the fruit, it's most likely most of the gold salts will end up in the seeds. We may know from biology that apples from this species of tree have a bimodal distribution of seeds within the fruit (or even uneven distribution of high acidity between the seeds, which would effect gold salt absorption), and we take that into account in our estimation. Further sampling can reinforce or weaken this theory.

Other non-hoax theories include air depostion (smelter nearby, airplane from fort knox exploded overhead), gold-laced insect infestation (which leaves us wondering where the worms ate the gold), and I can't think of much else off hand but feel free to add your own. The point is we don't just have these four samples to go by, but a wealth of information about nature to use in forming our testable theories.

Quote:
I'd also point out that producing golden apples is quite magical. There's no reason to assume a uniform distribution based on the info we have at hand.
It's only "magic" until it's explained. But we should always rule out supernatural forces and look for natural causes, or we may as well all regress backwards to superstitious times. You're right, we can't assume uniform distribution, the most likely explaination is obviously a hoax, not a more predictable natural process.
Variance question Quote
12-19-2015 , 01:56 PM
http://www.hindawi.com/journals/ijce/2011/939161/

Paper outlining the process of phytoextraction.

Quote:
4.1. Phytoextraction

Phytoextraction is the uptake/absorption and translocation of contaminants by plant roots into the above ground portions of the plants (shoots) that can be harvested and burned gaining energy and recycling the metal from the ash [28, 39–42].


---------------------


Interesting article showing a possible explanation to our magic tree could be microbial:

http://www.nature.com/nature/journal...ATURE-20130314

Quote:
Take a solution of gold chloride, a compound toxic to most forms of life. Add a colony of Cupriavidus metallidurans, one of the few bacteria able to survive amid compounds of heavy metals in mines across the world. As the bacteria accumulate the gold salt from the solution, biochemical processes within the organisms reduce it to the pure metal, which the bacteria excrete in the form of tiny gold nuggets — nanoparticles of pure gold. The bacteria produce the gold as protection from the toxic gold complexes that would otherwise destroy their cells.
Variance question Quote
12-19-2015 , 02:11 PM
You miss the point.

There are natural processes that obey uniform distributions. There are natural processes that obey non-uniform distributions.

There are also magical processes (I guess?) that also obey either uniform or non-uniform statistics. Stipulating that the gold tree is non-magical doesn't tell us anything about the distribution. If I tell you I'll give you a million dollars if you guess the distribution also doesn't give us any usable information.

And yes, I do agree that if someone is desperate, he/she can take a wild-ass guess at the answer. I'm not sure why that's a point that needs to be invoked here, since it's trivially true of any question ever.
Variance question Quote
12-19-2015 , 02:55 PM
Because the guess in that case is anything but desperate. It is very much on target for a vast number of distributions including many multimodal that are not very spiky and with supertight widths vs their population spreads. So if this is the case that 50% of them are like that your guess will be 50% of the time correct using CLT to get there and the rest of the time it wont be totally horrible either unless this is not a distribution at all of course. Of course you do not have any way of knowing if its 50% or 80% or 10% of the distributions that could describe such phenomenon and the real problem each time may offer more insight. But if you cant do anything else this is the last thing left. It will work for many distributions that are not at all normal initially.

It will probably be the case for the distributions of the effect the papers by FoldnDark describe, although those may vary also on many things like mass of the fruit, height from the ground, number of apples that are ripe together at a point in time etc. But i do not imagine that if there is a distribution in these cases it will be some spectacularly segmented one that will fail CLT summation at n=25 (for that type of rare substance concentration to the apple that should mostly depend on the mass distribution of the apples).

So if your life depended on a guess this is the guess you would be making, the only one possible if this is a distribution.

Sure you can find a ton of distributions that wont sum up well at n=25 but the majority of the textbook ones will do fine because most of them converge well at n=25 as links i provided show.

Its not bad math, its the only math possible here if math can offer any help at all.

Last edited by masque de Z; 12-19-2015 at 03:03 PM.
Variance question Quote
12-19-2015 , 03:18 PM
Quote:
Originally Posted by Trolly McTrollson
You miss the point.

There are natural processes that obey uniform distributions. There are natural processes that obey non-uniform distributions.

There are also magical processes (I guess?) that also obey either uniform or non-uniform statistics. Stipulating that the gold tree is non-magical doesn't tell us anything about the distribution. If I tell you I'll give you a million dollars if you guess the distribution also doesn't give us any usable information.

And yes, I do agree that if someone is desperate, he/she can take a wild-ass guess at the answer. I'm not sure why that's a point that needs to be invoked here, since it's trivially true of any question ever.
Maybe I am missing your point... but sure it does. Just because there may be natural processes that are not uniform doesn't mean any of them are likely to get gold into our fruit, or that one process is as likely as the next. Sun spots may occur in non-uniform fashion, but I can form no model around them that gets gold into our fruit. Currently, I can think of plenty of relevant natural processes that would be fairly uniform and no relevant natural process that would not be. There could be something I'm not imagining, and of course, some joker could just be messing with us. Regardless, each estimate would be qualified by listing the theory behind it. For example:

Theory A: Phytoextraction, estimate = a
Theory B: Microorganism, estimate = b
Theory C: Air deposition, estimate = c
Theory D: Some unimagined, highly erratic natural process, estimate= d (low confidence)
Theory E: Hoax, estimate = : e (low confidence)

Fwiw, I'm still banking on our tree being hoax, and I'm beginning to think it was you!
Variance question Quote
12-19-2015 , 04:33 PM
I thought of a natural theory that would be non-uniform. Bird poop! If a bird nesting in the tree had been eating from a treasure chest nearby that contained gold in various sizes from dust to 1-2 ounce beads it could then be crapping gold onto our apples in a wide range of amounts, also probably in a relatively small part of the tree.
Variance question Quote
12-19-2015 , 06:07 PM
Could be an apple tree with different varieties grafted on (I have a 4 in 1 tree) and each type behaves very differently.
Variance question Quote
12-19-2015 , 06:28 PM
Or it could be a normal apple tree other than the whole gold-in-them-there-apples thing, which would make it a decidedly non-normal distribution of apple sizes AND with the added problem of apples falling in clusters reducing your actual sample sizes by 80%.
Variance question Quote
12-19-2015 , 08:40 PM
Yeah, shaking the tree may not be the most scientifically rigorous of sampling methods.
Variance question Quote
12-20-2015 , 01:13 PM
Quote:
Originally Posted by FoldnDark
Yeah, shaking the tree may not be the most scientifically rigorous of sampling methods.
Of course that apples form and fall off in clusters rather than as individual fruits and that the sizes of each individual apple in a cluster is governed by certain properties of the cluster is esoteric knowledge only accessible to those who have had the extremely rare mystical experience of having actually encountered an actual apple tree in the wild.
Variance question Quote
12-20-2015 , 01:32 PM
Since it's a "special" tree, what is wrong with assuming that the amount of gold in each apple is a random number between 0 and n? Is it that there is no reasonable choice of n?
Variance question Quote
12-20-2015 , 02:25 PM
Quote:
Originally Posted by BrianTheMick2
Of course that apples form and fall off in clusters rather than as individual fruits and that the sizes of each individual apple in a cluster is governed by certain properties of the cluster is esoteric knowledge only accessible to those who have had the extremely rare mystical experience of having actually encountered an actual apple tree in the wild.

I don't know what kind of variance you're looking for, but they're apples. All pretty similar when you shake off the bird ****. Anyway it's not like we're comparing them to oranges.
Variance question Quote

      
m