Quote:
Originally Posted by 46:1
I recently started toying around with Benford's law on sets of naturally occuring numbers.
Benford's law is not actually a part of probability theory. That is, it is not a mathematical theorem that can be stated and proved mathematically. (In contrast, the law of large numbers and the central limit theorem
are mathematical theorems.) Instead, Benford's law is simply an observation about the statistical behavior of a wide variety of empirical data sets. It does not hold universally, nor does it hold exactly. Many data sets do not follow Benford's law, and those that do will follow it only approximately, with varying degrees of "closeness."
In my opinion, Benford's law is not very surprising. Let's take some arbitrary data set:
1691397
14925359
25902260
1358541
1153750
956000523
3603519
619955
4989127
383790212
...
Now take the base-10 logarithm of each number on the list:
6.228245556
7.173924786
7.413337658
6.13307275
6.062111714
8.98045813
6.556726816
5.792360167
6.698024559
8.584093895
...
Now drop the integer part of these numbers:
0.228245556
0.173924786
0.413337658
0.13307275
0.062111714
0.98045813
0.556726816
0.792360167
0.698024559
0.584093895
...
Would you be surprised to discover that this new list of numbers is approximately uniformly distributed on [0,1]? Well, this is just Benford's law! Here is why.
Take some positive integer N. Let X be the fractional part of its base-10 logarithm. Then the leading digit of N is just [10^X], where "[a]" denotes "the greatest integer less than or equal to a." Can you figure out why this is true?
Now, what is the probability that the leading digit of N, which is [10^X], is d? It is just the probability that
d <= 10^X < d + 1,
or
log(d) <= X < log(d + 1).
But if you are sufficiently convinced that X is uniformly distributed on [0,1], then this probability is just
log(d + 1) - log(d),
which is Benford's law.
Quote:
Originally Posted by 46:1
Can Benford's law also be used to give a chance to a set of numbers, that doesn't follow this distribution? That is in fraudcases, where the numbers didnt naturally occured, but were modified/created. Can you say: these numbers are off by more than 3 standard deviations so the chance that happens naturally is z?
So not: something fishy is going on, but: there is only a 0.05% of these numbers being distributed like this in a natural way?
I have never seen anyone attempt to do something so quantitative, and I would be extremely skeptical of any such analysis.
Quote:
Originally Posted by 46:1
Also does (and why would) nature follow a logaritmic scale?
The claim that "nature follows a logarithmic scale" is, in my opinion, way too broad and sweeping, and just not true. Logarithmic relationships are certainly very common in nature, and can result from physical principles such as the exponential growth of a physical quantity or the memoryless nature of a physical phenomenon. But in this case, the appearance of the logarithm seems to be related more to the connection between the leading digit of a number in base b and the base-b logarithm. This, of course, is a mathematical relationship and not a physical one.