Quote:
Originally Posted by TomCowley
What gives you the right to claim all the symmetries to use in indifference? They really seem plucked right out of thin air in this case. If there's a synopsis somewhere that justifies this while excluding legal configuration indifference, please link it- it'll save both of us some time.
The question in the OP is whether the 2-ball problem is well-posed, in the context of objective Bayesianism. I have tried to explain that context and I think you understand it quite well. You simply do not agree with it. That is fine, of course, but it is quite irrelevant to the question in the OP.
If you really want the full details, you will not get them from some linked synopsis. You will have to read quite a bit. To point you in the right direction, I am describing here the philosophy of objective Bayesianism, as developed on the basis of the Cox-Pólya-Jaynes desiderata. Read at least the first two chapters of
this book. The indifference principle is derived in Chapter 2, and problems similar to the ones in this thread are considered in Chapter 6. The derivation of the indifference principle is based on Desideratum (IIIc), which says,
"Equivalent states of knowledge must be always represented by equivalent probability assignments. That is, if in two problems our state of knowledge is the same (except perhaps for the labeling of the propositions), then we must assign the same probabilities in both."
Quote:
Originally Posted by TomCowley
Ok, so let's just define the problem as "J = We have an urn with 3 balls numbered 1-3. The urn either contains WBB, WWB, BWB, BWW, or BBW."
All right. We begin the proof of well-posedness by first observing that J is logically equivalent to J1 & J, where J1 = "We have an urn with 3 balls numbered 1-3. The urn either contains WBW, WBB, WWB, BWB, BWW, or BBW." As before, we apply the indifference principle to J1, and then combine the two information sets using the laws of probability. QED
This proof is 100% logically correct. You seem to want a proof that uses
only the indifference principle, as though the laws of probability are out of bounds, or somehow invalid. This is, of course, ridiculous from an objective Bayesian standpoint. The indifference principle and the laws of probability are both derived from the desiderata. No objective Bayesian believes that all probabilities can be calculated using only the indifference principle.
Quote:
Originally Posted by TomCowley
This looks like a straw man
This example is actually very germane to the topic. One of the applications of this subject is in machine learning. For example, imagine there is an urn with 2500 balls, some white and some black. A machine is going to sample from the urn with replacement, and use those samples to try to learn about the actual contents of the urn. The machine uses Bayes theorem to update its estimates after each sample. To get started, the machine needs a prior. The prior should be an
uninformative prior, since the machine initially has no information about the contents of the urn.
The independence assumption is virtually useless in Bayesian machine learning because it produces a prior on the proportion of white balls that is very nearly a delta function. In informal language, the machine would start off so strongly convinced that half the balls are white, that it would take an enormous number of samples before it could be convinced it was wrong. Even if its first 200 samples were all white, its estimate on the proportion of white balls would only increase to about 54%.
If it samples without replacement, it is even worse. In that case, it cannot learn anything at all about the unsampled balls. In the above Jaynes book, this is called the binomial monkey prior, and is discussed in Section 6.7. It is nearly the exact opposite of what an uninformative prior is supposed to be. No serious Bayesian would ever use it in an application.
In practice, the most commonly used prior in this situation is the one that is uniform on the number of white balls. In the 2-ball example, that would give P(BB) = P(WW) = 1/3 and P(BW) = P(WB) = 1/6. My question is whether this, or any other distribution, can be derived objectively (i.e. from the desiderata, or if you cannot be bothered to read Jaynes, from the principles and methods I have described in this thread).
Incidentally, if we were permitted to permute things other than labels, then we could simply permute the number of white balls to derive P(BB) = 1/3. In fact, I believe this illustrates one reason why we should not be permitted to permute things other than labels. Labels do not carry information. They are simply bookkeeping devices that index our hypotheses. By restricting ourselves to label permutations, I believe we entirely avoid situations where one kind of permutation leads to one distribution, while another permutation leads to something different.
Last edited by jason1990; 01-29-2009 at 11:10 AM.