Open Side Menu Go to the Top
Register
I think I am a statistical frequentist I think I am a statistical frequentist

09-30-2014 , 05:48 AM
Quote:
Originally Posted by nickthegeek
A bayesian asks himself: do I have any reason to believe that tail is more likely than head (or the reverse) in the next toss? The answer is no. So P(Heads)=1/2.
But why is that not circular? If "P(Heads) = 1/2" means "I have no reason to believe that Heads or Tails are not equally likely", then what does "likely" mean?
I think I am a statistical frequentist Quote
09-30-2014 , 06:19 AM
Quote:
Originally Posted by lastcardcharlie
But why is that not circular? If "P(Heads) = 1/2" means "I have no reason to believe that Heads or Tails are not equally likely", then what does "likely" mean?
It's not circular. One thing is the definition of probability. As I said, for a bayesian it reflects the concept of "fair odds" and quantifies how much one believes in an event, given his state of information. The other thing is to assess the probability to a given event. In the coin toss example, lacking any other information, fair odds would mean even odds, since there is no reason to prefer a side to another. So, the bayesian assesses the probability of this event at one half.

This is what everyone of us does everytime. When you say that the probability of extracting the Ace of spades out of a standard deck is 1/52, it's not because some weirdo is keeping extracting cards from a deck to measure the long run frequency, but because you know that the card you are going to extract can be any and you don't have any reason to believe that a particular card is more likely than another. So you assess the probability at 1/52.
I think I am a statistical frequentist Quote
09-30-2014 , 06:26 AM
Quote:
Originally Posted by nickthegeek
As I said, for a bayesian it reflects the concept of "fair odds"...
Okay, "P(Heads) = 1/2" means "if you are offered better than even money on Heads, take the bet" is not circular.

Quote:
...and you don't have any reason to believe that a particular card is more likely than another.
The question is plain enough: define "likely".
I think I am a statistical frequentist Quote
09-30-2014 , 06:33 AM
Quote:
Originally Posted by lastcardcharlie
The question is plain enough: define "likely".
I guess likely=probable. When I say "A is more likely than B" I mean that I'd prefer to bet on A rather than on B at the same odds. I wouldn't prefer any side between head and tail at the same odds, so they are equally likely.
I think I am a statistical frequentist Quote
09-30-2014 , 06:51 AM
Quote:
Originally Posted by nickthegeek
When I say "A is more likely than B" I mean that I'd prefer to bet on A rather than on B at the same odds.
Why would you prefer to? I assume it's because you like money and make bets that you expect to profit from in the long run. But now we're back somewhere close to frequentism. The meaning of "P(Heads) = 1/2" has become "I have no reason to believe that I will profit from betting on either Heads or Tails", which seems in effect to be saying that "I have no reason to believe that, when the coin is flipped, Heads will come up more often than Tails or vice versa".
I think I am a statistical frequentist Quote
09-30-2014 , 07:03 AM
Quote:
Originally Posted by lastcardcharlie
Why would you prefer to? I assume it's because you like money and make bets that you expect to profit from in the long run. But now we're back somewhere close to frequentism.
No, frequentism is not about this. Bayesiasn doesn't deny the long-run concept. Of course, if I make many bets which I assess to be at 70%, I expect to win 70% of the times on the long run. But this has nothing to do with frequentism. For a frequentist you can assess a probability to an event only if the event is infinitely repeatable. You cannot say that the probability of Patriots winning the next superbowl is x, since that is not repeatable. Sure, if I make a good bet I'd expect to profit in the long run. A frequentist cannot even define a good bet, since only repeatable events have a probability.
I think I am a statistical frequentist Quote
09-30-2014 , 07:11 AM
Just to be clear: a bayesian expects that the long-run frequency approaches to the assessed probability, while the frequentist defines the probability as the long-run frequency.
I think I am a statistical frequentist Quote
09-30-2014 , 09:58 AM
Perhaps I am being obtuse, but it is still not clear to me how a Bayesian defines "probable", or "likely". Specifically, can any part of the definition refer to what will happen when the coin is flipped, or must must it all be expressed in terms of the current internal mass distribution of the coin and the laws of physics, etc?

You have said that testing the coin may be considered as evidence:

Quote:
Originally Posted by nickthegeek
This [the Bayesian probability] can change if I ammass some evidence. If the coin keep landing on tail, the probability may change toss after toss (depending on how much likely I believe that the coin may be unfair to start with).
and then go on to say that a problem with frequentism is that experiments cannot be repeated:

Quote:
Originally Posted by nickthegeek
For a frequentist you can assess a probability to an event only if the event is infinitely repeatable. You cannot say that the probability of Patriots winning the next superbowl is x, since that is not repeatable.
This seems to me like the Bayesian wants to have it both ways. Each past experiment (coin flip) is evidence for the Bayesian, but future experiments are different enough to preclude the frequentist from assessing a probability.
I think I am a statistical frequentist Quote
09-30-2014 , 11:34 AM
Coincidental timing? What would the frequentist/Bayesian say?

http://www.nytimes.com/2014/09/30/sc...ated.html?_r=0
I think I am a statistical frequentist Quote
09-30-2014 , 11:57 AM
Quote:
Originally Posted by lastcardcharlie
Perhaps I am being obtuse, but it is still not clear to me how a Bayesian defines "probable", or "likely". Specifically, can any part of the definition refer to what will happen when the coin is flipped, or must must it all be expressed in terms of the current internal mass distribution of the coin and the laws of physics, etc?
It seems to me that you are still confusing the definition of probability with its assessment. How anyone estimates the probability of an event is based on the information he got. In the case of the coin, you know that the coin will land; that there will be just two possible ways it can land and that the degree of belief you have on either of the outcomes is the same.

Of course, past similar events modify your information status and may modify your probability evaluation. Think about this scenario. You hear from the news that a small amount (~5%) of the new coins are irregular. Half of the irregular coins land 70% of the times on head, while the other half lands 70% of the times on tail. You have a new coin on your pocket.

1) What's the probability of tossing a tail for a bayesian? And for a frequentist? On which side would you bet at even odds?

2) Suppose you toss a tail. What's the probability of tossing a tail the next toss for a bayesian? And for a frequentist? On which side would you bet at even odds?

If you can answer to those questions, you got what's the difference between bayesian and frequentist. And, by the way, are not the bayesians that preclude frequentists to assess probability. The frequentists don't assess probability to non repeatable events because that's how they define probability.
I think I am a statistical frequentist Quote
09-30-2014 , 01:11 PM
Quote:
Originally Posted by nickthegeek
It seems to me that you are still confusing the definition of probability with its assessment.
Probably. What confuses me is when Bayesians talk about something happening a certain percentage of the time, because that seems to me like pure frequentist talk.

Quote:
You hear from the news that a small amount (~5%) of the new coins are irregular. Half of the irregular coins land 70% of the times on head, while the other half lands 70% of the times on tail. You have a new coin on your pocket.

1) What's the probability of tossing a tail for a bayesian? And for a frequentist? On which side would you bet at even odds?
The new system does not appear to favour heads or tails, so I would say it's still 0.5. Is that Bayesian thinking? I don't know what the difference between the two is here.

Quote:
2) Suppose you toss a tail. What's the probability of tossing a tail the next toss for a bayesian? And for a frequentist? On which side would you bet at even odds?
If the coin lands tails, then using conditional probability I get that:

P(coin is irregular favouring tails) = 0.035
P(coin is irregular favouring heads) = 0.015
P(coin is regular) = 0.95

The probability of getting tails next is then:

(0.035 * 0.7) + (0.015 * 0.3) + (0.95 * 0.5) = 0.504

I suspect this is Bayesian because of conditional probability, but I don't know, and it would be good if you could elaborate on the difference between the two approaches to this scenario.
I think I am a statistical frequentist Quote
09-30-2014 , 02:18 PM
Quote:
Originally Posted by lastcardcharlie
The new system does not appear to favour heads or tails, so I would say it's still 0.5. Is that Bayesian thinking? I don't know what the difference between the two is here.
This is entirely correct (as well as the rest of your reasoning). The thing is for the frequentist it isn't. For them, the probability belongs to the event and what you know about it is irrelevant. So, a pure frequentist would say that the probability of tail is one of 30%, 50% or 70%. He can't say which one.

Quote:
If the coin lands tails, then using conditional probability I get that:

P(coin is irregular favouring tails) = 0.035
This is the key. As natural as it might appear what you wrote, that is a non-sense for a frequentist. You cannot repeat the event "coin is irregular". The coin is either irregular or it isn't. That event cannot have a probability. It doesn't matter what happens next. The probability of the next toss will be one of 30%, 50% or 70% for the frequentist.

You reasoned perfectly and, without even knowing, in a bayesian way.
I think I am a statistical frequentist Quote
09-30-2014 , 02:45 PM
Quote:
Originally Posted by nickthegeek
So, a pure frequentist would say that the probability of tail is one of 30%, 50% or 70%. He can't say which one.
I understand: if you keep on flipping the same randomly chosen coin, the Bayesian probability will start at 50%, but then at some point move towards one of those three.

But what is it about the wording of your scenario:

Quote:
You hear from the news that a small amount (~5%) of the new coins are irregular. Half of the irregular coins land 70% of the times on head, while the other half lands 70% of the times on tail. You have a new coin on your pocket.
that prevents the frequentist from interpreting that you flip a different randomly chosen coin each time, in which case he also gets 50%?
I think I am a statistical frequentist Quote
09-30-2014 , 03:23 PM
Quote:
Originally Posted by lastcardcharlie
that prevents the frequentist from interpreting that you flip a different randomly chosen coin each time, in which case he also gets 50%?
Exact. The frequentist can tell how much likely are the effects (if repeatable) if the "state of nature" is known, but can't tell how likely is a possible state of nature given the observed effects.
I think I am a statistical frequentist Quote
09-30-2014 , 05:40 PM
Quote:
Originally Posted by BrianTheMick2
You could try reading up on the subject. I often find that quite useful.


Quote:
Originally Posted by plaaynde
Or the relativity project.
This is true. I am infinitely curious.

Quote:
Originally Posted by masque de Z
Or both in the lightest sense of the term error given where we start from as humans and how our thought process was constrained by our macroscopic realities. Which is for sure the case in my opinion.

The next revolution will essentially retrace a 3k year science trajectory from the ground up.
This is something I'd be very interested in reading more about. I wonder sometimes whether we haven't just scratched the surface of understanding these phenomena.

Quote:
Originally Posted by lastcardcharlie
A question for the Bayesians...
Have you switched sides? Perhaps I've been misunderstanding our exchange.
I think I am a statistical frequentist Quote
09-30-2014 , 05:44 PM
Quote:
Originally Posted by lastcardcharlie
Let me first try to understand your claim in precise terms. Are you claiming that "P(Heads) = 1/2" should be interpreted as a prediction that, if the coin is flipped indefinitely, the ratio of the number of Heads to the number of coin flips will tend to 1 : 2? (I don't know what the standard frequentist interpretation is, but I'm not sure what else it could be.) Are you claiming, moreover, that this prediction is necessarily imperfect, because the ratio tended to will never be exactly 1/2?
And incidentally, this sounds right to me.
I think I am a statistical frequentist Quote
09-30-2014 , 05:49 PM
Quote:
Originally Posted by DrModern
Have you switched sides?
I'm just a good Wittgensteinian, like yourself.
I think I am a statistical frequentist Quote
09-30-2014 , 07:12 PM
Quote:
Originally Posted by nickthegeek
Exact. The frequentist can tell how much likely are the effects (if repeatable) if the "state of nature" is known, but can't tell how likely is a possible state of nature given the observed effects.
I don't understand this. Would you mind rephrasing it?

I am also having trouble with the following scenario. Suppose we have a coin but have no idea how biased it might be. Suppose we are to flip it twice. Before anything happens, it seems that the Bayesian assigns P(Flip 1 = H) = P(Flip 2 = H) = 0.5. Might just as well bet on heads as on tails, right? By a similar argument, P(Flip 1 = H and Flip 2 = H) = 0.25. Then, by conditional probability, P(Flip 2 = H given that Flip 1 = H) = 0.25/0.5 = 0.5.

So we go ahead and do Flip 1 and it comes up heads. Does not the Bayesian interpret this as some evidence that the coin is biased towards heads? If so, that appears to contradict that P(Flip 2 = H given that Flip 1 = H) = 0.5.
I think I am a statistical frequentist Quote
10-01-2014 , 01:19 AM
Quote:
Originally Posted by lastcardcharlie
I don't understand this. Would you mind rephrasing it?
The key is Bayes' theorem. It says (of course you know) that P(B|A) = P(A|B) P_0(B) /P(A). If B is a "cause" or a "state of nature" of some effect A, then the therom says how we can update our beliefs on B after the happening of A. The problem is that this theorem (while still valid) is pretty much useless in a frequentist framework since most often A is repeatable while B is not. Let's make an example. Suppose we want to run an experiment to measure the Higgs Boson mass. By calculating cross sections, physics know how much likely are the production of some particles of some energy if the Higgs boson has a given mass. So, let's say that B represents the event "Higgs has a mass of x". That is a "state of nature". As you have guessed, it's not a repeatable event, so it doesn't make sense to have a probability. Let's say that A is the event "some particles with some properties have been observed". This is repeatable, since I can run many times (at least in principle) different collisions. Frequentists can calculate P(A|B), but are not allowed to use the Bayes' theorem, since neither P_(B) nor P(B|A) make sense.

Quote:
I am also having trouble with the following scenario. Suppose we have a coin but have no idea how biased it might be. Suppose we are to flip it twice. Before anything happens, it seems that the Bayesian assigns P(Flip 1 = H) = P(Flip 2 = H) = 0.5. Might just as well bet on heads as on tails, right?
This is correct.

Quote:
By a similar argument, P(Flip 1 = H and Flip 2 = H) = 0.25.
This is not correct. The event are not conditionally independent. This can be mind boggling. In a bayesian framework, the independence between two events is conditional to what we know. So, if we knew how much (if at all) the coin is biased, the probability of flip 2 is the same regardless the result of flip 1. But we don't know how the coin is biased in your example. Knowing the result of the first toss gives us un update on the probability about the "biasness" of the coin, which, in turn, it reflects on the probability of flip 2. You always evaluate conditional probability to have the joint probability. So, in this example, you have that:

P(1=h && 2=h) = P(2=h | 1=h) P(1=h)

You need the first term on the right side to carry out the right number. To calculate it, you need a prior on how much is biased the coin to start with.

Maybe you might want to read something about the bayesian networks. You can learn how knowledge propagates and how events can be either dependent or independent in function of the evidence we got.

Last edited by nickthegeek; 10-01-2014 at 01:41 AM.
I think I am a statistical frequentist Quote
10-01-2014 , 01:54 AM
Cheers nickthegreek and lastcardcharlie I've enjoyed this.
I think I am a statistical frequentist Quote
10-01-2014 , 09:38 AM
Quote:
Originally Posted by nickthegeek
This is not correct.
Yes, I thought that there was no prior reason to prefer HH over HT, say, but that of course is not the case, either for the Bayesian or the frequentist. Before anything has happened, for the Bayesian it's P(HH) = P(TT) >= P(HT) = P(TH), while for the frequentist it's either P(HH) >= P(HT) = P(TH) >= P(TT), or P(TT) >= P(HT) = P(TH) >= P(HH).

Quote:
So, in this example, you have that:

P(1=h && 2=h) = P(2=h | 1=h) P(1=h)

You need the first term on the right side to carry out the right number. To calculate it, you need a prior on how much is biased the coin to start with.
Does that mean that if the first flip is H and no one has any further information, you would not bet on the next flip being T if you were offered odds of 100 to 1?
I think I am a statistical frequentist Quote
10-01-2014 , 10:33 AM
Quote:
...for the Bayesian it's P(HH) = P(TT) >= P(HT) = P(TH)...
Should be: P(HH) = P(TT) > P(HT) = P(TH).
I think I am a statistical frequentist Quote
10-01-2014 , 11:24 AM
Quote:
Originally Posted by lastcardcharlie
Does that mean that if the first flip is H and no one has any further information, you would not bet on the next flip being T if you were offered odds of 100 to 1?
Describe again what you know about that coin.

If i started with a coin that i have no idea how the heads probability is distributed, the information of one trial seen so far being eg heads is not telling us anything other than that its not a 100% tails only coin.

But imagine you were given a distribution for the coin to begin with. For example something like the coin is drawn from a population of coins that their heads probability is uniform in [0,1]. Then you can calculate the things you asked and find out if a "lucrative" looking bet is indeed that. Some other non continuous distribution might be 10% of coins are 50-50 the other 40% are 99.95% to give heads and 40% are 99.9% to give tails. Or even the original example you were working with.

With nothing to work with starting as distribution the first event you have is all you have (heads) and it cant be used to decide if the bet heads next is good or not.

But if you have a distribution to begin with, the trials you perform simply update it.


What is interesting now is, assuming you know nothing else (no initial distribution) how you play if you have another 10 trials and they have come 7-3 so far. How about 10-0? At some point when you have no other information you need to start working with what you have and becomes an estimation theory problem (and then frequency approach is all you have if each trial is clean enough to be seen as identical each time and not correlated to other things).


Maybe of additional interest (for people inspired by the thread);
http://en.wikipedia.org/wiki/Statist...thesis_testing
http://en.wikipedia.org/wiki/P-value
http://en.wikipedia.org/wiki/Binomial_test
http://en.wikipedia.org/wiki/Binomial_distribution (any details found there regarding estimations)
http://en.wikipedia.org/wiki/Bayesian_inference
http://en.wikipedia.org/wiki/Binomia...dence_interval
http://en.wikipedia.org/wiki/Estimation_theory

----------------
Non free will signature. 2+2 community, BruceZ, 2+2 leaders etc, all with your choices give back BruceZ and others you "chase" away to this discussion and the ones that will follow. We are all in this interactions learning game together.

Last edited by masque de Z; 10-01-2014 at 11:46 AM.
I think I am a statistical frequentist Quote
10-01-2014 , 11:26 AM
Quote:
Originally Posted by masque de Z
Describe again what you know about that coin.
Fair question. I suppose the point is that once you really start describing it then you are invoking the kind of prior knowledge that nickthegeek referred to?

Quote:
Some other non continuous distribution might be 90% of coins are 50-50 the other 5% are 99% to give heads and 5% are 99% to give tails. Or even the original example you were working with.
As I remarked earlier, I find this kind of language confusing here because the meaning of statements like "99% to give heads" is precisely what is under discussion. What do you mean by it?
I think I am a statistical frequentist Quote
10-01-2014 , 01:54 PM
By the way i changed the numbers in my post to make it less instantly trivial regarding the 100 to 1 bet example.

Keep in mind i am the guy that doesnt choose camps here, lol. I choose logic and science and math education instead so that you know each time what you can use best depending on what information is available or suspected.

That something has 90% chance to come out in some particular way (ie heads for a coin) in some event means to most people that if you run the event a large number of times and you accept these are identical experiments (not changing the conditions to make the trial not a repeated similar trial but individual events of their own unique properties) the frequency of occurrence will converge to 90% of total. If that cannot be the case ie you cannot repeat a complex event then its a number some theory you have proposes to assign preference/prediction bias (eg some champion top team A will win the first round elimination double match combo in some cup vs a very weak team with 90% chance or a star will explode as supernova with 90% chance in the next 10 mil years). In those cases all you can do to test how good your model that assigns such probabilities is is to apply it many times in different situations that the same theory is used and see what the results are.

You can see probability as frequency something happens in cases that are clean enough to be seen as identical or as a local preference value.

I can also see probability in complex enough systems that the trials are never identical but the mixing is so interesting and universal in its properties moreover the chaos that the frequency interpretation still proves the same over time.

Predicting a supernova will explode with 90% chance in the next 10 mil years is a one time event proposition and may be the result of simulations for example that when repeated in different sets still converge to the same result even if each trial is different than the one before in the details of how it got there. Ie i run 10 mil simulations and then run another 10 mil later and they use different rnd feeds and still i get identical (within statistical error) distributions of the explosion time. Each time the star is left to evolve the trajectory is not identical but all those different paths result in the same distribution eventually with common avg (ie 10 mil years) say for eg the 90% confidence level.

Also see

http://en.wikipedia.org/wiki/Probability

"Probability is the measure of the likeliness that an event will occur.[1]

Probability is used to quantify an attitude of mind towards some proposition of whose truth we are not certain.[2] The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The certainty we adopt can be described in terms of a numerical measure and this number, between 0 and 1 (where 0 indicates impossibility and 1 indicates certainty), we call probability.[3] Thus the higher the probability of an event, the more certain we are that the event will occur. A simple example would be the toss of a fair coin. Since the 2 outcomes are deemed equiprobable, the probability of "heads" equals the probability of "tails" and each probability is 1/2 or equivalently a 50% chance of either "heads" or "tails".

These concepts have been given an axiomatic mathematical formalization in probability theory (see probability axioms), which is used widely in such areas of study as mathematics, statistics, finance, gambling, science (in particular physics), artificial intelligence/machine learning, computer science, and philosophy to, for example, draw inferences about the expected frequency of events. Probability theory is also used to describe the underlying mechanics and regularities of complex systems."


You cannot have probability without repetition (either real repetition or implied in some parallel world sense like the star example) in the most simple cases people apply probability theory to but a more elaborate formal approach in terms of axioms is also another way to see things.


At some point when you have no prior knowledge or a hypothesis to test all you have is the trials and hopefully that they are clean enough to be seen as identical. Then with them you develop the concept of probability. Later you can use various methods to test if what you are now proposing based on these observations is reasonable or it has started to have problems.

To give you an idea, i have no knowledge whether the quantum experiments that try to see how random QM is are so far clean enough all these decades to eliminate pseudorandom number generator issues. By that i mean a fully deterministic non local theory of hidden variables can be deciding the results of experiments each time to make them look random without being exactly random. For example and i am not suggesting this is the case, what if the decay behavior of radioactive nuclei depends on solar activity (recent claims) . What if the spin of some electron measured is a function of some far away system that is so chaotic that makes the spin look random in measurements. We need methods to test properly whether QM is indeed truly random and find if any correlations exist actually that nobody noticed. In that sense a purely frequency approach will fail to see the structure. However if correlations exist we can potentially uncover them with experiments and proper hypothesis testing.

Notice also how close to the frequency understanding most people have of probability, the resulting theorems prove ie

http://en.wikipedia.org/wiki/Law_of_large_numbers

http://en.wikipedia.org/wiki/Central_limit_theorem

See also

http://en.wikipedia.org/wiki/Probability_axioms
http://en.wikipedia.org/wiki/Cox%27s_theorem
http://en.wikipedia.org/wiki/Bayesian_probability


----------------
Non free will signature. 2+2 community, BruceZ, 2+2 leaders etc, all with your choices give back BruceZ and others you "chase" away to this discussion and the ones that will follow. We are all in this interactions learning game together.

Last edited by masque de Z; 10-01-2014 at 02:09 PM.
I think I am a statistical frequentist Quote

      
m