Quote:
Originally Posted by ToothSayer
I simply don't accept that a first-trial hit on a long odds model without further information tells me absolutely nothing about the probability of the model being better than a 50/50 model; that view is wrong regardless of what the literature claims.
Probability theory and statistics are very tricky and require a lot of thoughts. Especially statistical inference.
First thing you need to consider is that there are no shortcuts. You need to formulate precisely assumptions, theories and models. You have to think about any possible explanation. You need to assign coherent prior beliefs for various options. That's not easy.
You are absolutely right when you think that a single data point tells you something. Unfortunately, understanding what it tells can be so complex and it depends a lot on the specific problem you are dealing with.
Let's try to examine model elections. Predictions will depend on the information considered and how that information is modelized. Let's build some models.
1) model 1: I just know that there is an election and one candidate will win. I also know that there are N candidates. If that is my information, my model will conclude that any candidate is 1/N to be elected.
2) model 2: I know what model 1 knows and also that the last M elections were won by candidates of just two parties. My model will conclude that with probability (M+1)/(M+2) (see why?) will win one of the two candidates of the major parties ((M+1)/(2*(M+2)) each). The remaining 1/(M+2) will be divided between each other candidate.
3) model 3: I have also knowledge of some polls. I don't have record of how past polls were able to predict the right result. However, I have knowledge of how this polls are conducted and am proficient in statistic. So, I can, with my expertise, build a prior about P(have a polls result | real risult=x). So, I can process the polls and my model will spit out some probabilities for each candidate.
4) model 4: now I have knowledge also of past polls. I can model more precisely the relation between polls and actual result.
5) model 5: I start to incorporate local polls and the features of the electoral system. I also try to model correlations between results in different regions.
You can go on. Aside from the very simple models, there are lot of assumptions to be made. Some models might be off because relations between variables are treated unrealistically. Other because they miss important and available information. Other because the statistical procedures are not sound.
If you really want to bash Princeton's guys learn how they built their model. Question their assumptions. Find data that disprove their choices. Prove that they treated their data poorly. Show that they choose to ignore something important. Compare the treatment of data with other models and show that there is a better way. In a word, build your own model and prove to be better (or at least contribute to build a better model).
You don't want shortcuts. It's futile to disprove them on general assumptions that, unfortunately, don't hold. A lot of examples were made ITT to explain why. Think about them lengthy. The simple fact that a 1% event happened, by itself, doesn't assure that likely they are doing a bad job. It might be a reason to investigate, but not to conclude.
Sorry for the long post (might be easily my longest in 2+2). Hope to have given you something to think about. Otherwise, sorry for the time wasted!
Last edited by nickthegeek; 10-09-2017 at 06:04 PM.