Open Side Menu Go to the Top
Register
Polling Polling

10-27-2020 , 07:03 PM
Quote:
Originally Posted by David Sklansky
It is entirely possible that a publicized poll near election time will have less accuracy than an unpublicized poll, especially considering that people may or may not vote. The direction of the inaccuracy is unclear though. Common sense might say that being behind might make more people vote. But there is also the opposite theory that people are more likely to vote when the poll says their preference is ahead because it feels good to vote for a winner.
Hi David:

I think the answer would have to do with the specific questions themselves. But it's probably more difficult to design questions that will produce accurate results right before an election, especially one like this one.

Best wishes,
Mason
Polling Quote
10-27-2020 , 08:04 PM
Quote:
Originally Posted by Mason Malmuth
Hi DD:

My understanding is that 538 looks at polls done by others and then comes to a conclusion. They don't do their own polling.

Best wishes,
Mason

The folks at 538 also caution their readers not to base things on the results from any one pollster, but rather to look at the combined results of a multitude of polls.

Polling by its very nature has an unavoidable degree of error - if you are taking a sample of 1000 people and trying to make a prediction about how 10 million people will vote on something...if you know what you’re doing you can get pretty close, but if you nail it exactly you simply got lucky.

538 also rates the various polling organizations based on their historical accuracy; if you are really into diving into the details about polling, you are probably quite familiar with their site already...if not, definitely worth checking out.
Polling Quote
10-27-2020 , 08:46 PM
Quote:
Originally Posted by Mason Malmuth
I don't agree. If the pollster sees that there is something wrong with his sample, he will most likely adjust for it. Otherwise, he'll be putting out a result that he knows is not accurate.

By the way, while it was many years ago, I have some experience doing this.

Best wishes,
Mason
Party identification is not something intrinsic or immutable or a necessarily known quantity in the sample such that it can be used as a gauge for whether or not a given sample is representative. It can and does change relatively frequently for people, certainly compared to other demographic information. Race does not. Gender only very rarely. Age always changes obviously, but the true distribution of ages is pretty well known. Likewise with incomes. That cannot be said for party ID.
Polling Quote
10-27-2020 , 08:50 PM
Put more simply, surely you can't adjust for sampling error in a variable which is by far the most accurate predictor of the variable the poll is designed to gauge?
Polling Quote
10-27-2020 , 09:45 PM
Quote:
Originally Posted by Mason Malmuth
I don't agree. If the pollster sees that there is something wrong with his sample, he will most likely adjust for it. Otherwise, he'll be putting out a result that he knows is not accurate.

By the way, while it was many years ago, I have some experience doing this.

Best wishes,
Mason

But that’s the point. A sample being D+7 or whatever doesn’t tell you there’s anything wrong.

I’m unimpressed with appeal to (your own) authority since current pollsters don’t typically weight by party ID for the reasons I noted above.
Polling Quote
10-27-2020 , 09:50 PM
Quote:
Originally Posted by 20dragons
The folks at 538 also caution their readers not to base things on the results from any one pollster, but rather to look at the combined results of a multitude of polls.

Polling by its very nature has an unavoidable degree of error - if you are taking a sample of 1000 people and trying to make a prediction about how 10 million people will vote on something...if you know what you’re doing you can get pretty close, but if you nail it exactly you simply got lucky.

538 also rates the various polling organizations based on their historical accuracy; if you are really into diving into the details about polling, you are probably quite familiar with their site already...if not, definitely worth checking out.
Hi dragons:

I agree that 538 is a good site with lots of good information.

Best wishes,
Mason
Polling Quote
10-27-2020 , 09:53 PM
Quote:
Originally Posted by MrWookie
Party identification is not something intrinsic or immutable or a necessarily known quantity in the sample such that it can be used as a gauge for whether or not a given sample is representative. It can and does change relatively frequently for people, certainly compared to other demographic information. Race does not. Gender only very rarely. Age always changes obviously, but the true distribution of ages is pretty well known. Likewise with incomes. That cannot be said for party ID.
Hi Wookie:

I think you miss the point. Everything you say may be correct. However, I have heard several times that certain polls are over counting Democrats in their sample, and I was just addressing that.

Best wishes,
mason
Polling Quote
10-27-2020 , 10:16 PM
Quote:
Originally Posted by Mason Malmuth
Hi Wookie:



I think you miss the point. Everything you say may be correct. However, I have heard several times that certain polls are over counting Democrats in their sample, and I was just addressing that.



Best wishes,

mason
How does one know in advance that, e.g. too many Republicans have been counted?
Polling Quote
10-27-2020 , 10:20 PM
Quote:
Originally Posted by MrWookie
How does one know in advance that, e.g. too many Republicans have been counted?
Easy. Say I surveyed 100 people and found that 55 of them have ever worn a bra. I know a priori that men don't ever wear bras, so I assume that 55% of the people I surveyed were women, and adjust my results for the sample bias to extrapolate that 50% of the population have ever worn a bra.

Sounds like this adjustment is using similar logic.
Polling Quote
10-27-2020 , 11:23 PM
Quote:
Originally Posted by MrWookie
How does one know in advance that, e.g. too many Republicans have been counted?
Well, I don't know for sure what a particular pollster might be doing. But the percentage of Republicans among registered voters in a county/state I believe is public information.

So, for instance, if it's known that, let's say, in a certain area X percent of the voters are members of Party A, but the sample has Y percent of Party A voters, the results can then be adjusted by first multiplying by X and then dividing by Y.

Now does a particular poll do this in coming up with their estimates, there's no way I would know.

Best wishes,
Mason
Polling Quote
10-27-2020 , 11:34 PM
Quote:
Originally Posted by d2_e4
Easy. Say I surveyed 100 people and found that 55 of them have ever worn a bra. I know a priori that men don't ever wear bras, so I assume that 55% of the people I surveyed were women, and adjust my results for the sample bias to extrapolate that 50% of the population have ever worn a bra.

Sounds like this adjustment is using similar logic.
Hi d2_e4:

I think this is correct. If you're a pollster and you see that there is something wrong in your raw data, you'll try to make an adjustment.

Best wishes,
Mason
Polling Quote
10-27-2020 , 11:48 PM
Hi, thank you for doing this

How many people would need to be polled to have a reasonably accurate idea of who is going to win? lets say 99% confidence interval
Polling Quote
10-28-2020 , 12:39 AM
Quote:
Originally Posted by NewAcctIsBest
Hi, thank you for doing this

How many people would need to be polled to have a reasonably accurate idea of who is going to win? lets say 99% confidence interval
Hi NewAcc:

First, you're welcome. However, please remember that my answers may not all be correct.

Second, you're making a mistake that I see quite often. The number of people in the sample doesn't determine whether you can form a 99 percent confidence interval (or a confidence interval of any size). What it determines is the size of the confidence interval.

For instance, with a certain sample size your 99 percent confidence interval of a 40 percent estimate might go from 37 percent to 43 percent. But with a much larger sample it might go from 39 to 41 percent.

Best wishes,
Mason
Polling Quote
10-28-2020 , 12:58 AM
Hi Everyone:

Earlier tonight on Hannity, which I'm sure most of you watch regularly, he had two representatives of pollsters Insider Advantage and The Trafalgar Group. These two polling companies both accurately predicted the 2016 election and both have Trump doing much better than most polls show.

So, why is this? Well, the Insider Advantage representative said that they have, relative to their questions, a "more blended system" without being completely direct to get more information from the respondent than by asking direct questions. He also stated that this approach produces better information and the respondent doesn't feel like there might be retribution if he was to answer directly that he was voting for Trump. It was also indicated that The Trafalgar Group does something similar.

Notice that consistent with what I wrote above, much of the accuracy of a poll has to do with the specific questions that are being asked.

Best wishes,
mason
Polling Quote
10-28-2020 , 06:52 AM
Great thread.

Quote:
Originally Posted by d2_e4
Put more simply, surely you can't adjust for sampling error in a variable which is by far the most accurate predictor of the variable the poll is designed to gauge?
Yes you can, provided you know what the "correct" results for that variable are supposed to be.

In this election, the number 1 predictor of how people are going to vote is how they voted in 2016. The "correct" numbers for 2016 vote are known from the election results (and from a few other things like demographics data for new voters). If you read the full PDFs for the polls that actual give this data, more than 90% of the people who voted for the two leading parties are going to vote the same way again this time. Any survey that doesn't weight based on this is IMHO trash. Also this is a bit of an insurance against the "shy voters". Someone may be shy about saying they're Trump or Biden voters, but they are also pretty likely to be shy about admitting their 2016 vote too, so they just dilute the "don't knows" rather than removing votes from the camp they're going to vote in).

I've been posting a bit in the sports betting thread about polls. Here are two of my posts I'm vain enough to think are worth a cross-post:

Answering the idea that the 2016 polls favoured Dems:

Quote:
Originally Posted by LektorAJ
Maybe. The last 2016 poll which weighted by 2012 vote (as this thread knows, I think any poll that doesn't weight by previous vote is utter trash) was this one:

https://d25d2506sfb94s.cloudfront.ne...bReport_lv.pdf
(see table 12)

It has (actual results in parentheses)
Clinton 45 (48)
Trump 41 (46)
Johnson 5 (3)
Stein 2 (1)
Other 3 (1)
Not sure 4

So you could say the polls favoured third party candidates and also "favoured" not sure and underestimated both Clinton and Trump.

My view is that the polls were accurate but there was a late third-party squeeze and the voters just broke more towards Trump - which was predictable given two of the main 3rd party candidates were Johnson and McMullin. Effectively Clinton gained half of Stein's votes and half the not sure. Trump gained about half of the Libertarian+Other, and half of the not sure.

Admittedly this is hindsight but it's fairly obvious there where his votes are coming from.

....
... so in other words it correctly counted the left and right wings of society, but the more fragmented right (at time of polling) was able to get behind Trump on polling day to get the result they wanted.

On why polling "margin of error" calculated on a statistical basis is actually less of a factor than many think, with a warning about other types of polling error:

Quote:
Originally Posted by LektorAJ
Margin of error as calculated and displayed is nonsense. With proper weighting there is much less margin of error. ...

Consider the following world: population is split 50-50 between Dems and Reps. That was also the result last time in this world (a dead heat) although each party has lost 5% of its voters to the other party since then.

We conduct the following polls:

1) We sample 1024 voters randomly. mean dem score = 512(50%), variance = 1024*(0.5*(1-0.5)) = 256. SD = 16, margin of error (2 SD) = 32, or 3.125%.

2) We sample 512 voters who voted D last time, and 512 previous Rep voters. Again the mean is 512 but in this case the variance is 512*(0.95*0.05)+512*(0.05*0.95) = 48.64. SD = 7. Margin of error = 14 (1.4%)

The number of voters changing parties is slightly higher than the above but there are other weighting factors helping too.

If that sounds counterintuitive, consider the following:

3) We conduct a "gender identification" poll weighted on what binary gender people identified as 4 years ago (last time it was 50-50 but 0.5% of people have changed their mind in either direction since then). It's clearly nonsense that if you sample 512 each of 2016 men and 2016 women, you would have a 3% margin of error in terms of what they identify as as now. The actual margin of error over that sample is about 0.4%.

Obviously plenty of polls don't weight by the most relevant factors because they exist to generate exciting narratives for the people who commission them to write stories about, but you have to ignore those ones (and also ignore the ones that are directly biased for political reasons or because the company is given better odds for being the outlier which got it right versus part of the herd).

The actual doubt in polling isn't mainly to do with the statistical margin of error - and therefore it's not something that can be cured by simply taking bigger statistical samples or averaging more polls together. The actual known unknowns are to do with:
1) late swings
2) postal votes invalidated
3) other shenanigans
4) reluctance to state true voting intention to pollsters.

plus any unknown unknowns that may come along.

That stuff isn't limited to the margin of error at all.
(btw the above isn't exactly how you calculate statistical margin of error when you don't know the underlying distribution but it shows the relevant principle)

Last edited by LektorAJ; 10-28-2020 at 07:02 AM.
Polling Quote
10-28-2020 , 09:01 AM
Malmuth, did Hannity or Hannity's guests cover voter suppression at all when discussing the differences between polling and vote results? Do you have any expertise in this particular issue with respect to political polling?
Polling Quote
10-28-2020 , 09:30 AM
Quote:
Originally Posted by Mason Malmuth
Hi Everyone:

Earlier tonight on Hannity, which I'm sure most of you watch regularly, he had two representatives of pollsters Insider Advantage and The Trafalgar Group. These two polling companies both accurately predicted the 2016 election and both have Trump doing much better than most polls show.

So, why is this? Well, the Insider Advantage representative said that they have, relative to their questions, a "more blended system" without being completely direct to get more information from the respondent than by asking direct questions. He also stated that this approach produces better information and the respondent doesn't feel like there might be retribution if he was to answer directly that he was voting for Trump. It was also indicated that The Trafalgar Group does something similar.

Notice that consistent with what I wrote above, much of the accuracy of a poll has to do with the specific questions that are being asked.

Best wishes,
mason
Mason, what are your thoughts on the below?

Quote:
Originally Posted by Willd
Trafalgar was less accurate on nationwide polls than most outlets in 2016. They are a firm that always publishes results hugely biased towards Republicans so it's not surprising that the one time a long shot came in they shout about it from the rooftops, despite the fact they were further away from the actual nationwide numbers than mainstream polling companies, just in the other direction. Their record in the 2018 midterms was even worse - iirc they had R gaining seats when it was the biggest blue wave in a generation.

It's really not remotely surprising that polls that constantly have an R bias appear to have been more accurate when looking specifically at 2016 but it is essentially useless as a predictor for future accuracy.
Polling Quote
10-28-2020 , 09:31 AM
There was a decent segment on polling on npr early this morning if anyone's interested.
Polling Quote
10-28-2020 , 09:32 AM
Quote:
Originally Posted by LektorAJ
Yes you can, provided you know what the "correct" results for that variable are supposed to be.
If you are sure enough in the "correct" distribution of the causative variable to adjust the actual results, then why conduct the poll in the first place? Sounds like you are essentially discarding the results of the poll.

Last edited by d2_e4; 10-28-2020 at 09:38 AM.
Polling Quote
10-28-2020 , 09:35 AM
Quote:
Originally Posted by Mason Malmuth
Hi d2_e4:

I think this is correct. If you're a pollster and you see that there is something wrong in your raw data, you'll try to make an adjustment.

Best wishes,
Mason
That's the opposite of the point I was trying to make, although I concede I probably gave a particularly poor example/analogy to demonstrate it. The point of that example was that if the results disagree with my prior assumptions, I should either discard my prior assumptions, or I shouldn't have bothered conducting the survey in the first place, since adjusting for prior assumptions in this case it equivalent to essentially discarding the results of the survey.
Polling Quote
10-28-2020 , 09:52 AM
Quote:
Originally Posted by Mason Malmuth
This may be true. If Trump wins it will again show how off these polls were. On the other hand, groups like Trafalgar and Rassmusen who have consistently showed Trump doing better may breathe new life into polling as other groups begin to copy their methods.

Best wishes,
Mason
No. These groups are obviously bad. Trafalgar released a poll yesterday that showed Michigan was close (maybe that happens) but had Trump winning 44% of the Michingan black vote, which is obviously wrong. While it's good to be occasionally surprised by polls and especially crosstabs given the small sample size, these particular polls simply tell Republicans what they want to hear. Rasmussen had republicans winning the house popular vote in 2018. Dems actually won it by the largest margin in history.

Last edited by ecriture d'adulte; 10-28-2020 at 09:58 AM.
Polling Quote
10-28-2020 , 09:57 AM
Quote:
Originally Posted by goofball
But that’s the point. A sample being D+7 or whatever doesn’t tell you there’s anything wrong.

I’m unimpressed with appeal to (your own) authority since current pollsters don’t typically weight by party ID for the reasons I noted above.
Especially when dems lead the generic ballot by.... +7 .
Polling Quote
10-28-2020 , 10:14 AM
Quote:
Originally Posted by d2_e4
The point being, if you are sure enough in what the "correct" results are to adjust the actual results, then why conduct the poll in the first place? Sounds like you are essentially discarding the results of the poll.
Did you read the next paragraph?

You don't look at the final results till you have your representative sample. Your sample should be representative in terms of the things that matter. You also have to know exactly the numbers what you're trying to be representative of, using known data e.g. from censuses and past election results.

I probably agree with you on the problems with "party identification" being a bit too fluid, ill-remembered, subject to conscious or subconscious suppression if the voter is no longer behind that party (since they last voted or since they registered to vote in a party's primaries). So I wouldn't weight with "party identification" but i would weight with "previous vote" because previous vote is known.
Polling Quote
10-28-2020 , 12:40 PM
Quote:
Originally Posted by NewAcctIsBest
Hi, thank you for doing this

How many people would need to be polled to have a reasonably accurate idea of who is going to win? lets say 99% confidence interval
It's probably not possible (within reason). There are broadly 2 types of error when it comes to polling, random and systemic. Random error is super well behaved; a known quantity and can be reduced simply by increasing sample size according to math easily understood by high school students. Systemic error is, on the other hand, virtually impossible to treat from a pure math standpoint (AFAIK) and doesn't go away in a very predictable nature by simply increasing sample size. The possibilty of systemic error likely prevents any practical poll or model from saying Biden (or Trump) is 99% to win even with a relatively huge sample size.

Edit: I just read Lektor's post where he quoted himself. He explained the above concept better than I did and with more detail.

Last edited by ecriture d'adulte; 10-28-2020 at 12:52 PM.
Polling Quote
10-28-2020 , 01:37 PM
Forecasting the US elections
The Economist is analysing polling, economic and demographic data to predict America’s elections in 2020








Polling Quote

      
m