With the longest British election campaign in full swing, the pollsters are getting more attention than usual. It is disappointing, then, to see a major polling company falling into bad statistical habits.
YouGov has conducted a poll of 1800 people that reverses the Conservative lead of a few days ago, which itself reversed Labour’s lead from the previous poll. You might think that the random seeming changes in results separated by only a few days might, indeed, be random sampling errors, but YouGov is happy to imply otherwise with its headline “Labour Lead at 4%”. That headline has been repeated, without any caveats, by The Sunday Times with a banner headline of “Labour races into 4-point lead after Miliband’s TV success”. But it isn’t true. Read More…
Looking through last week’s papers (our paper delivery boy made it through all the snow disruption last month, but seems to have forgotten today now the weather’s fine and the sky is blue) I found a story I had missed.
The small front page story started with
“Gordon Brown has insisted Labour could still win the general election outright as another poll showed the Tory lead narrowing.
Research by ICM for the Sunday Telegraph put David Cameron’s party down one point since last month on 39%.”
I know it might indicate something about my personality type, but the story irritated me. Down one point, with a sample size of a measly thousand?
Now, results of course vary from sample to sample in a predictably random way, which puts a limit on the reliability of any judgements made from the data from just one sample. But how much? Trust the data to within 0.1%? Or 10%?
Here’s the maths bit — skip to the next paragraph if it’s not your thing.
If the poll results over time can be represented approximated by a Poisson distribution (a reasonable assumption), then the variance of the number of people preferring one party is equal to the mean number of people choosing that category. For opinion polls, we don’t know this mean, but it is close to the reported figure, i.e 39% of 1002 in the sample. With this variance, we can be about 95% sure that the true mean number of people in repeated samples choosing Tory would be 390 (40%) plus or minus twice the square root of the variance (about 40 votes, or 4%).
And the result is:
So the real result is “The Conservatives polled between 35% and 43%, which is consistent with no change at all.” OK, not a good headline, but even newspapers have an obligation to at least try to be right. I’m sure papers used to put this sort of information at the foot or the article (where hardly anyone would see it), even if the article writer ignored such a basic check. But to have every outlet from newspaper to the TV news run a similar story is laziness.
Every newsroom must have someone with enough maths skill to do this right, haven’t they?