Sampling, probability and non-probability
January 20, 2014
Evaluating the significance of a survey used to be fairly straightforward – for any truly random sample, there is a well-defined margin of error which allows a critical reader to judge how valid any inferences are.
So for a survey of 1,000 respondents, the margin of error is around 3% at the 95% confidence level (this last figures indicates there is only a 5% probability that the findings were the result of chance). If a news story says Ed Miliband is leading David Cameron by 42% to 37% in the polls, we know this falls within the stated margin of error, since Ed Miliband’s true position may be 39% (42% – 3%) and David Cameron’s may be 40% (37% + 3%).
The difficulty from the pollster’s point of view is that creating a truly random sample is costly and time-consuming. So recently, non-probability sampling has enjoyed favour, particularly in the US and now increasingly in the UK.
Non-probability sampling covers a range of methods, from opt-in panels and clinical trials, to case-control studies and intercept surveys, as a recent report from the American Association for Public Opinion Research (AAPOR) makes clear.
The AAPOR accepts that non-probability samples have performed well in electoral polling but cautions that “evidence of their accuracy is less clear in other domains”.
Because non-probability sampling is a collection of methods rather than a single all-encompassing framework, it is a complex matter to explain to readers how reliable the results are. The theoretical considerations and real-world performance of the various methods vary considerably.
Hence transparency, notes the AAPOR, is ‘essential’ – “there is a higher burden … to describe the methods used to draw the sample, collect the data and make inference.” This doesn’t happen often enough, they say, and the reader is confronted with a ‘black box methodology’.
The problem is that without awareness of the appropriateness of modelling assumptions made by the pollsters, the reader is not in a position to evaluate the poll results. There is no uncomplicated analogue of the ‘margin of error’ figure given for probabalistic samples.
In its conclusions, the AAPOR calls for more theoretical work to be done, followed by empirical evaluation.
Until then, readers will be justified in reserving judgement on news stories based on non-probabalistic sampling without fuller explanations of the methods and assumptions used.