Monday, August 26, 2013

Lies, damned lies and statistics

Lies, damned lies and statistics
Many people are familiar with this quote, attributed to Disraeli (although the originator may in fact be Mark Twain). In some ways this is quite pejorative of the important data encapsulated by a statistic, but recent events remind of us of just how much a statistic can be bent to give a totally misleading impression.

There are two main types of statistics, descriptive and inferential. Descriptive statistics are simply a statement about the prevalence in the sample of particular characteristics - so the raw death rate at a particular hospital will be a descriptive statistic. There are a number of difficulties in drawing inferences from such statistics, but we can reliably determine the death rates at a hospital (although there may be issues with definitions of post-operative mortality, for example).

Often we wish to draw conclusions from a limited sample of a larger population, or we might wish to adjust a patient population in order to make comparisons between different hospitals. Here we come into the realm of inferential statistics. So we may look at the data from a limited sample, and try and draw conclusions about the prevalence of a particular condition in the wider population. If we design the study correctly so the sample reflects the wider population, we can say that the parameter lies within a certain range by calculating the confidence intervals, which enables us to say the parameter is between two values with a 95% confidence (if that is the confidence interval we've chosen).

All the mathematical tricks in the world are irrelevant if the design of the research is incorrect, so that the sample doesn't represent the wider population we're trying to research. With statistical models that attempt to adjust the patient population for all the factors that affect the variable we're looking at, there needs to be robust research conducted. So for cardiac surgery, studies were done so that it could be established with confidence what effect certain conditions had on the mortality from cardiac surgery. When these methods are reliable, as well as stating that a certain unit has X mortality for cardiac operations, we can also say with reasonable reliability that this unit has higher mortality than other units. We cannot say what the reasons are for that raised mortality, but we can say the likelihood is that these results did not occur by chance.

All these issues are specific to each statistical calculation. Of course there are the issues of probabilities, and the inevitable fact that as a matter of chance 50% of the population will be below average, but these are relatively mundane. However, without a more sophisticated understanding than this, the public will continue to be misled. The bald statement that there were 1200 unnecessary deaths at Mid Staffs and 13,000 at 14 other NHS hospitals is a blatant misrepresentation of the strength of the statistical method used. The originator of the HSMR method should be taking greater pains to ensure that the data are not misrepresented.

No comments:

Post a Comment