The Sally Clark Case: When Data Analysis Goes Badly Wrong
A tragic tale of the misapplication of probability.
In today's integrated and interconnected world, we are surrounded by data. Data is growing at quite an extraordinary rate and so much more is being collected to improve decisions regarding every facet of human life. In sports, retail, start-ups, governments and businesses, the reliance on data has never been more pronounced.
Given the degree to which we are beholden to and reliant on data, what happens, when we fail to interpret what's been presented to us accurately?
In November 1999, Sally Clark became the greatest victim of the greatest miscarriage of justice in modern British history, when she was convicted of murdering her two infant children. The first child that she gave birth to died within a few weeks of birth in December 1996 and the other child died in similar circumstances in January 1998.
The prosecution relied on evidence provided by a pediatrician professor, who said that the chances of "two cot deaths" in an affluent family was 1 in 73,000,000. An event so rare, that the only conclusion that the prosecution came to was that Sally Clark was responsible for the deaths of her babies.
However to all data scientists, the questions that should have been posed by the Crown Prosecution Service (the prosecutors in England) are as follows:
· How was the probability of 1 in 73 million calculated?
· What assumption has been made? Is the assumption valid?
· Even if the probability is correct, how do you think that might be misinterpreted?
Misapplication of probability
The statistics for the 1 in 73 million deaths from sudden infant death syndrome came from a body called Confidential enquiry for stillbirths and deaths in infancy (CEDSI). According to CEDSI, the odds of a random child dying from cot death is 1 in 1303, and this falls further, if the child is from an affluent non-smoking family to 1 in 8500.
The probability of 1 in 73 million came from squaring 1/8543, which was what was used in convicting Sally Clark, back in 1999.
According to Professor Ray Hill, he had the following to say on the flawed statistics:
He estimates that siblings of children who die of cot death are between 10 and 22 times more likely than average to die the same way. Using the figure of 1 in 1,303 for the chance of a first cot death, we see that the chances of a second cot death in the same family are somewhere between 1 in 60 and 1 in 130. There isn't enough data to be more precise, or to take familial factors into account, but it seems reasonable to use a ballpark figure of 1 in 100.
Multiplying 1/1,303 by 1/100 gives an estimate for the incidence of double cot death of around 1 in 130,000. Since 650,000 children are born every year in England and Wales, we can expect around 5 families to suffer a second tragic loss - and this is backed up by the FSID, who say that they hear of one or two such cases every year. Compare this to the figure quoted in Sally Clark's trial, which implied that double cot deaths were so rare that we could expect to them to happen once in a century!
Bayesian network analysis courtesy of YouTube
To the lawyers prosecuting the case, what they presented to the jury was the interpretation of data handed to them by CEDSI. The big mistake of the prosecution team was not to use "Bayes Theorem", which would have better explained the data properly to them in the very first place.
In non-mathematical language, Bayes Theorem allows you to separate how likely alternative explanations of an event are, from how likely it was that the event should have happened in the first place.
The interpretation of data, using Bayes Theorem, might prevented everything that happened subsequently to Sally Clark.
This is carefully explained in the video above.
Sally Clark And The Miscarriage of Justice
Because of the flaws in the evidence given by the experts, Sally Clark was convicted in 1999 and her first appeal in the year 2000 was dismissed.
At her second appeal, in January 2003, her conviction was quashed, owing to much sounder interpretation of data, surrounding cot deaths and the sudden infant deaths syndrome.
By the time she was released from jail, the damage was already done. Sally Clark, by virtue of the fact of being the daughter of a policeman and a solicitor, she endured great difficulty in jail.
Immense damage was done to her mental health and she died for alcohol poisoning in March 2007.
The sally Clark episodes goes to show what happens when we misinterpret data to fulfill some political objective and the tragic damage that it can do to people's lives.
Thanks very much for reading.
Helen Joyce of the Plus Magazine
Professor Ray Hill of Salford University.
John Haigh: Taking Chances.