By: Kevin Lattery
I love chocolate, especially a good dark chocolate. I don’t even feel guilty because dark chocolate is supposed to have health benefits. Researchers tell us that it’s supposed to lower bad cholesterol and increase good cholesterol. It is also full of antioxidants and contains phenylethylamine (PEA), the same chemical your brain creates when you feel like you’re fresh in romantic love.
But now there is a new study, published in the prestigious New England Journal of Medicine, that suggests even more. It shows a high correlation (.791) between a country’s per capita chocolate consumption and the number of Nobel Prize winners.
The article suggests there might be a real causal relationship here, with chocolate consumption helping Switzerland earn those Nobel Laureates. I still think the article may be more of a joke. But whatever the article’s intentions, it’s fun to pick on because it fails in so many ways, and that failure can be instructive.
Confronted with any correlation between two factors, the first thing one asks is whether there is a third factor that influences both. For instance there is a high correlation between ice cream sales and drownings. But we reject the idea of ice cream causing drowning because of a more likely third variable: the outside temperature. Warmer weather brings with it more ice cream sales and more people swimming.
As statisticians, part of our job is to look for these other factors (like the temperature) that may also be related to the correlation. In the case of Nobel Prizes, we also want to control for factors like income, access to education, healthcare, research oriented jobs and money, etc. We might even want to consider the Nobel Prize panel and their perception of each country.
We would also want to look within each country to see if these relationships exist. Are the Swedish Nobel Laureates even eating chocolate? What if they are eating significantly less chocolate than others in their country? This is a kind of hierarchical or multi-level mixed model.
Sophisticated statistics can help explain spurious correlations, but in the end we will never be certain that we have controlled for the right factors with the right model. There may always be some unknown factor we missed. This kind of uncertainty was leveraged by the tobacco companies for years. Yes, they would admit there is a correlation between smoking and lung cancer, but they argued there are many other potential factors that could be causing the cancer. Smokers tend to drink more alcohol and coffee, and exercise less. They may even have different genetic or psychological tendencies.
The best way to test for causation is with a randomized experimental design. You see these in medicine all the time. Take a group of people, randomly divide them into 2 subgroups. Give one group a drug (test), and the other a placebo (control). Then see if the drug brings about a change in the test group relative to the control group. The random assignment of people into two groups takes care of controlling for other factors. At least, it probably does so. There is always a chance that we just happened to create the two groups along some relevant factor. So we can repeat the study multiple times and see if the results are replicated.
In the case of chocolate, we would divide each person into one of two groups at birth. One of them we would force to eat chocolate, while the other would be prohibited. Then we would see how many from each group become Nobel Laureates. Unfortunately, there are not that many Nobel Laureates. So we would be comparing some very small percentages. I’m afraid at the end of this costly and ethically problematic study, we would still question the results. The small sample of Nobel prizes makes it a bad target variable. The chart above shows that we are talking about 30 Nobel Laureates (at the high end) per 10 million, so .0003%. When we start comparing numbers this small, I think most of statisticians get squeamish. A better test might be to look at something like IQ level rather than Nobel Laureates.
Ironically, another statistician did look at the relationship between chocolate consumption and IQ, using the same 23 countries (except Japan whose data could not be obtained) and this produced a much lower correlation.
This still does not control for other 3rd factors like education and research money. Nor does it look within country to see if Swedes with higher IQs are eating more chocolate. I’m guessing by the time we include those control variables in a multi-level model my chances of boosting my IQ or winning a Nobel Laureate aren’t much better as a result of my chocolate fetish. I’m OK with that. Valentine’s day is just a day away, and I’ll gladly eat dark chocolate, even if I’m not any smarter because of it.