This article by David Tuller in the New York Times (February 27, 2015) reports on the discovery of a biomarker for chronic fatigue syndrome, which apparently is no longer called chronic fatigue syndrome. The story reports that the Institute of Medicine has changed the name of this condition to “systemic exertion intolerance disease”, which I interpret as an attempt to make it sound more like a real disease that could be treated pharmaceutically, and less like a label for lazy people. Anyway, the scientists speculate that this could form the basis for the first clinical diagnosis for the illness. The research article was published in the new open access journal Science Advances, which critics have assailed not only because of unprecedented high publication fees (more than $5000 per article) but also because it turns out to not really be open access after all.
But the real point of this post is to note the shockingly frank description of p-hacking in the news report. These authors seem to have no shame about this, which I suspect means that they don't really know that they did anything wrong. Neither, apparently, do the reviewers of Science Advances. Here is how the New York Times describes the analysis in the paper:
"For the study, the research team — which included scientists from Columbia, Stanford and Harvard — tested the blood of 298 patients with the syndrome, and 348 healthy people who served as a control group, for 51 cytokines, substances that function as messengers for the immune system. When the team compared all the patients with all the healthy controls, they found no significant differences between the two groups. But after dividing the patients into two cohorts — those who had been sick for less than three years and those who had been sick longer — they found sharp differences. And both sets of patients were different from healthy controls."
They don't mention how many cut-points they tried before they came up with 3 years. Any bets on whether this one holds up or not?
While we're looking at this one, here's a line from the paper that impressed me on page 3:
"Two proinflammatory cytokines had a prominent association with short-duration ME/CFS (Table 2). This association was markedly elevated for interferon-g (IFNg), with an OR of 104.77 (95% CI, 6.975 to 1574.021; P = 0.001)."
That's an impressively wide interval.
And I confess that I was a bit baffled by this bit of text on page 8:
"GLM analyses were applied to examine both the main effect of diagnosis and the main and interaction effects of different fixed factors including diagnosis (two-level case versus control comparisons and three-level short duration versus long duration versus control analyses) and sex, with age adjusted as a continuous covariate. Because GLM uses the family-wise error rate, no additional adjustments for multiple comparisons were applied."
So you can't be accused of p-hacking if you fit a GLM because it fits a "family-wise error rate"? I have no idea what their talking about there, but maybe that's my problem, not theirs.
But the real point of this post is to note the shockingly frank description of p-hacking in the news report. These authors seem to have no shame about this, which I suspect means that they don't really know that they did anything wrong. Neither, apparently, do the reviewers of Science Advances. Here is how the New York Times describes the analysis in the paper:
"For the study, the research team — which included scientists from Columbia, Stanford and Harvard — tested the blood of 298 patients with the syndrome, and 348 healthy people who served as a control group, for 51 cytokines, substances that function as messengers for the immune system. When the team compared all the patients with all the healthy controls, they found no significant differences between the two groups. But after dividing the patients into two cohorts — those who had been sick for less than three years and those who had been sick longer — they found sharp differences. And both sets of patients were different from healthy controls."
They don't mention how many cut-points they tried before they came up with 3 years. Any bets on whether this one holds up or not?
While we're looking at this one, here's a line from the paper that impressed me on page 3:
"Two proinflammatory cytokines had a prominent association with short-duration ME/CFS (Table 2). This association was markedly elevated for interferon-g (IFNg), with an OR of 104.77 (95% CI, 6.975 to 1574.021; P = 0.001)."
That's an impressively wide interval.
And I confess that I was a bit baffled by this bit of text on page 8:
"GLM analyses were applied to examine both the main effect of diagnosis and the main and interaction effects of different fixed factors including diagnosis (two-level case versus control comparisons and three-level short duration versus long duration versus control analyses) and sex, with age adjusted as a continuous covariate. Because GLM uses the family-wise error rate, no additional adjustments for multiple comparisons were applied."
So you can't be accused of p-hacking if you fit a GLM because it fits a "family-wise error rate"? I have no idea what their talking about there, but maybe that's my problem, not theirs.