Andrew Wakefield and Brian Hooker have been making claims that the CDC are involved in misconduct in autism research. In case you haven’t followed the story, it basically goes like this:
1) the CDC planned on a study of MMR and autism using the MADDSP data.
2) That the CDC created a research plan.
3) That the CDC found results they didn’t want to report: an calculated odds ratio for African American boys. So the CDC team allegedly deviated from that plan and didn’t report that result.
4) That the CDC introduced a new analysis after the plan: that they would include birth certificate data. While the CDC rationale for this new analysis was to provide more data (confounding variables) for the analysis, the allegedly real reason was to dilute the sample set and make statistically significant results disappear.
Here’s a paragraph from one of the press releases about the Hooker study:
According to Dr. Thompson’s statement, “Decisions were made regarding which findings to report after the data was collected.” Thompson’s conversations with Hooker confirmed that it was only after the CDC study coauthors observed results indicating a statistical association between MMR timing and autism among African-Americans boys, that they introduced the Georgia birth certificate criterion as a requirement for participation in the study. This had the effect of reducing the sample size by 41% and eliminating the statistical significance of the finding, which Hooker calls “a direct deviation from the agreed upon final study protocol – a serious violation.”
Or so goes the story. But as is often the case with Andrew Wakefield and Brian Hooker, the facts don’t match the claims.
In a recent video, Mr. Wakefield shows us the research plan the CDC had drafted. One red flag with Mr. Wakefield’s approach so far has been how he tries to tightly manage the flow of information. He has not shared the analysis plan in total and only now has he provided us with a couple screenshots. Begs the question: what are they hiding?
Here’s one screenshot from that video. This one is where he gets the idea that the plan was to report race for the entire sample.
Here’s the full text, in case that’s difficult to read:
We will use conditional logistic regression stratified by matched sets to estimate the odds ratios for association between age at MMR vaccination and autism. In the main analyses, we will include all autism cases.
Potential confounding variables will be evaluated individually for their association with the autism case definition. Those with an odds ratio p-value < 0.20 will be included as covariates in a conditional logistic regression model to estimate adjusted odds ratios for the association between age at vaccination and autism. The only variable available to be assessed as a potential confounder using the entire sample is child’s race. For the children born in Georgia for whom we have birth certificate data, several sub-analyses will be carried out similar to the main analyses to assess the effect of several other potential confounding variables. A recent case control study (CDC, 2001) carried out with a subset of the autism cases from this study found that age matched cases and controls differed on several important background factors including maternal age, maternal education, birth type, and parity. The variables that will be assessed as potential confounders in this study will be birth weight, APGAR scores, gestational age, birth type, parity, maternal age, maternal race/ethnicity, and maternal education. (See Table 2 for how variables will be categorized.)
There are two interesting points in the above. First, the sentence Mr. Wakefield highlights doesn’t say what he claims. The only variable available to be assessed as a potential confounder using the entire sample is child’s race. The plan doesn’t say that they will test and report race. Consider the context: this is a section of the plan called “statistical analysis”. Put in context with the entire paragraph, this sentence is clear: the full dataset is limited because it only has one variable available.
The CDC didn’t deviate from the plan when they didn’t report on race for the total sample because that was never in the plan. If you want more evidence of this, the end of the paragraph says “See Table 2 for how variables will be categorized”. Table 2 is titled “Descriptive Statistics for Children Born in Georgia with Birth Certificate Records”. The variables will be categorized in the birth certificate sample.
The second interesting point from the paragraph Mr. Wakefield has shown us is this: the CDC plan included a birth certificate sample.
Here’s a screenshot of the analysis plan from that new video, showing the front page of the analysis plan:
Shown with this voice over by Mr. Wakefield (while the screenshot above is shown going up in flames…very dramatic)
“Over the ensuing months, after the data after the data had been collected and analyzed, and strictly forbidden in the proper conduct of science, the group abandoned the approved analysis plan, introducing a revised analysis plan to help them deal with their problem.”
So, in case you were thinking, “that’s an analysis plan, how do we know it’s the analysis plan”, well, you have Mr. Wakefield’s word on it. This is the “approved analysis plan” that the CDC allegedly had to revise.
What interests me about this as that’s the same plan that I have and was preparing to write about. It’s nice now to be able to be able to say that this is, indeed, the same document that Mr. Wakefield and Mr. Hooker are working with.
We’ve already seen two big mistakes by the Wakefield/Hooker team: first that the analysis plan doesn’t include a call to report on race separately in the total sample (the group without the birth certificates), second that the CDC “approved analysis plan” included analysis of a subset with birth certificate data.
So, what were the objectives of the study as in the plan?
We did not have information regarding onset of symptoms for most cases in this study and this limited our ability to do certain types of analyses such as case series analyses. In addition, a totally unexposed group (i.e., never received the MMR vaccine or other measles containing vaccine) was not available since measles, mumps, and rubella vaccination are required for school attendance in Georgia. The following objectives are considered the primary objectives for this study.
1) To determine if case children were more likely than their matched controls to have been vaccinated with MMR before 36 months of age. DSM-IV criteria for autism require that onset of symptoms occur before 36 months of age. Therefore, the 36-month cut-off is one that by definition can be used to classify a definitely “unexposed” group.
2) To determine whether there was a difference between cases and controls in the proportion of children exposed to their first dose of MMR vaccine before 18 months of age. This objective is based on the research that suggests the timing of first parental concern for the development of autism appears around 18 months of age (Taylor et al, 1999). In addition, Cathy Lord has reported that the range of first parental concern for regression was between 12 and 23 months of age with a mode of 19-21 months.
3) To determine whether the age distribution for receipt of the MMR vaccine differs between cases and controls.
Analysis of Autism subgroups
The IOM (2001) specifically recommended additional research regarding autism subgroups and MMR. We will examine several subtypes of autism in this study. Data from the Metropolitan Atlanta Congenital Defects Program will be included in the sub-analyses to identify particular sub-groups. The following sub-group analyses will be conducted:
1) Analyses excluding cases with an established cause for autism or a co-occurring condition suggesting an early prenatal etiology (e.g., tuberous sclerosis, fragile X, or other congenital/chromosomal anomalies.)
We propose to conduct a case-control sub-analysis looking at cases without an established or presumptive cause for autism, such as tuberous sclerosis, fragile X, and other congenital/chromosomal anomalies. The purpose of doing this analysis is to create a more homogeneous case group that may be more likely to be impacted by the timing of the MMR vaccine. The objectives from the primary analyses will be replicated in this sub-analysis.
2) Analyses of Isolated versus Non-isolated Autism.
Isolated autism cases are cases with no other co-morbid developmental disability while non-isolated cases do have a co-morbid developmental disability. Previous research suggests that the majority of non-isolated cases have a co-existing developmental disability of mental retardation (CDC, 2001). Both isolated and non-isolated cases will be compared separately to controls. The objectives from the primary analyses will be replicated in this sub-analysis.
3) Analyses examining Gender Effects
Males are at substantially higher risk for autism and may be more vulnerable to the exposure associated with the MMR vaccine. We will analyze males and females separately and replicate the main objectives of the primary analyses as well as examine the potential confounders available from Georgia birth certificates.
4) Analyses excluding autism cases with known onset prior to 1 year of age.
For a subset of autism cases, we were able to identify the timing of parental concern. This sub-analysis will exclude all cases excluded with an established or presumptive cause for autism (e.g., tuberous sclerosis, fragile X, and other congenital/chromosomal anomalies.) and children for whom we have been able to identify first parental concern prior to 12 months of age.
Just in case anyone reading this is one of the few that has been following Mr. Wakefield’s video releases: in a new video Mr. Wakefield is trying to claim that the isolated autism subanalysis was not done. Except that it was. They made a minor change to autism without MR, which gave essentially the same result that Mr. Wakefield claims was hidden.
Autism without MR has an odds ratio of 2.45 with a 95% confidence interval of 1.20 to 5.00. I’ll write about this new video soon as there’s much sleight of hand going on, but Mr. Wakefield is claiming that a result of odd ratio = 2.48 with confidence interval of 1.16 to 5.31 was not reported. Besides ignoring the fact that the data were reported by the CDC, Mr. Wakefield ignores the fact that these are raw-data results. Total sample, unadjusted analysis. In the adjusted analysis the result does not suggest an association.
But, getting back to the main point: the claims of fraud are just not founded on fact. The two main claims of “fraud” are just wrong. The analysis plan did not state that they would do a subanalysis by race for the total sample. The addition of the birth certificate data is in the plan, not in some sort of revision. And Mr. Wakefield and Mr. Hooker knew this.
I am reminded of a quote from an ABC News article recently
“There are always going to be those people at the edges of science who want to shout because they don’t want to believe what the data are showing,” said Dr. Margaret Moon, a pediatrician and bioethicist at Johns Hopkins Berman Institute of Bioethics. She said she thought the study author “manipulated the data and manipulated the media in a very savvy and sophisticated way.”
“It’s not good. It’s not fair. It’s not honest. But it’s savvy,” Moon said.
By Matt Carey