[Correction 5/4/2008: Please see this comment. The trends and conclusions don't change. The scatter of the graphs is not affected in a way that is noticeable, but the Y ranges do change. The adjustment formula also changes. See the corrected spreadsheet for details.]
This is a critique of Palmer et al. (2008), a recent study claiming to associate the administrative prevalence of autism in Texas school districts and proximity to coal-fired power plants, as well as mercury emissions. Normally I would just point out the likely problems of the paper, but this time I will go further and test a key hypothesis of my critique using California data in a way that is straightforward enough for readers to verify.
Palmer et al. (2008) is not the first study of its kind. Palmer et al. (2006) claimed to document that “for each 1000 lb of environmentally released mercury, there was a 43% increase in the rate of special education services and a 61% increase in the rate of autism.” The more recent paper by Palmer et al. does not result in such remarkable estimates, considering its finding that “for every 1,000 pounds of release in 1998, there is a corresponding 2.6 percent increase in 2002 autism rates.”
Windham et al. (2006) is a case-control study done in the San Francisco Bay Area which claims to associate autism with emissions of Hazardous Air Pollutants (HAPs).
Then we also have Waldman et al. (2007), which I consider a study of the same type, except it associates autism with precipitation (as a proxy of television exposure) instead of environmental pollution.
My primary criticism of these types of studies is that they are attempting to find a cause for an epidemiological phenomenon that could very well not require an environmental explanation. That is, administrative data (special education data in particular) is not equipped to tell us if there are real differences in the prevalence of autism from one region to the next. No screening has ever demonstrated that substantial differences in administrative prevalence between regions are not simply diagnostic differences.
That said, the studies have been done, and they have found statistical associations. This usually means they either found a real effect or they have failed to properly control for some confound.
As I have noted repeatedly over the last couple of years, the glaring confound that most likely mediates these types of associations is urbanicity. The association between urbanicity and autism was documented even before these studies were carried out. It is plausibly explained by a greater availability of autism specialists in urban areas and by greater awareness in the part of parents who live in cities.
Palmer et al. (2008) does control for urbanicity, which might be one of several reasons why its findings are underwhelming compared to those of Palmer et al. (2006).
Is the control for urbanicity in Palmer et al. (2008) adequate?
There are two main problems with the control for urbanicity, described in the paper as follows.
Urbanicity. Eight separate demographically defined school district regions were used in the analysis as defined by the TEA: (1) Major urban districts and other central cities (2) Major suburban districts and other central city suburbs (5) Non-metropolitan and rural school districts In the current analysis, dummy variables were included in the analysis coding Urban (dummy variable 1, and Suburban (dummy variable2), contrasted with non-metro and rural districts which were the referent group. Details and specific definitions of urbanicity categories can be obtained at the TEA website http://www.tea.state.tx.us/data.html
1. It is too discrete. Within the set of urban districts, some districts will be more urban than others. The same is true of rural districts. Palmer et al. (2008) is effectively using a stratification method to control for urbanicity, but this method is limited, especially considering the paper looks at 1,040 school districts. A better methodology would be to use population density as a variable.
2. Modeling for distance. The paper models autism rates based on distance to coal-fired power plants. It follows that a control variable should model distance to urban areas rather than urbanicity of each district. Granted, this would not be easy because, as noted, urbanicity is not a discrete measure. But it needs to be noted as a significant limitation of the analysis. Consider school districts in areas designated as “rural” that are close to areas designated as “urban.” Such proximity would presumably provide access to a greater availability of autism specialists than would otherwise be the case.
This time around I thought it would be a good idea to run some actual numbers in order to test this population density confound hypothesis that up to this point has been simply theoretical. I will use county-level data from the state of California, which was fairly easy to obtain on short notice. The data used is the following.
- Special education autism caseload data at the county level for 2005 was obtained from a California resident who had requested it from the California Department of Education.
- County population and density data for 2006 was obtained from counties.org.
- Atmospheric mercury concentration data was obtained from the EPA’s 1996 National Air Toxics Assessment Exposure and Risk Data for 2006.
- All of the raw data, intermediate data, formulas, and resulting charts can be found in this spreadsheet which I am making available for readers to verify and tweak as needed.
Population Density vs. Autism
Autism prevalence was calculated by dividing the special education autism caseload of each county by its population (Column G). This is not a precise determination, of course, but it should not affect the analysis. In any given California county, the population under 18 is roughly a fifth of the total population of the county.
A first attempt at modeling population density vs. autism prevalence (Chart A) suggested the relationship was logarithmic. So I modeled log(population density) vs. autism prevalence, which resulted in the clear correlation you see in Figure 1 (Chart B).
This is as expected. You will note, however, there is one significant outlier in the lower-right quadrant. That is San Francisco county. Presumably, because of its peculiar geographic characteristics, its population density is the highest in the state. Nevertheless, San Francisco is an important data point since it is a significant urban area which happens to have a relatively low special education prevalence of autism. Let’s leave it in and see how it affects things.
I will use a simple standardization method of adjustment for population density. Basically, I will standardize autism prevalence in each county, such that population density is no longer a factor. Think of it this way. If the population density of each county grew such that its log were now about 3.5, how would we expect autism prevalence to be affected? The following formula is what I came up with.
Adjusted(Y) = Y + 7 – 1.93 * X
The fact that the adjusted prevalence (Column H) is not dependent on population density can be verified graphically (Chart C). Readers can click back and forth between Chart B and Chart C to better understand the effect of the adjustment. I will come back to this adjusted prevalence.
Mercury Exposure Concentration vs. Autism
I obtained atmospheric mercury exposure concentrations for each county from 1996 EPA data (Column I). More recent data would’ve been better since our population density data is from 2006, but it is not clear if newer data is available. I learned of the 1996 data because that is what Windham et al. (2006) uses. I’m working under the assumption that changes in population density in the last decade have been roughly uniform across the state.
Let’s first look at Figure 2 (Chart E), a graph of log(mercury exposure) vs. autism prevalence, without adjustment for population density.
There is a graphically noticeable trend in Figure 2, which is not surprising. The question is, does the trend remain after adjustment for population density?
Figure 3 (Chart D) is a graph of log(mercury exposure) vs. standardized autism prevalence; that is, autism prevalence adjusted for population density as previously calculated. In this figure we see there’s no longer a graphically discernable correlation between environmental mercury and autism. In fact, Excel produces a linear fit that indicates there’s somewhat of an inverse correlation between environmental emissions and autism prevalence.
Granted, if we were to remove San Francisco as an outlier, the trend would be pushed upwards. But then in this graph there appear to be two additional outliers in the middle upper part of the graph, Orange county and Los Angeles county. Keep in mind we have not adjusted for wealth. Regardless of how we might adjust the analysis, I fail to see that the graph would support a statistically meaningful association between mercury exposure and autism.
So far I have provided evidence that, in California, an association between environmental mercury exposure and autism disappears once we control for population density. This is clear to my satisfaction, but I thought it would be a good idea to attempt an inverse exercise as an illustration of the adjustment method. That is, let us try adjusting prevalence for mercury exposure, and see if the correlation with population density remains.
This is similar to what I did previously. A linear model is discerned from the correlation between log(mercury exposure) and autism (Chart E). This is used to derive an adjustment formula (Column K) whose validity can be verified graphically (Chart F). The new adjusted prevalence (Column K) is used in a new graph of log(population density) vs. autism: Figure 4 (Chart G).
What Figure 4 (Chart G) tells us is that even after we control for mercury exposure, there is still a clear correlation between autism and population density. In other words, population density wins bigtime – I believe that is the epidemiological term.
An analysis of California data suggests that correlations between the administrative prevalence of autism and environmental mercury emissions are fully mediated by population density. Palmer et al. (2008) suggests there is a real effect in Texas, but its results are not convincing primarily because its control for urbanicity is limited and inconsistent with the hypothesis the paper tests.