Map 2. The above map demonstrates what counties are "hotter" than others in terms of death rates. However, it doesn't do a very good job of demonstrating larger scale clustering. This next map divides the continental US into a 1 degree grid, where larger scale clustering becomes more apparent. This is interesting because it shows very few bright red areas in the middle of blue areas, a purple area is at least adjacent.
Map 3. The next step was to obtain data from the Toxic Release Inventory in shape file format (http://www.mapcruzin.com). This data came with co-ordinates from each source. There are studies that have demonstrated the relatively high inaccuracy of the Lats. and Longs. that are submitted with this data. However, at least these facilities are much more accurate in the reporting of their county. This, along with the fact that the breast cancer data only came at the county level made me limit the analysis and combine all TRI data at the scale of the county. Map 3 is the number of TRI facilities per county. A higher number in a county may be due to being located in a large county. However, a large portion of the dense TRI counties are relatively small. As well will see in Table 1. there is a high correlation between facility density and population.
Map 4. This map is based on the sheer output of chemicals in the TRI per county, regardless of the level of toxicity. This is indicative of a community's toxic load. Many of the counties in Nevada and Nex Mexico seem to be so hot because of the relatively high volume output levels due to mining, but these are areas of rather low population density. What is interesting is the difference between number of facilities and sheer output volume. Just because there are many facilities located in a particular county does not mean there will also be a high output rate.
Table 1. This table was produced to investigate where correlations might exist between breast cancer death rates, counts, and projected counts in the future. That this projected (Ecountwf) count relates highly to current count allows for me to use data from 1970-1994 as it relates to current (1999) TRI data. I noticed a relatively high (.7319) correlation between facility count and death count (not rate). However, I figured this was more of a result of population density per county more than anything else. So, I went out and found the 1999 estimated population data by county (http://www.census.gov) and found a fairly high correlation between population count, facility count and death count. Therefore I would have to assume that the three were related, primarily based on population density. The correlation between TRI output and death rate was quite insignificant (.00413), that is, a high TRI output in the county did not correlate to a high breast cancer death rate. I also ran regressions on this data, produced scatter plots. It ended up showing a flat slope, i.e. no strong correlation at all.
Correlation
FACILITY_COUNT
TOTAL_RELEASE
TOT_REL_TRN
Ratewf
Countwf
Lbwf
Ubwf
Lwf_uus
Lus_uwf
Ecountwf
_7_1_1999_Estimate
_7_1_1998_Estimate
FACILITY_COUNT
1
TOTAL_RELEASE
0.077201474
1
TOT_REL_TRN
0.228437132
0.975168948
1
Ratewf
0.232983466
-0.004135073
0.031376927
1
Countwf
0.731941588
0.036919316
0.137138669
0.254850399
1
Lbwf
0.369845084
0.007586189
0.0630855
0.854768951
0.361110453
1
Ubwf
-0.011314659
-0.006318832
-0.007422909
0.784448698
0.038077537
0.358545464
1
Lwf_uus
0.369840209
0.007585355
0.063083612
0.854769501
0.361124341
0.999999675
0.358544359
1
Lus_uwf
0.011325158
0.006324469
0.007429779
-0.784448901
-0.03806916
-0.35854533
-0.999999872
-0.358544227
1
Ecountwf
0.752805933
0.043892289
0.147689779
0.242206411
0.996652452
0.361748538
0.014533547
0.361761675
-0.014525136
1
Population_7_1_1999_Estimate
0.873416689
0.068485827
0.194233922
0.219336819
0.843851271
0.351799093
-0.015024587
0.351805765
0.015034279
0.871002682
1
Population_7_1_1998_Estimate
0.875398936
0.068717901
0.1948283
0.220454913
0.84600156
0.352815195
-0.014193687
0.352821591
0.014203314
0.872830654
0.999950388
1
DATA FIELD DEFINITIONS
FACILITY_COUNT: Number of TRI facilities in the county
TOTAL_RELEASE: Total release per county in pounds of chemicals
TOT_REL_TRN: Total release on and off site per county in pounds of chemicals
COUNTY: 5-digit FIPS code
RATEWF: mortality rate per 100,000, age-adjusted to the 1970 US population
COUNTWF: number of deaths
LBWF: lower bound of the 95% confidence interval on the mortality rate
UBWF: upper bound of the 95% confidence interval on the mortality rate
LWF_UUS: LBWF - upper bound of the 95% confidence interval on the US mortality rate
LUS_UWF: lower bound of the 95% confidence interval on the US mortality rate - UBWF
ECOUNTWF: expected number of deaths based on US rates
Map 5a and 5b. Even though, at the county scale, there is no strong correlation between output volume and breast cancer death rates, I did think it would be useful to still be able to see the counties that do show high rates in both categories. In addition, it is useful to be able to quickly see where there is little or no correlation between the two, i.e. where there may be high death rates with low output. The top map (5a) is a bit too dense to demonstrate how this map can be useful. The enlarged portion (5b) shows a scale where the map is more readable.
Conclusion. This project demonstrates that there is no strong correlation between a county's white female breast cancer death rate and sheer output volume of toxics. This does not mean, however, that knowing where the two correlate (particularly on the high end) is not useful. The final map (Map 5a and 5b) becomes helpful in seeing the counties where such a correlation does occur.