Predicting Lung Cancer in the United States: A Multiple Model Examination of Public Health Factors
Abstract
:1. Introduction
- How well does smoking predict lung cancer? How well does state predict it? What other factors should be included?
- How does a macro-level model (environmental quality) compare to a micro-level model (ambient air pollutants)?
- What is the best model we can obtain in terms of explanatory power and predictive accuracy?
2. Literature Review
3. Materials and Methods
- CDC United States Cancer Statistics, 2013–2017
- County Health Ranking Organization, 2011–2013 (University of Wisconsin Population Health Institute)
- EPA Outdoor Air Quality Data, 2006–2010
- Air Quality-Lung Cancer Data, 2000–2005 (National Cancer Institute and Environmental Protection Agency)
3.1. Data Cleaning
3.2. Model Results and Interpretation
3.3. Foundation + Ambient Emissions
3.4. Linear Model of All layers
3.5. Model Comparison
4. Discussion and Contributions
5. Limitations and Directions for Future Research
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Emissions tracked by the National Air Toxics Assessment
1,1,1-Trichloroethane | Acetaldehyde | Chlorobenzilate | Hexachlorobutadiene | O-Toluidine |
1,1,2,2-Tetrachloroethane | Acetamide | Chloroform | Hexachlorocyclopentadiene | Pah/Pom |
1,1,2-Trichloroethane | Acetonitrile | Chloromethyl Methyl Ether | Hexachloroethane | Parathion |
1,1-Dimethylhydrazine | Acetophenone | Chloroprene | Hexamethylene Diisocyanate | P-Dioxane |
1,2,3,4,5,6-Hexachlorocyclohexane | Acrolein | Cobalt | Hexamethylphosphoramide | Pentachloronitrobenzene |
1,2,4-Trichlorobenzene | Acrylamide | Coke Oven Emissions | Hexane | Pentachlorophenol |
1,2-Dibromo-3-Chloropropane | Acrylic Acid | Cresols | Hexavalent Chromium | Phenol |
1,2-Diphenylhydrazine | Acrylonitrile | Cumene | Hydrazine | Phosgene |
1,2-Epoxybutane | Allyl Chloride | Cyanide | Hydrochloric Acid | Phosphine |
1,2-Propylenimine | Aniline | Dibenzofuran | Hydrogen Fluoride | Phosphorus |
1,3-Butadiene | Antimony | Dibutyl Phthalate | Hydroquinone | Phthalic Anhydride |
1,3-Dichloropropene | Arsenic | Dichloroethyl Ether | Iodomethane | Polychlorinated Biphenyl |
1,3-Propane Sultone | Benzene | Dichlorvos | Isophorone | P-Phenylenediamine |
1,4-Dichlorobenzene | Benzidine | Diesel Pm10 | Lead | Propionaldehyde |
2,2,4-Trimethylpentane | Benzotrichloride | Diethanolamine | Maleic Anhydride | Propoxur |
2,4,5-Trichlorophenol | Benzyl Chloride | Diethyl Sulfate | Manganese | Propylene Dichloride |
2,4,6-Trichlorophenol | Beryllium | Dimethyl Phthalate | Mercury | Propylene Oxide |
2,4-Dichlorophenoxyacetic Acid | Biphenyl | Dimethyl Sulfate | Methanol | Quinoline |
2,4-Dinitrophenol | Bis(2-Ethylhexyl) Phthalate | Dimethylcarbamoyl Chloride | Methoxychlor | Quinone |
2,4-Dinitrotoluene | Bis(Chloromethyl) Ether | Epichlorohydrin | Methyl Chloride | Selenium |
2,4-Toluenediisocyanate | Bromoform | Ethyl Acrylate | Methyl Isobutyl Ketone | Styrene |
2-Acetylaminofluorene | Bromomethane | Ethyl Carbamate | Methyl Isocyanate | Styrene Oxide |
2-Chloroacetophenone | Cadmium | Ethyl Chloride | Methyl Methacrylate | Tetrachloroethylene |
2-Nitropropane | Calcium Cyanamide | Ethylbenzene | Methyl Tert-Butyl Ether | Titanium Tetrachloride |
3,3’-Dichlorobenzidine | Captan | Ethylene Dibromide | Methylene Chloride | Toluene |
3,3’-Dimethoxybenzidine | Carbaryl | Ethylene Dichloride | Methylhydrazine | Toluene-2,4-Diamine |
3,3’-Dimethylbenzidine | Carbon Disulfide | Ethylene Glycol | N,N-Dimethylaniline | Toxaphene |
4,4’-Diphenylmethane Diisocyanate | Carbon Tetrachloride | Ethylene Oxide | N,N-Dimethylformamide | Trichloroethylene |
4,4’-Methylenebis(2-Chloroaniline) | Carbonyl Sulfide | Ethylene Thiourea | Naphthalene | Triethylamine |
4,4’-Methylenedianiline | Catechol | Ethylenimine | Nickel | Trifluralin |
4,6-Dinitro-O-Cresol | Chloramben | Ethylidene Chloride | Nitrobenzene | Vinyl Acetate |
4-Aminobiphenyl | Chlordane | Formaldehyde | N-Nitrosodimethylamine | Vinyl Bromide |
4-Dimethylaminoazobenzene | Chlorine | Glycol Ethers | N-Nitrosomorpholine | Vinyl Chloride |
4-Nitrobiphenyl | Chloroacetic Acid | Heptachlor | N-Nitroso-N-Methylurea | Vinylidene Chloride |
4-Nitrophenol | Chlorobenzene | Hexachlorobenzene | O-Anisidine | Xylene |
References
- Leading Causes of Death. 2021. Available online: https://www.cdc.gov/nchs/fastats/leading-causes-of-death.htm (accessed on 3 June 2021).
- Klebe, S.; Leigh, J.; Henderson, D.W.; Nurminen, M. Asbestos, Smoking and Lung Cancer: An Update. Int. J. Environ. Res. Public Health 2020, 17, 258. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Rubin, H. Synergistic mechanisms in carcinogenesis by polycyclic aromatic hydrocarbons and by tobacco smoke: A bio-historical perspective with updates. Carcinogenesis 2001, 22, 1903–1930. [Google Scholar] [CrossRef] [Green Version]
- Wogan, G.N.; Hecht, S.S.; Felton, J.S.; Conney, A.H.; Loeb, L.A. Environmental and chemical carcinogenesis. Semin. Cancer Biol. 2004, 14, 473–486. [Google Scholar] [CrossRef] [PubMed]
- Dockery, D.W.; Pope, C.A.; Xu, X.; Spengler, J.D.; Ware, J.H.; Fay, M.E.; Benjamin, G.; Ferris, J.; Speizer, F.E. An Association between Air Pollution and Mortality in Six U.S. Cities. N. Engl. J. Med. 1993, 329, 1753–1759. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Greenwald, H.P.; Polissar, N.L.; Borgatta, E.F.; McCorkle, R.; Goodman, G. Social Factors, Treatment, and Survival in Early-Stage Non-Small Cell Lung Cancer. Am. J. Public Health 1998, 88, 1681–1684. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Abbey, D.E.; Nishino, N.; Mcdonnell, W.F.; Burchette, R.J.; Knutsen, S.F.; Beeson, W.L.; Yang, J.X. Long-Term Inhalable Particles and Other Air Pollutants Related to Mortality in Nonsmokers. Am. J. Respir. Crit. Care Med. 1999, 159, 373–382. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Pope, C.A.; Burnett, R.T.; Thun, M.J.; Calle, E.E.; Krewski, D.; Ito, K.; Thurston, G.D. Lung Cancer, Cardiopulmonary Mortality, and Long-term Exposure to Fine Particulate Air Pollution. JAMA 2002, 287, 1132–1141. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Alberg, A.J.; Brock, M.V.; Samet, J.M. Epidemiology of Lung Cancer: Looking to the Future. J. Clin. Oncol. 2005, 23, 3175–3185. [Google Scholar] [CrossRef]
- Jacobson, M.Z. On the causal link between carbon dioxide and air pollution mortality. Geophys. Res. Lett. 2008, 35. [Google Scholar] [CrossRef] [Green Version]
- Valavanidis, A.; Vlachogianni, T.; Fiotakis, K. Tobacco Smoke: Involvement of Reactive Oxygen Species and Stable Free Radicals in Mechanisms of Oxidative Damage, Carcinogenesis and Synergistic Effects with Other Respirable Particles. Int. J. Environ. Res. Public Health 2009, 6, 445–462. [Google Scholar] [CrossRef] [PubMed]
- Anenberg, S.C.; Horowitz, L.W.; Tong, D.Q.; West, J.J. An Estimate of the Global Burden of Anthropogenic Ozone and Fine Particulate Matter on Premature Human Mortality Using Atmospheric Modeling. Environ. Health Perspect. 2010, 118, 1189–1195. [Google Scholar] [CrossRef] [PubMed]
- Singh, G.K.; Williams, S.D.; Siahpush, M.; Mulhollen, A. Socioeconomic, Rural-Urban, and Racial Inequalities in US Cancer Mortality: Part I—All Cancers and Lung Cancer and Part II—Colorectal, Prostate, Breast, and Cervical Cancers. J. Cancer Epidemiol. 2011, 2011, 107497. [Google Scholar] [CrossRef] [PubMed]
- Williams, D.R.; Kontos, E.Z.; Viswanath, K.; Haas, J.S.; Lathan, C.S.; MacConaill, L.E.; Chen, J.; Ayanian, J.Z. Integrating Multiple Social Statuses in Health Disparities Research: The Case of Lung Cancer. Health Serv. Res. 2012, 47, 1255–1277. [Google Scholar] [CrossRef] [PubMed]
- Gharibvand, L.; Shavlik, D.; Ghamsary, M.; Beeson, W.L.; Soret, S.; Knutsen, R.; Knutsen, S.F. The Association between Ambient Fine Particulate Air Pollution and Lung Cancer Incidence: Results from the AHSMOG-2 Study. Environ. Health Perspect. 2017, 125, 378–384. [Google Scholar] [CrossRef]
- Yi, H.; Kreuter, U.P.; Han, D.; Güneralp, B. Social segregation of ecosystem services delivery in the San Antonio region, Texas, through 2050. Sci. Total Environ. 2019, 667, 234–247. [Google Scholar] [CrossRef]
- National Air Toxics Assessment. 2021. Available online: https://www.epa.gov/national-air-toxics-assessment/nata-frequent-questions (accessed on 3 June 2021).
- Lubin, J.H.; John, D.; Boice, J. Lung Cancer Risk From Residential Radon: Meta-analysis of Eight Epidemiologic Studies. J. Natl. Cancer Inst. 1997, 89, 49–57. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Steenland, K.; Loomis, D.; Shy, C.; Simonsen, N. Review of occupational lung carcinogens. Am. J. Ind. Med. 1996, 29, 474–490. [Google Scholar] [CrossRef]
- Loomis, D.; Grosse, Y.; Lauby-Secretan, B.; Ghissassi, F.E.; Bouvard, V.; Benbrahim-Tallaa, L.; Guha, N.; Baan, R.; Mattock, H.; Straif, K. The Carcinogenicity of Outdoor Air Pollution. Lancet Oncol. 2013, 14, 1262–1263. [Google Scholar] [CrossRef]
- Alberg, A.J.; Samet, J.M. Epidemiology of lung cancer. Chest 2003, 123, 21S–49S. [Google Scholar] [CrossRef] [Green Version]
- Govindan, R.; Page, N.; Morgensztern, D.; Read, W.; Tierney, R.; Vlahiotis, A.; Spitznagel, E.L.; Piccirillo, J. Changing Epidemiology of Small-Cell Lung Cancer in the United States Over the Last 30 Years: Analysis of the Surveillance, Epidemiologic, and End Results Database. J. Clin. Oncol. 2006, 24, 4539–4544. [Google Scholar] [CrossRef]
- Lobdell, D.T.; Jagai, J.S.; Rappazzo, K.; Messer, L.C. Data Sources for an Environmental Quality Index: Availability, Quality, and Utility. Am. J. Public Health 2011, 101, S277–S285. [Google Scholar] [CrossRef] [PubMed]
- Messer, L.C.; Jagai, J.S.; Rappazzo, K.M.; Lobdell, D.T. Construction of an environmental quality index for public health research. Environ. Health 2014, 13, 1–22. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kaufman, S.; Rosset, S.; Perlich, C.; Stitelman, O. Leakage in data mining: Formulation, detection, and avoidance. ACM Trans. Knowl. Discov. Data 2011, 15, 556–563. [Google Scholar]
- Witschi, H. Ozone, nitrogen dioxide and lung cancer: A review of some recent issues and problems. Toxicology 1988, 48, 1–20. [Google Scholar] [CrossRef]
- Last, J.A.; Sun, W.-M.; Witschi, H. Ozone, NO, and NO2: Oxidant Air Pollutants and More. Environ. Health Perspect. 1994, 102 (Suppl. 10), 179–184. [Google Scholar] [CrossRef] [PubMed]
- Eckel, S.P.; Cockburn, M.; Shu, Y.-H.; Deng, H.; Lurmann, F.W.; Liu, L.; Gilliland, F.D. Air pollution affects lung cancer survival. Thorax 2016, 71, 891–898. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Geman, S.; Bienenstock, E.; Doursat, R. Neural networks and the bias/variance dilemma. Neural Comput. 1992, 4, 1–58. [Google Scholar] [CrossRef]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer: New York, NY, USA, 2009; Volume 1. [Google Scholar]
- Integrated Science Assessment for Particulate Matter (Final Report); U.S. Environmental Protection Agency: Washington, DC, USA, 2009.
- Integrated Science Assessment for Particulate Matter; U.S. Environmental Protection Agency: Washington, DC, USA, 2019.
ID | Variables | Methods/Data Source(s) | Findings |
---|---|---|---|
[5] | Fine particulates, including sulfates | Regression, 14–16 year mortality follow-up of 8111 adults in 6 US cities/Prospective cohort study | After adjusting for smoking, mortality strongly associated with air pollution with fine particulates |
[6] | Race and socioeconomic and gender predictors of early-state non-small cell lung cancer | Regressions/SEER | Higher socioeconomic status helps survival, as does being Caucasian or female. |
[7] | PM10, SO4, SO2, O3, and NO2 checked for lung cancer | 6,338 nonsmoking, non-Hispanic white SDA residents of California were enrolled in 1977/Adventist Health Study (AHS) | Levels of PM10, SO4, SO2, O3, NO2 far higher for those with lung cancer, especially in males. |
[8] | PM2.5 and SO2, lung cancer, lung cancer mortality | Cox Proportional hazards model/American cancer society, part of cancer prevention study (CPS-II), ongoing prospective mortality study of 1.2 M adults | PM2.5 and SO2 associated with lung cancer; each 10 microgram/m3 increase associated with 8% increase in lung cancer mortality |
[9] | Race, gender, SE class, chemicals, not just smoking | Datasets from SEER and NPCR/National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) Program and the Centers for Disease Control and Prevention’s National Program of Cancer Registries (NPCR) | Epidemiologically and biological studies show strong causation between smoking and cellular mutations; racial disparities: Black worst, then white, then other races; lower Socio-Economic class is strongly associated with lung cancer; not gender; race seems to be proxy for Socio-Economic class |
[10] | carbon dioxide, ozone, cancer, Ozone Mortality, Ozone Hospitalization, Ozone Emergency Room Visits, and Particulate Matter Mortality pollution mortality | Mathematical model/Nasa and EPA and California air resources | A climate-air pollution model showed by cause-and-effect analysis that fossil-fuel CO2 increases U.S. surface ozone, carcinogens, and Coarse Particulate matter, increasing cancer rates |
[11] | asbestos fibers and ambient Coarse Particulate Matter PM10, PM2.5 and diesel exhaust particles | Chemicals purchased and combined with smoke, passed through filters/experiments | Synergistic effects in the generation of hydroxyl radicals in smoke with environmental asbestos fibers and ambient PM10, PM2.5 and diesel exhaust particles (DEP). The highest synergistic effects were observed with the asbestos fibers, PM2.5 and DEP, producing redox recycling and oxidative action. |
[12] | Ozone and PM2.5 to predict premature (excess) mortality | Simulations of preindustrial and present-day (2000) concentrations included rural areas/epidemiology literature | Tropospheric O3 and PM2.5 contribute substantially to global premature mortality from lung cancer, which is 14% higher than baseline. |
[13] | Socioeconomic, Rural-Urban, and Racial Inequalities in US Cancer Mortality: | Stats (regression)/three national data sources: the national mortality database, the decennial census, and the 2009–2010 Area Resource File | Blacks experiencing higher mortality from each cancer than whites within each deprivation group. Socioeconomic gradients in mortality were steeper in non-metropolitan than in metropolitan areas. Mortality disparities may reflect inequalities in smoking and other cancer-risk factors, screening, and treatment. |
[14] | All of them | Statistics | Intersectionality of all the variables |
[15] | PM2.5 and O3 | 80,285 AHSMOG-2 participants were followed for an average of 7.5 years; Logistic regression/Adventist Health and Smog Study-2 (AHSMOG-2), a cohort of health-conscious nonsmokers, where 81% have never smoked. | Lung cancer is associated with PM2.5 in never smokers and slightly higher if 1+ hrs./Day outdoors or 5+ years at residence. |
[16] | Cancer risk index (CRI) Incidence of cancer risk from air toxics | Statistical modelling of San Antonia Texas; racial disparities found/Data for CRI from National Air Toxics Assessment [17] | Cancer risk index is all positively correlated with the ambient diesel coarse particulate matter. Institutional transformations are essential to mitigate the social-ecological divide. |
[18] | Radon, Lung cancer | Meta-analysis of 8 case-control studies of indoor radon, where n = 200+/Finland (2), USA (2), Sweden (2), China, Canada | Relative risk is 14% greater for those exposed to indoor radon versus the controls |
[19] | Occupational lung cancer, asbestos, arsenic, chromium, radon, silica, beryllium, nickel, cadmium, diesel exhaust | Review of many studies of workers in the U.S. | Conservative estimates are that relative risk of occupational lung cancer is 1.31 for diesel fumes, 2.0 for asbestos, and 3.69 for arsenic; several million exposed workers in early 1980 s |
[20] | 24 experts in a working group | Review of many studies: human, occupational, outdoor, indoor, animal. | From many sources, respirable PM10, PM2.5, NO2, SO2, and O3 are frequently and substantially above safe levels. Consistency in studies shows cellular damage, as well as genetic and epigenetic effects. |
[21] | Demographics, cancer types, cigarette features all lead to mutations and other changes in the genes | Review of smoking: all epidemiologically and biological studies show strong causation, and it parallels the rise and fall of cigarette smoking/Many sources | Prevention important and cessation important because it causes cancer in all demographics. Stopping smoking is the most important cause of lung cancer. |
[22] | Incidence and survival of Small-Cell Lung Cancer among all lung cancers by Gender and Smoking and Stage of cancer | Analysis of the Surveillance, Epidemiologic, and End Results (SEER) database | Proportion of SCLC has diminished, and survival has increased slightly, attributed to decreasing smoking and increased proportion of low-tar cigarettes |
Var. | Description | Time | Data Source |
---|---|---|---|
New Case of Lung Cancer | Cancer of the Lung or Bronchus, All Ages, All Races/Ethnicities, Male and Female. Rate per 100,000 people | 2013–2017 (mean) | CDC United States Cancer Statistics |
Adult Smoking | Percentage of adults who are current smokers (county level) | 2011–2013 (mean) | County Health Ranking Organization |
Land EQI | Environmental Quality Index–Land Domain | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
SocioD EQI | Environmental Quality Index–Socio-Demographic Domain | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
Built EQI | Environmental Quality Index–Built Environment Domain | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
Air EQI | Environmental Quality Index–Air Domain | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
Water EQI | Environmental Quality Index–Water Domain | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
PM2.5_T1 | Fine Particulate Matter (2.5 micrometers or smaller) Mean of 24 h period | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
PM10_T1 | Coarse Particulate Matter (10 micrometers or smaller) based on Mean of 24 h period | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
SO2_T1 | Sulfur Dioxide | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
NO2_T1 | Nitrogen Dioxide | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
CO_T1 | Carbon Monoxide | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
O3_T1 | Tropospheric (ground level) Ozone | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
CN_T1 | Cyanide compounds | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
Diesel | Gaseous exhaust produced by a diesel type of internal combustion engine | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
CS2 | Carbon Disulfide | 2000–2005 (mean) | Air Quality-Lung Cancer Data |
PM2.5_T2 | Fine Particulate Matter (2.5 micrometers or smaller), weighted annual mean (mean weighted by calendar quarter), based on weighted mean 24 h | 2006–2010 (mean) | EPA Outdoor Air Quality Data |
PM10_T2 | Coarse Particulate Matter (10 micrometers or smaller), weighted annual mean (mean weighted by calendar quarter), based on weighted mean 24 h | 2006–2010 (mean) | EPA Outdoor Air Quality Data |
SO2_T2 | Sulfur Dioxide Mean 1 h (the annual mean of all the 1-h measurements in the year) | 2006–2010 (mean) | EPA Outdoor Air Quality Data |
NO2_T2 | Nitrogen Dioxide Mean 1 h (the annual mean of all the 1-h measurements in the year) | 2006–2010 (mean) | EPA Outdoor Air Quality Data |
CO_T2 | Carbon Monoxide 2nd Max 8 h (the 2nd highest non-overlapping 8-h avg in the year) | 2006–2010 (mean) | EPA Outdoor Air Quality Data |
O3_T2 | Tropospheric Ozone 4th Max 8 h, the 4th highest daily max 8-h average in the year | 2006–2010 (mean) | EPA Outdoor Air Quality Data |
Variable | Description | Imputation | Transformation |
---|---|---|---|
New Cases of Lung Cancer | Cancer of the Lung/Bronchus, Rate per 100,000 people | none | none |
Adult Smoking | Percentage of adults who are current smokers | none | none |
PM2.5_T1 | Particulate Matter 2.5 in Time 1 | none | none |
PM10_T1_log | Particulate Matter 10 in Time 1 | none | Logarithm |
SO2_T1_log | Sulfur Dioxide in Time 1 | none | Logarithm |
NO2_T1_log | Nitrogen Dioxide in Time 1 | none | Logarithm |
CO_T1_log | Carbon Monoxide in Time 1 | median | Logarithm |
EQI_Land | Environmental Quality Index, Land Domain | median | none |
EQI_SocioD | Environmental Quality Index, Socio-Demographic Domain | none | none |
EQI_Built | Environmental Quality Index, Built Domain | none | none |
O3_T1_log | Tropospheric Ozone in Time 1 | none | Logarithm |
CN_log | Cyanide compounds | none | Logarithm |
Diesel_log | Diesel Exhaust | none | Logarithm |
CS2_log | Carbon Disulfide | none | Logarithm |
EQI_Air | Environmental Quality Index, Air Domain | none | none |
EQI_Water | Environmental Quality Index, Water Domain | none | none |
PM2.5_T2 | Particulate Matter 2.5 in Time 2 | none | none |
PM10_T2 | Particulate Matter 10 in Time 2 | median | none |
SO2_T2_log | Sulfur Dioxide in Time 2 | none | Logarithm |
CO_T2 | Carbon Monoxide in Time 2 | none | none |
O3_T2 | Tropospheric Ozone in Time 2 | median | none |
NO2_T2 | Nitrogen Dioxide in Time 2 | --------- | --------- |
Var. Type | Variable | Description | Min. | 1 Q | Median | Mean | 3 Q | Max. |
---|---|---|---|---|---|---|---|---|
Target | Lung Cancer | Lung/Bronchus Cancer Rate | 14.600 | 56.800 | 65.360 | 66.220 | 75.700 | 132.400 |
Baseline | Adult Smoking | Current Adult Smokers (%) | 0.000 | 0.173 | 0.207 | 0.210 | 0.243 | 0.425 |
Macro Variables | EQI_Air | Environmental Quality Index, Air Domain | −2.532 | −0.349 | 0.177 | 0.147 | 0.692 | 2.508 |
EQI_Built | Environmental Quality Index, Built Domain | −3.993 | −0.408 | 0.177 | 0.119 | 0.672 | 7.283 | |
EQI_Land | Environmental Quality Index, Land Domain | −3.149 | −0.395 | 0.207 | 0.078 | 0.672 | 2.095 | |
EQI_SocioD | Environmental Quality Index, Socio-Demographic Domain | −3.331 | −0.584 | 0.022 | 0.027 | 0.570 | 3.979 | |
EQI_Water | Environmental Quality Index, Water Domain | −1.701 | −0.614 | 0.359 | 0.063 | 0.889 | 1.478 | |
Micro Variables | CN_log | Cyanide compounds | −3.743 | −2.118 | −1.812 | −1.842 | −1.523 | −0.022 |
CO_T1_log | Carbon Monoxide | 0.650 | 2.248 | 2.555 | 2.503 | 2.944 | 3.800 | |
CO_T2 | Carbon Monoxide | 0.267 | 1.191 | 1.558 | 1.691 | 1.900 | 7.020 | |
CS2_log | Carbon Disulfide | −6.900 | −3.875 | −3.436 | −3.429 | −2.975 | 0.361 | |
Diesel_log | Diesel Exhaust | −1.773 | −0.711 | −0.526 | −0.539 | −0.356 | 0.495 | |
NO2_T1_log | Nitrogen Dioxide | 1.306 | 2.383 | 2.657 | 2.632 | 2.905 | 3.818 | |
NO2_T2 | Nitrogen Dioxide | 1.000 | 7.811 | 8.700 | 9.231 | 11.125 | 24.400 | |
O3_T1_log | Tropospheric Ozone | 2.341 | 3.456 | 3.641 | 3.598 | 3.810 | 4.876 | |
O3_T2 | Tropospheric Ozone | 0.053 | 0.069 | 0.072 | 0.071 | 0.075 | 0.090 | |
PM10_T1_log | Particulate Matter 10 | 1.030 | 2.129 | 2.452 | 2.406 | 2.692 | 3.678 | |
PM10_T2 | Particulate Matter 10 | 10.000 | 19.420 | 21.990 | 22.210 | 23.700 | 40.200 | |
PM2.5_T1 | Particulate Matter 2.5 | 2.167 | 7.940 | 10.417 | 9.941 | 11.782 | 16.912 | |
PM2.5_T2 | Particulate Matter 2.5 | 4.500 | 9.743 | 11.171 | 10.855 | 12.419 | 17.150 | |
SO2_T1_log | Sulfur Dioxide | 0.251 | 1.679 | 2.154 | 2.035 | 2.478 | 3.569 | |
SO2_T2_log | Sulfur Dioxide | 1.000 | 22.000 | 33.000 | 36.980 | 49.000 | 98.000 |
ID | Meth. | Variable Groups | TRAIN (80%) | TEST (20%) | |||||
---|---|---|---|---|---|---|---|---|---|
adj. R2 | RMSE | MAE | MAPE | RMSE | MAE | MAPE | |||
1 | LR | smoking + state + EQI + emissions | 0.617 | 10.067 | 7.507 | 12.133 | 11.155 | 8.281 | 13.901 |
2 | LR | smoking + state + EQI | 0.612 | 10.168 | 7.556 | 12.241 | 11.167 | 8.259 | 13.858 |
3 | LR | smoking + state + emissions | 0.602 | 10.273 | 7.697 | 12.478 | 11.416 | 8.579 | 14.332 |
4 | LR | state | 0.527 | 11.239 | 8.198 | 13.324 | 11.792 | 8.664 | 14.414 |
5 | LR | emissions | 0.322 | 13.543 | 10.494 | 17.089 | 13.996 | 10.818 | 18.316 |
6 | LR | smoking | 0.308 | 13.724 | 10.367 | 17.401 | 14.777 | 11.083 | 19.289 |
7 | LR | EQI | 0.211 | 14.639 | 11.098 | 18.633 | 15.297 | 11.429 | 20.338 |
ID | Meth. | Variable Groups | TRAIN (80%) | TEST (20%) | |||||
---|---|---|---|---|---|---|---|---|---|
adj. R2 | RMSE | MAE | MAPE | RMSE | MAE | MAPE | |||
8 | GBT | smoking + state + EQI | 0.611 | 10.340 | 7.611 | 12.334 | 9.976 | 7.377 | 12.054 |
9 | SVM | smoking + state + EQI + emissions | 0.634 | 10.026 | 7.335 | 11.833 | 10.027 | 7.401 | 12.063 |
10 | RF | smoking + state + EQI + emissions | 0.639 | 9.960 | 7.268 | 11.926 | 10.068 | 7.314 | 11.977 |
11 | GBT | smoking + state + EQI + emissions | 0.625 | 10.151 | 7.445 | 12.132 | 10.239 | 7.535 | 12.252 |
12 | RR | smoking + state + EQI + emissions | 0.600 | 10.486 | 7.741 | 12.667 | 10.314 | 7.822 | 12.881 |
13 | RR | smoking + state + EQI | 0.598 | 10.507 | 7.758 | 12.688 | 10.322 | 7.793 | 12.784 |
14 | RF | smoking + state + EQI | 0.584 | 10.684 | 7.814 | 12.932 | 10.383 | 7.627 | 12.570 |
Predictor Variables | Linear Model (adj. R2, RMSE) | Non-Linear Model (adj. R2, RMSE) |
---|---|---|
smoking + state + EQI | Linear Regression (0.612, 11.167) | Gradient Boosted Tree: (0.611, 9.976) |
smoking + state + EQI + Emissions | Linear Regression (0.617, 11.155) | Random Forest (0.639, 10.068) |
Ambient Emission | Anthropogenic Sources |
---|---|
Particulate Matter | Combustion of carbon-based fuels. Smokestacks; power plants, automobiles. Diesel- and gasoline-powered motor vehicles and equipment; burning wood in residential fireplaces, wood stoves, wildfires, agricultural and other fires. Cement dust, fly ash, oil smoke, and smog from construction sites, unpaved roads and fields [31]. |
Sulfur Dioxide | Fuel combustion in mobile sources, e.g., automobiles, locomotives, ships, and other equipment; burning of fossil fuels (coal, oil, and diesel) or other materials that contain sulfur at power plants and other industrial facilities. Smelting of mineral ores (aluminum, copper, zinc, lead, and iron) that contain sulfur. Eastern states have more sulfate particles than the West, mostly because of sulfur dioxide emitted by large, coal-fired power plants [32]. |
Carbon Monoxide And Tropospheric Ozone | Burning of fossil fuels (gasoline, natural gas, oil, coal, and wood) in vehicles or machinery. Poorly vented gas appliances (furnaces, ranges, ovens, water heaters, clothes dryers, etc.), many in the home:
|
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kamis, A.; Cao, R.; He, Y.; Tian, Y.; Wu, C. Predicting Lung Cancer in the United States: A Multiple Model Examination of Public Health Factors. Int. J. Environ. Res. Public Health 2021, 18, 6127. https://doi.org/10.3390/ijerph18116127
Kamis A, Cao R, He Y, Tian Y, Wu C. Predicting Lung Cancer in the United States: A Multiple Model Examination of Public Health Factors. International Journal of Environmental Research and Public Health. 2021; 18(11):6127. https://doi.org/10.3390/ijerph18116127
Chicago/Turabian StyleKamis, Arnold, Rui Cao, Yifan He, Yuan Tian, and Chuyue Wu. 2021. "Predicting Lung Cancer in the United States: A Multiple Model Examination of Public Health Factors" International Journal of Environmental Research and Public Health 18, no. 11: 6127. https://doi.org/10.3390/ijerph18116127
APA StyleKamis, A., Cao, R., He, Y., Tian, Y., & Wu, C. (2021). Predicting Lung Cancer in the United States: A Multiple Model Examination of Public Health Factors. International Journal of Environmental Research and Public Health, 18(11), 6127. https://doi.org/10.3390/ijerph18116127