Prior Knowledge-Based Causal Inference Algorithms and Their Applications for China COVID-19 Analysis
Abstract
:1. Introduction
1.1. Related Work
1.1.1. Time Series Analysis with Causality
1.1.2. Causal Model Based on Bayesian Networks
1.1.3. Mutual Causality Model
Causal Learning Algorithms
Intermediary Feedback Mechanism and Loop
Non-Intermediary Feedback Mechanism and Mutual Causality
1.2. Motivations
1.3. Our Contributions
- To address the use of traditional Bayesian causal identification algorithms on a random exhaustive problem, we designed an increasing prior knowledge iterative algorithm PC+, which uses the causal tendency variable relationship as prior knowledge to obtain the deterministic output of causal graphics, to accomplish the identification of mutual causality with iterative calculation, continuous data pruning, and gradually fixed causal edge.
- To address the problem wherein the time characteristic causality among multiple variables cannot be identified, we proposed an algorithm based on d-separation, named DCM, which can handle causal variables with time characteristics. The DCM algorithm uses the aforementioned method in conjunction with conditional independence to eliminate indirect causal effects on direct causality, and obtain more accurate causal identification.
- Our proposed algorithms can improve the runtime cost and the accuracy, which can be easier used in discovering the casualty in medical, financial, and other important industry fields. In this paper, we employ our algorithm in discovering the casualty in COVID-19.
- We collected the COVID-19 data of China and built a causal model of the impact of social response policies on COVID-19. Our proposed PC+ algorithm and DCM algorithm were evaluated using the daily number of new confirmed cases as the outcome variable, with the policy measures and their social impact, existing environmental conditions, and demographic structure as the independent variables.
1.4. Paper Organization
2. Improved Algorithms for Causal Inference
2.1. PC+ Algorithm
2.1.1. Introduction of PC Algorithm
2.1.2. Establishment of Prior Knowledge
2.1.3. Calculation of Causal Effect
2.1.4. Iterative Calculation Process
2.2. DCM Algorithm
2.2.1. Introduction of CCM Algorithm
2.2.2. DCM Iterations and Improvements
3. Experimental Results
3.1. Datasets
3.2. Feasibility of PC+ Algorithm
3.3. Efficiency of PC+ Algorithm
3.4. Accuracy
4. Conclusions and Future Works
Author Contributions
Funding
Conflicts of Interest
References
- Stuart, E.A. Matching methods for causal inference: A review and a look forward. Stat. Sci. A Rev. J. Inst. Math. Stat. 2010, 25, 1. [Google Scholar] [CrossRef] [PubMed]
- Athey, S.; Imbens, G.W. Machine learning methods for estimating heterogeneous causal effects. Stat 2015, 1050, 1–26. [Google Scholar]
- Kaddour, J.; Lynch, A.; Liu, Q.; Kusner, M.J.; Silva, R. Causal Machine Learning: A Survey and Open Problems. arXiv 2022, arXiv:2206.15475. [Google Scholar]
- Yao, L.; Chu, Z.; Li, S.; Li, Y.; Gao, J.; Zhang, A. A survey on causal inference. ACM Trans. Knowl. Discov. Data (TKDD) 2021, 15, 1–46. [Google Scholar] [CrossRef]
- Bonner, S.; Vasile, F. Causal embeddings for recommendation. In Proceedings of the 12th ACM Conference on Recommender Systems, Vancouver, BC, Canada, 2–7 October 2018; pp. 104–112. [Google Scholar]
- Uri, S. Can we learn individual-level treatment policies from clinical data? Biostatistics 2019, 21, 359–362. [Google Scholar]
- Zhao, S.; Heffernan, N. Estimating Individual Treatment Effect from Educational Studies with Residual Counterfactual Networks. In Proceedings of the 10th International Conference on Educational Data Mining (EDM), Wuhan, China, 25–28 June 2017. [Google Scholar]
- McDuff, D.; Song, Y.; Lee, J.; Vineet, V.; Vemprala, S.; Gyde, N.A.; Salman, H.; Ma, S.; Sohn, K.; Kapoor, A. Causalcity: Complex simulations with agency for causal discovery and reasoning. In Proceedings of the Conference on Causal Learning and Reasoning, PMLR, Eureka, CA, USA, 11–13 April 2022; pp. 559–575. [Google Scholar]
- Zhao, T.; Liu, G.; Wang, D.; Yu, W.; Jiang, M. Learning from counterfactual links for link prediction. In Proceedings of the International Conference on Machine Learning, PMLR, Baltimore, MD, USA, 17–23 July 2022; pp. 26911–26926. [Google Scholar]
- Khan, N.; Haq, I.U.; Ullah, F.U.M.; Khan, S.U.; Lee, M.Y. CL-Net: ConvLSTM-Based Hybrid Architecture for Batteries’ State of Health and Power Consumption Forecasting. Mathematics 2021, 9, 3326. [Google Scholar] [CrossRef]
- Granger, C.W. Investigating causal relations by econometric models and cross-spectral methods. Econom. J. Econom. Soc. 1969, 37, 424–438. [Google Scholar] [CrossRef]
- Pearl, J. Causal inference in statistics: An overview. Stat. Surv. 2009, 3, 96–146. [Google Scholar] [CrossRef]
- Mooij, J.M.; Claassen, T. Constraint-Based Causal Discovery In The Presence Of Cycles. arXiv preprint 2005, 00610, 2020. [Google Scholar]
- Geiger, D.; Verma, T.; Pearl, J. d-separation: From theorems to algorithms. In Machine Intelligence and Pattern Recognition; Elsevier: Amsterdam, The Netherlands, 1990; pp. 139–148. [Google Scholar]
- Spirtes, P.; Glymour, C.N. Causation, Prediction, and Search; MIT Press: Cambridge, MA, USA, 2000. [Google Scholar]
- Jensen, F.V. An Introduction to Bayesian Networks; UCL Press: London, UK, 1996; Volume 210. [Google Scholar]
- Xuan, N.; Julien, V.; Wales, S.; Bailey, J. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 2010, 18, 2837. [Google Scholar]
- Sugihara, G.; May, R.; Ye, H.; Hsieh, C.H.; Deyle, E.; Fogarty, M.; Munch, S. Detecting causality in complex ecosystems. Science 2012, 338, 496–500. [Google Scholar] [CrossRef]
- Steiger, E.; Mußgnug, T.; Kroll, L.E. Causal analysis of COVID-19 observational data in German districts reveals effects of mobility, awareness, and temperature. medRxiv 2020. [Google Scholar]
- Chang, M.C.; Kahn, R.; Li, Y.A.; Lee, C.S.; Buckee, C.O.; Chang, H.H. Modeling the impact of human mobility and travel restrictions on the potential spread of SARS-CoV-2 in Taiwan. medRxiv 2020. [Google Scholar]
- Mazzoli, M.; Mateo, D.; Hernando, A.; Meloni, S.; Ramasco, J.J. Effects of mobility and multi-seeding on the propagation of the COVID-19 in Spain. medRxiv 2020. [Google Scholar]
- Chinazzi, M.; Davis, J.T.; Ajelli, M.; Gioannini, C.; Litvinova, M.; Merler, S.; Pastore y Piontti, A.; Mu, K.; Rossi, L.; Sun, K.; et al. The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science 2020, 368, 395–400. [Google Scholar] [CrossRef] [PubMed]
- Kraemer, M.U.; Yang, C.H.; Gutierrez, B.; Wu, C.H.; Klein, B.; Pigott, D.M. The effect of human mobility and control measures on the COVID-19 epidemic in China. Science 2020, 368, 493–497. [Google Scholar] [CrossRef] [PubMed]
- Ayyoubzadeh, S.M.; Ayyoubzadeh, S.M.; Zahedi, H.; Ahmadi, M.; Kalhori, S.R.N. Predicting COVID-19 incidence through analysis of google trends data in Iran: Data mining and deep learning pilot study. JMIR Public Health Surveill. 2020, 6, e18828. [Google Scholar] [CrossRef] [PubMed]
- Effenberger, M.; Kronbichler, A.; Shin, J.I.; Mayer, G.; Tilg, H.; Perco, P. Association of the COVID-19 pandemic with internet search volumes: A Google TrendsTM analysis. Int. J. Infect. Dis. 2020, 95, 192–197. [Google Scholar] [CrossRef] [PubMed]
- Yuan, X.; Xu, J.; Hussain, S.; Wang, H.; Gao, N.; Zhang, L. Trends and prediction in daily new cases and deaths of COVID-19 in the United States: An internet search-interest based model. Explor. Res. Hypothesis Med. 2020, 5, 1. [Google Scholar] [CrossRef]
- Li, C.; Chen, L.J.; Chen, X.; Zhang, M.; Pang, C.P.; Chen, H. Retrospective analysis of the possibility of predicting the COVID-19 outbreak from Internet searches and social media data, China, 2020. Eurosurveillance 2020, 25, 2000199. [Google Scholar] [CrossRef] [PubMed]
- Zhou, W.; Wang, A.; Xia, F.; Xiao, Y.; Tang, S. Effects of media reporting on mitigating spread of COVID-19 in the early phase of the outbreak. Math. Biosci. Eng. 2020, 17, 2693–2707. [Google Scholar] [PubMed]
- Bannister-Tyrrell, M.; Meyer, A.; Faverjon, C.; Cameron, A. Preliminary evidence that higher temperatures are associated with lower incidence of COVID-19, for cases reported globally up to 29th February 2020. medRxiv 2020. [Google Scholar]
- Auler, A.C.; Cássaro, F.A.M.; Da Silva, V.O.; Pires, L.F. Evidence that high temperatures and intermediate relative humidity might favor the spread of COVID-19 in tropical climate: A case study for the most affected Brazilian cities. Sci. Total Environ. 2020, 729, 139090. [Google Scholar] [CrossRef] [PubMed]
- Wu, Y.; Jing, W.; Liu, J.; Ma, Q.; Yuan, J.; Wang, Y.; Du, M.; Liu, M. Effects of temperature and humidity on the daily new cases and new deaths of COVID-19 in 166 countries. Sci. Total Environ. 2020, 729, 139051. [Google Scholar] [CrossRef]
Edge Number | PC _All | PC+ |
---|---|---|
8 | 0.23 | 0.1 |
9 | 0.45 | 0.2 |
10 | 52,301 | 2358 |
11 | 230,127 | 16,282 |
12 | 6,589,924 | 256,329 |
Factors | Literature Evidence |
---|---|
Mobility | Modeling the impact of human mobility and travel restrictions on the potential spread of SARS-CoV-2 in Taiwan [20]. |
Effects of mobility and multi-seeding on the propagation of the COVID-19 in Spain [21]. | |
Emergency response | The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak [22]. |
The effect of human mobility and control measures on the COVID-19 epidemic in China [23]. | |
Search index | Predicting COVID-19 incidence through analysis of Google trends data in Iran: Data mining and deep learning pilot study [24]. |
Association of the COVID-19 pandemic with Internet search volumes: A Google Trends (TM) analysis [25]. | |
Trends and prediction in daily new cases and deaths of COVID-19 in the United States: An Internet search-interest based mode [26]. | |
Media index | Retrospective analysis of the possibility of predicting the COVID-19 outbreak from Internet searches and social media data, China, 2020 [27]. |
Effects of media reporting on mitigating spread of COVID-19 in the early phase of the outbreak [28]. | |
Temperature | Preliminary evidence that higher temperatures are associated with lower incidence of COVID-19, for cases reported globally up to 29 February 2020 [29]. |
Evidence that high temperatures and intermediate relative humidity might favor the spread of COVID-19 in tropical climate: A case study for the most affected Brazilian cities [30]. | |
Effects of temperature and humidity on the daily new cases and new deaths of COVID-19 in 166 countries [31]. |
Model | Guide Edge | Directed Edge | Discrimination |
---|---|---|---|
PC_Random | 12 | 4 | 33% |
CCM | 15 | 9 | 60% |
DCM | 13 | 10 | 76% |
PC+ | 12 | 11 | 91% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, H.; Hai, M.; Tang, W. Prior Knowledge-Based Causal Inference Algorithms and Their Applications for China COVID-19 Analysis. Mathematics 2022, 10, 3568. https://doi.org/10.3390/math10193568
Li H, Hai M, Tang W. Prior Knowledge-Based Causal Inference Algorithms and Their Applications for China COVID-19 Analysis. Mathematics. 2022; 10(19):3568. https://doi.org/10.3390/math10193568
Chicago/Turabian StyleLi, Haifeng, Mo Hai, and Wenxun Tang. 2022. "Prior Knowledge-Based Causal Inference Algorithms and Their Applications for China COVID-19 Analysis" Mathematics 10, no. 19: 3568. https://doi.org/10.3390/math10193568