Approaches to Extracting Patterns of Service Utilization for Patients with Complex Conditions: Graph Community Detection vs. Natural Language Processing Clustering
Abstract
:1. Introduction
1.1. Clinical Practice Guidelines, Clinical Pathways, and Services Pathways
1.2. Use of Graph Analytics for Healthcare Data
1.3. Use of NLP in Analyzing Healthcare Data
1.4. Objectives
- To what extent can NLP methods be used to extract PSUs from longitudinal heterogeneous cross-continuum healthcare data? How do the data need to be modeled and what data pre-processing needs to be conducted to generate the base data upon which the NLP methods can be applied?
- Are the results from NLP clustering for Service Classes similar to those obtained using graph community detection? Are they judged to be similar by clinical subject matter experts (SMEs) or clinical/administrative service system operations experts (SSOEs)?
- Does a hybrid NLP–graph community detection approach generate meaningful results, and how do the results compare to (a) community detection results, with simple frequency-based edge-weighted projections of service-service interactions, and (b) results obtained using NLP-based clustering approaches, employing measures of cosine similarity between vectors reflecting patient journeys?
2. Methodological Approach
2.1. Concurrent Validation via Application of Multiple Methods to the Same Body of Data, Modeled in Different Ways
2.2. Source Data
2.3. Data Pre-Processing
2.4. Use of Community Detection in Extracting PSUs
2.5. Use of NLP in Extracting PSUs
2.6. Combining Community Detection with the NLP to Extract PSUs
3. Analysis
3.1. Analysis Using Community Detection
3.2. Analysis Using NLP Methods with K-Means and Hierarchical Clustering
4. Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Panteli, D.; Legido-Quigley, H.; Reichebner, C.; Ollenschläger, G.; Schäfer, C.; Busse, R. Clinical practice guidelines as a quality strategy. In Improving Healthcare Quality in Europe; OECD Publishing: Paris, France, 2019; p. 233. [Google Scholar]
- Howlett, J.G.; McKelvie, R.S.; Costigan, J.; Ducharme, A.; Estrella-Holder, E.; Ezekowitz, J.A.; Giannetti, N.; Haddad, H.; Heckman, G.A.; Herd, A.M. The 2010 Canadian Cardiovascular Society guidelines for the diagnosis and management of heart failure update: Heart failure in ethnic minority populations, heart failure and pregnancy, disease management, and quality improvement/assurance programs. Can. J. Cardiol. 2010, 26, 185–202. [Google Scholar] [CrossRef]
- Bambi, J.; Santoso, Y.; Sadri, H.; Moselle, K.; Rudnick, A.; Robertson, S.; Chang, E.; Kuo, A.; Howie, J.; Dong, G.Y. A methodological approach to extracting patterns of service utilization from a cross-continuum high dimensional Healthcare Dataset to Support Care Delivery Optimization for Patients with Complex Problems. BioMedInformatics 2024, 4, 946–965. [Google Scholar] [CrossRef]
- Dawkins, B.; Renwick, C.; Ensor, T.; Shinkins, B.; Jayne, D.; Meads, D. What factors affect patients’ ability to access healthcare? An overview of systematic reviews. Trop. Med. Int. Health 2021, 26, 1177–1188. [Google Scholar] [CrossRef] [PubMed]
- Stangl, A.L.; Earnshaw, V.A.; Logie, C.H.; Van Brakel, W.C.; Simbayi, L.; Barré, I.; Dovidio, J.F. The Health Stigma and Discrimination Framework: A global, crosscutting framework to inform research, intervention development, and policy on health-related stigmas. BMC Med. 2019, 17, 31. [Google Scholar] [CrossRef]
- Cradock-O’Leary, J.; Young, A.S.; Yano, E.M.; Wang, M.; Lee, M.L. Use of general medical Services by VA patients with psychiatric disorders. Psychiatr. Serv. 2002, 53, 874–878. [Google Scholar] [CrossRef] [PubMed]
- Christiani, A.; Hudson, A.L.; Nyamathi, A.; Mutere, M.; Sweat, J. Attitudes of homeless and drug-using youth regarding barriers and facilitators in delivery of quality and culturally sensitive health care. J. Child Adolesc. Psychiatr. Nurs. 2008, 21, 154–163. [Google Scholar] [CrossRef] [PubMed]
- De Groot, V.; Beckerman, H.; Lankhorst, G.J.; Bouter, L.M. How to measure comorbidity: A critical review of available methods. J. Clin. Epidemiol. 2003, 56, 221–229. [Google Scholar] [CrossRef]
- UNAIDS: Joint United Nations Programme on HIV/AIDS. Protocol for the identification of discrimination against people living with HIV. In Protocol for the Identification of Discrimination against People Living with HIV; UNAIDS: Genava, Switzerland, 2000; p. 40. [Google Scholar]
- Nyblade, L.; Stockton, M.A.; Giger, K.; Bond, V.; Ekstrand, M.L.; Lean, R.M.; Mitchell, E.M.H.; Nelson, L.R.E.; Sapag, J.C.; Siraprapasiri, T. Stigma in health facilities: Why it matters and how we can change it. BMC Med. 2019, 17, 25. [Google Scholar] [CrossRef]
- Iezzoni, L.I.; McCarthy, E.P.; Davis, R.B.; Siebens, H. Mobility impairments and use of screening and preventive services. Am. J. Public Health 2000, 90, 955–961. [Google Scholar] [CrossRef]
- Barabási, A.-L.; Loscalzo, J.; Silverman, E.K. Network Medicine: Complex Systems in Human Disease and Therapeutics; Harvard University Press: Cambridge, MA, USA, 2017. [Google Scholar]
- Mislove, A.; Marcon, M.; Gummadi, K.P.; Druschel, P.; Bhattacharjee, B. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, San Diego, CA, USA, 24–26 October 2007; pp. 29–42. [Google Scholar]
- Pavlopoulos, G.A.; Secrier, M.; Moschopoulos, C.N.; Soldatos, T.G.; Kossida, S.; Aerts, J.; Schneider, R.; Bagos, P.G. Using graph theory to analyze biological networks. BioData Min. 2011, 4, 1–27. [Google Scholar] [CrossRef]
- Wysocki, K.; Ritter, L. Diseasome. Annu. Rev. Nurs. Res. 2011, 29, 55–72. [Google Scholar] [CrossRef] [PubMed]
- Rostami, M.; Oussalah, M.; Berahmand, K.; Farrahi, V. Community detection algorithms in healthcare applications: A systematic review. IEEE Access 2023, 11, 30247–30272. [Google Scholar] [CrossRef]
- Toor, R.; Chana, I. Network Analysis as a Computational technique and its benefaction for predictive analysis of healthcare data: A systematic review. Arch. Comput. Methods Eng. 2021, 28, 1689–1711. [Google Scholar] [CrossRef]
- Yi, H.-C.; You, Z.-H.; Huang, D.-S.; Kwoh, C.K. Graph representation learning in bioinformatics: Trends, methods and applications. Brief. Bioinform. 2021, 23, bbab340. [Google Scholar] [CrossRef]
- Wanyan, T.; Kang, M.; Badgeley, M.A.; Johnson, K.W.; De Freitas, J.K.; Chaudhry, F.F.; Vaid, A.; Zhao, S.; Miotto, R.; Nadkarni, G.N. Heterogeneous graph embeddings of electronic health records improve critical care disease predictions. In Proceedings of the Artificial Intelligence in Medicine: 18th International Conference on Artificial Intelligence in Medicine, AIME 2020, Minneapolis, MN, USA, 25–28 August 2020; pp. 14–25. [Google Scholar]
- Wu, T.; Wang, Y.; Wang, Y.; Zhao, E.; Yuan, Y. Leveraging graph-based hierarchical medical entity embedding for healthcare applications. Sci. Rep. 2021, 11, 5858. [Google Scholar] [CrossRef] [PubMed]
- Niyirora, J.; Aragones, O. Network analysis of medical care services. Health Inform. J. 2020, 26, 1631–1658. [Google Scholar] [CrossRef] [PubMed]
- Palmer, R.; Utley, M.; Fulop, N.J.; O’Connor, S. Using visualisation methods to analyse referral networks within community health care among patients aged 65 years and over. Health Inform. J. 2020, 26, 354–375. [Google Scholar] [CrossRef] [PubMed]
- Fortunato, S. Community detection in graphs. Phys. Rep. 2010, 486, 75–174. [Google Scholar] [CrossRef]
- Yin, H.; Benson, A.R.; Leskovec, J.; Gleich, D.F. Local higher-order graph clustering. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 555–564. [Google Scholar]
- Clauset, A.; Newman, M.E.J.; Moore, C. Finding community structure in very large networks. Phys. Rev. E 2004, 70, 066111. [Google Scholar] [CrossRef]
- Newman, M.E.J. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 2006, 74, 036104. [Google Scholar] [CrossRef] [PubMed]
- Blondel, V.D.; Guillaume, J.-L.; Lambiotte, R.; Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 2008, P10008. [Google Scholar] [CrossRef]
- Stewart, R.; Velupillai, S. Applied natural language processing in mental health big data. Neuropsychopharmacology 2021, 46, 252. [Google Scholar] [CrossRef] [PubMed]
- Souili, A.; Cavallucci, D.; Rousselot, F. Natural Language Processing (NLP)—A Solution for Knowledge Extraction from Patent Unstructured Data. Procedia Eng. 2015, 131, 635–643. [Google Scholar] [CrossRef]
- Silverman, G.M.; Sahoo, H.S.; Ingraham, N.E.; Lupei, M.; Puskarich, M.A.; Usher, M.; Dries, J.; Finzel, R.L.; Murray, E.; Sartori, J. NLP methods for extraction of symptoms from unstructured data for use in prognostic covid-19 analytic models. J. Artif. Intell. Res. 2021, 72, 429–474. [Google Scholar] [CrossRef]
- Reyes-Ortiz, J.A.; González-Beltrán, B.A.; Gallardo-López, L. Clinical decision support systems: A survey of NLP-based approaches from unstructured data. In Proceedings of the 2015 26th International Workshop on Database and Expert Systems Applications (DEXA), Valencia, Spain, 1–4 September 2015; pp. 163–167. [Google Scholar]
- Campbell, D.T.; Fiske, D.W. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychol. Bull. 1959, 56, 81. [Google Scholar] [CrossRef] [PubMed]
- Koval, A.; Moselle, K. Clinical Context Coding Scheme—Describing utilisation of services of Island Health between 2007–2017. In Proceedings of the Conference of the International Population Data Linkage Association, Banff, AB, Canada, 12–14 September 2018. [Google Scholar]
- Bambi, J.; Santoso, Y.; Moselle, K.; Robertson, S.; Rudnick, A.; Chang, E.; Kuo, A. Analyzing patterns of service utilization using graph topology to understand the dynamic of the engagement of patients with complex problems with health services. BioMedInformatics 2024, 4, 1071–1084. [Google Scholar] [CrossRef]
- Bambi, J.; Dong, G.Y.; Santoso, Y.; Moselle, K.; Dugas, S.; Olobatuyi, K.; Rudnick, A.; Chang, E.; Kuo, A. Patterns of service utilization across the full continuum of care: Using patient journeys to assess disparities in access to health services. Knowledge 2024, 4, 252–264. [Google Scholar] [CrossRef]
- Ramos, J. Using TF-IDF to determine word relevance in document queries. In Proceedings of the First Instructional Conference on Machine Learning, Los Angeles, CA, USA, 23–24 June 2003; Volume 242, pp. 29–48. [Google Scholar]
- Lahitani, A.R.; Permanasari, A.E.; Setiawan, N.A. Cosine similarity to determine similarity measure: Study case in online essay assessment. In Proceedings of the 2016 4th International Conference on Cyber and IT Service Management, Bandung, Indonesia, 26–27 April 2016; pp. 1–6. [Google Scholar]
- Csardi, G.; Nepusz, T. The igraph software package for complex network research. InterJournal Complex Syst. 2006, 1695, 1–9. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Nunez-Iglesias, J.; Van Der Walt, S.; Dashnow, H. Elegant SciPy: The Art of Scientific Python; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2017. [Google Scholar]
- Cui, M. Introduction to the K-means clustering algorithm based on the elbow method. Account. Audit. Financ. 2020, 1, 5–8. [Google Scholar]
- Satopaa, V.; Albrecht, J.; Irwin, D.; Raghavan, B. Finding a “kneedle” in a haystack: Detecting knee points in system behavior. In Proceedings of the 2011 31st International Conference on Distributed Computing Systems Workshops, Minneapolis, MN, USA, 20–24 June 2011; pp. 166–171. [Google Scholar]
- Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A K-means clustering algorithm. J. R. Stat. Society. Ser. C (Appl. Stat.) 1979, 28, 100–108. [Google Scholar] [CrossRef]
- Johnson, S.C. Hierarchical clustering schemes. Psychometrika 1967, 32, 241–254. [Google Scholar] [CrossRef] [PubMed]
- Karthikeyan, B.; George, D.J.; Manikandan, G.; Thomas, T. A comparative study on K-means clustering and agglomerative hierarchical clustering. Int. J. Emerg. Trends Eng. Res. 2020, 8, 1600–1604. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Cronbach, L.J.; Meehl, P.E. Construct validity in psychological tests. Psychol. Bull. 1955, 52, 281. [Google Scholar] [CrossRef]
Service Class ID | Service Class Label | Internal Weighted Degree | External Weighted Degree |
---|---|---|---|
33 | MHSU-Clinical Intake-Adult | 2724 | 24,011 |
22 | MHSU-Addictions-Clinic-Adult-Ambulatory | 2655 | 18,690 |
23 | MHSU-Addictions-Withdrawal Management (Detox)-Adults | 1934 | 10,699 |
24 | MHSU-Addictions-Post-Withdrawal Stabilization-Residential-Adults | 1439 | 6762 |
275 | COVID-19 MHSU Health Monitoring | 398 | 2607 |
165 | MHSU-Shared Care or Collaborative Care | 325 | 1814 |
40 | MHSU-Personality Disorders Therapy (DBT) | 10 | 40 |
284 | Surgery-Day Care Antimicrobial Therapy | 6 | 18 |
78 | Med/Surg Intensive Acute Care-Neo-Natal | 3 | 9 |
Service Class ID | Service Class Label | Similarity Percentage |
---|---|---|
33 | MHSU-Clinical Intake-Adult | 9.36 |
22 | MHSU-Addictions-Clinic-Adult-Ambulatory | 7.92 |
23 | MHSU-Addictions-Withdrawal Management (Detox)-Adults | 6.72 |
203 | Overdose-Related Services | 6.19 |
34 | MHSU-Addictions-Clinical Intake-Adult | 5.88 |
24 | MHSU-Addictions-Post-Withdrawal Stabilization-Residential-Adults | 5.44 |
Service Class ID | Service Class Label | Internal Weighted Degree | External Weighted Degree |
---|---|---|---|
23 | MHSU-Addictions-Withdrawal Management (Detox)-Adults | 2.3489 | 8.0714 |
34 | MHSU-Addictions-Clinical Intake-Adult | 2.3551 | 6.6388 |
33 | MHSU-Clinical Intake-Adult | 2.141 | 12.9558 |
22 | MHSU-Addictions-Clinic-Adult-Ambulatory | 1.9415 | 10.6506 |
24 | MHSU-Addictions-Post-Withdrawal Stabilization-Residential-Adults | 1.8488 | 5.9565 |
165 | MHSU-Shared Care or Collaborative Care | 0.8368 | 4.3058 |
21 | MHSU-Addictions-Sobering & Assessment Centre | 0.5759 | 1.642 |
67 | MHSU-Perinatal Mental Health | 0.2168 | 0.6528 |
Graph Community Detection | NLP + K-Mean Clustering | NLP + Hierarchical Clustering | NLP + Community Detection | |
---|---|---|---|---|
Common Service Classes | 22 | 22 | 22 | 22 |
23 | 23 | 23 | 23 | |
24 | 24 | 24 | 24 | |
33 | 33 | 33 | 33 | |
34 | 34 | 34 | ||
165 | 165 | |||
203 | 203 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bambi, J.; Sadri, H.; Moselle, K.; Chang, E.; Santoso, Y.; Howie, J.; Rudnick, A.; Elliott, L.T.; Kuo, A. Approaches to Extracting Patterns of Service Utilization for Patients with Complex Conditions: Graph Community Detection vs. Natural Language Processing Clustering. BioMedInformatics 2024, 4, 1884-1900. https://doi.org/10.3390/biomedinformatics4030103
Bambi J, Sadri H, Moselle K, Chang E, Santoso Y, Howie J, Rudnick A, Elliott LT, Kuo A. Approaches to Extracting Patterns of Service Utilization for Patients with Complex Conditions: Graph Community Detection vs. Natural Language Processing Clustering. BioMedInformatics. 2024; 4(3):1884-1900. https://doi.org/10.3390/biomedinformatics4030103
Chicago/Turabian StyleBambi, Jonas, Hanieh Sadri, Ken Moselle, Ernie Chang, Yudi Santoso, Joseph Howie, Abraham Rudnick, Lloyd T. Elliott, and Alex Kuo. 2024. "Approaches to Extracting Patterns of Service Utilization for Patients with Complex Conditions: Graph Community Detection vs. Natural Language Processing Clustering" BioMedInformatics 4, no. 3: 1884-1900. https://doi.org/10.3390/biomedinformatics4030103
APA StyleBambi, J., Sadri, H., Moselle, K., Chang, E., Santoso, Y., Howie, J., Rudnick, A., Elliott, L. T., & Kuo, A. (2024). Approaches to Extracting Patterns of Service Utilization for Patients with Complex Conditions: Graph Community Detection vs. Natural Language Processing Clustering. BioMedInformatics, 4(3), 1884-1900. https://doi.org/10.3390/biomedinformatics4030103