Similarity Analysis in Understanding Online News in Response to Public Health Crisis
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Collection
2.2. Data Cleaning and Pre-Processing
2.3. Application of a Clustering Algorithm to Similar Texts
- For each point in the dataset, it searches for neighbors that respect the distance ϵ, forming a cluster with at least s elements.
- For each cluster formed, the algorithm tries to expand it by adding more data points that do not yet belong to another cluster.
- For each point not in a cluster, if it respects the distance ϵ from some cluster, that point is added to the cluster; otherwise, the point is labeled as noise.
2.4. Limitations
3. Results
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Cluster | N1 | N2 |
---|---|---|
51 | ||
54 |
References
- da Saúde, B.M. Secretaria de Vigilância Em Saúde. Boletim Epidemiológico-Sífilis. 2019. Available online: http://www.aids.gov.br/pt-br/pub/2019/boletim-epidemiologico-sifilis-2019 (accessed on 15 November 2021).
- de Morais Pinto, R.; de Medeiros Valentim, R.A.; Fernandes da Silva, L.; Lima, G.F.d.M.S.; Kumar, V.; Pereira de Oliveira, C.A.; de Gusmão, C.M.G.; de Paiva, J.C.; de Andrade, I. Analyzing the reach of public health campaigns based on multidimensional aspects: The case of the syphilis epidemic in brazil. BMC Public Health 2021, 21, 1632. [Google Scholar] [CrossRef] [PubMed]
- Pinto, R.; Valentim, R.; da Silva, L.F.; de Souza, G.F.; de Moura Santos, T.G.F.; de Oliveira, C.A.P.; dos Santos, M.M.; Miranda, M.M.; Cunha-Oliveira, A.; Kumar, V.; et al. Use of interrupted time series analysis in understanding the course of the congenital syphilis epidemic in brazil. Lancet Reg. Health-Am. 2022, 7, 100163. [Google Scholar] [CrossRef]
- Valentim, R.A.d.M.; Caldeira-Silva, G.J.P.; da Silva Rodrigo, D.; Albuquerque, G.A.; de Andrade, I.G.M.; Sales-Moioli, A.I.L.; Rodrigues, A.G.C.D.R. Stochastic petri net model describing the relationship between reported maternal and congenital syphilis cases in brazil. BMC Med. Inform. Decis. Mak. 2022, 22, 100163. [Google Scholar]
- Valentim, R.A.D.M.; Oliveira, A.C.; Dias, A.D.P.; Olieveira, E.D.S.G.D.; Valentim, J.L.R.D.S.; Moreira, J.A.M.; Coutinho, K.D.; Trindade, S.M.d.G.D.D.C.; Bonfim, M.A.A. Educommunication as a strategy to face Syphilis: An analysis of the open educational resources available at AVASUS. Braz. J. Sex. Trans. Dis. 2021, 33, 1–5. [Google Scholar] [CrossRef]
- Robinson, M.N.; Tansil, K.A.; Elder, R.W.; Soler, R.E.; Labre, M.P.; Mercer, S.L.; Eroglu, D.; Baur, C.; Lyon-Daniel, K.; Fridinger, F. Mass media health communication campaigns combined with health-related product distribution: A community guide systematic review. Am. J. Prev. Med. 2014, 47, 360–371. [Google Scholar] [CrossRef]
- Vega-Casanova, J.; Camelo-Guarín, A.; del Río-González, A.M.; PalacioSañudo, J. Integrative Review of The Evalu-Ation of Health Communication Campaigns for Hiv Prevention in Latin American Mass Media; Interface-Comunicação: Saúde, Educação, 2020; p. 24. [Google Scholar]
- de Araújo, A.C.C.; de Lima Paiva, J.C.; de Sousa Lacerda, J.; Molano, M.M. Avaliação de campanhas de saúde: Uma revisão integrativa sobre a construção de indicadores. Disert. Anu. Electrónico De Estud. En Comun. Soc. 2021, 14, 8. [Google Scholar]
- Bettencourt, L.M.; Cintrón-Arias, A.; Kaiser, D.; Castillo-Chavez, C. The power of a good idea: Quantitative modeling of the spread of ideas from epidemiological models. Phys. A Stat. Mech. Appl. 2006, 364, 513–536. [Google Scholar] [CrossRef] [Green Version]
- Jayson, R.; Block, M.P.; Chen, Y. How synergy effects of paid and digital owned media influence brand sales: Considerations for marketers when balancing media spend. J. Advert. Res. 2018, 58, 77–89. [Google Scholar] [CrossRef] [Green Version]
- Aurimas, M.; Pengshuo, Z.; Xiang, G. The Role Of Paid And Earned Social Media On Consumer Behavior For Apparel Brands In China’s Market: A Quantitative Method Study. Bachelor’s Thesis, Jönköping University, Jönköping, Sweden, 2020. [Google Scholar]
- de Camargo Milone, J.; Mccombs, M. A Teoria da Agenda: A mídia e a opinião pública. Petrópolis, RJ: Vozes, 2009. Rev. Opinião Filosófica 2012, 3, 1–240. [Google Scholar]
- Ardèvol-Abreu, A.; de Zúñiga, H.G.; McCombs, M.E. Orígenes y desarrollo de la teoría de la agenda setting en comunicación. tendencias en españa (2014–2019). El Prof. De La Inf. EPI 2020, 29, e290414. [Google Scholar] [CrossRef]
- Fan, Z.; Chen, S.; Zha, L.; Yang, J.A. Text Clustering Approach of Chinese News Based on Neural Network Language Model. Int. J. Parallel Program. 2014, 44, 198–206. [Google Scholar] [CrossRef]
- Rahman, A.; Khan, K.; Khan, W.; Khan, A.; Saqia, B. Unsupervised Machine Learning based Documents Clustering in Urdu. ICST Trans. Scalable Inf. Syst. 2018, 5, e5. [Google Scholar] [CrossRef] [Green Version]
- de Magalhães, L.H. Agrupamento automático de notícias de jornais on-line usando técnicas de machine learning para clustering de textos no idioma português. Múltiplos Olhares Em Ciência Da Inf. 2019, 9, 1–15. [Google Scholar]
- Blokh, I.; Alexandrov, V. News clustering based on similarity analysis. Procedia Comput. Sci. 2017, 122, 715–719. [Google Scholar] [CrossRef]
- Bouras, C.; Tsogkas, V. A clustering technique for news articles using WordNet. Knowl.-Based Syst. 2012, 36, 115–128. [Google Scholar] [CrossRef]
- Bisandu, D.B.; Prasad, R.; Liman, M.M. Clustering news articles using efficient similarity measure and n-grams. Int. J. Knowl. Eng. Data Min. 2018, 5, 333–348. [Google Scholar] [CrossRef]
- Salton, G.; Fox, E.A.; Wu, H. Extended Boolean information retrieval. Commun. ACM 1983, 26, 1022–1036. [Google Scholar] [CrossRef]
- Ester, M.; Kriegel, H.-P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 1996, 96, 226–231. [Google Scholar]
- Lacerda, J.d.S.; Muneiro, L.C.; Oliveira Junior, M.S.; Araujo, K.S.; Paiva, J.C.L. Campanha De Combate À Sífilis 2018–2019: Das Condições De Produção Às Estratégias Comunicativas; ECA-USP: São Paulo, Brazil, 2020; pp. 162–176. [Google Scholar]
- Ribeiro, B.V.D.; Galdencio, R.C.B.; Pinto, E.E.P.; Saraiva, E.D.; de Oliveira, L.M.C. Um século de sífilis no brasil: Deslocamentos e aproximações das campanhas de saúde de 1920 e 2018/2019. Rev. Bras. De História Da Mídia 2021, 10. [Google Scholar] [CrossRef]
- Melo, J.M.d.; Assis, F.d. Gêneros e formatos jornalísticos: Um modelo classificatório. Intercom Rev. Bras. De Ciências Da Comun. 2016, 39, 39–56. [Google Scholar] [CrossRef]
- Xie, Q.; Neill, M.S.; Schauster, E. Paid, Earned, Shared and Owned Media From the Perspective of Advertising and Public Relations Agencies: Comparing China and the United States. Int. J. Strat. Commun. 2018, 12, 160–179. [Google Scholar] [CrossRef]
- Pinto, R.d.M.; Silva, L.F.; Valentim, R.A.d.M.; Kumar, V.; de Gusmão, C.M.G.; de Oliveira, C.A.; Lacerda, J.D.S. Systematic review on information technology approaches to evaluate the impact of public health campaigns: Real cases and possible directions. Front. Public Health 2021, 9, 715403. [Google Scholar] [CrossRef]
- de Morais, C.M.; Teixeira, I.V.; Sadok, S.; Endo, P.T.; Kelner, J. Syphilis Trigram: A domain-specific visualisation to combat syphilis epidemic and improve the quality of maternal and child health in Brazil. BMC Pregnancy Childbirth 2022, 22, 379. [Google Scholar] [CrossRef] [PubMed]
- Kraus, S.; Schiavone, F.; Pluzhnikova, A.; Invernizzi, A.C. Digital transformation in healthcare: Analyzing the current state-of-research. J. Bus. Res. 2020, 123, 557–567. [Google Scholar] [CrossRef]
- Kuah, A.T.; Dillon, R. (Eds.) Digital Transformation in a Post-Covid World: Sustainable Innovation, Disruption, and Change; CRC Press: Boca Raton, FL, USA, 2021. [Google Scholar]
- Gunasekeran, D.V.; Tseng, R.M.W.W.; Tham, Y.-C.; Wong, T.Y. Applications of digital health for public health responses to COVID-19: A systematic scoping review of artificial intelligence, telehealth and related technologies. npj Digit. Med. 2021, 4, 40. [Google Scholar] [CrossRef] [PubMed]
- Massuda, A.; Hone, T.; Leles, F.A.G.; de Castro, M.C.; Atun, R. The Brazilian health system at crossroads: Progress, crisis and resilience. BMJ Glob. Health 2018, 3, e000829. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Nuzzo, J.B.; Meyer, D.; Snyder, M.; Ravi, S.J.; Lapascu, A.; Souleles, J.; Andrada, C.I.; Bishai, D. What makes health systems resilient against infectious disease outbreaks and natural hazards? Results from a scoping review. BMC Public Health 2019, 19, 1310. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ammar, W.; Kdouh, O.; Hammoud, R.; Hamadeh, R.; Harb, H.; Ammar, Z.; Atun, R.; Christiani, D.; Zalloua, P.A. Health system resilience: Lebanon and the Syrian refugee crisis. J. Glob. Health 2016, 6, 020704. [Google Scholar] [CrossRef] [PubMed]
- Valentim, R.A.D.M.; Lima, T.S.; Cortez, L.R.; Barros, D.M.D.S.; da Silva, R.D.; de Paiva, J.C.; Coutinho, K.D.; de Morais, P.S.G.; Lacerda, J.D.S.; de André, F.R. A relevância de um ecossistema tecnológico no enfrentamento à Covid-19 no Sistema Único de Saúde: O caso do Rio Grande do Norte, Brasil. Ciênc. Saúde Coletiva 2021, 26, 2035–2052. [Google Scholar] [CrossRef] [PubMed]
- Howitt, P.; Darzi, A.; Yang, G.-Z.; Ashrafian, H.; Atun, R.; Barlow, J.; Blakemore, A.; Bull, A.M.; Car, J.; Conteh, L.; et al. Technologies for global health. Lancet 2012, 380, 507–535. [Google Scholar] [CrossRef] [PubMed]
- Neto, G.C.C.; Andreazza, R.; Chioro, A. Integração entre os sistemas nacionais de informação em saúde: O caso do e-SUS Atenção Básica. Rev. Saude Publica 2021, 55, 93. [Google Scholar] [CrossRef] [PubMed]
- Mheidly, N.; Fares, J. Leveraging media and health communication strategies to overcome the COVID-19 infodemic. J. Public Health Policy 2020, 41, 410–420. [Google Scholar] [CrossRef] [PubMed]
- Keselman, A.; Logan, R.; Smith, C.A.; Leroy, G.; Zeng-Treitler, Q. Developing Informatics Tools and Strategies for Consumer-centered Health Communication. J. Am. Med. Inform. Assoc. 2008, 15, 473–483. [Google Scholar] [CrossRef] [PubMed]
- Backer, T.E.; Rogers, E.M.; Sopory, P. Designing Health Communication Campaigns: What Works? SAGE Publications, Inc.: Thousand Oaks, CA, USA, 1992. [Google Scholar] [CrossRef]
- Parvanta, C.; Nelson, D.E.; Parvanta, S.A.; Harner, R.N. Essentials Of Public Health Communication; Jones & Bartlett Publishers: Sudbury, MA, USA, 2020. [Google Scholar]
Cluster | Number | Cluster | Number | Cluster | Number |
---|---|---|---|---|---|
0 | 9 | 20 | 3 | 40 | 2 |
1 | 2 | 21 | 2 | 41 | 2 |
2 | 3 | 22 | 2 | 42 | 2 |
3 | 2 | 23 | 2 | 43 | 3 |
4 | 2 | 24 | 2 | 44 | 3 |
5 | 2 | 25 | 2 | 45 | 3 |
6 | 2 | 26 | 2 | 46 | 2 |
7 | 2 | 27 | 6 | 47 | 2 |
8 | 2 | 28 | 2 | 48 | 2 |
9 | 2 | 29 | 2 | 49 | 2 |
10 | 2 | 30 | 5 | 50 | 2 |
11 | 2 | 31 | 2 | 51 | 5 |
12 | 3 | 32 | 2 | 52 | 2 |
13 | 2 | 33 | 2 | 53 | 3 |
14 | 2 | 34 | 2 | 54 | 3 |
15 | 2 | 35 | 3 | 55 | 2 |
16 | 6 | 36 | 2 | 56 | 2 |
17 | 2 | 37 | 2 | 57 | 6 |
18 | 4 | 38 | 2 | 58 | 3 |
19 | 2 | 39 | 2 | 59 | 2 |
Time Period | News Reports | Clusters | ||
---|---|---|---|---|
Number | % | Number | % | |
1 January 2015–21 November 2018 | 164 | 26.49% | 18 | 11.46% |
22 November 2018–31 March 2019 | 85 | 13.73% | 24 | 15.29% |
1 April 2019–31 December 2019 | 370 | 59.77% | 115 | 73.25% |
Total | 619 | 100% | 157 | 100% |
Source | Amount |
---|---|
National Media | 68 (14.72%) |
Regional Media | 86 (18.61%) |
Ministry of Health, State and Municipal Health Secretaries | 162 (35.06%) |
Health or Societal organizations or Educational Institutions | 146 (31.60%) |
Total | 462 (100%) |
Category | Number |
---|---|
News | 295 (63.85%) |
Opinion Articles | 73 (15.80%) |
News Reports | 71 (15.37%) |
Other | 23 (4.98%) |
Total | 462 (100%) |
Category | Number |
---|---|
Paid Media | 8 (1.73%) |
Spontaneous Media | 282 (61.04%) |
Organic Media | 172 (37.23%) |
Total | 462 (100%) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cezario, S.; Marques, T.; Pinto, R.; Lacerda, J.; Silva, L.; Santos Lima, T.; Santana, O.; Ribeiro, A.G.; Cruz, A.; Araújo, A.C.; et al. Similarity Analysis in Understanding Online News in Response to Public Health Crisis. Int. J. Environ. Res. Public Health 2022, 19, 17049. https://doi.org/10.3390/ijerph192417049
Cezario S, Marques T, Pinto R, Lacerda J, Silva L, Santos Lima T, Santana O, Ribeiro AG, Cruz A, Araújo AC, et al. Similarity Analysis in Understanding Online News in Response to Public Health Crisis. International Journal of Environmental Research and Public Health. 2022; 19(24):17049. https://doi.org/10.3390/ijerph192417049
Chicago/Turabian StyleCezario, Sidemar, Thiago Marques, Rafael Pinto, Juciano Lacerda, Lyrene Silva, Thaisa Santos Lima, Orivaldo Santana, Anna Giselle Ribeiro, Agnaldo Cruz, Ana Claudia Araújo, and et al. 2022. "Similarity Analysis in Understanding Online News in Response to Public Health Crisis" International Journal of Environmental Research and Public Health 19, no. 24: 17049. https://doi.org/10.3390/ijerph192417049
APA StyleCezario, S., Marques, T., Pinto, R., Lacerda, J., Silva, L., Santos Lima, T., Santana, O., Ribeiro, A. G., Cruz, A., Araújo, A. C., Miranda, A. E., Cadaxa, A., Teixeira, C., Muñoz, A., & Valentim, R. (2022). Similarity Analysis in Understanding Online News in Response to Public Health Crisis. International Journal of Environmental Research and Public Health, 19(24), 17049. https://doi.org/10.3390/ijerph192417049