1. Introduction
Recognising a scientific contribution by peers, as recorded by citations, is essential for a scientific author to support career development, increase status in the scientific community, and access financial support for research. International academic editions provide the means to disseminate research findings to the scientific community of specialists in the same topic or domain. However, in the vicinity of thousands or millions of papers, discoverability in the first place becomes crucial. The journal’s title, where a scientific paper is published, its reputation, and its impact on the relevant research field are all determinants of citation counts [
1]. Furthermore, the bibliometrics of a journal, as the number of articles published and their references and citations measured per year or other chronological period, represent the communication process and ranking among academic editions [
2,
3]. Of course, the quality of the work, content, importance, novelty, rationale, methodologies, and findings are the essence of an original scientific research paper. But if it is not searchable, discoverable, retrievable, or accessible to many peers, it will not be easily visible and ultimately citable [
4].
Disruptive innovations in current awareness services for scientists, such as Current Contents [
5], Chemical Abstracts [
6], Medline [
7], and the introduction of new publication system technologies, namely, electronic [
8] and open-access [
9] ones, have not only fundamentally changed the searchability and discoverability of papers but also increased the output of worldwide scientific productivity. In biomedicine and health sciences alone, more than seventeen million research papers have been published over the past decade. Together with the thirty million preceding ones, they compete for citations when the yearly average citation output of the decade is more than fifty million [
10]. The citation distribution in biomedicine is highly skewed [
11], with almost 35% of the total biomedical papers without citations [
10]. The percentage of documents per year without citation has increased for recent publications, with 42% for the year before last, 60% for last year, and 88% for this year [
10]. This indicates that the intensity of competition for citations for recent paper publications represents a trend of increasing pressure.
The increasing enthusiasm of the public in the US for scientific advances at the turn of the 20th century and the insufficiency of journalism reporting on research results of this era led to the formation of the first scientific news media organisation, the publisher of Science News [
12]. From this point forward, scientific journalism has expanded worldwide in news media agencies. Scientific sections enrich the content of newspapers, magazines, radio, and television. The Internet era brought news outlets and their stories online, accompanied by blogs and social media reposts [
13]. The consumption of biomedical, clinical, biological, and health sciences breakthroughs through news media outlets is expected but is not free from exaggerations or distortions in the stories or the follow-up comments and people’s views [
14]. The media outlets’ scientific coverage of biomedical sciences peaked during the COVID-19 pandemic when a wave of information was described as infodemic [
15]. The haphazard accumulation of contradicting information derived from scientific communications through scholarly peer-reviewed papers and unreviewed preprints confused the public but, at the same time, exhibited the influential role of news media stories for the people [
16]. However, is the impact of news media coverage of the same magnitude to the scientific community and the public?
Previous reports have applied different methodologies to assess press relations to scientific publications and their implication in scholarly communication and society. One approach is to focus on news media channels and their content. By analysing their content and their relations to scientific journal reports and topics of interest, conclusions can be extracted from their mutual interactions [
17]. Another approach is topic-oriented, when news stories about a particular subject are analysed to track their source [
18,
19]. A third approach is to use social media to track their relationships with news and credible informational sources [
15] or to check the scientific news background and quality by text analysis, keywords, or phrases in scientific literature [
20]. This paper’s scientific literature-driven approach was preferred as the starting point for data collection and analysis.
This work investigates whether the bibliometrics and altmetrics of biomedical papers in the news media outlets’ spotlight significantly differ from those bypassed. This work is not focused on specific news media outlets or their types, commercial or non-profit, nor their scopes, general, political, business, scientific, educational, health, entertainment or lifestyle-oriented, fact-checking, or biases, but rather on the reference material, namely, the scientific papers reported. A decade-old aged and middle-to-top impact factor-matched, all open-access, bibliographical portfolio of biomedical papers was generated and served as a sample. The portfolio was divided into two subsets based on news media stories. In this paper, the identities or types of news media outlets were not considered because their complete, detailed information was unavailable in the database. The bibliometric and altmetrics of the two groups were compared to show that papers with news stories received more citations, blog reports, X posts, Facebook mentions, Wikipedia references, video references, and more Mendeley readers than their matching counterparts without news media attention. The magnitude of differences was substantially higher for all altmetrics variables compared to citations or Mendeley readers. This indicates that news media stories were inflated by the social media responses of the public and that the papers out of the news media spotlight were not negligible enough to be neglected by the media and left outside of public awareness. The invisibility of a scientific work from the public eye can affect public understanding and opinions. A merely informed society may face the same risks as a misinformed one. In both cases, it would be challenging to make wise decisions for policy formulations in crucial public health issues [
21]. Therefore, the journalist’s newsworthiness criteria for a scientific report should be carefully considered and evaluated.
3. Results
To investigate the effects of news stories of scientific paper publications on their citations and altmetrics, a bibliographical portfolio of common attributes was generated, as depicted in the flowchart presented in
Figure 1. The common qualities applied to achieve as much uniformity as possible were for all works to be published in a cohort of biomedical journals with a journal impact factor between 10 and 14 by 2015, the year of their publication, and to be original research papers, all open access. The information about bibliometrics and altmetrics was manually collected from Digital Science Dimensions
®, for each report that met the inclusion criteria. This 2020 original research biomedical report portfolio was divided into two groups of 882 published papers (44% of the total) and 1138 without (56% of the total) news stories. The publications with and without news stories are presented in
Figure 2. The percentage of papers receiving news attention per journal varied from 30% (
American Journal of Human Genetics) to 52% (
Acta Neuropathologica). The documents with news stories received 5121 mentions by news outlets, on average 5.8 ± 9.9, with a lower quartile (25% percentile) of 1, a median of 3, and an upper quartile (75% percentile) of 7 mentions. The distribution of news stories per journal studied is presented in
Figure 3. Except for journals with less than five publications with new stories, the rest exhibited similar news story distributions. Noteworthy, outliers receive significantly higher news stories far above the upper quartile of the journals’ paper distribution. This suggests that these papers’ unique individual characteristics, topic, title, authors, and content strongly contribute to the increased attraction of news outlets, which, for some, is accompanied by increased altmetrics.
The number of papers with citations, FCR, RCR, Altmetric Score, blogs, policy sources, X posts, patent citations, peer review sites, Facebook, Weibo, Wikipedia, mentions in Q & A, Google+, Reddit, videos, faculty opinions, Mendeley, and CiteULike per group with and without news stories are depicted in
Table 1. Nearly all papers in both groups with and without news stories received at least one citation, one X post, or one Mendeley reader, but other altmetrics variables differed significantly with 3.6-fold more papers with blog mentions, 3.7-fold more with policy sources, twice as many with Google+ mentions, 5-fold more with Reddit mentions, 2.6-fold more papers with videos in the news stories group, whilst the rest, namely, patent citations, Facebook, Weibo, Wikipedia, faculty opinions, and CiteULike, were pretty similar.
The total number of these variables metrics per group is presented in
Table 2. Papers with news media coverage exhibited a 60% increase in the average number of citations per paper, 70% higher average FCR, and 69% higher RCR. However, regarding the altmetrics these differences are much more significant with a 9-fold increase in the average altmetrics score for papers with news stories when compared to those without, 7-fold more blogs, 6-fold more policy sources, 2.7-fold more X posts, 50% more patent citations, 3.4-fold more Facebook mentions, 5-fold more mentions in Weibo, twice more mentions in Wikipedia, 7-fold more mentions in Google+, 6-fold more mentions in Reddit, 6.5-fold more videos, and 60% more of the average Mendeley readers. The only variables with similar average values were peer review sites, mentions in Q & A, faculty opinions and CiteULike. These positive correlations between news stories, bibliometrics and altmetrics are further increased when the number of news stories per paper is considered. Papers that received news outlets’ attention of the upper quartile and above, with at least seven news stories, received on average twice as many citations, blogs, X posts, Facebook mentions, Wikipedia references, videos, and Mendeley readers than papers of the lower quartile.
The distributions of citations per journal with and without news stories are depicted in
Figure 4. News stories affect the odds of receiving more citations for publications in the same journal and year, as indicated by the increase in the lower, median, and upper quartiles. The effect of news media stories on altmetrics scores is far more prominent as it is presented in the comparative depiction of the distributions of altmetrics scores per journal with and without news stories in
Figure 5. Orders of magnitude increase in the lower, median, and upper quartiles of the distribution of altmetrics are evident for all journals for the papers that attract news outlets’ attention compared to those without.
Notably, the odds ratio for increased citations, FCR, altmetrics score, blogs, policy sources, X posts, Facebook mentions, Weibo, Wikipedia, Google+, Reddit, videos and Mendeley readers of original biomedical research, all open-access reports with news outlets attention when compared to age- and attributes-matched reports, all published in the same cohort of journals with similar median to high impact factors, found to be statistically significant different as depicted in the agreement statistical analysis in
Figure 6. The highest odds ratios were obtained for altmetrics score and Mendeley readers, with OR values of nearly 50 (95% CI, 7–357), followed by blogs with an OR 9 (95% CI, 7–11), and Reddit with an OR 6 (95% CI, 3.5–10). This observation suggests that the impact of news stories is more substantial for altmetrics when compared to bibliometrics.
Pearson correlation analysis of paper citations, FCR, altmetrics score, news stories, X posts, Facebook, Mendeley, and patents citations for the complete bibliographic portfolio of 2020 reports revealed the correlations between these variables as indicated in
Table 3 and
Figure 7a. These correlations reflect linear associations between these parameters, as
Figure 7b depicts. Citations correlate well with FCR and Mendeley but not with news stories, whilst news stories correlate with altmetrics score and, to a lesser extent, with X posts and Facebook mentions. The scatter plots and linear regressions of news stories exhibited slopes of 1.146 for citations, 0.653 for FCR, 10.015 for altmetrics score, 3.284 for X posts, 0.236 for Facebook, 6.180 for Mendeley, but 0.003 for patents citations. This is an additional indication of the impact of news stories on altmetrics when compared to bibliometrics.
Collectively, these data suggest that news outlets’ stories on original research paper publications are an independent factor correlated well with enhanced bibliometrics parameters but strongly associated with increased altmetrics variables.
4. Discussion
Quantifying multivariable-dependent trends in cross-sectional studies, such as the correlation of news stories with bibliometrics and altmetrics, can be particularly challenging. Therefore, this study focused on generating homogenous and well-matched groups of scientific publications to compare. A total of 2020 original biomedical research articles were investigated, and all open-access articles published within the same year in 18 journals had impact factors between 10 and 14. By selecting 2015 as the publication year, the recent effects of COVID-19 infodemics were also avoided [
15]. This bibliographic portfolio was split into two groups of articles: those with and those without news stories. By controlling for the research field, year of publication, journal impact factor, and accessibility to readers, the effects of news stories were accessible.
The distribution of news stories per journal is similar except for journals with less than five papers with news stories. The outliers above the upper quartile of these distributions indicate journals with unique characteristics that explain the increased attention by news outlets and other altmetrics variables that reflect public opinion, reactions, or discussions on them. As recently postulated, a scientific publication communicating to different audiences, peers, or the public may produce different responses [
24]. This study sheds light on the differential dynamics of the impact of biomedical research on specialists and non-specialist audiences. News stories correlate with more citations, blogs, X posts, Facebook reports, Wikipedia references, videos, and Mendeley readers in biomedicine. However, their impact on papers’ altmetrics is several times stronger than bibliometrics, as indicated by two lines of evidence: the collective data of a total number of citations or altmetrics mentions with their descriptive statistics and the box and whisker plots of paper distribution per journal according to citations or altmetrics score. Whilst the outliers indicate papers with unique characteristics that explain their high citability or altmetrics attraction state, the distributions per journal attest that grouping by news stories reflects actual differences in papers published in the same journal and year in bibliometrics and altmetrics. However, the predictive ability of news stories for citations or altmetrics variables differs significantly, as indicated by agreement statistics. According to this analysis, the predictive ability of at least one news story is good for overall altmetrics score, blogs, policy sources, X posts, Facebook, Wikipedia, Google+, Reddit, videos, and Mendeley readers. Therefore, the impact of news stories on social media appears to be significantly more potent than on the scientific audience. Although it has been shown that early Mendeley readers correlate well with later citation counts [
25], the odds ratio of papers with news stories versus papers without suggests that the impact on citations is moderate compared to a 25-times more substantial effect on Mendeley readers. When considering that the publications explored are already ten years old, so there was enough time to receive citations, it may be concluded that the Mendeley readers may cite only a few from all the reports they accessed and read; in addition, the Mendeley readers population is separated by few expert scientific authors when compared to many young investigators that still do not produce their peer-review works.
Pearson correlations accompanied by linear regression analyses were performed to detect linear relationships between the variables examined. These analyses were not performed in groups but in the whole bibliographic portfolio. They showed that the bibliometric variables correlate with each other and Mendeley, and altmetrics variables correlate well but with lesser linear relationships rather than modifier factors. News stories correlate linearly with the overall altmetrics score, with a relationship of 1 news story to an increase of 10 in the altmetrics score. This finding suggests an amplification of news stories by altmetrics resources. There are also partial linear correlations between news stories with X posts and Facebook mentions but not with citations. This finding indicates that even though news stories are associated with more citations, this relationship is not linear, and the effect of outliers with individual characteristics that attract citations may be significant.
In the literature, the interplay between scientific papers and news media stories has been considered under different contexts and niches, but, in most cases, it involves the Internet and social media [
26]. The contagion-like diffusion of information in social media can be attributed to network users or content derived from external sources like news outlets [
27]. However, as superspreaders of information, penetrating and impactful newspapers or news channels may deliver erroneous messages due to citation bias, false consistency, lack of clarity and transparency, overgeneralisation, exaggeration, insulation, or noncredible sources [
28]. Recently, it was shown that there is a positive correlation between the external popularity of research outside the scientific community by noteworthy media coverage and the number of scientific reports published after media coverage [
29]. This investigation aligned the Altmetric
® Mainstream-Media-Scores (MMS) with thematically similar articles from the PubMed database to obtain increased scientific publications on the same topic as a research article that received widespread news media coverage after this incident. This report did not examine the citations, bibliometrics, or other altmetrics but showed that news stories may influence scientific authors’ research investigations. However, the trigger of scientific community attention was the extensive, exceeding a hundred news stories per paper, news media attention on a single paper. The data presented here for the bibliometrics of scientific articles with and without news stories agree with this observation, but by considering a single news story of a paper. The average number of citations received after a single news story mention was found to be increased by 60% compared to a matching paper without a news story. This finding is important because only a tiny fraction, in this investigation, 0.1%, of the published scientific literature achieves over a hundred news media stories.
In a recent bibliometric and altmetrics investigation, Digital Science Dimensions
® was used to generate a bibliographic portfolio of COVID-19 reports derived from Jordan and published between 2019 and 2022, which were subgrouped according to Altmetric Explorer
® or Semantic Scholar
® Highly Influential Citations (HICs) and compared [
30]. The Semantic Scholar
® HICs represent an AI-generated classification of citations regarding the context of the citations in a paper. There was a lack of correlation between Semantic Scholar
® HICs and Altmetric
® Attention Scores. This finding agrees with the Pearson correlation accompanied by linear regression analyses presented here, as citations and the Altmetric
® Attention Score Pearson correlation coefficient value was 0.05. This is explained by the different motivations and criteria that define the citation, and the specific altmetrics-mentioned behaviour.
The data presented here concerning biomedicine suggest that news media are potent influencers of the scientific community and society and may generate trends in research investigations, funding flow, and public opinions and beliefs. This is important for all the stakeholders involved in scientific research, investigators, publishers, funders, policymakers, news media outlets, journalists, and the public, for introspection and careful consideration of these stimuli. News stories may deliver novel and groundbreaking science, but there is a risk of exaggeration, misinterpretation, or misleadingness. A critical perspective and cross-evaluation are necessary to extract conclusions and decide actions.
This study has some limitations that should be mentioned. Firstly, obtaining the flow of altmetrics variables in time series was impossible due to the platform restrictions. Secondly, selecting a publication year to achieve uniformity of citation and altmetrics information may mask longitudinal trends. Thirdly, this study did not consider the individual characteristics of each publication examined, such as topic, title, authors, affiliated institutions or countries, abstracts, or keywords. Fourthly, the criteria for journalists to select newsworthy research papers is out of the scope of this work. Fifth, this report could not obtain the full text of the news stories about the documents examined to comment on the manner of scientific information delivery or explanation by the news outlets. Sixth, the generalizability of these findings may be limited because of the enrolment of highly ranked biomedical scientific journals.
This is the first cross-sectional investigation of news stories’ effects on bibliometrics and altmetrics in biomedicine, with uniformity of age, journal impact factor, and access-matched groups of papers. Future prospective studies would shed light on the timing of bibliometrics or altmetrics responses on a scientific paper upon news story release. Specific altmetrics indicators exhibit individual patterns of interest, converging or diverging from bibliometrics. Retrospective studies would demonstrate the socioeconomic determinants underlying these trends, expert opinions or stakeholders’ perspectives against scientific investigations. These tendencies suggest social, economic, and political driving forces over science that may directly or indirectly influence scientific topic trends or research funding. Seeking specific thematic patterns or lemmas in scientific papers’ titles, abstracts, or keywords that attract news media interest would be important to deliver. Its quantified observations provide a better understanding of the relationships between published research reports’ audiences, scientific experts, and the public.