*Review* **Worldwide Use of RUCAM for Causality Assessment in 81,856 Idiosyncratic DILI and 14,029 HILI Cases Published 1993–Mid 2020: A Comprehensive Analysis**

#### **Rolf Teschke 1,\* and Gaby Danan <sup>2</sup>**


Received: 19 August 2020; Accepted: 25 September 2020; Published: 29 September 2020

**Abstract: Background:** A large number of idiosyncratic drug induced liver injury (iDILI) and herb induced liver injury(HILI) cases of variable quality has been published but some are a matter of concern if the cases were not evaluated for causality using a robust causality assessment method (CAM) such as RUCAM (Roussel Uclaf Causality Assessment Method) as diagnostiinjuryc algorithm. The purpose of this analysis was to evaluate the worldwide use of RUCAM in iDILI and HILI cases. **Methods:** The PubMed database (1993–30 June 2020) was searched for articles by using the following key terms: Roussel Uclaf Causality Assessment Method; RUCAM; Idiosyncratic drug induced liver injury; iDILI; Herb induced liver injury; HILI. **Results:** Considering reports published worldwide since 1993, our analysis showed the use of RUCAM for causality assessment in 95,885 cases of liver injury including 81,856 cases of idiosyncratic DILI and 14,029 cases of HILI. Among the top countries providing RUCAM based DILI cases were, in decreasing order, China, the US, Germany, Korea, and Italy, with China, Korea, Germany, India, and the US as the top countries for HILI. **Conclusions:** Since 1993 RUCAM is certainly the most widely used method to assess causality in IDILI and HILI. This should encourage practitioner, experts, and regulatory agencies to use it in order to reinforce their diagnosis and to take sound decisions.

**Keywords:** RUCAM; Roussel Uclaf Causality Assessment Method; diagnostic algorithm; iDILI; iDrug induced liver injury; DILI; HILI; herb induced liver injury

#### **1. Introduction**

Idiosyncratic drug induced liver injury (DILI), in short also termed iDILI, and herb induced liver injury (HILI) are complex diseases and received much attention in recent years [1–9]. The present scientometric study comprehensively analyzed the global knowledge base and specific emerging topics of DILI derived from 1995 publications in 79 countries and regions, with an impressive annual growth of reports between 2010 and 2019 and almost 340 studies published in 2020 [1]. In parallel, more and more publications on DILI and HILI cases refer to RUCAM (Roussel Uclaf Causality Assessment Method) for causality assessment [10–13]. The original RUCAM was first published in 1993 [14] and updated in 2016 [15] with additional information on its use and perspectives [16,17], which is now the preferred version to be used in future cases of DILI and HILI [15]. It is widely recognized that causality assessment in DILI and HILI is a multifaceted approach [7–9,15], a real medical challenge, for which a diagnostic quantitative algorithm such as RUCAM is an easy tool for case evaluation [10–18] to solve complex conditions [18].

The RUCAM algorithm is a structured, standardized, transparent, liver specific and quantitative diagnostic clinical scale based on key elements of liver injury, which are individually scored and provide a score for five-degree causality grading from unrelated up to highly probable causality levels [15]. Since key elements are specifically described and scored, assessments are objective with little risk of subjectivity [15–17] commonly observed if the approach to assess causality lacks scored key elements [19]. RUCAM can help expand our knowledge by enlarging population analysis with prospective and scored causality assessment, allowing for harmonized interpretation of data across populations [20]. In this context, RUCAM should be viewed as a cornerstone approach assessing causality of liver injury cases [15–17,21], because robust diagnostic biomarkers are rarely available due to misconducted studies as outlined by EMA (European Medicines Agency: Formerly London, UK, now Amsterdam, Netherlands) [21].

In this review article, current conditions of DILI and HILI cases assessed worldwide using RUCAM were critically analyzed. For the first time, the focus is on reports published from1993 to mid 2020 and the discussion of their potential use to describe specific features of DILI and HILI cases.

#### **2. Literature Search and Source**

The PubMed database (1993–30 June 2020) was searched for articles by using the following key terms: Roussel Uclaf Causality Assessment Method; RUCAM; Idiosyncratic drug induced liver injury (iDILI); Herb induced liver injury (HILI). Key terms were used alone or in combination. Limited to the English language, publications from each search terms were analyzed for suitability of this review article. The electronic search was completed on 30 June 2020 and supplemented by a manual literature search, using also the large private archive of the authors when the publication was not yet referenced in PubMed. The final compilation consisted of original papers including individual case reports and case series, consensus reports, and review articles with the most relevant publications included in the reference list of this review.

#### **3. Definitions**

RUCAM is presented as an algorithm that requires a few criteria allowing for a final quantitative evaluation. In particular, establishing RUCAM based criteria of liver test thresholds and liver injury patterns was revolutionary at the time of first publication issued from an international consensus meeting of experts, without the requirement of a liver biopsy [14] with same principles preserved in the updated RUCAM [15].

#### *3.1. RUCAM Based Liver Injury*

#### 3.1.1. Liver Test Thresholds

A liver injury caused by exogenous compounds such as drugs and herbs is defined by specific threshold values established for the liver tests (LTs) alanine aminotransferase (ALT) and alkaline phosphatase (ALP), with current serum activities considered as relevant for ALT ≥ 5 × ULN (upper limit of normal) and ALP ≥ 2 × ULN [15] provided that ALP is of hepatic origin. The original RUCAM was the first causality assessment method (CAM) ever considering threshold criteria although initially with lower values for ALT [15] as compared to currently used criteria [16]. Of note, serum bilirubin is not part of the diagnostic RUCAM algorithm that uses ALT or ALP as diagnostic liver test. In this context, conjugated bilirubin is a sign of the severity of the liver injury.

#### 3.1.2. Liver Injury Pattern

RUCAM was also the first CAM proposing different patterns of liver injury based on LTs [14] and are included also in the updated RUCAM [15]. To determine the liver injury pattern, the ratio R is to be calculated using the multiple of the ULN of serum ALT divided by the multiple of the ULN of serum ALP, provided the ALP increase is of hepatic origin. For causality assessment purposes, two types of liver injury are defined (independently from histological findings): first a hepatocellular injury with R > 5, and second, a cholestatic/mixed liver injury with R ≤ 5.

#### *3.2. Idiosyncratic Versus Intrinsic Liver Injury*

Liver injury is either idiosyncratic, due to the interaction between the exogenous synthetic chemical or phytochemical and a susceptible individual with some genetic factor(s), or it is intrinsic due to chemical overdose [11–13]. In the present analysis, idiosyncratic injury is considered, as opposed to intrinsic liver injury most commonly observed with overdosed drugs such as acetaminophen [22].

#### **4. Worldwide Publications of DILI**

The current scientometric report from China on knowledge mapping confirmed the high worldwide interest in DILI publications and identified a total of 1995 DILI studies published between 2010 and 2019, although information on the applied method of causality assessment was not provided and will need further clarification [1]. This Chinese analysis on the top 10 countries involved in DILI research listed the US, China, Japan, Germany, UK, Spain, France, the Netherlands, Sweden, and Canada. In addition, many interesting details on DILI were comprehensively discussed with focus on definition, incidence rate, clinical characteristics, etiology or pathogenesis such as the character of the innate immune system, the regulation of cell-death pathways, susceptible HLA (Human Leukocyte Antigen) identification, or criteria and methods of causality assessment, all topics were considered as the knowledge base for DILI research [1].

#### **5. Worldwide Publications of RUCAM Based Idiosyncratic DILI**

The worldwide impact of DILI can best be quantified by using liver injury cases assessed for causality with a robust method that allows for establishing causality gradings for each implicated drug and to exclude alternative causes unrelated to drug administration.

#### *5.1. Countries and Regions*

In the current analysis, authors from 31 countries worldwide reported on cases of idiosyncratic DILI caused by multiple drugs published from 1993 up to mid 2020 and applied in all cases RUCAM to assess causality (Table 1) [23–180]. Such a table with a comprehensive list of publications over a long period has never been reported before and will facilitate the search for RUCAM based DILI cases caused by individual drugs, considering that databases such as LiverTox may have problems providing real DILI cases [10,74].

#### *5.2. Hospital and Other Sources*

RUCAM based DILI cases were mostly published by authors from university hospitals and their affiliated teaching hospitals known for their high reputation (Table 1). Among these were a broad range of departments, which in most cases include departments of Hepatology and Gastroenterology, ensuring careful clinical evaluation of patients with suspected DILI and associated causality assessment for the offending drug(s). To a lesser degree, other departments were contributors, for instance, Pharmacology, or Pharmacy and Pharmaceutical sciences [170].


**1.**Worldwide countries with a selection of published DILI cases assessed for causality using RUCAM.

**Table** 










Abbreviations: CAMs, Causality Assessment Methods; DILI, Drug induced liver injury; NSAID, Non-steroidal anti-inflammatory drug; RUCAM, Roussel Uclaf CausalityAssessment Method.

In addition to hospitals, other sources provided RUCAM based DILI cases (Table 1). Among these were National Institutes of Health from Japan [92] and the US [165], consortia from Spain [115,141], the adverse drug reactions advisory committee (ADRAC) from Sweden [126], regulatory pharmacovigilance and pharmacoepidemiology centers from France [58,59] and Italy [86], drug commission of medical association from Germany [64], committee for drug induced liver injury from China [42]; also, drug reaction reporting database from Spain [65], regulatory agency from Spain [114] health insurance from the US [157], and drug safety departments of drug companies from France [57], Sweden {148], and Switzerland [132]. Some of these played an eminent role in promoting the use of RUCAM in prospective studies, particularly those from Spain [115], Sweden [126], and the US with France and Sweden [148].

#### *5.3. Top Ranking Countries*

Among the top 10 countries were in decreasing order China, the US, Germany, Korea, Italy, Sweden, Spain, Japan, Argentina, and Thailand, whereby the top 5 countries provided most of the DILI cases (Table 2). Authors from these 5 countries contributed together 75,133 DILI cases out of a total 81,856 worldwide DILI cases, corresponding to 91.8%. On the lower part of the list ranked the 6 countries Israel, Malaysia, Mexico, Morocco, Saudi Arabia, and Turkey, authors from these low ranking countries provided each one single DILI case assessed for causality using RUCAM, corresponding to 6 cases altogether out of a total of 81,856 DILI cases. Authors from the remaining 20 countries with a ranking from 6 down to 25 contributed 6.723 DILI cases out of overall 81,856 cases corresponding to 8.2%. In essence, RUCAM based DILI cases were mostly published in English language journals, raising the question how DILI cases were assessed and published by the other countries in local journals in languages other than English. Currently, overall 81,856 cases of idiosyncratic DILI assessed for causality by RUCAM have been retrieved via PubMed, all published 1993–June 2020 (Table 1) [23–180].

#### *5.4. Annual Growth Trends of RUCAM Based DILI Case Publications*

Analyses of growth trends provided additional information after identification of a total 1995 DILI studies, published between 2010 and 2019 but not stratified for causality assessment using RUCAM [1]. In the frame of the present analysis, only publications of idiosyncratic DILI cases were included if they had been assessed for causality using RUCAM, providing a more homogenous series with established DILI diagnoses.

#### 5.4.1. Published Annual RUCAM Based DILI Cases

Considering the period from 1993 to 2019, annually published cases of RUCAM based idiosyncratic DILI ranged between 0 and 27,224 in 2019, but data of 2020 were not included because case counting stopped by end of June in this particular year (Figure 1). Three phases of trends appeared with respect to published RUCAM based DILI cases: (1) phase 1 with clinical field testing from 1996 to 2004 (2) phase 2 with promotion from 2005 to 2013, and (3) phase 3 of worldwide use from 2014 to 2019.

Phase 1 started after the launch of RUCAM in 1993 [16,47] and the analysis of 94 DILI cases [47], the number of subsequent annual published DILI cases remained small until 2004, reaching 121 cases (Figure 1). This was the period of initial testing the RUCAM algorithm under clinical field conditions with interesting early information provided by 3 reports [58,91,114]. The first report came from Spain, was published in 1996, analyzed a major study cohort of DILI due to amoxicillin and clavulanate, and described their typical clinical features, with Rodríguez as first author and Zimmerman as senior author [114] who actually was involved as an expert from the US in the international consensus meetings [14] but did not promote RUCAM in DILI evaluations in his own country. Of interest was also the retrospective design of this analysis, suggesting that this particular study approach is feasible [114] although a prospective approach is recommended [15]. The second report was from Japan with Japanese patients, published in 2003 by Masumoto et al. [91]. This study favored RUCAM over other CAMs, provided evidence that the performance of the lymphocyte transformation test was poor in line

with previous reports, and the RUCAM criteria were viewed as useful for diagnosing DILI in Japanese patients. The third publication came from France, reported in 2004 on details of a patient with DILI by pioglitazone, and showed the feasibility of a good case report to be assessed by RUCAM, evaluated by Arotcarena et al. [58]. All three reports were hallmarks of the first phase of RUCAM based DILI case series devoted to clinical field evaluation that ended in 2004 (Figure 1).


**Table 2.** Top ranking of countries providing DILI cases assessed for causality by RUCAM.

Abbreviations: DILI, Drug induced liver injury; RUCAM, Roussel Uclaf Causality Assessment Method.

Phase 2 started in 2005 with overall 7695 annually published RUCAM based DILI cases (Figure 1) [115,126,147,148]. Among these were 461 cases provided by Andrade et al. from Spain retrieved from a prospective study involving various drugs [115], additional 784 cases from Sweden were published by Björnsson and Olsson retrieved from a prospective study of DILI by various drugs [126], whereas from the US 2 case reports of DILI by amoxicillin and clavulanate were presented by Fontana et al. [147] as well as a large cohort of DILI caused by ximelagatran occurred in clinical trials was published by Lee et al. [148]. These 4 studies promoted the usefulness of RUCAM evaluating DILI cases [115,126,147,148] by preferring a prospective study design [115,126], evaluating single DILI case reports [147], and correctly assessing suspected DILI cases in clinical trials [148]. Whereas RUCAM had already a firm place among DILI experts in Europe, it seems that experts in the US became more familiar with the use and practicability of RUCAM.

Phase 3 is characterized by the worldwide use of RUCAM for DILI started in 2014 with 11,525 DILI cases (Figure 1), mostly attributed to one study with 11,109 DILI cases provided by Cheetham et al. [157]. Starting in 2015, there was a continuous rise of published RUCAM based DILI cases (Figure 1), likely

driven also by the updated RUCAM available online 2015 and published in 2016 [15]. With 27,224 published DILI cases, the maximum level on an annual base was achieved in 2019 (Figure 1). Until end of June 2020, additional 15,153 published DILI cases were counted but not included in Figure 1, corresponding already to more than half of the cases counted in 2019 and representing a good base for 2020 and further years.

**Figure 1.** Annual cases of DILI assessed for causality by RUCAM and published since 1993.

#### 5.4.2. Annual RUCAM Based DILI Publications and Growth Trend

Over the years starting from 1993, when RUCAM was launched [14,57], and until 2019 an upward trend of annual RUCAM based DILI publications can be observed with some dips in between (Figure 2). In 2019, 26 publications were counted, and 15 publications from January 2020 until end of June 2020 that were not included in the listing (Figure 2). Overall 158 publications with RUCAM based DILI cases were counted from 1993 until mid 2020 (Table 1).

**Figure 2.** Annual publications of DILI cases assessed for causality by RUCAM as reported since 1993.

#### *5.5. Specificities of DILI Case Evaluation*

Large study cohorts of RUCAM based DILI cases accumulated many different drugs and provided as expected a global information of the DILI cases due to various drugs without a detailed description of clinical features drug by drug (Table 1). Consequently, typical clinical features of a DILI by a single drug cannot be obtained from large cohorts as opposed to single DILI case reports or case series that included DILI cases due to a single drug (Table 1). In general, studies with a single DILI case or a few cases are more informative because they provide an exhaustive past medical history with clinical details required for a sound case evaluation. In search for typical DILI features by specific drugs, therefore, assistance may be provided by the drug listing (Table 1). In addition, details can be retrieved via the internet, using the search terms drug induced liver injury and the name of the suspected drug, combined with RUCAM or the updated RUCAM.

#### *5.6. Worldwide Top Ranking of Drugs Causing DILI*

There is concern how best to establish a top ranking of drugs most commonly implicated in DILI [70,74]. A recent study presented a list with top ranking drugs out of overall 3312 DILI cases evaluated by RUCAM (Table 3) [70]. The RUCAM based DILI cases were retrievd from 15 reports by six national databases of DILI registries and three large medical centers worldwide, which provided the DILI cases under consideration. Contributing countries and regions were in alphabetical order China, Germany, Latin America, Iceland, India, Singapore, Spain, Sweden, and the US. It was found that the databases of national registries and large medical centers are the best sources of drugs implicated in DILI cases. There is also the note that presently DILI cases of the LiverTox database are less suitable for clinical or regulatory purposes as presented on its website because many suspected DILI cases were derived from published cases of poor quality, lacking a robust CAM such as RUCAM [70,74]. Consequently, the majority of LiverTox based cases of assumed DILI could previously not be classified as real DILI [74]. To overcome these diagnostic shortcomings, LiverTox attempted a top ranking of drugs by counting the published DILI cases for each individual drug [74]. It was assumed that the degree of causality probability increases with the number of published DILI reports: the higher the case number the higher the probability. This special approach explains the variability of the top listing presented by liverTox [74] as compared to RUCAM based cohorts [70].


**Table 3.** Worlwide top ranking of drugs causing DILI cases with causality assessment by RUCAM.


**Table 3.** *Cont*.

Substantially modified from a previous report [70], which provides references for each implicated drug.

#### **6. Worldwide Publications of HILI Cases Assessed for Causality Using RUCAM**

Highlights of liver injury cases have been reported not only for DILI but with increasing frequency also for HILI cases questionable due to lack of a robust CAM [7–9]. The problems associated with HILI are specifically addressed in the current analysis, which considers for the first time worldwide HILI cases using RUCAM as a robust algorithm for assessing causality.

#### *6.1. Countries and Regions*

Authors from many countries around the world reported on cases of HILI in connection with the consumption of various herbs, all published since 1993 (Table 4) [29,37,38,42,48,100–103,113,115–118, 181–255]. Specifically considered were patients, who experienced HILI with established causality using RUCAM. Such a table with a comprehensive list of publications over a long period of time will help the search for RUCAM based HILI cases caused by specific herbs or herbal products containing a mixture of several herbs. This list is unique as compared to databases that may have problems providing real HILI cases not confounded by alternative causes or lack of a robust causality assessment.


 **4.**Worldwide countries with a selection of published HILI cases assessed for causality using RUCAM.

**Table**





Abbreviations: DILI, Drug induced liver injury; HILI, Herb induced liver injury; HSOS, Hepatic sinusoidal obstruction syndrome; RUCAM, Roussel Uclaf Causality AssessmentTCM, Traditional Chinese Medicines; USP, United States Pharmacopeia; WHO, World Health Organizations.

#### *6.2. Hospital and Other Sources*

Most RUCAM based HILI cases were provided by authors from university hospitals and their affiliated teaching hospitals with their departments of Hepatology and Gastroenterology, Medicine or Internal Medicine (Table 3). Rare contributors were other departments like those with focus on Emergency Medicine [255], Clinical Pharmacology and Toxicology in Berlin [209], Pharmacology and Toxicology in Hannover [213], Pharmacy in Singapore [238], Physiology and Pharmacology in Rome [219,221], Anatomical, Histological, Forensic and Orthopedic Sciences in Rome [222], Pediatrics in Seoul [234], and among the contributors were even the Neurology and Headache Center in Essen [212] and Spine and Joint Research Institute in Seoul [235].

Other sources providing RUCAM based HILI cases include the Chinese Academy of Medical Sciences in Beijing [195], School of Chinese Materia Medica in Beijing [199,202], Competence Centre for Complementary Medicine and Naturopathy in Munich [211], Biomedical Research and Innovation Platform South African Medical Research Council in Tygerberg [239], United States Pharmacopeia in Rockville [254], and Center of Pharmacovigilance of Florence [218].

#### *6.3. Top Ranking Countries*

Among the countries presenting RUCAM based HILI cases were on top in descending order China and Korea, followed by Germany, India and the US, whereby the top 5 countries provided most of the HILI cases (Table 5). Authors from these 5 countries contributed together 13,808 HILI cases out of a total 14,029 worldwide HILI cases, corresponding to 98.4%. On the lower part of the list ranked the 4 countries Brazil, Colombia, Switzerland, and Turkey, authors from these low ranking countries provided each one HILI case assessed for causality using RUCAM, corresponding to 4 cases altogether out of a total of 14,029 HILI cases. Authors from the remaining 20 countries with a ranking from 6 down to 14 contributed 217 HILI cases out of overall 14,029 cases corresponding to almost 1.6%.


**Table 5.** Top ranking of countries providing HILI cases assessed for causality by RUCAM.

Abbreviations: HILI, Herb induced liver injury; RUCAM, Roussel Uclaf Causality Assessment Method.

#### 6.3.1. Published Annual RUCAM Based HILI Cases

From 1993 to 2019, published annual cases of RUCAM based HILI ranged between 0 and 11,609 in 2019, while 57 HILI cases of 2020 were not included because case counting stopped by end of June in this particular year (Figure 3). Three phases of trends appeared with respect to published RUCAM

based HILI cases: (1) phase 1 with lack of any clinical field testing from 1993 to 2003, (2) phase 2 with slow promotion from 2004 to 2016, and (3) phase 3 of worldwide use from 2017 to 2019.

**Figure 3.** Annual cases of HILI cases assessed for causality by RUCAM and published since 1993.

Phase 1 started after the launch of RUCAM in 1993 [16,47] but without a single published HILI case until 2003 (Figure 3). The lack of published RUCAM based HILI cases during this period might be due to the fact that the value of RUCAM was not yet sufficiently known or to uncertainties whether herbs have the potential to cause liver injury. In addition, the term of herb induced liver injury or its acronym HILI was unknown at that time and therefore not in common use.

During the subsequent phase 2, the number of annual published HILI cases remained small with cases ranging from 2 to 933, considering the years from 2004 until 2016 (Figure 3). In 2008, there were 108 HILI cases, with 18 Spanish cases published by García-Cortés et al. [117,241] and 90 Korean cases published by Kang et al. [227] and Sohn et al. [228]. During 2016, there was a sharp increase with 933 HILI cases, mostly attributed to 866 cases from China published by Zhu et al. [42]. As a reminder and outlined recently, herb induced liver injury with HILI as its acronym was first introduced and proposed as a specific term in the scientific literature only in 2011 [12]. This may explain retarded publications on HILI cases (Figure 3).

Phase 3 started with low HILI case numbers in 2017 and 2018 (Figure 3), considering that the updated RUCAM applicable also to HILI cases was published only in 2016 [15]. With 11,609 the largest HILI case number was published in 2019 (Figure 3) as a consequence of the ongoing worldwide use of RUCAM for assessing causality in suspected HILI cases (Table 4). In particular, contributing countries were in alphabetical order Australia [29], Brazil [183], China [48,196–201], Germany [213–215], India [217], Italy [222], Korea [103], Spain [115,117,240–242], Switzerland [244], and the US [245–254]. Most of the 11,619 HILI cases published in 2019 were from China [48,196] and Korea [103], with 6971 cases published by Shen et al. [48], 2019 cases reported by Byeon et al. [103], and 1552 cases provided by Chow et al. [196]. However, until mid 2020 only 57 HILI cases were published (Table 4) [202,203,253,254], suggesting for the whole year 2020 at best 100 cases (Figure 3).

#### 6.3.2. Annual RUCAM Based HILI Publications and Growth Trend

Over the years starting from 1993, when RUCAM was launched [14,57] and until 2019, an upward trend of annual RUCAM based HILI publications can be observed with some dips in between (Figure 4). In 2019, 18 publications were counted and 4 publications until end of June 2020 that were not included (Figure 4). For the whole year 2020, therefore, at best perhaps 8 publications can be anticipated

(Figure 4). These figures show that a total of 85 publications with RUCAM based HILI cases were reported from 1993 until mid 2020 (Table 2).

**Figure 4.** Annual publications of HILI cases assessed for causality by RUCAM as reported since 1993.

#### *6.4. Specificities of HILI Cases*

Large study cohorts of RUCAM based HILI cases accumulate many different herbs and provide as expected a global information of many HILI cases without a detailed description of clinical features for specific herbs (Table 4). Consequently, studies with a single or a few HILI cases have many advantages because they focus on a single herb or herbal product causing the liver injury and usually provide an exhaustive past medical history with clinical details required for a sound case evaluation. For interested physicians, regulators, and manufacturers, this listing provides individual cases with herbs causing HILI.

#### **7. Utility of RUCAM**

The utility of RUCAM has been confirmed in in many liver injury cases of DILI (Table 1) and HILI (Table 4) published from countries and regions around the world, as outlined in various reports [5,11,15–18] and briefly summarized (Table 6). In short, the high qualification of RUCAM as an objective diagnostic algoritm to assess causality in liver injury cases of DILI and HILI is the clue of its increasing use (Figures 1–4). RUCAM is smoothly applied by clinicians or regulators and obviously without problems (Tables 1 and 4). The worldwide use allows data comparison among different countries, a unique condition for multifacetted diseases as DILI and HILI are. RUCAM is also applied in epidemiology studies. Finally and most importantly, each individual DILI and HILI case report contain important details of liver injury cases that may be helpful for physicians in care of patients with suspected DILI and HILI.

#### **Table 6.** Characteristics of RUCAM.



Abbreviations: AI: Artificial Intelligence; ALT: Alanine aminotransferase; ALP: Alkaline phosphatase; CAM: Causality assessment method; CMV: Cytomegalovirus; DILI: Drug induced liver injury; EBV: Epstein Barr virus; HAV: Hepatitis A virus; HBV: Hepatitis B virus; HCV: Hepatitis C virus; HEV: Hepatitis E virus; HILI: herb induced liver injury; HSV: Herpes simplex virus; RUCAM: Roussel Uclaf Causality Assessment Method; VZV: Varicella zoster virus.

#### **8. Other CAMs**

Apart from the objective diagnostic RUCAM algorithm, a few non-RUCAM based CAMs are known, critically discussed elsewhere in detail [5,15]. In short, they are less accurate than RUCAM, not quantitative as not based on specific elements to be scored individually, not specific for liver injury cases, not structured, not validated, or based on individual arbitrary subjective opinions. In fact, other CAMs are still caught up in the pre-RUCAM and pre-AI era [18] and thereby neglecting the use of diagnostic algorithms such as the original RUCAM [14] or the now preferred updated version [18].

#### **9. Limitation of the Analysis**

The current analysis is based on published data of DILI and HILI reports in English, or at least an abstract in English, rather than on unpublished data contained in the original data sets that were not available to the authors of the analysis for re-analysis. Although most of the published DILI and HILI cases provide excellent data, some authors forgot presenting RUCAM based causality gradings or included cases with a possible causality grading in their final evaluations of cases together with a probable or highly probable causality level. Nevertheless, a broad range of different causality gradings was commonly provided in most published cases, respective references allow for detailed information. As being outside the scope of this article, causality gradings for individual reports were not provided (Tables 1 and 5), but some details of 46,266 DILI cases assessed by RUCAM were published earlier [11]. Problematic are study cohorts with inclusion of both DILI and HILI cases, unless both groups were separately evaluated [48]. As expected, not all of the patients were commonly confirmed as being DILI by RUCAM scoring, but the number of published cases remained accurate. For instance, special conditions are evident in the randomized clinical trial of ximelagatran [148]. In this prospective, report, hepatic findings were analyzed in all suspected cases with regard to causal relationship to ximelagatran by using RUCAM, considered as the most reliable tool to assess causality [148]. Applying RUCAM based on ALT thresholds only is insufficient since 92% of the ximelagatran group did not meet this criterion missing then a final robust causality grading, as opposed to 8% of the study group receiving partially high causality gradings. This study reaffirms the utility of RUCAM to identify cases with real DILI cases in cohorts under real world conditions.

#### **10. Outlook**

The perspectives using the updated RUCAM in future DILI and HILI cases are favorable because many authors including those from the US become more familiar with RUCAM and are ready to use this diagnostic algorithm (Tables 1–4), in line with principles of Artificial Intelligence to solve difficult processes [18]. Moreover, as in the US and many other countries RUCAM was successfully used to assess causality in cases of DILI, there is no need to invent another instrument specifically designed for drug development [255]. The issue of overlooked alternative causes remains a clinical problem and was described already in 1999 by Aithal et al. [256] and guided by RUCAM subsequently confirmed [69,257].

Future DILI and HILI studies should adhere on a prospective study design as strongly recommended in the RUCAM updated in 2016 because a retrospective approach may create concern on the validity of the published results due to incomplete information [15]. Neglecting this recommendation and using instead a retrospective design could be problematic [48]. In addition, attempts to lift RUCAM based causality gradings from possible to probable must be resisted [48]. Discouraged is in particular the use of a non-RUCAM based CAM in addition to RUCAM, because such a combination causes uncertainty due to disputable results of causality gradings. It is not recommended to mix in the same cohort patients with DILI or HILI [48] because this situation will complicate a separate evaluation of DILI or HILI features. However, it is clear that in individual cases RUCAM allows for a distinction between a drug and a medicinal herb when causality gradings are different.

#### **11. Conclusions**

The current analysis showed a favorable run of the RUCAM algorithm globally used since its launch in 1993, considering the annually published DILI and HILI cases. Overall 95,885 liver injury cases were published using RUCAM for causality assessment, namely 81,856 iDILI cases and 14,029 HILI cases. The global use of RUCAM assessing causality in cases of DILI and HILI helps compare study results among various countries and facilitates description of typical clinical features, best derived from case reports or small case series. RUCAM solves complex conditions as an algorithm in line with principles of Artificial Intelligence. Top ranking countries providing RUCAM based DILI cases were China, the United States, Germany, Korea, and Italy, whereas most RUCAM based HILI cases were published by authors from China, Korea, Germany, India, and the United States. In term of number of cases published, there is no other causality assessment method that could outperform RUCAM evaluating DILI and HILI cases. This should encourage all the stakeholders involved in DILI and HILI to systematically use RUCAM in order to reinforce their diagnosis and take the right decisions for the benefit of the patients.

**Author Contributions:** Conceptualization and methodology, R.T. and G.D.; formal analysis and draft preparation, R.T.; review and editing, G.D. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** The authors are grateful to Sabine Veltens for her professional providing of the figures.

**Conflicts of Interest:** The authors declared that they have no conflict of interests regarding this invited article.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Comorbidities Associated with Granuloma Annulare: A Cross-Sectional, Case-Control Study**

**Erik Almazan 1, Youkyung S. Roh 1, Micah Belzberg 2, Caroline X. Qin 1, Kyle Williams 1, Justin Choi 1, Nishadh Sutaria 1, Benjamin Ka**ff**enberger 3, Yevgeniy R. Semenov 4, Jihad Alhariri 2,\*,**† **and Shawn G. Kwatra 2,\*,**†


Received: 5 August 2020; Accepted: 27 August 2020; Published: 28 August 2020

**Abstract: Background:** Granuloma annulare (GA) is a cutaneous granulomatous disorder of unknown etiology. There are conflicting data on the association between GA and multiple systemic conditions. As a result, we aimed to clarify the reported associations between GA and systemic conditions. **Methods:** A retrospective, cross-sectional, case-control study was performed in which the medical records of biopsy-confirmed GA patients ≥18 years of age, who presented to the Johns Hopkins Hospital System between 1 January 2009 and 1 June 2019, were reviewed. GA patients were compared to controls matched for age, race, and sex. **Results:** After adjusting for confounders, GA patients (*n* = 82) had higher odds of concurrent type II diabetes (odds ratio (OR) = 5.27; 95% confidence interval (CI), 1.73–16.07; *p* < 0.01), non-migraine headache (OR = 8.70; 95% CI, 1.61–46.88; *p* = 0.01), and a positive smoking history (OR = 1.93; 95% CI, 1.10–3.38; *p* = 0.02) compared to controls (*n* = 164). Among GA patients, women were more likely to have ophthalmic conditions (*p* = 0.04), and men were more likely to have cardiovascular disease (*p* < 0.01) and type II diabetes (*p* = 0.05). No differences in systemic condition associations were observed among GA subtypes. **Conclusions:** Our results support the reported association between GA and type II diabetes. Furthermore, our findings indicate that GA may be associated with cigarette smoking and non-migraine headache disorders.

**Keywords:** granuloma annulare; granulomatous disorders of the skin; inflammatory skin conditions; medical dermatology

#### **1. Introduction**

Granuloma annulare (GA) is a granulomatous cutaneous disorder of unknown etiology with an estimated prevalence of 0.1–0.4% [1]. Clinically, GA has various presentations, including localized, generalized, subcutaneous, patch, or perforating subtypes [1–4]. Due to this variation in clinical presentation, characteristic histological findings are critical for diagnosis [2,4]. Given the unknown etiology of GA, studies have focused on uncovering its association with systemic conditions [2].

Systemic diseases proposed to have an association with GA include diabetes mellitus [1,5,6], dyslipidemia [7], hypothyroidism [8], and various malignancies [2]. However, results from the current literature have been conflicting, as several reports have also found no association between

GA and diabetes mellitus [9], thyroid function [1], or dyslipidemia [5]. Additionally, reports on GA's association with malignancy have largely been confined to case reports or studies with small sample sizes, which make an association difficult to ascertain [2]. Therefore, we performed a retrospective cross-sectional, case-control study of patients with clinically diagnosed and biopsy-confirmed GA to clarify the conflicting evidence of associations between GA and systemic conditions.

#### **2. Materials and Methods**

#### *2.1. Study Design*

Patients with GA were identified through a review of medical records at the Johns Hopkins Health System (JHHS), mainly comprised of tertiary-care, academic medical centers. A search was performed for patients with an ICD-10 (International Statistical Classification of Diseases and Related Health Problems, Tenth Revision) code L92.0 who received outpatient care between 1 January 2009 and 1 June 2019. Patients with incomplete medical records were excluded. The GA patient study cohort included patients 18 years or older with both a documented clinical presentation and biopsy consistent with GA, including histological evidence of lymphohistiocytic inflammation, mucin deposition, and collagen degradation. All clinical and histopathologic evaluations were performed by board-certified dermatologists and dermatopathologists at JHHS, respectively. Patients with incomplete medical records, lack of biopsy-confirmed GA, or foreign body reactions at the time of diagnosis were excluded. Patients with GA were matched to controls in a 1:2 ratio by age (±3 years), race, and sex. Controls presented to JHHS as outpatients for regularly scheduled skin exams or for benign, localized chief complaints. The study was found exempt by the JHHS institutional review board, and patient consent was waived as only de-identified data were used.

#### *2.2. Data Collection*

Histopathological reports and medical records of patients with GA were reviewed. Patient characteristics and comorbidities present on the date of GA diagnosis by pathology were manually extracted from medical records.

#### *2.3. Definition of Smoking History and Comorbidities*

Smoking history was defined as self-reported cigarette smoking at the time of GA diagnosis or a previous history of smoking, regardless of amount. Comorbidities were any diseases that were ongoing problems at the time of GA diagnosis. Cardiovascular disease included active problems such as atherosclerosis, arrhythmias, cardiomyopathies, and cardiac infections. A history of myocardial infarction, heart failure, and stroke was included even if they were past medical events, insofar as the sequelae of those events were deemed significant enough to be considered active problems in the patient medical record. Liver disease encompassed viral liver infections, non-alcoholic steatohepatitis, hepatic steatosis, autoimmune liver disease, and hereditary liver diseases. Ophthalmic conditions included retinal degeneration, vitreous degeneration, inflammatory conditions of the eye proper or the optic nerve, closed-angle glaucoma, open-angle glaucoma, cataracts, and myopia.

#### *2.4. Statistical Analysis*

Continuous variables were presented as mean ± standard deviation (SD), and categorical variables were analyzed as proportions. Means were compared between cohorts using Student's *t*-test, while proportions were compared using the chi-squared or Fisher's exact test, as appropriate. Logistic regression results were expressed using odds ratios with 95% confidence intervals. Analyses were conducted with Stata/SE, v. 15.1 (StataCorp LLC, College Station, TX, USA). Univariable analyses were performed to compare patient characteristics between the GA cohort and controls. Logistic regression was used to adjust for potential confounding variables in our comparisons. A *p*-value < 0.05 (two-tailed) was considered significant in all analyses.

#### **3. Results**

Patient billing codes and pathology reports identified 471 patients with an ICD-10 L92.0 code who were seen at JHHS from January 2009 to June 2019. A total of 82 patients (17.4%) met the inclusion criteria and were included in the retrospective chart review. Table 1 displays patient demographics and clinical characteristics. On average, both cohorts were aged 58 ± 16 years, predominately female, and of non-Hispanic white race. In regard to smoking, a greater proportion of GA patients had a history of smoking or were active smokers at the time of their diagnosis (*p* = 0.03). Type II diabetes mellitus (*p* < 0.01), liver disease (*p* = 0.04), and non-migraine headache (*p* = 0.02) were present more frequently in patients with GA compared to patients without GA. The prevalence of clinically diagnosed dyslipidemia (*p* = 0.41), hypothyroidism (*p* = 0.63), and solid organ malignancy (*p* = 0.76) did not significantly differ between the study groups.


**Table 1.** Demographic and clinical characteristics of granuloma annulare (GA) patients.

Logistic regression was performed to address for potential confounding variables. A smoking history was associated with higher odds of GA (odds ratio (OR) = 1.93; 95% confidence interval (CI), 1.10–3.38; *p* = 0.02) after accounting for age, race, and sex. Patients with type II diabetes (*p* < 0.01), but not liver disease, also had higher odds of concurrent GA after accounting for age, race, sex, and smoking (Table 2). Non-migraine headache remained associated with GA after including age, race, sex, smoking history, cardiovascular disease, and essential hypertension in the logistic regression model (OR = 8.70; 95% CI, 1.61–46.88; *p* = 0.01).


**Table 2.** Logistic regression for comorbidities and granuloma annulare (GA) controlling for age, race, sex, and smoking.

A sub-analysis of patients with GA was performed, examining variation by sex (Table 3). Among GA patients, females were more likely than males to present with ophthalmic conditions (*p* = 0.04), while males were more likely than females to have concurrent cardiovascular disease (*p* < 0.01) and type II diabetes (*p* = 0.05). Further analysis by GA subtype, generalized or localized GA (Table 4), demonstrated that patients with localized GA were younger than patients with generalized GA (*p* < 0.01). Differences were not observed in comorbidities or smoking history by GA subtype.


**Table 3.** Demographic and clinical characteristics of granuloma annulare (GA) patients by sex.



#### **4. Discussion**

The results of our study investigating systemic disease associations with GA support previously published reports of an association between GA and type II diabetes. Additionally, our results align with previous studies which found no associations between GA and dyslipidemia [5], hypothyroidism [1], or malignancy [10,11]. However, we observed a significantly increased prevalence of a smoking history and non-migraine headaches among GA patients. An association between GA and a positive smoking history has not previously been reported, while associations with non-migraine headaches have not been reported outside of case reports.

Type II diabetes mellitus and dyslipidemia are two metabolic disorders that have been reported to be associated with GA. In 2009, a retrospective, multicenter study in Korea found a higher prevalence of diabetes mellitus in generalized GA patients compared to the general Korean population [6]. This observation was corroborated by similar findings in a retrospective analysis in which 44 GA patients in Taiwan were compared to the general Taiwanese population [5], as well as another study that found increased levels of fasting blood sugar in 28 Iranian GA patients compared to healthy controls [1]. However, another study using psoriasis patients as internal controls instead of national data failed to find a statistically significant association between GA and type II diabetes [9]. Studies exploring the relationship between GA and dyslipidemia have similarly observed conflicting results. A case-control study found significant associations between GA and dyslipidemia as well as increased levels of low-density lipoprotein, triglyceride, and total cholesterol [7]. However, these findings were not replicated in a more recent retrospective analysis in 2016 [5]. Our study found an association between type II diabetes and GA but did not observe a significant association between GA and dyslipidemia.

Hypothyroidism and solid organ malignancy are other systemic conditions with inconsistent reports on their associations with GA. A retrospective correlation study of 100 GA patients by Dabski and Winkelmann found 13 patients to have a thyroid disorder (in descending frequency: hypothyroidism, Grave's disease, thyroiditis, thyroid adenoma) [12]. However, whether GA is truly associated with hypothyroidism is unclear. For example, while one small case-control study failed to demonstrate a significant difference in thyroid hormone levels between GA patients and healthy controls [1], another case-control study of similar sample size did, in fact, report an association between localized GA and autoimmune thyroiditis, specifically in adult women [8]. In the latter, the authors suggested a common immunogenetic pathophysiology potentially underlying the two conditions, as well as other systemic, GA-like, granulomatous conditions that lie on a spectrum [8]. Our study did not show that

GA is associated with hypothyroidism. Collectively, these findings indicate that the current literature cannot definitely support or deny any association between GA and hypothyroidism. Additional large-scale, controlled studies are necessary to determine whether a true association between GA and hypothyroidism exists.

The question of whether or not malignancy is associated with GA is also debated. Multiple case reports have demonstrated GA occurring concurrently with various malignancies, which has led to speculation about a potential relationship [2]. While the exact etiology of GA is unknown, several studies have supported the hypothesis that the pathogenesis of GA involves a T-cell mediated response [13]. As a corollary, it has also been hypothesized that GA, in certain settings, may be a cutaneous manifestation of a chronic immune response to an underlying malignancy [14]. However, despite the seemingly high occurrence of GA in cases of malignancy, a meta-analysis of multiple case reports and correlation studies has not supported an association—with the caveat that clinically atypical GA in elderly individuals warrants investigation for an underlying malignancy that can histologically mimic GA [10]. A more recent review article by Hawryluk et al. echoed such conclusions, as well as cautioning against misdiagnoses, since other granulomatous dermatoses could in fact implicate malignancy [15]. Furthermore, a recent case-control study failed to find any association between malignancy and generalized GA [11]. Likewise, our study did not show that GA was associated with solid organ malignancy, thereby corroborating the results of the current literature on the topic.

The results of our study not only help to clarify previously reported associations between GA and certain systemic diseases but also reveal previously unrecognized relationships. For instance, we demonstrate increased prevalence of a smoking history in patients with GA. Even though the mechanism is unclear, smoking has been previously implicated in complications of sarcoidosis, another granulomatous disease [16]. Specifically, cigarette smoking has been shown to predict the development of ocular sarcoidosis in sarcoidosis patients. Smoking is thought to trigger systemic increases in cytokines, such as IL-6, IL-1β, and TNF-α, which are critical in the formation of granulomas [16]. As sarcoidosis and GA are both granulomatous diseases, it is possible that similar mechanisms may underlie the association between smoking and GA.

GA's association with non-migraine headache has not been previously supported by a well-defined study. Our results showed that patients who received a diagnosis for GA were more likely to have non-migraine headache. Due to the lack of specific documentation in the medical record, it was not possible to identify the source of these non-migraine headaches. However, this finding is interesting due to reports of GA occurring simultaneously with giant cell arteritis (GCA), a particular cause of headaches. These conditions have been reported to occur together, and resolve with the same medications [17,18]. The presence of giant cells, granulomas, vascular deposition of IgM and C3, CD4 T-cell involvement [17], and an increased expression of HLA-B15 [18] in both GA and GCA lends credence to the possibility of a shared pathophysiologic mechanism. Despite the limitations of our study, it is possible that our non-migraine headache classification could have served as a proxy for a headache of a vascular origin. These findings encourage further inquiry to elucidate whether a relationship exists between specific headache disorders and GA, as well as their respective mechanisms.

Finally, our study found certain differences in the associations of systemic conditions with GA when stratified by sex. Females were more likely to present with a concurrent ophthalmic condition, while males were more likely to present with cardiovascular disease and type II diabetes. Given our sample size, it is difficult to determine whether these differences are generalizable. However, sex differences may be important to consider in future studies aiming to determine how GA presents in different patient populations.

This study had several limitations. Firstly, this was a retrospective study at an academic, tertiary-care medical center system, which may limit the generalizability of our results. Additionally, there was incomplete information in the medical record, which limited the conclusions that we could draw from our analyses. Important lab values and body mass indexes were difficult to obtain near the time of GA diagnosis. These values may have been confounding variables in our assessment of GA and comorbidities. Furthermore, the lack of detail in the medical record on the source of documented headaches did not allow for the assessment of whether or not GA was associated with a particular type of headache disorder. Lastly, our study was also limited by its sample size and study design. The limited GA cohort size affected the associations that could be made and parts of our cross-sectional study design do not allow us to determine causality.

A range of systemic conditions have been suggested to be associated with GA. Our study contributes to this literature by uncovering meaningful associations between GA and type II diabetes, a history of cigarette smoking, and non-migraine headache. In contrast to prior studies, no associations were detected between GA and dyslipidemia, hypothyroidism, and malignancy. These results contribute to the growing body of literature on GA and suggest further avenues for investigation.

**Author Contributions:** Conceptualization, J.A. and S.G.K.; Formal analysis, E.A. and C.X.Q.; Investigation, E.A.; Methodology, E.A., M.B., B.K., Y.R.S., J.A. and S.G.K.; Project administration, S.G.K.; Resources, M.B.; Supervision, M.B., J.A. and S.G.K.; Validation, M.B., B.K., Y.R.S., J.A. and S.G.K.; Visualization, C.X.Q. and S.G.K.; Writing—original draft, E.A.; Writing—review and editing, E.A., Y.S.R., M.B., C.X.Q., K.W., J.C., N.S., B.K., Y.R.S., J.A. and S.G.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **E**ff**ects of Diaphragmatic Breathing on Health: A Narrative Review**

#### **Hidetaka Hamasaki**

Hamasaki Clinic, 2-21-4 Nishida, Kagoshima, Kagoshima 890-0046, Japan; h-hamasaki@umin.ac.jp; Tel.: +81-099-2503535; Fax.: +81-099-250-1470

Received: 30 August 2020; Accepted: 13 October 2020; Published: 15 October 2020

**Abstract: Background:** Breathing is an essential part of life. Diaphragmatic breathing (DB) is slow and deep breathing that affects the brain and the cardiovascular, respiratory, and gastrointestinal systems through the modulation of autonomic nervous functions. However, the effects of DB on human health need to be further investigated. **Methods:** The author conducted a PubMed search regarding the current evidence of the effect of DB on health. **Results:** This review consists of a total of 10 systematic reviews and 15 randomized controlled trials (RCTs). DB appears to be effective for improving the exercise capacity and respiratory function in patients with chronic obstructive pulmonary disease (COPD). Although the effect of DB on the quality of life (QoL) of patients with asthma needs to be investigated, it may also help in reducing stress; treating eating disorders, chronic functional constipation, hypertension, migraine, and anxiety; and improving the QoL of patients with cancer and gastroesophageal reflux disease (GERD) and the cardiorespiratory fitness of patients with heart failure. **Conclusions:** Based on this narrative review, the exact usefulness of DB in clinical practice is unclear due to the poor quality of studies. However, it may be a feasible and practical treatment method for various disorders.

**Keywords:** diaphragmatic breathing; abdominal breathing; breathing exercise; systematic review; randomized controlled trial; respiratory function

#### **1. Introduction**

Breathing is an essential part of life. The diaphragm is one of the major respiratory muscles, and its function is vital for proper respiration. At the end of 19th century, Sewall and Pollard [1] firstly investigated the relationship between the movement of diaphragm and chest during respiration. The diaphragm also contributes to vocalization and swallowing, as well as respiration. Its dysfunction is associated with various disorders, such as respiratory insufficiency, exercise intolerance, sleep disturbance, and potential mortality [2,3]. The diaphragm has multiple physiological roles. The phrenic nerve that innervates the functions of the diaphragm has a connection with the vagus nerve, which can affect the whole body system [4]. Diaphragmatic motion in breathing directly and indirectly affects the sympathetic and parasympathetic nervous systems and also influences motor nerve activities and brain mass [5]. The diaphragm also controls the postural stability, defecation, micturition, and parturition by modulating intra-abdominal pressure. Furthermore, its function is associated with metabolic balance [6] and cardiovascular and intraperitoneal lymphatic systems [3].

As diaphragmatic (abdominal) breathing (DB) is a slow and deep breathing method, it should not be considered as just a breathing control [7]. Since time out of mind, traditional martial arts such as tai chi and yoga utilize DB in their practice. DB is defined as breathing in slowly and deeply through the nose using the diaphragm with a minimum movement of the chest in a supine position with one hand placed on the chest and the other on the belly [8]. During breathing, practitioners should be careful that chest remains as still as possible and stomach moves against the hand focusing

on contracting the diaphragm. Generally, DB practitioners inhale and exhale for approximately six seconds, respectively. DB is a fundamental procedure during meditation practices in individuals who engage in yoga and traditional martial arts such as tai chi. Recently, a systematic review has reported that mind–body exercise (yoga/tai chi) can reduce stress in individuals under high stress or negative emotions by modulating the sympathetic–vagal balance [9]. Martarelli et al. [10] showed that DB increased the antioxidant activity and reduced the oxidative stress after exercise in athletes. DB has a potential to be a non-pharmacological treatment for patients with stress disorder as well as chronic respiratory disease. Although a number of studies have investigated the efficacy of breathing exercises in treating chronic obstructive pulmonary disease (COPD) [11–29], asthma [30–35], postoperative pulmonary function [36–40], and cardiorespiratory performance in post-Fontan patients [41], the effect of DB on other disorders, for example, cancer, heart failure, and anxiety, still needs to be further investigated. As a martial arts practitioner, the author uses DB in daily mind–body exercises (Figure 1) and feels the necessity to assess whether DB has a favorable impact on the overall health. This review aims to summarize the current evidence of the impact of DB on diseases as described above as well as respiratory function and to discuss its future perspective.

**Figure 1.** Breathing in slowly and deeply through the nose with a minimum movement of the chest in a supine position with one hand placed on the chest and the other on the belly. Diaphragmatic breathing has an impact on the brain and cardiovascular, respiratory, and gastrointestinal systems through the modulation of the autonomic nervous function.

#### **2. Methods**

This is a narrative review searching the current evidence on the effect of DB on human health. The author searched the literature on DB using PubMed and Cochrane Library from its inception to May 2020. The search terms (MeSH) were "diaphragmatic," "breathing exercise," "systematic review," and "randomized controlled trial (RCT)." First, the author conducted a search in the systematic reviews, which yielded 19 published articles. Second, the author searched in the RCTs, and this yielded 98 articles. Crossover trials and RCTs already assessed in previous systematic reviews were excluded from this review. The titles and abstracts of the identified articles were reviewed to determine their relevance. Overall, a total of 10 systematic reviews and 15 RCTs were included.

#### **3. Results**

#### *3.1. Systematic Reviews*

COPD is the most well-studied disease on which DB has a significant effect. In 2012, the Cochrane Airway Group reported the efficacy of breathing exercises in treating COPD [42]. In this study, 16 RCTs involving 1233 subjects were included with a mean age of 51–73 years and mean forced expiratory volume in 1 s (FEV1) of 30–51%, which suggested that the study subjects had moderate-to-severe COPD. Of these, 13 studies were included in the meta-analysis. Primary outcomes were dyspnea, quality of life (QoL), and exercise capacity. Breathing exercises, such as yoga with pranayama timed breathing, pursed-lip breathing, and DB, effectively improved the six-minute walk distance. However, no effects on dyspnea and QoL were observed. Although only two studies [25,43] were included in this systematic review, the four-week supervised DB training improved the six-minute walk distance (mean difference (MD), 34.7 m; 95% confidence interval (CI), 4.1–65.3) [25]. On the other hand, another study reported that DB had an unfavorable effect on dyspnea [43]. Recently, Ubolnuar et al. [44] have also assessed 19 RCTs investigating the efficacy of breathing exercises in patients with any severity stage of COPD. The types of breathing exercise include DB, pursed-lip breathing, just relaxation and slow breathing, ventilatory feedback training, and singing. Overall, the breathing exercises improved the respiratory function such as respiratory rate (RR), tidal volume (VT), respiratory time, and QoL of COPD patients. In particular, DB significantly improved the RR (MD, –1.09; 95% CI, −2.19 to 0.00), although the quality of evidence is low [14,29]. However, the QoL measured using the St. George's Respiratory Questionnaire and dyspnea did not differ between the DB and control groups. Furthermore, these results indicate that, although breathing exercises including DB are promising to improve the exercise capacity and respiratory function, their effects on clinical symptoms and QoL are inconsistent due to the severity stage of COPD.

The Cochrane Airway Group reevaluated the efficacy of breathing exercises in adults with asthma in 2020 [45]. Nine studies were added to the previous systematic review published in 2013, and a total of 22 RCTs were included in this systematic review and meta-analysis. Unfortunately, since only one RCT met the inclusion criteria [46], the effect of DB on QoL and asthma symptoms was inconclusive. However, breathing exercises such as yogic breathing and the Buteyko breathing technique had positive effects on QoL and asthma symptoms. Moreover, breathing exercises improved the QoL measured using the Asthma Quality of Life Questionnaire at three months (MD, 0.42; 95% CI, 0.17–0.68) and at six months (odds ratio, 1.34; 95% CI, 0.17–0.68) compared with no active control. Furthermore, hyperventilation symptoms and FEV1.0% were predicted to be improved by breathing exercises.

Prem et al. [47] investigated the effect of DB on the QoL of patients with asthma. Only three RCTs assessing the effect of DB on asthma were included [30–32]. The intervention used in the study by Thomas et al. [31] was DB plus nasal breathing exercise. In addition, the interventions as controls were asthma education [30,31] and conventional asthma medication [32]. This systematic review did not perform a meta-analysis. However, DB improved the QoL measured using the Asthma Quality of Life Questionnaire; specifically, the questionnaire score was improved: 0.79 [30] and 1.12 [31]. Moreover, the scores of the Asthma Control Test (from 18 ± 2.5 to 22 ± 3.3) and end-tidal CO2 (by 4 mmHg) were improved in the study by Grammatopoulou et al. [32]. The authors suggested that DB improved the QoL of patients with asthma based on the reduction in hyperventilation, which physiologically improved the respiratory function.

The effect of breathing exercise in children with asthma was systematically reviewed in 2016 [48]. Only three studies [33–35] were eligible for this systematic review. The primary outcomes were QoL, asthma symptoms, and adverse events. None of these studies evaluated the single effect of DB, and the breathing exercise programs consisted of DB, lateral costal breathing [33], pursed-lip breathing [35], and endurance exercise [34]. A heterogeneity in the asthma severity of patients among the studies was observed. The difference in the primary outcomes could not be found in the comparisons between the intervention and control groups. Lima et al. [35] reported that the peak expiratory flow (PEF) was improved after the intervention, but based on the meta-analysis, no clear evidence could confirm that DB improved the respiratory function. Moreover, it was inconclusive that DB had a benefit or risk in children with asthma.

Dysfunctional breathing is associated with poor asthma control in children [49, 50]. Barker et al. [51] assessed the effect of breathing exercises in children with dysfunctional/hyperventilation syndrome. However, no eligible studies were found for this systematic review. This lack of evidence is due to the insufficient number of well-designed RCTs conducted in children. On the other hand, Jones et al. [52] evaluated the effect of breathing exercises in adults with dysfunctional/hyperventilation syndrome. Since only a single RCT [53] met the inclusion criteria, this systematic review could not provide a reliable conclusion regarding the effect of DB on dysfunctional breathing. The included study enrolled 45 patients with hyperventilation syndrome and divided them into three groups (relaxation therapy, relaxation therapy and DB, and control) of 15 patients each. As the frequency and severity of hyperventilation attacks were significantly reduced in the DB group compared with the control group, no detailed data and statistical analysis were presented in this study [53]. Therefore, the effect of DB on dysfunctional breathing is still unclear.

A systematic review with meta-analysis examined (1) the generalizability, consistency, volume, and quality of the evidence for breathing control; and (2) the effect of breathing control on various clinical outcomes [54]. This systematic review included a total of 20 studies: 2 RCTs [55,56], 3 non-RCTs [57–59], and 15 quasi-experimental studies [14,60–73]. The study participants were also heterogeneous; 80% of the studies recruited patients with chronic respiratory disease, such as COPD, and 20% of the studies included patients with other conditions (e.g., post-surgery, chronic progressive multiple sclerosis) and asymptomatic individuals. DB was required to be the single intervention used in all studies. DB had beneficial effects on abdominal excursion (MD, 1.36; 95% CI, 0.42–2.31), diaphragm excursion (MD, 1.39; 95% CI, 1.00–1.77), short-term changes in respiratory function, RR (MD, −0.84; 95% CI, −1.09 to 0.60), VT (MD, 0.98; 95% CI, 0.71–1.25), gas exchange, arterial oxygen saturation (MD, 0.63; 95% CI, 0.25–1.02), and percutaneous oxygen (MD, 1.48; 95% CI, 0.85–2.11). On the other hand, DB had a negative impact on the work of breathing (MD, 1.06; 95% CI, 0.52–1.60) and dyspnea (MD, 1.47; 95% CI, 0.88–2.05) in patients with severe respiratory disease. DB had no significant effects on ventilation, long-term change in respiratory function, vital capacity (VC), forced vital capacity (FVC), expiratory flow rate, FEV1, respiratory muscle strength, oxygen consumption, respiratory muscle efficiency, ventilation distribution, and 12-min walk test. On the other hand, DB was effective in the short-term improvement of respiratory function, but it did not have a beneficial effect on the long-term physiological outcomes and energy cost of breathing. Interestingly, DB could negatively affect the respiratory symptoms of patients with severe respiratory disease and may not be applicable to all kinds of respiratory disease. However, the generalizability and quality of evidence is not high as this systematic review included only two RCTs and the heterogeneity of the characteristics of study subjects and the intervention methods used among the studies was large.

Grams et al. [74] examined the effects of breathing exercises on the prevention of postoperative pulmonary complications and recovery of pulmonary function in patients who had upper abdominal surgery. A total of six RCTs or quasi-RCTs were included in this systematic review [36–40,75], four of which were conducted in Brazil. The meta-analysis showed that the maximal expiratory pressure and maximal inspiratory pressure increased by 12.8 (95% CI, 7.6–18.1) and 5.6 (95% CI, 0.6–10.5) mmH2O, respectively, on Day 1 postop [38–40]. However, DB was observed to have no significant effects on respiratory function including FVC, FEV, and FEV1. This systematic review indicates that breathing exercises, which mainly consist of DB, improve the respiratory muscle strength of patients after upper abdominal surgery. However, the included studies investigated the effect of DB on Day 1–5 postop, and the respiratory functions of the study subjects at baseline were heterogeneous. Therefore, the findings of this systematic review are limited to a specific circumstance and the generalizability is low.

Recently, Hopper et al. [76] reported that DB might have reduced the physiological and psychological stress, although the meta-analysis could not be performed due to the methodological heterogeneity and outcome measures. One RCT [77] and quasi-experimental studies [78,79] were included in this qualitative analysis. Ma et al. [77] reported that DB reduced the RR and salivary cortisol levels in an RCT, suggesting that DB has a favorable effect on stress. Two experimental studies also showed that DB was effective for improving the blood pressure control [78] and stress measured using the Depression Anxiety Stress Scale-21 [79]. However, more well-designed RCTs with an appropriate sample size are needed to conclude whether DB is beneficial for reducing stress.

Table 1 summarizes the results of these systematic reviews.

**Table 1.** Systematic reviews and meta-analyses assessing the effects of diaphragmatic breathing on various disorders.



COPD, chronic obstructive pulmonary disease; RCT, randomized controlled trial; ↑, increase; →, no change; ↓, decrease.

#### *3.2. Randomized Controlled Trials*

#### 3.2.1. COPD and Asthma

It is apparent that previous studies investigating the effects of DB have been conducted in patients with COPD. The author has identified a recent RCT that was not included in previous systematic reviews. Yekefallah et al. [80] compared the effect of the breathing exercise involving DB and pursed-lip breathing and upper limb exercise on exercise capacity measured through a six-minute walking test in patients with COPD. Seventy-five patients with moderate-to-severe COPD were recruited and divided into three groups: upper limb exercise group (*n* = 25), breathing exercise group (*n* = 25), and control group (*n* = 25). Participants in the breathing exercise group performed DB and pursed-lip breathing for one minute, respectively, with a one-minute rest between these exercises. They were asked to do these exercises four times a day for four weeks. On the other hand, participants in the upper limb exercise group performed upper limb strengthening exercises using dumbbells for 20 min per session, thrice a week, for four weeks. Moreover, all participants completed the study. The mean walking distance significantly increased in the breathing exercise group (from 355.3 ± 47.9 m to 376.9 ± 37 m) and in the upper limb exercise group (from 389.8 ± 5.8 m to 409.5 ± 29.8 m) during the study, whereas the control group did not show any significant change. Although both the upper limb exercise and DB plus pursed-lip breathing were effective in increasing the walking distance, a post hoc analysis revealed that the walking distance of the upper limb exercise group was longer than that of the breathing exercise group. This study indicates that the upper limb strengthening exercise is more effective for improving the exercise capacity of COPD patients than DB training.

The respiratory function and abdominal and thoracic kinematics changes due to DB training in patients with moderate persistent asthma were evaluated, although the intervention might be a respiratory muscle training rather than a simple DB training [81]. Eighty-eight inactive patients with asthma aged between 18 and 34 were enrolled in this RCT. The study participants were categorized into aerobic exercise (*n* = 22), DB (*n* = 22), aerobic exercise combined with DB (*n* = 22), and control (*n* = 22) groups. The participants in the intervention groups performed the training program thrice a week for eight weeks. The DB training in this study was unique. The participants in the DB group breathed using a tube to maximize their inspiration and expiration, and a 2.5 kg weight (Week 1–4) or a 5 kg weight (Week 5–8) was put on their abdominal cavity. Moreover, they completed three sets of 5–10 repetitions using one second of inspiration and two seconds of expiration, three sets of 10–15 repetitions using two seconds of inspiration and four seconds of expiration, and three sets of 15–20 repetitions using three seconds of inspiration and six seconds of inspiration. The participants in the aerobic exercise group walked and/or jogged for 30 min at the intensity of 60% of the age-predicted maximum heart rate. After the eight-week intervention, the DB training improved the FVC (from 3.01 ± 0.58 L to 3.52 ± 0.74 L), FEV1 (from 2.85 ± 0.57 L to 3.22 ± 0.63 L), FEV1/FVC ratio (from 94.86 ± 4.94% to 90.64 ± 6.67%), PEF (from 7.10 ± 1.57 L to 7.68 ± 1.26 L), and inspiratory VC, but the forced expiratory flow (FEF) rate, maximum voluntary ventilation (MVV), and VT did not change. On the other hand, aerobic exercise improved the FVC (from 2.77 ± 0.48 to 3.11 ± 0.71 L), FEV1 (from 2.72 ± 0.53 to 2.97 ± 0.65 L), PEF (from 7.15 ± 1.45 L to 7.57 ± 1.47 L), MVV (from 103.65 ± 27.86 L/min to 128.97 ± 27.56 L/min), and inspiratory VC, but the FEV1/FVC ratio, FEF, and VT did not change. Aerobic exercise combined with DB more effectively improved the FVC (from 2.87 ± 0.67 L to 3.68 ± 0.82 L) and FEV1 (from 2.70 ± 0.67 L to 3.30 ± 0.70 L) than aerobic exercise alone, but DB and aerobic exercise were equally effective in the improvement of FVC and FEV1. Aerobic exercise, DB, and DB combined with aerobic exercise improved the chest circumferences during inspiration, but no significant improvement was observed during the rest and expiration phases. Interestingly, DB improved the resting, inspiratory, and expiratory abdominal circumferences at the height of the midpoint between the umbilicus and the xiphoid process, but aerobic exercise did not change the resting circumference.

#### 3.2.2. Cancer

Campbell et al. [82] investigated the efficacy of relaxation techniques in treating the eating problems of cancer patients who have a prognosis of at least six months and have nutritional problems such as weight loss. The relaxation technique includes DB, autosuggestion, relaxing of muscles, and image control. The changes in weight and performance status measured using the Karnofsky Performance Status Scale during the study period were assessed. Twenty-two patients with cancer were randomly assigned to the intervention (*n* = 12) and control groups (*n* = 10), respectively. After performing the relaxation training for six weeks, 75% of the patients gained weight within 10% of one's ideal weight. Performance status was improved in 33% of the patients after the eight-week intervention. Moreover, relaxation training using DB may support the treatment of eating problems in patients with cancer.

Shahirai et al. [83] evaluated the effect of DB, muscle relaxation, and body image on the QoL of older patients with breast or prostate cancer. Fifty patients were recruited and categorized into the intervention (*n* = 25) and control (*n* = 25) groups. The functional QoL score was immediately improved after the intervention (from 31.6 to 60.5 points) and six weeks after the intervention (from 31.6 to 66 points), whereas no significant changes were observed in the control group. Furthermore, the mean score of the general domain of QoL was also immediately increased after the intervention (from 36.33 to 64.33 points) and six weeks after the intervention (from 36.33 to 52.33 points). On the other hand, it was decreased in the control group during the study period. These studies applied the use of concurrent techniques, but not DB techniques, and the outcome measures were mortality and survival period. Thus, whether DB is useful for cancer treatment or not is inconclusive. However, relaxation and DB techniques may be a cost-effective and convenient method for improving the general condition of patients with cancer.

#### 3.2.3. Other Diseases

Silva and Motta [84] investigated the effect of DB, abdominal muscle training, and massage on pediatric patients with chronic functional constipation. Seventy-two patients aged 4–18 were categorized into the physiotherapy plus medication (*n* = 36) and the medication using only laxatives (*n* = 36) groups. The physiotherapy consisted of DB, isometric training of the abdominal muscles to increase intra-abdominal pressure, and slow circular clockwise abdominal massage. After the six-week intervention, the defecation frequency was significantly higher in the physiotherapy group than the medication-only group. Furthermore, DB may increase intra-abdominal pressure and stimulate the parasympathetic activity, which increases the colonic motility and improves the defecation frequency.

Wang et al. [85] investigated the effect of DB on blood pressure in prehypertensive patients. Twenty-six postmenopausal women aged 45–60 were enrolled and categorized into the intervention and control groups. Twenty-two participants (intervention group, n =12; control group, *n* = 10) completed the study. The intervention group was treated with DB combined with the frontal electromyographic biofeedback-assisted relaxation training, whereas the control group only performed DB techniques. All participants performed 10 sessions of treatment once every 3 days. After the training, in the intervention group, systolic and diastolic blood pressures were decreased by 8.4 and 3.9 mmHg, respectively. Single DB also significantly decreased the systolic blood pressure by 4.3 mmHg, but no changes in the diastolic blood pressure were observed. DB combined with the biofeedback training was more effective in lowering the blood pressure than DB alone. In addition, the RR interval increased during the training in the intervention group, whereas no change was observed in the control group. The standard deviation of the normal–normal intervals significantly increased in both groups. Although DB alone was effective in lowering the blood pressure and improving the heart rate variability, the biofeedback training seemed to strengthen its effect through inhibiting sympathetic activity and improving vagal tone [86].

Seo and colleagues [87] examined the effect of DB on dyspnea and physical activity of patients with heart failure. Thirty-six patients were enrolled in this study and were categorized into the home-based DB retraining (*n* = 18) and control (*n* =18) groups. A total of 29 patients (intervention group, *n* = 13; control group, *n* = 16) completed the study, and 27 patients (intervention group, *n* = 12; control group, *n* = 15) who continued the home-based DB retraining were followed up for five months. After the eight-week intervention, the DB group showed little improvement in dyspnea. The functional status in the DB and control groups increased by 10.5% and 4.4%, respectively, but declined by 2.2% in the control group in the five-month follow-up. On the other hand, the average daily activity measured by a triaxial accelerometer, ActiGraph, significantly increased by 14% in the DB group and decreased by 6% in the control group. No adverse effects were reported. Moreover, DB was a feasible treatment option for patients with heart failure. Daily physical activity can be increased due to the improvement of dyspnea through regular DB exercise, which may lead to maintaining or improving the cardiorespiratory fitness of patients with heart failure. Furthermore, the results of the studies by Wang et al. [85] and Seo et al. [87] indicate that DB has beneficial effects on cardiovascular health.

Sutbeyaz and colleagues [88] conducted an interesting RCT that compared the efficacy of DB and pursed-lip breathing in inspiratory muscle training for improving the cardiopulmonary functions of patients with subacute stroke. Forty-five inpatients with stroke were recruited and categorized into the breathing retraining (*n* = 15), inspiratory muscle training (*n* = 15), and control (*n* = 15) groups. The breathing training program consisted of 15 min of DB combined with pursed-lip breathing, 5 min of air-shifting techniques, and 10 min of voluntary isocapnic hyperpnea. The participants received daily training, six times a week, for six weeks. No significant changes in VC, FVC, FEV1, FEF25–75%, and MVV from baseline in the DB group were observed, but the PEF of the DB intervention group improved as compared with both the inspiratory training and control groups. In contrast, inspiratory muscle training significantly improved VC, FVC, FEV1, FEF25–75%, and MVV as compared with controls. Interestingly, DB increased both the maximum inspiratory and expiratory pressures, but inspiratory muscle training did not increase the maximum expiratory pressure. In contrast to DB, inspiratory muscle training improved the exertional dyspnea and functional status based on the Barthel Index and Functional Ambulation Category scores. The general health, pain, vitality, and emotional role domains of the SF–36 improved in the DB group from baseline as compared with the control group. The short-term inspiratory muscle training effectively improved the respiratory function and exercise

capacity of patients with stroke, but DB was also effective in improving the PEF, inspiratory and expiratory pressures, and QoL. Considering that inspiratory muscle training requires the appropriate medical equipment, DB is a more feasible treatment option for improving the cardiopulmonary function of inpatients with stroke.

Eherer and colleagues [89] assessed the effect of the four-week DB training on the QoL, pH-metry, and on-demand proton pump inhibitor usage of patients with nonerosive gastroesophageal reflux disease (GERD). Nineteen patients were enrolled in this RCT and were categorized into the training (*n* = 10) and control (*n* = 9) groups. The training group engaged in daily DB practice for at least 30 min. After the four-week DB training, the time with a pH < 4.0 in the training group decreased from 9.1% ± 1.3% to 4.7% ± 0.9%, and the QoL scores measured using the GERD Health-Related Quality of Life Scale also improved from 13.4 ± 1.98 to 10.8 ± 1.86, but no changes in the control group were observed. Furthermore, after the nine-month follow-up, patients who continued the DB techniques showed an improvement in their QoL scores (from 15.2 ± 2.2 to 9.7 ± 1.6) and proton pump inhibitor usage (from 98 ± 34 mg/week to 25 ± 12 mg/week). Furthermore, DB as a non-pharmacological intervention was observed to reduce the proton pump inhibitor usage and improve the long-term QoL of patients with GERD.

In 2005, an interesting RCT was conducted in India [90]. Migraine is a common but hard-to-treat disease. Kaushik et al. investigated whether biofeedback-assisted DB could treat migraine. This study enrolled 192 patients who were then categorized into biofeedback (*n* = 96) and control (*n* = 96) groups. Moreover, 24 (25%) patients in the biofeedback group were excluded. The control group received 80 mg/day of propranolol, whereas the biofeedback group was subjected to DB and relaxation with electromyogram and temperature feedback for six months. Biofeedback-assisted DB was effective in 66.66% of the patients. In both groups, the severity, frequency, number of vomiting episodes, and duration of attacks were decreased. One year after the intervention, the resurgence of migraine was observed in 9.37% of the participants in the biofeedback group, which was significantly lower than that of the propranolol group (38.54%). Differences in the resurgence rate between the groups were observed (*p* < 0.001). In the propranolol group, adverse effects such as fatigue and nausea were observed in 13.54% of the patients, whereas the side effects were only observed in 5.2% of the patients in the biofeedback group. The authors recommended that biofeedback-assisted DB and relaxation techniques should be used as a treatment for migraine.

A systematic review has shown that DB may be useful for stress management [76]. Chen et al. [91] evaluated the effectiveness of DB training program on anxiety. Anxiety is associated with respiratory symptoms such as dyspnea, shallow respiratory breathing, hyperventilation, and chest tightness [92], as well as cardiovascular symptoms such as tachycardia and palpitations [92,93]. The authors hypothesized that relaxation and DB techniques could reduce anxiety. Forty-six individuals who had anxiety for at least a month were recruited, but only 30 participants (DB group, *n* = 15; control group, *n* = 15) completed the eight-week study. The DB group practiced DB at least twice a day and 10 exercises per session. The anxiety scores measured using the Beck Anxiety Inventory declined from baseline (19.13 ± 7.52) to week 4 (12.67 ± 7.09) and also from week 4 to week 8 (5.33 ± 4.52). Moreover, after the eight-week DB training, the peripheral temperature increased from 33.26°C ± 1.49°C to 34.77°C ± 1.01°C, heart rate decreased from 85.52 ± 8.0 to 72.45 ± 5.57 beats/min, and breathing rate decreased from 16.24 ± 2.27 to 12.59 ± 2.40 breaths/min, whereas no significant changes in the control group were observed. Furthermore, DB is effective to reduce anxiety, which leads to favorable changes in physiological indicators.

Table 2 summarizes the results of these RCTs.


**Table 2.** Randomized controlled trials assessing the effects of diaphragmatic breathing on


COPD, chronic obstructive pulmonary disease; DB, diaphragmatic breathing; VC, vital capacity; FVC, forced vital capacity; FEV1, forced expiratory volume in 1 s; PEF, peak expiratory flow; FEF, forced expiratory flow; MVV, maximum voluntary ventilation; VT, tidal volume; ↑, increase; →, no change; ↓, decrease.

#### 3.2.4. Healthy Individuals

The effects of DB on healthy individuals have been also investigated. An experimental study was conducted to investigate whether DB had an impact on motion sickness in a virtual reality environment [94]. Healthy individuals were screened for motion sickness susceptibility. A total of 60 motion sickness susceptible subjects were randomly categorized into the DB (*n* = 31) and control (*n* =

29) groups. The participants wore 3D goggles and experienced motion sickness in a virtual reality space (10-min fluctuating view of a stormy sea). During the virtual reality experience, the respiration rate was significantly lower (11.38 ± 3.49 breaths/min vs. 16.21 ± 2.77 breaths/min) and the heart rate variability (respiratory sinus arrhythmia) was significantly higher (7.46% ± 1.05% vs. 6.38% ± 0.86%) in the DB group compared with the control group. In addition, the self-reported motion sickness rating (1.37 ± 0.44 vs. 1.78 ± 0.63) and the motion sickness assessment questionnaire score (2.1 ± 0.91 vs. 2.85 ± 1.72) were significantly lower in the DB group than those in the control group. In the DB group, a positive correlation between the respiration rate and motion sickness rating and negative relationships of the heart rate variability with respiration rate and motion sickness rating were observed. Therefore, these findings suggested that DB increased the parasympathetic nervous system activity, decreased the respiration rate, and improved the motion sickness symptoms.

Gimenez et al. [95] compared the effectiveness of comprehensive directed breathing retraining with DB on male smokers who had exertional dyspnea but normal spirometry. Twenty-four active male smokers aged 33–60 were enrolled and categorized into the experimental (comprehensive directed breathing retraining) and control (DB) groups. Both groups performed 60 min of DB, 30 min of walking, and conditioning exercises for 5 days a week for 4 weeks. The participants were asked to continue DB at home, walking, and exercises twice daily for at most 30 min. The experimental group was educated about the anatomy and physiology of respiration, enhanced their awareness on abnormal breathing patterns, shown the ventilator rhythm on a spirogram, and watched a DB instructional film. The measurement of physiologic parameters was performed at rest and at 40-W exercise for 10 min. In the experimental group, 34 of 44 lung function parameters, such as dyspnea index, ventilation capacity, FEV1, PEF, VO2, VCO2rest, and PaO2, were improved. The single DB intervention did not effectively improve the exertional dyspnea and lung function. Moreover, the authors referred to an unfavorable effect of DB that previous studies on patients with COPD had shown: the possibility that DB worsened the chest wall motion, reduced the efficiency of ventilation, and increased the respiratory workload [64,96,97].

Han and Kim [98] investigated the effect of DB combined with upper extremity exercise on the lung function of young healthy individuals. Forty male adults were recruited and categorized into the experimental (DB with upper extremity exercise; *n* = 20) and control (only DB; *n* = 20) groups. Both groups performed 10 min of warm-up exercise, 5 min of DB, and 10 min of cool-down exercise. Additionally, the experimental group performed breathing exercises with 25 min of dynamic upper extremity exercise using an elastic band with 40% resistance for one repetition maximum, whereas the control group performed 25 min of regular breathing exercise. Both groups performed the exercise session thrice a week for four weeks. After the four-week training, FVC significantly increased in both experimental and control groups. FEV1 and PEF did not change in both groups. However, FEV1 increased by 0.05 L in the experimental group, whereas it decreased by 0.02 L in the control group. This study indicates that DB is effective in improving FVC, but the upper extremity exercise may have an additional effect on obstructive ventilatory disturbance.

Bahensky et al. [99] investigated how DB based on yoga affects the efficiency of breathing in adolescent endurance runners. This study included 37 runners who performed endurance training at least six times a week. The intervention group (*n* = 21) engaged in DB exercise for at least 10 min per session, at least 5 times a week, for 4 months. The VT and breathing frequency were measured at two and four months after the intervention started. A spiroergometry test was performed using a bicycle ergometer at the point of subjective exhaustion. In the intervention group, VT significantly increased from 2.02 ± 0.43 L at baseline to 2.11 ± 0.43 L after the two-month intervention and to 2.25 ± 0.51 L after the four-month intervention, whereas no changes in the control group were observed. Moreover, the breathing frequency significantly increased from 59.0 ± 8.6 breaths/min at baseline to 55.6 ± 9.5 breaths/min after the two-month intervention and to 52.2 ± 9.2 breaths/min after the four-month intervention, whereas no changes in the control group were observed. The DB training for four months

effectively increased the VT by 10.96% and decreased the breathing frequency by 11.47%. DB may improve the endurance capacity of the respiratory muscles in healthy adolescents.

Previous studies have shown that DB has no significant impacts on aerobic capacity in healthy individuals. Respiratory muscle trainings also appear to have no beneficial effects on VO2max in healthy non-smokers [100] and athletes [101].

#### **4. Discussion**

DB has various physiological effects in humans. The diaphragm is the major respiratory muscle. As the movement of the diaphragm has a positive correlation with the lung volume [102], using the diaphragm consciously during respiration increases the lung capacity. DB facilitates slow respiration, but if RR decreases, hypercapnia and the activation of chemoreceptors would be induced to increase RR to maintain the respiratory homeostasis [103]. DB that controlled RR at six breaths/min reduces the chemoreflex response to hypoxia and hypercapnia compared with normal breathing [104]. Decreased RR increases the VT, which improves the efficiency of ventilation for oxygen [105] through alveolar recruitment and distention, improving the alveolar ventilation due to reduced alveolar dead space and increasing the arterial oxygen saturation [103]. Therefore, DB has a potential to improve the blood oxygen levels.

RR also affects the heart rate, systemic blood pressure, and circulating blood volume. Generally, inspiration decreases the intrathoracic pressure and increases the pressure gap between the right heart and the systemic circulation, which increases the venous return to the right heart. On the other hand, the pulmonary venous return decreases and the blood volume in the left heart is reduced. As a result, the cardiac output increases due to the increase of blood volume in the right heart. This physiological action is reversed in expiration [103]. Heart rate increases during inspiration and decreases during expiration while arterial blood pressure is lowered [106]. DB enhances the fluctuations in blood pressure and heart rate [103] via slow breathing [107] and diaphragm excursions, therefore improving the baroreflex sensitivity, heart rate variability, and blood pressure oscillations [103,108].

Breathing has a close relationship with autonomic nervous system function. The phrenic nerve that controls the movement of the diaphragm is connected to the vagus (parasympathetic) nerve [4]. Decreasing the RR by DB activates the parasympathetic nervous activity while suppressing the sympathetic nervous activity [11]. Chang et al. [109] reported that slow breathing with eight breaths/min makes the balance of the parasympathetic nervous activity dominant. Autonomic dysfunction, for example, a reduction in heart rate variability, is associated with an increased risk of cardiovascular mortality and morbidity [110]. Hyperactive sympathetic nervous activity and hypoactive parasympathetic nervous activity can be regulated by DB, which will improve the cardiovascular health. In addition, yoga practice tends to tune the brain toward a parasympathetically driven mode and positive states [111]. Jerath et al. [112] indicated that breathing stimulated the vagal activation of gamma-aminobutyric acid pathways in the brain, and reduced stress and anxiety. Furthermore, DB appears to have a favorable effect on the cardiovascular system and brain through the improvement of the autonomic balance.

Although the current evidence regarding the effects of DB on human health is accumulating, several limitations should be considered to conclude its efficacy in clinical practice. Firstly, the DB technique among the studies has not been standardized. The inspiratory and expiratory phase times ranged from 4 to 8 s, respectively, and the practitioners performed DB in various postures such as supine position, semi-recumbent position, or seated position. Moreover, the optimal RR and posture for achieving physiological benefits are still unknown. Secondly, the effect of DB may differ depending on the severity of the diseases. For instance, DB could be harmful for dyspnea in patients with severe COPD. Therefore, future studies should investigate whether the effects of DB differ according to the severity of diseases. Thirdly, considerable heterogeneity among studies was observed, such as the characteristics of the study subjects, intervention frequency and duration, and controls. Furthermore, previous systematic reviews assessed in this study have different criteria of inclusion. For

example, several studies that did not investigate the single effect of DB were included in a systematic review (e.g. Thomas et al. [31]). Fourthly, systematic reviews included a wide range of studies from the 1950s to 2010s. Studies that were performed 50 years ago have important information; however, recent studies may be more important because statistical methods and quality of data advances with the times. Such heterogeneity might cause the findings of this review to be inconclusive. Finally, the primary outcomes of systematic reviews are usually clinical symptoms, QoL, respiratory function, and exercise capacity, and no studies have evaluated the effect of DB on hard endpoints, such as the development of respiratory failure, cardiovascular disease, and mortality. Most of the studies have short study periods (e.g., 4–6 weeks). Thus, the long-term effect of DB should be clarified in the future. Despite these limitations, DB has the potential to improve various kinds of disease. Moreover, no serious adverse effects have also been reported in the RCTs. Recently, a number of studies have shown that physical rehabilitation improves exercise capacity in transplant recipients and candidates [113,114]. DB could also be safe and feasible in the post-transplant management due to its non-invasive technique.

#### **5. Conclusions**

Previous systematic reviews and meta-analyses have shown that DB is effective for improving the exercise capacity and RR in patients with COPD. On the other hand, DB could deteriorate dyspnea in severe COPD patients. Moreover, the effect of DB on the QoL of patients with asthma still needs to be investigated further. DB may also be beneficial for reducing both physiological and psychological stress and could improve the respiratory function and respiratory muscle strength, but more firm evidence will be needed in the future. In addition, DB may help in treating eating disorders, chronic functional constipation, hypertension, migraine, and anxiety, as well as the QoL of patients with cancer and GERD and the cardiorespiratory fitness of patients with heart failure. Furthermore, DB could be a feasible and practical technique for patients with such disorders. Although further studies are needed to clarify the effects of DB on human health, DB can support clinical practice.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **The E**ff**ects of Tai Chi and Qigong on Immune Responses: A Systematic Review and Meta-Analysis**

**Byeongsang Oh 1,2,3,\*, Kyeore Bae 1,4, Gillian Lamoury 1,2,3, Thomas Eade 1,2,3, Frances Boyle 2,3, Brian Corless 1, Stephen Clarke 1,3, Albert Yeung 5, David Rosenthal 5, Lidia Schapira <sup>6</sup> and Michael Back 1,2,3**


Received: 21 June 2020; Accepted: 28 June 2020; Published: 30 June 2020

**Abstract: Background:** Effective preventative health interventions are essential to maintain well-being among healthcare professionals and the public, especially during times of health crises. Several studies have suggested that Tai Chi and Qigong (TQ) have positive impacts on the immune system and its response to inflammation. The aim of this review is to evaluate the current evidence of the effects of TQ on these parameters. **Methods:** Electronic searches were conducted on databases (Medline, PubMed, Embase and ScienceDirect). Searches were performed using the following keywords: "Tai Chi or Qigong" and "immune system, immune function, immunity, Immun\*, inflammation and cytokines". Studies published as full-text randomized controlled trials (RCTs) in English were included. Estimates of change in the levels of immune cells and inflammatory biomarkers were pooled using a random-effects meta-analysis where randomised comparisons were available for TQ versus active controls and TQ versus non-active controls. **Results:** Nineteen RCTs were selected for review with a total of 1686 participants and a range of 32 to 252 participants within the studies. Overall, a random-effects meta-analysis found that, compared with control conditions, TQ has a significant small effect of increasing the levels of immune cells (SMD, 0.28; 95% CI, 0.13 to 0.43, *p* = 0.00), I<sup>2</sup> = 45%, but not a significant effect on reducing the levels of inflammation (SMD, −0.15; 95% CI, −0.39 to 0.09, *p* = 0.21), I<sup>2</sup> = 85%, as measured by the systemic inflammation biomarker C-reactive protein (CRP) and cell mediated biomarker cytokines. This difference in results is due to the bidirectional regulation of cytokines. An overall risk of bias assessment found three RCTs with a low risk of bias, six RCTs with some concerns of bias, and ten RCTs with a high risk of bias. **Conclusions:** Current evidence indicates that practising TQ has a physiologic impact on immune system functioning and inflammatory responses. Rigorous studies are needed to guide clinical guidelines and harness the power of TQ to promote health and wellbeing.

**Keywords:** Tai Chi; qigong; immune system; immunity; inflammation

#### **1. Introduction**

The effectiveness of the human immune system to prevent disease and aid recovery is critical [1]. Inflammation is an adaptive biological response of the immune system that can be triggered by several

factors, such as pathogens, damaged cells and toxic compounds [2,3]. In response to infection, immune cells produce pro-inflammatory cytokines [4] and suppress anti-inflammatory genes [5] as key elements of the pathogenic defense process. In the current COVID-19 pandemic, humanity is facing one of its greatest public health challenges with more physical and psychological demands placed on the immune system and a greater need for evidence-based healthy lifestyle interventions, such as increased physical activity, to support and maintain the integrity of immune functions [6]. Several studies have reported that older people with comorbid conditions are more likely to have more severe symptoms and a higher risk of mortality from COVID-19 infection compared to children or younger adults [7–9]. Older adults with chronic disease and diminished immune responses have been found to have an increased risk of infection [10].

Supporting this view, a recent case study demonstrated that robust immune responses were observed during clinical recovery from the COVID- 19 virus in a middle aged healthy adult [11,12]. Other studies also reported that the total number of NK and CD8<sup>+</sup> T cells had decreased significantly in patients with SARS-CoV 2 infection [13] as a result of T cell infection by SARS-CoV 2 [14,15].

Recently, several studies have demonstrated that physical activity and meditation play a pivotal role in regulating inflammation and supporting immune function [16–19]. Consequently, general recommendations for a healthy lifestyle, including physical activities and meditation, have been made worldwide to help prevent disease, enhance immune function and improve global health and well-being [17,19]. Emerging evidence indicates that there are substantial benefits for practising Tai Chi and Qigong (TQ) for health and well-being. TQ, also known as moving meditation, is a classical mind-body exercise originating in China and has been utilised as a preventative health intervention for many centuries. TQ is the most common form of physical exercise among adults in China [20]. The 2012 US National Health Interview Survey (NHIS) data suggested that more than 7 million adults in the US practised TQ and its popularity is growing globally [21–23]. Reasons given for practicing TQ were to optimize overall health and well-being, prevent disease and prevent the progression of medical conditions [20,21]. Respondents who practice TQ indicated that the positive outcomes of practice were reduced levels of stress (83%) and improved overall health and well-being (74%) [21]. Recently, a number of systematic reviews and meta-analyses have demonstrated the positive impact of TQ on physical conditions such as arthritis [24], cancer [25], diabetes [26], falls prevention [27], fibromyalgia [28], osteoarthritis [29], chronic pain [24,30] and cognitive function [31]. Evidence also supports the psychological benefits of TQ, particularly for symptoms of anxiety and depression [32]. Furthermore, several descriptive reviews have examined the beneficial effects of mind-body interventions, i.e., TQ, yoga and meditation on immune function and immune system-related inflammatory biomarkers [33,34]. Another recent study conducted with healthy women demonstrated that TQ can change gene expression associated with inflammation (HSF1, HSPA1A, IL6, IL10, CCL2 and NF-kB mRNA) [35]. Although a number of studies have reviewed the effects of TQ on physical and psychological wellbeing [36,37], few studies have explored the effects of TQ on immunoregulatory responses. The aim of the current review is to assess the effects of TQ on immune function and immune system-related inflammatory biomarkers.

#### **2. Methods**

A systematic review and meta-analysis was conducted following the 2018 Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines [38] for systematic reviews and meta-analyses. Electronic searches were conducted on four English databases (Medline, PubMed, Embase and ScienceDirect) from inception through to April 2020. Searches were performed using the following keywords: "Tai Chi or Qigong" and "immune system, immune function, immunity, Immun\*, inflammation and cytokines". Additional searches were performed in Google Scholar. Eligibility criteria were: full-text studies published in English, RCTs with the primary outcome of immune response, sample size (n ≥ 30), and a TQ intervention period of at least four weeks. Interventions using qigong (QG) or emitting qi therapy by a qi master were excluded. Purely meditational techniques, such as Zen meditation, were excluded. Two reviewers (KB and BO) screened the titles and abstracts

and reviewed them for eligibility after reading the full-text. Additionally, searches were conducted for other potential studies by screening references in the identified studies.

#### *2.1. Data Analysis*

Outcomes at the initial post-intervention assessment were summarised and compared by TQ intervention arms. Estimates of intervention effects on immune responses (change of immune cells and inflammatory biomarkers) were extracted and compared for randomised arms [intervention, active control (exercise or health education) or non-active control (usual care/daily activities/wait-list)]. A random-effects meta-analysis was used to compute pooled estimates allowing for variation.

Effect sizes were estimated from the difference between study group means divided by variances pooled from both treatment and control groups. Where necessary, known equations were used to calculate the effect sizes from the reported data [39,40]. Standardised mean differences (SMD, Hedge's g) and 95% confidence intervals (CIs) were calculated. I2 was calculated to assess heterogeneity [40]. The outcomes of immune system and inflammatory-related biomarkers were reported in Table 1, and Figures 2–6. A negative SMD value indicated a greater decrease in biomarkers.



#### *Medicines* **2020**, *7*, 39






CRP:C-reactive

TC: Tai Chi, QG: Qigong, TQ: Tai Chi and Qigong, CG: Control group, NS: Not significant, ↑: increase, ↓: decrease.

 protein, IL-1ra: interleukin 1 receptor antagonist, RLXN: relaxation training, SPRT: spiritual growth groups, CBT: Cognitive Behavioural Therapy, GLQG: Guolin Qigong,

#### *2.2. Quality Assessment of Original Papers*

Risk of Bias (RoB) Assessment: To adequately assess RoB of the included RCTs, two reviewers independently assessed the RoB using the Cochrane Collaboration's tool for assessing RoB version 2 (RoB 2) [65]. The Cochrane Collaboration tool RoB2 consists of six domains: "randomization process", "deviations from intended interventions", "missing outcome data", "measurement of the outcome", "selection of the reported result", and "overall bias". Any disagreement between the two reviewers was resolved through discussion.

#### **3. Results**

A total of 969 studies were initially identified and screened in this literature search. After an in-depth evaluation of screening titles and abstracts, 53 articles remained for assessment of eligibility to be included in the review. Nineteen studies were included in the review (Figure 1). Seventeen studies were included in the meta-analyses.

**Figure 1.** Flow chart.

#### *3.1. Characteristics of Clinical Studies and Quality of Evidence*

In the nineteen RCTs [Tai Chi (n = 14) and Qigong (n = 5)], there were a total of 1686 participants with an age range 18 to 87 years, and a sample size range of 32 to 252 participants within the studies, of which 775 were in the intervention groups and 911 in the control groups (Table 1). Studies were conducted across several countries viz. USA (n = 8), China (n = 4), Taiwan (n = 2), Australia (n = 1), Spain (n = 2), and one each were from Hong Kong and Thailand. Participants in the studies were categorised as cancer survivors (n = 5), older adults with a history of varicella (n = 2), healthy college students (n = 3), healthy older adults (n = 2), and a further seven studies with older adults with chronic neck pain (n = 1), mild cognitive impairment (n = 1), cardiovascular disease, (n = 1), diabetes (n = 1), insomnia (n = 1), HIV (n = 1), and depression (n = 1). Seventeen studies were designed with two arms while one study was conducted with three arms (TC vs. CBT vs. Health education) and one with four arms (TC vs. relaxation vs. spiritual growth vs. wait-list), respectively. In the control group conditions, physical exercise (n = 3), health education (n = 4) and/or CBT (n = 2), wait-list group (n = 2), and usual care and daily activities (n = 9) were used. The TQ intervention period varied from 4 weeks to

6 months and included periods of 4 weeks (n = 2), 8 weeks (n = 1), 10 weeks (n = 3), 12 weeks (n = 6), 15 weeks (n = 1), 16 weeks (n = 3) and 6 months (n = 3).The number of intervention sessions ranged from one to five times per week, with session frequencies of once per week (n = 5), two sessions per week (n = 3), three sessions per week (n = 9), and one study each with four sessions per week and five sessions per week, respectively. The majority of studies used an intervention time of 60 min (n = 7), whereas other intervention times comprised 30 to 50 min (n = 6), 90 min (n = 2), 120 min (n = 3) and 1 study did not report an intervention time.

#### *3.2. Outcomes on the Immune System and Inflammation Associated Biomarkers*

The effects of TQ interventions on the selected immune system outcomes and inflammatory biomarkers are reported in Figures 2–6. Meta-analysis data are presented as SMD (95%, CI) unless otherwise stated.

#### *3.3. Outcomes on the Immune System*

Overall, a random-effects meta-analysis found that TQ had a significant small effect of increasing the levels of immune cells (SMD, 0.28; 95% CI, 0.13 to 0.43, *p* < 0.01, I<sup>2</sup> = 45%) (Figure 2).

**Figure 2.** Forest plot for random-effects meta-analysis of the effects of TQ on the immune system.

#### *3.4. E*ff*ects on the Innate Immune System*

Overall, a random-effects meta-analysis found that TQ had a small effect of increasing the levels of innate immune cells compared with controls (SMD, 0.22; 95% CI, <sup>−</sup>0.00 to 0.45, *p* = 0.05, I<sup>2</sup> = 27%), with no significant heterogeneity across the studies (Figure 3A).

#### 3.4.1. NK Cells

A meta-analysis performed with three studies showed that there were no significant effects for the levels of NK cells (SMD, 0.00; 95% CI, <sup>−</sup>0.64 to 0.64, *<sup>p</sup>* <sup>=</sup> 0.99, I<sup>2</sup> <sup>=</sup> 68%), (Figure 3B). Despite two studies reporting positive trends on NK cells, one study showed significant decreases in NK cells which may offset the effect size.

#### 3.4.2. Dendritic Cells (DCs)

There were significant small effects on DCs (SMD, 0.32; 95% CI, 0.02 to 0.62, *p* = 0.04) in favour to TQ compared to control groups (Figure 3C).

**Figure 3.** Forest plot for random-effects meta-analysis of the effects of TQ on the innate immune system. (**A**): The effects on the innate immune system, (**B**): the effects on the NK cells, (**C**): the effects on the DCs.

#### *3.5. Other Innate Immune Cells*

A meta-analysis of one study which examined innate immune cells showed that TQ had a small effect of increasing the levels of eosinophils (SMD, 0.40; 95% CI, −0.20 to 1.00), monocytes (SMD, 0.44; 95% CI, −0.16 to 1.04), and a marginally small effect on neutrophils (SMD, 0.18; 95% CI, −0.42 to 0.77), compared with daily activities.

#### *3.6. E*ff*ects on the Adaptive Immune System*

#### 3.6.1. Adaptive Immune Cells

Overall, a random-effects meta-analysis found that TQ had a small effect of increasing the levels of adaptive immune cells compared with controls (SMD, 0.31; 95% CI, 0.11 to 0.51, *p* = 0.01), I2 = 52%, with low to moderate heterogeneity across studies (Figure 4A–D).

**Figure 4.** *Cont*.

**Figure 4.** Forest plot for random-effects meta-analysis of the effects of TQ on the adaptive immune system. (**A**): The effects on the adaptive immune system, (**B**): the effects on the NKT cells, (**C**): the effects on the CD4+/CD8+ ratio, (**D**): the effects on the VZV-cell-mediated immunity.

#### 3.6.2. T Cell Associated Adaptive Immune Cells

A meta-analysis showed that TQ had a small but non-significant effect of increasing levels of NKT cells (SMD 0.24, 95% CI, −0.18 to 0.66, *p* = 0.27), a moderate effect on the Th1/Th2 ratio (SMD, 0.52; 95% CI, −0.25 to 1.29), and a significant large effect on the Tc1/Tc2 ratio (SMD, 1.64; 95 % CI, 0.75 to 2.53), compared with daily activities. Also, there was a small non-significant effect on the CD4+/CD8<sup>+</sup> ratio (SMD, 0.11, 95% CI: −0.21 to 0.44, *p* = 0.49).

Other adaptive immune cell responses associated with the biomarker B lymphocytes showed a significant moderate effect for TQ increasing the proportion of B lymphocytes (SMD, 0.64; 95% CI, 0.45 to 0.83), but there were negligible effects for immunoglobulin antibodies IgA (SMD, −0.03; 95% CI, −0.54 to 0.47), IgG (SMD, 0.10; 95% CI, −0.41 to 0.60), and IgM (SMD, 0.05; 95% CI, −0.46 to 0.55), when compared with a health education control group. (Figure 4A–C). For VZV-cell-mediated immunity, two RCTs measured VZV responder cell frequency (VZV-RCF) compared with daily activity and health education controls. A meta-analysis found that TQ had a small effect (SMD, 0.20; 95% CI, −0.13 to 0.52), I<sup>2</sup> = 0%, of elevating VZV-RCF (Figure 4D).

#### *3.7. E*ff*ects on the Inflammation Response*

Overall, a random-effects meta-analysis indicated that TQ had no significant effects on responses to inflammation (SMD, <sup>−</sup>0.15; 95% CI, <sup>−</sup>0.39 to 0.09, *p* = 0.21, I2 = 85%), as measured by the systemic inflammation biomarker CRP, the cell mediated biomarker cytokines (IL1β, IL2, IL4, IL6, IL10, IL12, IL18, TNF-α, INF-γ, GCSF) and NF-κB, compared with controls, due to several bidirectional cytokine responses. However, sub-group analyses showed TQ had positive trend on levels of CRP, IL6 and NF-κB, separately (Figure 5A–E).


**(B)** 


#### **(E)**

**Figure 5.** Forest plot for random-effects meta-analysis of the effects of TQ on the inflammation response. (**A**): The effects on the inflammation response, (**B**): the effects on the levels of IL6, (**C**): the effects on the levels of TNF-α, (**D**): the effects on the levels of IFN-γ, (**E**): the effects on the activity of NF-κB.

#### 3.7.1. CRP

A meta-analysis conducted with six studies suggested that TQ had a small effect of reducing CRP compared with control groups (SMD, <sup>−</sup>0.30; 95% CI, <sup>−</sup>0.70 to 0.11, *p* = 0.16, I2 = 80%). Furthermore, a subgroup analysis (Figure 6A,B) of the effect of TQ on CRP, with different control conditions (TQ vs. health education control, TQ vs. exercise control, TQ vs. inactive control, TQ vs. CBT), showed that studies that compared TQ with "health education" resulted in a moderate effect (SMD −0.64, 95% CI, <sup>−</sup>1.19 to <sup>−</sup>0.08, I2 = 76%), of reducing CRP, and with "inactive (usual care)" resulted in a small effect (SMD, −0.32; 95% CI, −0.63 to −0.01) of reducing CRP. In contrast, a study that compared TQ with an "exercise" group, demonstrated non-significant effects (SMD, 0.16; 95% CI, −0.47 to 0.78), and another study comparing TQ with "Cognitive Behavioural Therapy (CBT)" showed a small effect of increasing CRP (SMD, 0.32; 95% CI, −0.09 to 0.73).

**Figure 6.** Forest plot for random-effects meta-analysis of the effects of TQ on levels of CRP. (**A**): The effects on the levels of CRP, (**B**): subgroup analysis of CRP based on control interventions.

#### 3.7.2. IL-6

A meta-analysis of three studies suggested that, compared to controls, TQ had a small effect of reducing the levels of IL-6 (SMD, <sup>−</sup>0.38; 95% CI, <sup>−</sup>0.13 to 0.36, I<sup>2</sup> = 85%). Of these three studies, one study showed that TQ increased the levels of IL-6, while another two studies showed reduced IL-6 levels following a TQ intervention.

#### 3.7.3. TNF-α

A meta-analysis of four RCTs found that TQ had a negligible effect on levels of TNF-α compared with controls (SMD, 0.15; 95% CI, <sup>−</sup>0.08 to 0.37, I2 <sup>=</sup> 0%).

#### 3.7.4. INF-γ

A meta-analysis of five RCTs showed that TQ had a small effect of increasing the level of INF-<sup>γ</sup> (SMD, 0.27; 95% CI, <sup>−</sup>0.49 to 1.02, I2 = 93%). This result may have been influenced by considerable heterogeneity.

#### 3.7.5. Effects on Other Pro-Inflammatory Biomarkers

A meta-analysis conducted on one study showed a non-significant effect of TQ on levels of IL-2 (SMD, 0.16; 95% CI, −0.16 to 0.47) when compared with controls, whereas another study showed a small effect of TQ on levels of IL-18 (SMD, −0.21; 95% CI, −0.64 to 0.23), and a third study showed a moderate effect for TQ on IL-12 (SMD, −0.50; 95% CI, −1.02 to 0.01). Another meta-analysis found larger effects in favour to TQ on levels of NF-κB (SMD, −0.96; 95% CI, −1.35 to −0.58).

#### *3.8. E*ff*ects on Anti-Inflammatory Biomarkers*

#### IL-4

A meta-analysis conducted on two studies suggests that TQ had no significant effect on levels of IL-4 (SMD, 0.03; 95% CI, <sup>−</sup>0.33 to 0.39, I<sup>2</sup> <sup>=</sup> 0%), when compared with controls (data not presented).

#### *3.9. Assessment of Risk of Bias*

A RoB assessment was conducted with a revised tool (RoB 2) [65], to examine randomization process, deviations from intended interventions, missing outcome data, measurement of the outcome, selection of the reported result, and overall bias. In an overall assessment of bias, three RCTs were assessed as having a low RoB, six RCTs with some RoB concerns and ten RCTs with high RoB. In the domain of "measurement of the outcome" all of the reviewed RCTs have low RoB, whereas in domain of "missing outcome data" the majority of reviewed studies displayed high RoB. Individual scores for RoB are presented in Figure 7.

**Figure 7.** Risk of bias assessment.

#### **4. Discussion**

In this systematic review and meta-analysis of nineteen RCTs that examined the effects of TQ on the immune system and inflammation, we found that TQ is capable of modulating immune system functioning and inflammatory biomarker responses. An important finding of the current review was that a minimum of 4 weeks practice of TQ enables participants to enhance their immune system functioning by stimulating innate and adaptive immune cell responses and regulating biomarkers associated with inflammation. In addition, we found two studies that showed that practising TQ for more than 12 weeks can alter gene expression, as demonstrated in NF-κB signal pathways.

Our findings are comparable with similar studies that assessed the effect of mind-body therapies using mixed interventions including meditation, Yoga and TQ [33,66]. Prior reviews that examined the effect of TQ on immunity and inflammation have reported no strong evidence of a favorable effect of TQ on inflammation [67] and the immune system [68] and insufficient evidence to support the clinical application of TQ to reduce infection [68,69]. These earlier reviews included fewer RCTs, and non-RCT studies, and a less comprehensive literature search which limits the conclusions made.

Another important aspect of the present review is that, to the best of our knowledge, this is the first comprehensive systematic review specifically evaluating the effects of TQ on the immune system and inflammatory biomarker responses. In order to assess the effect of TQ on the immune system and inflammation, immune related cell types were categorized into two groups, viz. innate immune cells and adaptive immune cells, and inflammatory biomarkers into three groups, viz. the systemic inflammatory biomarker CRP, cytokines and gene expression associated with pro-inflammatory processes. Of the nineteen reviewed RCTs, four studies measured innate immune cells (eosinophils, monocytes, neutrophils, NK and dendritic cells), six studies measured adaptive immune cells (T cells, NKT, and B cells), and four studies measured both. Overall, a random-effects meta-analysis found that TQ had a significant small effect of increasing the levels of immune cells (SMD, 0.28; 95% CI, 0.13 to 0.43, *p* < 0.01, I2 = 45%), compared to controls. The effect of TQ on inflammation is measured commonly by levels of the inflammatory biomarkers CRP, IL6 and TNF-α. In general, the levels of CRP decreased following a TQ intervention. Of the eight studies measuring CRP, four studies reported that TQ significantly decreased levels of CRP compared to controls, whereas three studies conducted with older adults with symptoms of chronic ill-health (pain, insomnia and older adults with history of varicella) showed no differences between the intervention and control groups (Table 1). Considering that CRP is a commonly used diagnostic marker of systemic inflammation, non-significant changes in levels of CRP may be associated with the progression of the chronic disease in these older participants. Given these mixed results, further investigation of this aspect, with a homogeneous study population, is warranted. In addition to CRP, cell mediated inflammatory cytokines (IL6, TNF-α) were also found to demonstrate an overall trend of reduced levels following the TQ intervention. However, for inflammatory cytokines (IL2, IL4, IL6, IL10, IL12, TNF-α, INF-γ) both downregulation and upregulation of cytokine responses were observed across the studies.

These bidirectional results for inflammatory cytokines are consistent with the results of recent studies that examined cytokine levels following mind-body therapies [33,66] and exercise [70] interventions, that also included TQ. Several studies have suggested that increases in inflammatory cytokines are not only in response to immune system activation or infection, but also can occur when immune cells are stimulated to activate cytotoxicity [71,72]. For example, a review paper on pro-inflammatory and anti-inflammatory processes in patients with multiple myeloma, examined the effects of locally produced cytokines, as a primary immune response, and found that efficacious tumour immunosurveillance due to tumour-specific CD4<sup>+</sup> T cells was consistently related to increased local concentrations of both proinflammatory (IL-6, IL-1α, and IL-1β) and Th1-associated cytokines (IL-2, IL-12, and IFN-γ) [71]. It was concluded that the influence of cytokines on the immune system occurs as parallel processes and that changes in one specific cytokine can be balanced by others within the cytokine system, leading to a modulation of the immune response. In light of the current findings of a bidirectional response of inflammatory cytokines, additional clinical trials with a homogeneous

study population will help to better understand the directional nature of the relationship between inflammation and immunity. Despite these mixed results in outcomes for inflammatory responses, as measured by levels of CRP, cytokines, and NF-κB, overall, there were trends towards reduced levels of inflammation compared with control conditions.

Several limitations were identified in the current study. Firstly, caution is required in interpreting the overall random effect size on immune cells (SMD, 0.28; 95% CI, 0.13 to 0.43) and inflammation responses (SMD, −0.15; 95% CI, −0.39 to 0.09). Studies included in this review measured changes in a range of immune system and inflammatory biomarkers, rather than changes in a single identical biomarker in each study, thus confounding assumptions for independent variables associated with confidence intervals and heterogeneity in the meta-analysis.

Secondly, the demographic profile of participants in the original studies included in this review were heterogeneous in respect of age, ranging from 18 to 89 years, and health status viz. from healthy to various symptoms of medical conditions, which limit the generalizability of our findings. Furthermore, the TQ interventions that were examined were heterogeneous in respect of duration (from 4 weeks to 6 months), frequency (one to five times per week) and type of TQ intervention. Considering the heterogeneity of these studies, future investigations into the modulatory effects on immune responses of different types of TQ interventions and their dosage levels will be worthwhile. For example, one study was conducted with a TQ intervention duration of 4 weeks and a frequency of three sessions per week, whereas the duration of other studies varied from 4 weeks to 24 weeks with frequencies ranging from one session to five sessions per week. However, at present there is no standardised protocol to inform healthcare professionals and the general public on the minimum dosage levels of TQ required to modulate immune responses [73]. Finally, the current review did not investigate the physiological mechanisms underlying the effects of TQ on immune responses, despite previous studies attempting to explain these potential mechanisms in mind-body medicine and psychoneuroimmunology models [33]. Given the complexity of TQ as a movement-based mind-body therapy, compared with other mind-body therapies that have less movement components, investigating these underlying TQ mechanisms will provide important insights into future clinical applications. Moreover, some recent RCTs comparing TQ with exercise or CBT have suggested that there were no significant differences in outcomes between the intervention groups, indicating that TQ may be equivalent to conventional exercise or CBT interventions. However, more studies with robust study designs and adequate statistical power are required. Considering that the immune system is vital for protection from external pathogens, including bacterial and viral pathogens often occurring in the natural environment, the evidence supporting TQ for a healthier immune system, can have important implications for promoting TQ programs for health and well-being. Recommending TQ programs for people with low immunity, particularly those receiving treatment that induces immune suppression and older adults with chronic diseases, could have a beneficial effect of strengthening immune function. More clinical outcome studies are required to examine TQ as a stand-alone or adjunctive intervention for these patients and its effects on comparative rates of recovery. For the general public, TQ offers a preventative health measure that can strengthen the immune system and assist overall health and well-being.

In conclusion, despite several limitations, the current review of RCTs indicates that practising TQ can have a positive impact on immune system functioning and inflammatory processes. Given the vital role of the immune system, and in particular the influence of cytokines, in the current viral pandemic, preventative public health strategies to improve immune functioning in the general public, and those with medical conditions, are needed. However, while the promotion of TQ to the general public and healthcare professionals as a preventative health intervention for strengthening the immune system is recommended, further robust studies to develop clinical practice guidelines for using TQ with vulnerable populations is warranted.

**Author Contributions:** All authors cooperated on developing the concept design and preparing the manuscript. B.O. and K.B. had full access to the study data and take responsibility for the integrity and accuracy of data analysis. Statistical analysis and interpretation of data: K.B., B.C., B.O. Drafting of the manuscript: K.B., B.O. and

B.C. Critical revision of the manuscript for important intellectual content: B.O., K.B., G.L., T.E., F.B., B.C., S.C., A.Y., D.R., L.S., M.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
