Next Article in Journal
A Compiled Update on Nutrition, Phytochemicals, Processing Effects, Analytical Testing and Health Effects of Chenopodium album: A Non-Conventional Edible Plant (NCEP)
Next Article in Special Issue
Two-Dimensional High-Performance Liquid Chromatography as a Powerful Tool for Bioanalysis: The Paradigm of Antibiotics
Previous Article in Journal
Current and Potential Applications of Atmospheric Cold Plasma in the Food Industry
Previous Article in Special Issue
A Low-Cost Colorimetric Assay for the Analytical Determination of Copper Ions with Consumer Electronic Imaging Devices in Natural Water Samples
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Mass Spectrometry-Based Evaluation of the Bland–Altman Approach: Review, Discussion, and Proposal

Institute of Toxicology, Core Unit Proteomics, Hannover Medical School, 30623 Hannover, Germany
Molecules 2023, 28(13), 4905; https://doi.org/10.3390/molecules28134905
Submission received: 9 May 2023 / Revised: 12 June 2023 / Accepted: 16 June 2023 / Published: 21 June 2023

Abstract

:
Reliable quantification in biological systems of endogenous low- and high-molecular substances, drugs and their metabolites, is of particular importance in diagnosis and therapy, and in basic and clinical research. The analytical characteristics of analytical approaches have many differences, including in core features such as accuracy, precision, specificity, and limits of detection (LOD) and quantitation (LOQ). Several different mathematic approaches were developed and used for the comparison of two analytical methods applied to the same chemical compound in the same biological sample. Generally, comparisons of results obtained by two analytical methods yields different quantitative results. Yet, which mathematical approach gives the most reliable results? Which mathematical approach is best suited to demonstrate agreement between the methods, or the superiority of an analytical method A over analytical method B? The simplest and most frequently used method of comparison is the linear regression analysis of data observed by method A (y) and the data observed by method B (x): y = α + βx. In 1986, Bland and Altman indicated that linear regression analysis, notably the use of the correlation coefficient, is inappropriate for method-comparison. Instead, Bland and Altman have suggested an alternative approach, which is generally known as the Bland–Altman approach. Originally, this method of comparison was applied in medicine, for instance, to measure blood pressure by two devices. The Bland–Altman approach was rapidly adapted in analytical chemistry and in clinical chemistry. To date, the approach suggested by Bland–Altman approach is one of the most widely used mathematical approaches for method-comparison. With about 37,000 citations, the original paper published in the journal The Lancet in 1986 is among the most frequently cited scientific papers in this area to date. Nevertheless, the Bland–Altman approach has not been really set on a quantitative basis. No criteria have been proposed thus far, in which the Bland–Altman approach can form the basis on which analytical agreement or the better analytical method can be demonstrated. In this article, the Bland–Altman approach is re-valuated from a quantitative bioanalytical perspective, and an attempt is made to propose acceptance criteria. For this purpose, different analytical methods were compared with Gold Standard analytical methods based on mass spectrometry (MS) and tandem mass spectrometry (MS/MS), i.e., GC-MS, GC-MS/MS, LC-MS and LC-MS/MS. Other chromatographic and non-chromatographic methods were also considered. The results for several different endogenous substances, including nitrate, anandamide, homoarginine, creatinine and malondialdehyde in human plasma, serum and urine are discussed. In addition to the Bland–Altman approach, linear regression analysis and the Oldham–Eksborg method-comparison approaches were used and compared. Special emphasis was given to the relation of difference and mean in the Bland–Altman approach. Currently available guidelines for method validation were also considered. Acceptance criteria for method agreement were proposed, including the slope and correlation coefficient in linear regression, and the coefficient of variation for the percentage difference in the Bland–Altman and Oldham–Eksborg approaches.

1. Introduction

Most likely, nobody knows how many low- and high-molecular-mass chemical compounds are present in biological samples such as blood and urine. Yet, their number is assumed to be very high and to increase with time due to the discovery of natural and the introduction into the environment of new synthetic compounds including drugs. The core mission of Analytical Chemistry is both to identify the structure these compounds and to determine their concentration as accurately as possible. Over the years, numerous analytical methods were reported for the quantitative determination of virtually all classes of chemical compounds. Scientific competition, curiosity and striving, often paired with the discovery of novel technologies and improvements in available methodologies, have resulted in, and consistently result in, the development of various analytical methods in part for the same analyte, yet with different analytical performances. The performance of analytical methods can be characterized with a certain degree of objectivity, especially when defined criteria are applied. Generally, an improvement in a current analytical method for a certain analyte is an acceptable justification for the publication of the improved analytical method in a scientific journal, despite its lacking true analytical novelty.

1.1. Method-Comparison Approaches

Method-comparison approaches were proposed, interpreted, discussed, criticized, and improved by several groups [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36] (in part cited in chronological order). Linear regression analysis (see Formula (1)) of results obtained by two different methods for the same measure, e.g., for an analyte in a biological system, is reportedly the oldest approach of method-comparison. In 1986, Bland and Altman published in The Lancet their legendary paper entitled Statistical methods for assessing agreement between two methods of clinical measurement [10]. This paper is one of the most frequently cited articles in Life Sciences (Figure S1). Bland and Altman [10] have noted that linear regression analysis, notably the use of the correlation coefficient r, is inappropriate for method-comparison, and they have suggested an alternative approach. Despite the availability of approaches for stronger analytical power, including the Bland–Altman method, linear regression analysis is still, without doubt, the most frequently and routinely used approach in the field of analytical chemistry and in other areas, until this day. Interestingly, the Bland–Altman method seems to be more widespread in the field of clinical chemistry. The consequences of the use of inappropriate or unsatisfactory approaches for method-comparison, such as the sole use of any value of the correlation coefficient r, as long it is associated with a statistically significant p value, i.e., p ≤ 0.05, may be grave, as is demonstrated in the present study.
τ1j = α + β × τ2j
whereas τ1j and τ2j are the values measured by method 1 and method 2 (j = 1, 2, … n−1, n; n = total number of the analyzed samples), respectively; α and β are values of the y-axis intercept and the slope of the straight line, respectively.
The Bland–Altman (BA) method [10] is a rather graphical approach which is still widely used but is less frequently and not routinely applied in analytical chemistry. The Bland–Altman approach examines the relationship between the difference (δBA or simply δ) of the values obtained by two methods (see Formula (2)) and the mean (μBA or simply μ) of the methods (see Formula (3)). Usually, in this approach, δBA is plotted versus the μBA of the methods.
δBA = τ1jτ2j
µBA = 1/2 × (τ1j + τ2j)
Even if the Bland–Altman approach is steadily used in analytical chemistry, this method-comparison is applied incorrectly, most likely due to the lack of acceptance criteria. Thus, most of the measurements may be within a 95% confidence interval, e.g., the ±1.96 × standard deviation, despite lacking analytically relevant comparability. This is because the 95% confidence interval becomes wider the larger the difference between the methods is (see below). It should be emphasized that Bland and Altman suggested their approach for quite comparable methods [10]. In this respect, both linear regression analysis and the Bland–Altman approach are used arbitrarily and do not provide reliable information about a potentially existing comparability and the extent of agreement.
Oldham [2], and later Eksborg [8], have suggested independently of each other an alternative approach which is based on using the ratio (ΛOE or simply Λ; see Formulas (4a) and (4b)) of the measured values by two methods versus the mean of the values or versus the values τ2 of method 2 (chosen as the reference method; Formula (4a)) or versus their average (Formula (4b)). In the present work, this method is referred to as OE. Interestingly, and in the opinion of this author, surprisingly, this approach did not find appreciable applications in method-comparison studies until the present day.
ΛOE = τ1j:τ2j
ΛOE = τ1j:0.5 × (τ1j + τ2j)

1.2. Basic Principles of Mass Spectrometry and Tandem Mass Spectrometry

Analytical methods involving the use of mass spectrometers, such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS) apparatus, are based on the separation of inorganic and organic ions produced in the ion-source of the instruments due to their mass-to-charge (m/z) ratio (Scheme S1). This ability provides mass spectrometry (MS)-based approaches with inherent specificity and distinguishes them from other analytical techniques, which are based on the utilization of far less characteristic physicochemical properties such as light absorption, fluorescence, or conductivity. The separation of substances by their m/z values enables the use of stable-isotope-labelled analogues as internal standards (IS) in MS-based methods. This is a unique feature of MS technology and lends MS-based methods high accuracy in quantitative analysis. A quantum jump in specificity and accuracy is represented by tandem mass spectrometry (MS/MS), for instance, as realized in GC-MS/MS and LC-MS/MS instruments. Not without reason, MS/MS-based methods are regarded as the Reference Methods, the Gold Standard, in the area of analytical chemistry, in basic and clinical research, including clinical chemistry (see for instance Refs. [37,38,39,40]).
In biological samples, such as plasma and urine samples, there are myriads of substances that belong to distinctly different classes. In MS-based methods, sample treatment procedures, such as protein precipitation, proper extraction and/or derivatization prior to analysis, generally lead to a considerable reduction in the number of the analytes finally injected into the MS instrument. The number of analytes that may interfere with the analysis of a certain substance may be further reduced by gas chromatographic or liquid chromatographic separation prior to MS separation. Despite a strong reduction in the number of potentially interfering analytes by such steps, the ionization process of co-eluting substances may generate isobaric ions, i.e., structurally different ions which have, however, the same m/z value. This particular situation is illustrated in Scheme S1 for mass spectrometers based on quadrupole (Q) technology.
Commonly, quantification by GC-MS and LC-MS instruments (and by GC-MS/MS and LC-MS/MS instruments operated in the SSQ configuration) is performed in the selected-ion monitoring (SIM) mode, as shown in Scheme S2A. In general, two ions produced in the ion source are selected: one ion for the target analyte AT and one ion for the corresponding ion of the stable-isotope-labelled analogue AIS, which serves as the IS for the analyte AT. Quantification by GC-MS/MS and LC-MS/MS instruments is usually carried out in the selected-reaction monitoring (SRM) mode, as illustrated in Scheme S2B. For example, from the ions produced in the ion source of a GC-MS/MS instrument, the first quadrupole Q1 alternately separates the ion with m/z AT for the target analyte and the corresponding ion m/z AIS for the externally added IS. These precursor ions are fragmented in the collision chamber (second quadrupole) Q2, and the third quadrupole Q3 alternately selects, in general, each one specific product ion (p); for instance, it “filters” m/z PT for the target analyte and m/z PIS for the IS (Scheme S2B). In the SRM mode, Q3 can also pass product ions of the same m/z value (i.e., m/z PT = m/z PIS) which are, however, produced from different precursor ions and can therefore be completely discriminated.
A more detailed description of the instrumentation and principles of operation techniques, including ionization techniques, with SSQ and TSQ mass spectrometers and other types of mass spectrometers can be referred to the literature (e.g., Refs. [41,42,43,44,45]). A history of European mass spectrometry is found in Ref. [46]. In the context of quantitative analytical chemistry, it should be emphasized that mass spectrometry is not, per se, a magic bullet, and it does not always guarantee valid data [47]. Yet, it is currently the best available technology in analytical chemistry. It must be used with validated methods, and all findings need to be critically evaluated [48].

1.3. Problem and Aim of the Study

Because of the lack of guide numbers for the correlation coefficient r, the y-axis intercept α and the slope β of the regression equation, the results of linear regression analysis for method-comparison are used rather arbitrarily [10,49]. In particular, the lack of a definition of the acceptance criteria for the correlation coefficient r seduces us into misusing regression analysis, for instance, into suggesting an agreement in doubtful cases or even in missing analytical agreements. Thus, even if the value for the correlation coefficient r is, for instance, only 0.8, a p value below 0.05 is commonly considered satisfactory to claim correlation between the methods tested, irrespective of the y-axis intercept and slope values of regression equations.
In principle, these considerations equally apply to both the Bland–Altman method and the Oldham–Eksborg approach. Actually, these two methods lack a definition of the acceptance criteria for comparability and the validity of the analytical methods being compared. The Bland–Altman approach is useful for comparisons of methods with comparable performances, as originally stated by Bland and Altman in their original work [10]. In cases of considerable disagreement between the methods being compared, the Bland–Altman approach would penalize the method with the better analytical performance, e.g., method 2, in favour of the method with the putatively lower-quality analytical method, e.g., method 1. For example, this could be the case when comparing GC-MS/MS or LC-MS/MS methods with GC-FID or HPLC-UV methods. Application of the Bland–Altman approach to two methods being less comparable would result into a too-large confidence interval. Thus far, no additional established quantitative parameters of this approach have been proposed to value and report the extent of the agreements between methods. Regrettably, and in analogy to regression analysis, the Bland–Altman approach is interpreted incorrectly by many investigators, presumably because of the lack of acceptance criteria. Commonly, the application of this approach to method-comparison is solely restricted to showing the graph and the confidence interval. Eventually, the Oldham–Eksborg approach finds generally very few applications in method-comparisons despite the considerable potential of this method.
As will be shown in this work, the approaches of linear regression analysis, Bland–Altman, and Oldham–Eksborg are linked together. Therefore, one possibility to overcome the flaws of the individual approaches could be the deviation and use of a proper combination of these methods. However, even if this would be profitable, it may not allow us to solve the main, common, and principal problem of method-comparison: the renunciation of the superiority of one method over the other method, or the arbitrariness of defining one of the methods being compared as the absolute reference method, the Gold Standard. We could extricate ourselves from this dilemma if we accepted that thoroughly validated and proven analytical methods based on the tandem mass spectrometry methodology, such as GC-MS/MS and LC-MS/MS methods, are best-qualified to represent the reference methods [48]. The superiority of tandem mass spectrometry technology over other putatively less reliable analytical techniques is reasonably indisputable in the literature. The examples presented below are supportive of the analytical superiority of the MS/MS methodology over other analytical methodologies.
The aim of the present study was to investigate whether or not defining validated and proven analytical MS/MS-based methods as reference methods may help solve problems associated with method-comparison and may even help to define acceptance criteria for linear regression analysis, the Bland–Altman, and the Oldham–Eksborg approaches. Most currently available guidelines proposed by international associations and analytically oriented journals address exogenous drugs as analytes [50,51,52,53,54,55,56] rather than endogenous substances which have special requirements beyond method validation [57]. The present work focuses on the quantitative analysis of endogenous substances in biological samples, which represents a formidable analytical challenge.

2. Methods

2.1. Re-Evaluation of Published Analytical Data

Proceeding

Selected studies published by the author’s group and by other investigator groups were examined by three method-comparison approaches: (1) linear regression analysis; (2) the Bland–Altman (BA) method; and (3) the Oldham–Eksborg (OE) approach. The selected studies reported results which allow for a satisfactory re-evaluation. The data reported in the Figures and Tables of this article were reconstituted and re-evaluated by the author to the best of his ability. For simplicity, values in Tables are reported without their respective units. Statistical data from the author’s group were generated using GraphPad Prism Version 7 for Windows (GraphPad Software, San Diego, CA, USA). Chemical structures were drawn using ChemDraw 15.0 Professional (PerkinElmer, Germany). The structures of some analytes discussed in the present work are illustrated in Scheme 1. Where applicable, data analysis is reported in the following sections in more detail.
Standard analytical parameters included in the present work are: (1) y-axis intercept α, slope β, and goodness of fit (r2) from linear regression analysis; (2) the mean of the difference δBA, the average µBA and the bias values from the BA approach; and (3) the OE ratio ΛOE. In addition, further statistically relevant parameters, notably the relative standard deviation (RSD) or coefficient of variation (CV) of the absolute difference δ and the percentage difference δ(%), and of the ratio ΛOE were included. In the BA approach, linear regression analysis between δ or δ(%) versus the average was performed and the goodness of fit (ρ2) was reported. It is assumed that these measures allow for evaluations of agreement between two methods more effectively and on a quantitative basis as compared to the rather qualitative information provided by the individual approaches. In addition, the receiver operating characteristic (ROC) approach was used and the area under the curve (AUC) values were considered to evaluate agreement/disagreement between two compared methods. The complete set of the results from the meta-analyzed studies is presented in Figures 1–11 and summarized in Table 1.

2.2. Measurement of Nitrate in Human Urine-Comparison of GC-MS with GC-MS/MS

Nitrate (Scheme 1) is the major circulating and urinary metabolite of nitric oxide (NO) [81]. Nitrate in urine is a suitable measure of whole-body NO synthesis. Figure 1 shows the results from the re-evaluation of data previously reported by our group (Table 1 of Ref. [58]) regarding validation by GC-MS/MS of a GC-MS method for the quantitative analysis of nitrate in human urine. In the urine samples analyzed, the nitrate concentration ranged between about 100 µM and 4000 µM. The nitrate concentration was measured to be (mean ± SD) 1048 ± 1024 µM (CV, 98%) by GC-MS (method 1) and 1059 ± 1035 µM (CV, 98%) by GC-MS/MS (method 2). The values differed between the methods (p = 0.014; two-tailed Wilcoxon test).
Linear regression analysis between the data obtained by GC-MS and those by GC-MS/MS resulted in a regression equation with a very low y-axis intercept value α = 1.2, a slope value β = 0.988 close to unity, and a very high correlation coefficient (r2 = 0.9978). These observations suggest a very tight agreement between the GC-MS (method 1) and the GC-MS/MS (reference method 2) (Figure 1A).
The Bland–Altman approach revealed a very low difference between the two methods δBA = −12 ± 50 µM according to a percentage difference δ (%) of −1.5 ± 2.7% (mean ± SD), which is only a very small portion of the mean concentration of nitrate measured in the whole concentration range (Figure 1B). Neither the difference (ρ2 = 0.05) nor the percentage difference (ρ2 = 0.02) correlated with the average concentration. Thus, the findings argue for a close agreement between the GC-MS and GC-MS/MS methods for nitrate in human urine.
The approach according to Oldham and Eksborg gave a concentration-independent ratio ΛOE = 0.986 ± 0.027 which is very close to the unity and has a low CV value of only 2.8% (Figure 1C). In the Bland–Altman approach, the ratio of the two methods can also be plotted against the average of the two methods. It provided a value of 0.9858 ± 0.027 which is identical to the ratio ΛOE. The third approach used to compare the GC-MS method with the GC-MS/MS method for urinary nitrate strongly indicates that the GC-MS method is as suitable as the GC-MS/MS method for the accurate quantitative determination of nitrate in human urine.
The ROC approach on these data resulted in the AUC value of 0.531 ± 0.093 and a p value of p = 0.735. This data can be interpreted as having a high agreement between the two methods.

2.3. Measurement of Asymmetric Dimethylarginine (ADMA) in Human Plasma and Serum

Asymmetric dimethylarginine (ADMA) is an endogenous inhibitor of NO synthase (NOS), which catalyzes the conversion of L-arginine to NO and is a cardiovascular risk factor [81]. ADMA circulates in blood and is excreted in the urine. Several methods were developed for the measurement of ADMA mainly in plasma and serum [63,64,65,66,67,82,83,84]. The concentration of ADMA in serum and heparinized plasma of humans was reported by many groups using HPLC, GC-MS, GC-MS/MS and LC-MS/MS methods and found not to differ significantly, with deviations being in the order of 1% [82,83,84].
We have utilized this feature of ADMA and quantitated ADMA by GC-MS/MS in heparinized plasma and serum samples generated from blood samples of a patient suffering from end-stage kidney disease before, during, and after extended haemodialysis for 8 h [62]. The ADMA concentration (mean ± SD) was measured to be 1351 ± 386 nM (CV, 29%) in serum (method 2) and 1334 ± 383 nM (CV, 29%) in plasma (method 1). The values differed between the methods (p = 0.034; two-tailed Wilcoxon test). The results of the methods-comparison are shown in Figure 2.
Linear regression analysis between the data measured in serum (method 2) and those in plasma (method 1) resulted in a regression equation with a very low y-axis intercept value α = 12.7, a slope value β = 1.003 very close to the unity, and a very high correlation coefficient (r2 = 0.9935). These observations suggest a very high agreement between the serum and the plasma ADMA levels (Figure 2A).
The Bland–Altman approach revealed a very low difference between the two methods (17.2 ± 1.2 nM) according to a percentage difference of 1.3 ± 2.1% (bias), which is only a very small portion of the mean concentration of ADMA measured in the whole concentration range (Figure 2B). Neither the difference (ρ2 = 0.007) nor the percentage difference (ρ2 = 0.001) correlated with the average concentration. These findings argue for a close agreement between the plasma and serum “methods” for ADMA.
The approach according to Oldham and Eksborg gave ΛOE = 1.013 ± 0.0219 which is very close to the unity and has a CV value of only 2.2% (Figure 2C). Thus, the third approach that was used to compare the serum and plasma levels of ADMA strongly indicates that ADMA can be measured equally accurately in human serum and plasma samples by GC-MS/MS.
The ROC approach on these data resulted in the AUC value of 0.549 ± 0.097 (p = 0.613), suggesting no difference, i.e., a high extent of agreement between the two methods.

2.4. Measurement of Anandamide in Human Plasma by GC-MS/MS and LC-MS/MS

Anandamide (AEA) is an endogenous cannabinoid, an endocannabinoid, and is mainly measured in human plasma and serum at concentrations in the upper pM-to-the lower nM range [68,69,85,86]. For the measurement of AEA in human plasma, we developed GC-MS/MS [68] and LC-MS/MS [69] methods, which utilize stable-isotope-labelled AEA as the internal standard. In the GC-MS/MS method, AEA is derivatized, while in the LC-MS/MS method AEA is analyzed without derivatization. We compared these methods by parallel measurements of AEA in the plasma of healthy humans. In this comparison, we considered the GC-MS/MS method as the reference method.
The AEA concentration (mean ± SD) was measured to be 0.844 ± 0.289 nM (CV, 34%) by GC-MS/MS (method 2) and 0.729 ± 0.270 nM (CV, 37%) by LC-MS/MS (method 1). The values differed between the methods (p < 0.0001; two-tailed Wilcoxon test). The results of the methods-comparison are shown in Figure 3.
The linear regression analysis between the AEA concentrations measured by LC-MS/MS and those measured by GC-MS/MS resulted in a regression equation with a y-axis intercept value α = 0.013, a slope value β = 0.848, and a relatively small correlation coefficient (r2 = 0.8207). These observations suggest a weak agreement between LC-MS/MS and GC-MS/MS (Figure 3A).
The Bland–Altman approach revealed a low but considerably varying difference between the two methods (0.116 ± 0.123 nM; CV, 106%) according to a percentage difference of 15.6% (bias) (Figure 3B). Neither the difference (ρ2 = 0.024) nor the percentage difference (ρ2 = 0.024) correlated with the average concentration. These findings argue for a weak agreement between the LC-MS/MS and GC-MS/MS methods regarding AEA measurement in human plasma samples.
The approach according to Oldham and Eksborg resulted in the ratio ΛOE = 0.866 ± 0.136 which is not close to the unity and has a moderate variation (CV, 16%) (Figure 3C). Thus, the third OE-approach confirms the results of the linear regression and Bland–Altman method.
The ROC approach yielded an AUC value of 0.626 ± 0.024 and a p value < 0.0001, suggesting some extent of agreement between the two methods.
In summary, GC-MS/MS and LC-MS/MS are suitable for the measurement of AEA in human plasma but yield considerably different results.

2.5. Measurement of Homoarginine in Human Plasma by GC-MS and GC-MS/MS-Real Data and a Simulation

L-Homoarginine (hArg) is a non-proteinogenic amino acid. Low plasma and urinary hArg concentrations are considered to be risk markers for cardiovascular and renal diseases [87]. For the quantitative determination of hArg in human plasma, serum and urine samples, we have developed GC-MS and GC-MS/MS methods using L-[2H3]homoarginine (d3-hArg) as the internal standard [86]. In healthy humans, plasma, serum, and urine concentrations of hArg are on the order of 2–3 µM [87].
The hArg concentration (mean ± SD) was measured to be 0.747 ± 0.367 µM (CV, 49%) by GC-MS/MS (method 2) and 0.643 ± 0.302 µM (CV, 47%) by GC-MS (method 1). The values differed between the methods (p < 0.0001; two-tailed Wilcoxon test). The results of the methods-comparison are shown in Figure 4.
Linear regression analysis between the hArg concentrations measured by GC-MS (method 1, DSQ) and those measured by GC-MS/MS (method 2, TSQ) resulted in the regression equation with a y-axis intercept value α = 0.030, a slope value β = 0.821, and a correlation coefficient (r2 = 0.9943). These observations suggest a close correlation between GC-MS and GC-MS/MS, with the GC-MS method providing constantly lower hArg concentrations than TSQ (Figure 4A).
The Bland–Altman approach revealed a relatively low difference between the two methods (0.105 ± 0.070 µM) according to a percentage difference of 14.3 ± 27% (bias) (Figure 4B1). The percentage difference (y) correlated very weakly (ρ2 = 0.1739, p < 0.0001) with the average concentration of hArg (x): y = 10.9 + 4.9 × x. The difference correlated more strongly (ρ2 = 0.8701, p < 0.0001) with the average concentration of hArg: y = −0.03 + 0.195 × x (Figure 4B2). These findings argue for a considerable agreement between the GC-MS and GC-MS/MS methods regarding hArg measurement in human plasma samples, with the GC-MS method resulting in constantly lower hArg concentrations.
The approach according to Oldham and Eksborg resulted in the ratio ΛOE = 0.868 ± 0.035 (CV, 4%) which is not close to the unity (Figure 4C). Linear regression between the ΛOE ratio (y) and the average (x) resulted in the regression equation y = 0.897 − 0.043 × x (r2 = 0.173, p < 0.0001), indicating weak concentration-dependency.
The ROC approach yielded and AUC value of 0.596 ± 0.021 and a p value < 0.0001, suggesting some extent of agreement between the two methods.
The hArg concentrations measured by DSQ (i.e., by GC-MS) were changed by multiplication to reach higher (DSQ × 1.2) and lower (DSQ × 0.8, DSQ × 0.6) concentrations. All methods of comparison were used to compared unchanged and changed hArg concentrations with those measured by TSQ (i.e., by GC-MS/MS). The results of these analyses are summarized in Table 2.
The best agreement between TSQ and DSQ were observed between plasma hArg concentrations measured by TSQ and by DSQ × 1.2:β = 0.985, a very small difference δ and bias (δ%) in the Bland–Altman method with no linearity between the difference and the average (ρ2 = 0.027), a weakly (CV, 4%) varying Oldham–Eksborg ratio of 1.04, and an AUC value very close to 0.5, indicating complete agreement between the methods. These results suggest that r2 alone is not a useful measure of agreement between two methods.

2.6. Measurement of Homoarginine in Mouse Plasma by GC-MS/MS and LC-MS/MS

hArg was measured in the plasma of mice by GC-MS/MS [86] and LC-MS/MS [71]. The hArg concentrations were measured as 245 ± 105 nM (CV, 43%) by the GC-MS/MS and 270 ± 204 nM (CV, 76%) by the LC-MS/MS. These values did not differ from each other (p = 0.190; two-tailed Wilcoxon test). The results of the methods-comparison are shown in Figure 5.
Linear regression analysis between the hArg concentrations measured by the LC-MS/MS (method 1) and those measured by the GC-MS/MS (method 2) resulted in a regression equation with a y-axis intercept value α = −186, a slope value β = 1.86, and a correlation coefficient (r2 = 0.9175). These observations suggest a good correlation between LC-MS/MS and GC-MS/MS, yet with the LC-MS/MS providing higher values for hArg in mouse plasma than the GC-MS/MS (Figure 5A).
The Bland–Altman approach revealed a moderate difference between the two methods (−25 ± 108 nM) according to a percentage difference of 13.6 ± 25% (bias) (Figure 5B1). The percentage difference (y) correlated weakly (ρ2 = 0.5421, p < 0.0001) with the average concentration (x): y = 80 − 0.26 × x. The difference correlated more strongly (ρ2 = 0.8601, p < 0.0001) with the average concentration of hArg: y = 80 − 0.258 × x (Figure 5B2). These findings argue for a moderate agreement between the GC-MS/MS and LC-MS/MS methods regarding hArg measurement in mouse plasma samples.
The approach according to Oldham and Eksborg resulted in the ratio ΛOE = 0.978 ± 0.439 (CV, 45%) which is close to unity, but is considerably variable (Figure 5C).
The ROC approach resulted in an AUC value of 0.508 ± 0.021 and a p value of 0.8688, suggesting a good agreement between the two methods.
In summary, the Bland–Altman and the Oldham–Eksborg approaches indicate considerable disagreement between the GC-MS/MS and LC-MS/MS methods for hArg measurement in mouse plasma. Disagreement is especially visible at hArg concentrations lower than 200 nM (Figure 5), presumably because of the lower sensitivity of the LC-MS/MS method in terms of a higher limit of quantitation (LOQ).
The LC-MS/MS method for hArg was compared with an ELISA method for this amino acid in human plasma [71]. The linear regression analysis of the hArg concentrations measured by ELISA (y) correlated with those measured by LC-MS/MS: y = 0.04 + 0.76 × x, r2 = 0.78 [71]. Thus, LC-MS/MS yielded consistently higher hArg values than ELISA (p < 0.001). Analysis by the Bland–Altman approach resulted in a considerable difference of 0.50 ± 0.39 µM hArg between ELISA and LC-MS/MS. The data indicate a considerable disagreement between LC-MS/MS and ELISA for hArg measurement in human plasma.

2.7. Measurement of F2-Isoprostanes in Biological Samples

Free-radical-induced peroxidation of arachidonic acid and other polyunsaturated fatty acids esterified to lipids and subsequent hydrolysis generates prostaglandin-like compounds including isoprostanes and neuroprostanes. These compounds are accessible for quantitative determination in tissue, plasma and urine. F2-Isoprostanes have emerged as markers of lipid peroxidation in vivo in humans. Among them, 8-iso-prostaglandin (PG) F, also known as 8-iso-PGF, 8-epi-PGF, 15-F2t-IsoP, iPF-III, and its major urinary metabolites, i.e., 2,3-dinor-4,5-dihydro-8-iso-PGF and 2,3-dinor-8-iso-PGF, were the subjects of extensive investigation [72,73,74,79,88,89,90,91]. Analytical approaches for F2-isoprotanes include chromatographic techniques such as thin-layer chromatography (TLC), high-performance liquid chromatography (HPLC), and GC, particularly in combination with MS. F2-Isoprostanes are extracted from biological samples by solid-phase extraction (SPE) or immunoaffinity column chromatography (IAC). HPLC and TLC are used for the isolation of particular F2-isoprostanes.

2.7.1. Comparison between EIA and GC-MS

Figure 1 of the article by Devaraj et al. [73] shows the relationship for “F2Isoprostanes” in urine as measured by the authors’ group by using a commercially available EIA (i.e., method 1) and as measured by the group of Roberts by using a GC-MS method (i.e., method 2) after Morrow and Roberts [72].
The F2-isoprostanes concentration (mean ± SD) was measured as being 2.183 ± 1.623 ng/mg (CV, 74%) by the GC-MS (method 2) and 2.037 ± 1.135 ng/mg (CV, 56%) by EIA (method 1). The values did not differ between the methods (p = 0.612; two-tailed Wilcoxon test). The results of the methods-comparison are shown in Figure 6.
The linear regression analysis between the F2-isoprostanes concentrations measured in urine by EIA (method 1) and those measured by GC-MS (method 2) resulted in a regression equation with a y-axis intercept value α = 0.814, a slope value β = 0.560, and a correlation coefficient (r2 = 0.6422). These observations suggest a weak correlation between EIA and GC-MS (Figure 6A). On the basis of the data of Figure 6A, and on the assumption that the GC-MS method is the reference method, one may, at first glance, conclude that the EIA method is valid for the intended analyte. However, the calculated correlation coefficient r2 value of 0.64 across the whole concentration range is rather low. Nevertheless, such a value is frequently considered to be high enough throughout the literature.
The y-axis intercept α and the slope β values of the regression equation are often disregarded when methods are compared. The y-axis intercept value α of 0.81 of the regression equation in the whole concentration range of this example (Figure 6A) is clearly far from the zero which would indicate complete agreement. Moreover, this value may suggest that the LOQ value of the EIA method is about 0.8 ng/mg creatinine for the analyte in urine, with the lower values being most likely highly overestimated. The slope value β of 0.56 (for the whole range) suggests that the concentrations measured by this EIA method are statistically half of those measured by the GC-MS method. This finding seems to be supported by data in the literature [91] which suggest that the EIA method detects only one F2-isoprostane out of 64 potential isomers [89], i.e., 15(S)-8-iso-PGF, while the reported physicochemical methods, GC-MS [89] and GC-MS/MS [91], that do not use specific immunoaffinity column chromatography (IAC) extraction for 15(S)-8-iso-PGF, may detect an unknown number of additional F2-isoprostanes. The contribution of those additional F2-isoprostanes is presumably of the same extent (of about 50%) as that of 15(S)-8-iso-PGF alone [91]. However, in the, by far more relevant, lower concentration range—both for controls and diabetes patients investigated in the study by Devaraj et al. [73]—there is no correlation between the EIA method and the GC-MS method (Figure 6A), i.e., r2 = 0.16 for the 0–3 range (n = 53) and r2 = 0.006 for the 0–3 range (n = 39). Figure 6A clearly demonstrates that the linearity between methods 1 and 2 solely is the result of very few (about 10% of the total) high concentration points (for a discussion see Ref. [24]). Thus, the value of linear regression analysis is limited, and the sole use of correlation coefficients may be misleading and even pretend correlation, because deviations in the lower concentration range are difficult to detect [8]. This comparison suggests that the correlation coefficient r2 = 0.64 is obviously too low and the agreement is analytically insufficient [90].
The Bland–Altman approach revealed a moderate difference between t5e two methods (−0.147 ± 0.985 ng/mg) according to a percentage difference of −1.07 ± 47% (bias) (Figure 6B). The percentage difference (y) correlated very weakly (ρ2 = 0.079, p = 0.022) with the average concentration (x): y = −22 + 10 × x. The difference (y) correlated very weakly (ρ2 = 0.279, p < 0.0001) with the average concentration (x): y = −0.68 + 0.392 × x. These findings argue for a weak agreement between the EIA and GC-MS methods with respect to F2-isoprostanes measurement in human urine samples.
The Bland–Altman approach (Figure 6B), and a deeper examination, reveal considerable disagreement between the EIA method and the reference GC-MS method [72]. The disagreement applies to the vast majority of those concentration points in the relevant concentration range for these substances [89], i.e., for the lower concentration range (see also Refs. [74,90]). Interestingly, the mean difference between the two methods is 0.15 (see Formula (2)), which is considerably low and close to zero. However, the standard deviation of the mean difference is 0.98, which is high and within the range of measured concentrations.
The approach according to Oldham and Eksborg resulted in the ratio ΛOE = 1.159 ± 0.691 (CV, 60%) which is not very far from unity but is very variable (Figure 6C). This indicates a rather poor agreement between the EIA and the GC-MS methods, notably in the lower and more relevant concentration range (Figure 6C).
The ROC approach resulted in an AUC value of 0.510 ± 0.051 and a p value of 0.8413, suggesting agreement between the two methods.

2.7.2. Comparison between ELISA and LC-MS/MS

Figure 5 of the article by Yan et al. [74] shows the linear regression between “iPF-III by ELISA (pg/mL)” and “iPF-III by LC-MS/MS (pg/mL)” in urine as measured by the authors themselves by using a commercially available ELISA (i.e., method 1) and as measured by the same group by using an LC-MS/MS method (i.e., method 2). It should be noted that iPF-III and 8-iso-PGF are abbreviations for the same F2-isoprostane [88]. The largest part of the originally reported data of the study by Yan et al. [74], i.e., 67 out of 86 (78% of total) could be re-evaluated by the author of the present article and they are presented and discussed below.
The F2-isoprostane iPF-III concentration (mean ± SD) was measured to be 934 ± 506 pg/mL (CV, 54%) by ELISA (method 2) and 417 ± 241 pg/mL (CV, 58%) by LC-MS/MS (method 1). The values differed between the methods (p < 0.0001; two-tailed Wilcoxon test). Other results of the methods-comparison are shown in Figure 7.
Linear regression analysis between the iPF-III concentrations measured in urine by ELISA (method 1) and those measured by LC-MS/MS (method 2) resulted in the regression equation with a y-axis intercept value α = 176, a slope value β = 1.82, and a correlation coefficient (r2 = 0.7518). These observations suggest a weak correlation between the ELISA and LC-MS/MS methods (Figure 7A).
Yan et al. reported in their article that ELISA and LC-MS/MS provided very similar results [74]. However, the ELISA method provides on average about two times higher values than the LC-MS/MS method. It is worth mentioning, that the theoretical value of the slope of the regression line should be about 0.5 [79]. This is because ELISA detects most likely only one F2-isoprostane (S-form), while LC-MS and LC-MS/MS detect at least two F2-isoprostanes (R- and S forms). Thus, actually the ELISA method provides values, which are on average about 4 times higher than those measured by the LC-MS/MS method.
The Bland–Altman approach revealed a considerably difference between the two methods (−517 ± 320 pg/mL) according to a percentage difference of 75 ± 29% (bias) (Figure 7B). The percentage difference did not correlate with the average concentration. However, the mean difference correlated with the average (y = −9.2 − 0.752 × x, ρ2 = 0.7253, p < 0.0001), indicating a considerable proportional error of the ELISA method. These findings argue for a weak agreement between the ELISA and LC-MS methods with respect to F2-isoprostanes measurement in human urine samples.
The approach according to Oldham and Eksborg resulted in the ratio ΛOE = 2.412 ± 0.920 (CV, 38%) which is far from unity (Figure 7C) and supports a disagreement between ELISA and LC-MS/MS, notably in the lower concentration range (Figure 7C). This example supports the critique by Altman and Bland on the use of the correlation coefficient for evaluating method-agreement of clinical measurement [10], even when the correlation coefficient value is fairly high (r2 = 0.75) as in the present example (Figure 7A).
The ROC approach yielded an AUC value of 0.845 ± 0.033 and a p value < 0.0001, suggesting a very low extent of agreement between the two methods.

2.7.3. Comparison between GC-MS and LC-MS

Sircar and Subbaiah [90] have compared their LC-MS method for 15(S)-8-iso-PGF (due to the use of IAC extraction) with the GC-MS method of Morrow and Roberts [72], i.e., which measures 15(S)-8-iso-PGF and additional F2-isoprostanes in urine (Figure 8).
The F2-isoprostane 15(S)-8-iso-PGF concentration (mean ± SD) was measured to be 0.825 ± 0.349 ng/mL (CV, 42%) by LC-MS (method 1) and 3.605 ± 1.362 ng/mL (CV, 38%) by GC-MS (method 2). The values differed between the methods (p < 0.0001; two-tailed paired t test). The results of the methods-comparison are shown in Figure 8.
Linear regression analysis between the iPF-III concentrations measured in urine by GC-MS (method 2) and those measured by LC-MS (method 1) resulted in the regression equation with a high y-axis intercept value α = 0.996, a high slope value β = 3.62, and a low correlation coefficient (r2 = 0.6555) (Figure 8A). These observations suggest a weak correlation between GC-MS and LC-MS.
The Bland–Altman approach revealed a moderate difference between the two methods (−0.147 ± 0.985 ng/mg; CV, 335%) according to a percentage difference of 125 ± 15% (bias) (Figure 8B). The percentage difference did not correlate with the average concentration (ρ2 = 0.028, p = 0.643). These findings argue for a weak agreement between the GC-MS and LC-MS methods. The difference correlated with the average concentration (ρ2 = 0.906, p < 0.0001). These findings argue for a weak agreement between the GC-MS and LC-MS methods with respect to the iPF-III measurement in human urine samples.
The approach according to Oldham and Eksborg resulted in the very high ratio ΛOE = 4.513 ± 1.143 (CV, 25%) which is far from the unity (Figure 8C). However, such a high ratio would be expectable because the GC-MS measures several F2-isoprostanes in addition to 15(S)-8-iso-PGF [72], while in the LC-MS method measures only 15(S)-8-iso-PGF due to the use of a specific IAC extraction. The informative value of this comparison is considered low because of the small number of urine samples (n = 10).
The ROC approach resulted in an AUC value of 0.990 ± 0.016 and a p value of 0.0002, suggesting no agreement between the two methods.

2.8. An Example for Clinical Measurement-Systolic Blood Pressure [75,76]

Figure 9 shows the results from the application of the three method-comparison approaches to an example for a typical clinical measurement, i.e., for measuring systolic blood pressure (SBP) in 25 patients with essential hypertension, which was reported elsewhere [75,76]. In order to be able to evaluate the data in the same manner as in the examples discussed above, one of the methods used to measure blood pressure was chosen arbitrarily as the reference method (i.e., method 2) by the author of the present article.
The SBP values were measured to be 167 ± 25 mmHg (CV, 15%) by method 2 and 178 ± 29 mmHg (CV, 16%) by method 1, and were found to differ significantly (p < 0.0001; two-tailed paired t test), albeit by a relatively low extent of 6%.
Linear regression analysis between the two methods resulted in the regression equation with a y-axis intercept value α = −7, a slope value β = 1.11, and a correlation coefficient (r2 = 0.9113) (Figure 9A). These observations suggest a good correlation between the two methods of blood pressure measuring, with method 2 providing on average 1.1 times higher SBP values.
The Bland–Altman approach revealed a moderate difference between the two methods according to a percentage difference of −6 ± 4.6% (bias) (Figure 9B).
The approach according to Oldham and Eksborg resulted in the ratio ΛOE = 0.943 ± 0.044 (CV, 5%) which is close to the unity and little variable (Figure 9C).
All three approaches indicate a considerable agreement between the compared methods of SBP measurement (Figure 9). The ROC approach resulted in an AUC value of 0.609 ± 0.080 and a p value of 0.1870, also suggesting agreement between the two methods.

2.9. A Second Example for Clinical Measurement-Peak Expiratory Flow Rate [10]

Figure 10 shows the results from the application of three method-comparison approaches to a clinical measurement, i.e., for measuring peak expiratory flow rate (PEFR) by two methods in 17 subjects, originally reported by Bland and Altman [10].
The PEFR values were measured to be 450 ± 116 mL/min (CV, 26%) by method 2 and 453 ± 113 mL/min (CV, 25%) by method 1, and differed significantly (p < 0.0001; two-tailed paired t test), albeit by a relatively low extent of 0.7%.
Linear regression analysis between the two methods resulted in the regression equation with a y-axis intercept value α = 39, a slope value β = 0.917, and a correlation coefficient (r2 = 0.8898) (Figure 10A). These observations suggest a good correlation between the two methods of PEFR measuring, with method 2 providing on average 0.92 times lower PEFR values.
The Bland–Altman approach revealed a small difference between the two methods (−2.1 ± 39 mL/min), yet with a considerable variability according to a percentage difference of −1 ± 12% (bias) (Figure 10B). The difference did not correlate with the average PEFR.
The approach according to Oldham and Eksborg resulted in the ratio ΛOE = 0.995 ± 0.114 (CV, 5%) which is very close to the unity and little variable (Figure 10C).
The ROC approach resulted in an AUC value of 0.509 ± 0.1015 and a p value of 0.9314, also suggesting good agreement between the two methods of PEFR measurement.

3. Discussion

In the present work, published data from studies reporting on method-comparison were re-analyzed and re-evaluated by three method-comparison approaches, i.e., linear regression (LR) analysis and the Bland–Altman (BA) method, which are the most frequently used approaches, and the Oldham–Eksborg (OE) method, which is much less frequently used in comparison of analytical methods. The Oldham–Eksborg method is closely comparable with the Bland–Altman variant, in which the ratio of the results provided by two methods is plotted against their average. In the studies considered here, one of the applied methods was based on mass spectrometry, i.e., GC-MS, GC-MS/MS, LC-MS or LC-MS/MS. The analytes measured by the methods are all physiological low-molecular-mass substances, belong to different chemical classes and pathways, and require chemical derivatization for analysis by GC-MS-based methods (Scheme 1). Chosen biological matrices were human biological samples including plasma, serum and urine. The results of the present work are summarized in Table 1 and selected examples are illustrated in the Figure 1, Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10.
GC-MS/MS and LC-MS/MS methods were among the methodologies used in these comparative studies. They are generally considered best useful as reference methods, i.e., the Gold Standards. The tandem mass spectrometry technique, when used in combination with chromatography, e.g., GC or LC (Schemes S1 and S2) in analytical chemistry, allows to generate accurate quantitative results, i.e., the concentrations of analytes in complex biological samples, even if the analytes are isomeric (Figure S2). It is considered that they are applied properly [48]. Analytical methods based on GC-MS/MS or LC-MS/MS are superior to those based on GC-MS or LC-MS, respectively, because the tandem mass spectrometry (MS/MS) is able to exclude (entirely) potentially interfering, mostly unknown analytes (Schemes S1 and S2). Of particular importance is the unique feature of MS to enable use of stable-isotope-labelled analogs of the analytes as internal standards in quantitative analyses. In contrast to the high specificity of MS-based methods, analytical methods that utilize much less specific physicochemical properties of analytes, such as light absorbance or fluorescence detection, even in combination with GC or LC, are generally considered less accurate because of the susceptibility to interferences and a lack of analytical sensitivity, i.e., too-high limits of detection. On the basis of these facts, MS/MS methods may reasonably be considered superior to MS methods on the one hand, and MS methods superior to non-MS methods, such as GC coupled to flame ionization detection (FID) or electrochemical detection (ECD), and LC coupled to UV/vis absorbance or fluorescence detection, on the other hand. Eventually, non-MS methods may be considered superior to batch assays which are free of any chromatographic or immunologic separation. Well-documented examples for batch assays include the analysis of nitrite based on the Griess reaction [92] and of malondialdehyde (MDA) based on the use of thiobarbituric acid (TBA) [93,94]. In a comparison of analytical methods, those based on GC-MS/MS and LC-MS/MS can serve as reference methods not only for those based on GC-MS and LC-MS, but also for non-MS-based analytical methods including immunological methods. Although this thought is widely spread, few scientists practise this principle in the field of bioanalysis.
In the present work, the Bland–Altman approach, the Oldham–Eksborg approach and the standard linear regression analysis were applied to compare published analytical methods for the measurement in biological samples of a series of structurally different physiological substances. Examples were taken from the author’s group and other groups who reported data from the use of two different analytical methods. Evaluations of comparability and agreement can be performed by using characteristic, preferably dimensionless, parameters of the abovementioned approaches. They include slope (β) and coefficients of correlation (r2), the Oldham–Eksborg ratio (ΛOE), the percentage difference, i.e., the bias (δ (%)), as well as the coefficient of correlation (ρ2) obtained from linear regression analysis of the difference δ versus the average μ in the Bland–Altman approach. The relation between difference and mean in the Bland–Altman approach was addressed by Bland and Altman, who found that in some cases the difference δ may be proportional to the mean μ [10], yet it was not further considered. This issue was addressed by Ludbrook [25]. Sporadically, such as in ophthalmology and vision science, exact parametric confidence intervals for the Bland–Altman approach were proposed and reviewed [28,95]. These approaches do not consider additional methods such as those investigated in the present work. In the present study, AUC data obtained from ROC analyses were also considered, an approach that is rarely used in comparisons of analytical methods.
Statistically significant differences between the τ1 and τ2 values may not indicate disagreement between the methods. Values of β, r2 and ΛOE of the order of 1.00, δ(%) and ρ2 values close to 0.00, and ROC-AUC (AUC) values close to 0.5 would indicate perfect agreement between the two methods compared. The present study indicates that perfect agreement is an exception rather than a rule. By contrast, β, r2 and ΛOE values different from 1.00, δ (%) and ρ2 values different from 0.00, and AUC values different from 0.5 would also not decisively indicate disagreeing methods. Rather, like in statistical analyses where statistical significance is defined arbitrarily, for instance p < 0.05, assessing the extent of agreement or disagreement of two methods demands definition of ranges rather than discrete values for p, β, r2, ΛOE, δ (%) and possibly for AUC as well. As the definition of values and ranges is, per se, arbitrary, agreement or disagreement between two methods would also be arbitrary and relative. This resembles in many aspects the validation of analytical methods for which quantitative criteria were achieved by consensus and are widely used in the field of analytical chemistry for various types of analytical chemistry. These criteria include the precision in terms of the relative standard deviation (RSD) or the coefficient of variation (CV), the accuracy of the method in terms of recovery (%) for analytes added to biological samples at relevant concentrations, their limits of detection (LOD) and quantitation (LOQ), usually on the basis of the signal-to-noise (S/N) ratio. Such criteria have not been declared for the agreement or disagreement of methods, irrespective of the method-comparison approach. This is particularly the case with the Bland–Altman approach, which is mostly “degraded” to a simple plot.
On the basis of the results reported in the present work, each of the three currently available method-comparison approaches alone may be useful for comparing analytical methods, and for finding out which of the two compared methods is able to provide better, specifically more accurate, results. Yet, without a definition of reference methods and without a definition of quantitative criteria for the extent of agreement between two methods, no objective assessment is possible. A definition of acceptance criteria for main characteristic parameters for each method-comparison is required.
Yet, a reliable solution to this problem is likely to require the definition of a composite of all single parameters: p, β, r2, δ(%), ρ2, ΛOE, AUC. Such a composite may provide maximum information about agreement or disagreement between the two analytical methods being compared. A way to overcome this dilemma could be to accept fully validated and published MS/MS-based methods as the reference methods. This assumption is reasonable and justified because of the inherent accuracy of the MS/MS-technique, provided it is performed correctly and errors, such as contamination, artificial formation, or the degradation of analytes during sampling, sample storage, derivatization, and analysis, are eliminated [47,48]. Strictly speaking, comparisons of two methods require the use of validated protocols for each method and the performance of comparison studies in parallel under optimum conditions for each method [96,97], and should also include the use of standardized reference compounds for an analyte and its stable-isotope-labelled analog in GC-MS/MS and LC-MS/MS methods (see Figure 3) [68,69].
Special emphasis and consideration should be given when comparing chemical and immunological methods or immunological methods such as immunoaffinity chromatography (IAC) that are used for the isolation of certain analytes from biological samples prior to chemical analysis. Without a consideration of such aspects, considerable disagreement between two methods is expected to be observed, as is shown in the present work for F2-isoprostanes (Figure 6 and Figure 7) [72,73,74,79,88,89,90,91].
Most frequently, the Bland–Altman approach plots the absolute difference of two methods versus the average, as originally proposed by Bland and Altman [10]. Plotting the percentage difference against the average of two methods is also widely used. In the present work, two examples were presented which indicate that only one of the two Bland–Altman plots may reveal additional information about the agreement/disagreement between the methods which has not been reported by Bland and Altman. Observation of a linearity between the difference of two methods and their average is often interpreted as a systematic error and is even used to identify systematic errors [70,97,98]. In contrast, the lack of linearity between the percentage difference of two methods and their average may erroneously exclude the presence of a systematic error. It is therefore advisable to test this kind of linearity.
The analysis of the results observed in the present work (Table 1), revealed that the p value from the Wilcoxon test correlated inversely with the AUC value from the ROC analysis: r = −0.662, p = 0.042. The r2 value from the linear regression analysis correlated inversely with the δ(%) value of the Bland–Altman assay: r = −0.699, p = 0.029, and with the CV value from the Oldham–Eksborg test: r = −0.780, p = 0.010. The δ(%) value from the Bland–Altman test correlated directly with the correlation of coefficient from the linear regression analysis of the Bland–Altman difference vs. the average concentration ρ2: r = 0.729, p = 0.021, as well as with CV value from the Oldham–Eksborg test: r = 0.669, p = 0.039. Such correlations may suggest that many parameters rather than a single parameter of the three methods-comparison approaches may be useful in assessing agreement and to determine its extent.
Figure 11 shows the results separately for p (Wilcoxon or paired t test), β, r2, ρ2, Λ, AUC, δ(%) and CVOE taken from 10 examples listed in Table 1.
Figure 12 shows the results for the sum of the statistical parameters for each analyte in these examples. The analytes included (n = 8) were nitrate, ADMA, AEA, hArg, F2-isoprostanes (Iso), SPB and PEFR. The numerically highest values were observed for δ (%) and CVOE with respect to the statistical parameters, and for hArg (case hArg b) and the F2-isoprostanes and CVOE with respect to the analytes.

4. Proposal of Criteria for Method Agreement

The results of the present study suggest that reliable comparison of two analytical methods is best performed by using a combination of different statistical methods. Statistical difference between the concentrations of an analyte measured in a biological sample is not useful in assessing agreement. We can dispense with the use of the paired t test or Wilcoxon test. Linear regression analysis is useful, but we need not only the coefficient of correlation (r or r2), but also the slope β of the regression line. The closer r and β to a value of 1.0, the higher the extent of agreement. However, linear regression does not reveal potentially important differences between the methods. This can be observed by the Bland–Altman approach, when the difference δAB is proportional to the mean μAB. The coefficient of correlation ρ2 in the Bland–Altman approach is a useful parameter in assessing agreement. This is clearly visible in Figure 13A, notably in the measurement of hArg in plasma by GC-MS and GC-MS/MS. The Oldham–Eksborg approach provides values of the ratio Λ which correlate with β and r2. Thus, the closer Λ to the value of 1.0, the higher the extent of the agreement. The bias, i.e., the percentage difference between the methods δ(%) in the Bland–Altman approach correlates with the coefficient of variation CVOE of the Oldham–Eksborg ratio Λ (Figure 13B).
The absolute value of the difference δ is less informative. It can therefore be concluded that β, r2, ρ2, ΛOE, δ (%) and CVOE are useful in evaluating method agreements and in determining the extent of agreement between two analytical methods. Figure 13 suggests that good agreement between two methods exists when β, r2, ΛOE do not differ from 1.00 by 7 to 11%, and δ (%) and CVOE are below 12%.
In an analogy to currently available guidelines for chemical analytical methods with respect to method validation [52,53,54,55,56,57,99], acceptance criteria for agreement could be defined as ±15% from 1.00 for β, r2, ΛOE and ±15% for δ (%).

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/molecules28134905/s1. Figure S1. Number of yearly citations of the paper by (A) J.M. Bland and D.G. Altman [10] and by (B) M.M. Bradford [100] according to Scopus (Elsevier) from 1976 to 11 January 2023. Bland and Altman reported an approach on method comparison, which is widely known as the Bland–Altman plot [10]. Bradford reported in her paper a method for the measurement of protein concentration utilizing the principle of protein-dye binding [100]. The paper by M.M. Bradford is thematically not related to the present work but is suitable for a better understanding of the value of the paper by J.M. Bland and D.G. Altman in science. Scheme S1. Schematic of the principles of the mass spectrometry (MS) and tandem mass spectrometry (MS/MS) based on the quadrupole (Q) technology, exemplified for two structurally closely related analytes A and B which co-elute (same retention time, tR) and ionize to form two isobaric ions (same mass-to-charge, m/z, ratio). (Upper left) Analytes A and B cannot be discriminated by single-stage quadrupole (SSQ) MS spectrometers. (Upper right, lower left) In the collision chamber (i.e., the second quadrupole Q2) of triple-stage quadrupole (TSQ) MS/MS spectrometers, collision induced dissociation (CID) of the precursor ions A and B with argon atoms produces several common and two distinctly different products ions (indicated by dotted arrows). (Lower, right) The third quadrupole Q3 of TSQ MS spectrometers separates the different product ions (set in dotted circles) formed in Q2. Thus, unlike SSQ MS spectrometers, TSQ MS/MS spectrometers can discriminate between analytes that co-elute and ionize in the ion source to form isobaric ions (same m/z). CID in Q2 and subsequent second mass separation in Q3 in TSQ instruments and related MS/MS instruments guarantee unique specificity. This feature makes MS/MS-based analytical methods the most qualified candidates to serve as reference methods, as the Gold Standard, for numerous analytes. See also Figure S2 and Scheme S2. Scheme S2. Schematic of the most frequently used modes in quantitative analyses of a target analyte A by using its stable-isotope-labelled analogue serving as the internal standard on quadrupole instruments. (A) Selected-ion monitoring (SIM) by mass spectrometry (MS) and (B) Selected-reaction monitoring (SRM) by tandem mass spectrometry (MS/MS). For more, details see the text. Figure S2. GC-MS (A,B) and GC-MS/MS (C,D) spectra of the pentafluorobenzyl (PFB) esters of 9-nitro-oleic acid (9-NO2-OA) and 10-nitro-oleic acid (10-NO2-OA). Electron-capture negative-ion chemical ionization (ECNICI) of the PFB esters of 9-NO2-OA (A) and 10-NO2-OA (B) leads to almost identical mass spectra, with the most intense ion being [M–PFB] with m/z 326. In GC-MS/MS, the isobaric (m/z 326) parent (P) ions ([M–PFB], [P]) are separated by Q1, subjected in Q2 to collision induced dissociation (CID), and the product ions formed in Q2 are separated by Q3. The product ion mass spectra of 9-NO2-OA (C) and 10-NO2-OA (D) are different. The product ions m/z 195 and m/z 197 are produced from 9-NO2-OA, but not from 10-NO2-OA. Thus, 9-NO2-OA and 10-NO2-OA can be discriminated by GC-MS/MS even if their PFB esters would co-elute. See also Scheme S1 and Ref. [101].

Funding

This research received no external funding.

Institutional Review Board Statement

This study did not use animal or human materials.

Informed Consent Statement

Not applicable.

Data Availability Statement

The study did not report any data.

Acknowledgments

The author is grateful to previous and current members of his group for their contributions over the last four decades. The author sincerely thanks Pedro Araujo (Norwegian Institute of Marine Research, Feed and Nutrition Group, N-5817 Bergen, Norway) and Stefanos A. Tsikas (Hannover Medical School, Academic Controlling, Hannover, Germany) for fruitful discussions.

Conflicts of Interest

The author declares no conflict of interest.

Sample Availability

Not available.

References

  1. Deming, W.E. Statistical Adjustment of Data; John Wiley and Sons: New York, NY, USA, 1943; p. 184. [Google Scholar]
  2. Oldham, P.D. Measurement in Medicine: The Interpretation of Numerical Data; English Universities Press: London, UK, 1968. [Google Scholar]
  3. Westgard, J.O.; Hunt, M.R. Use and interpretation of common statistical tests in method-comparison studies. Clin. Chem. 1973, 19, 49–57. [Google Scholar] [CrossRef]
  4. Wakkers, P.J.M.; Hellendoorn, H.B.A.; Op De Weegh, G.J.; Heerspink, W. Applications of statistics in clinical chemistry. A critical evaluation of regression lines. Clin. Chim. Acta 1975, 64, 173–184. [Google Scholar] [CrossRef]
  5. Brace, R.A. Fitting straight lines to experimental data. Am. J. Physiol. 1977, 233, R94–R99. [Google Scholar] [CrossRef]
  6. Cornbleet, P.J.; Gochman, N. Incorrect least-squares regression coefficients in method-comparison analysis. Clin. Chem. 1979, 25, 432–438. [Google Scholar] [CrossRef]
  7. Smith, D.S.; Pourfarzaneh, M.; Kamel, R.S. Linear regression analysis by Deming’s method. Clin. Chem. 1980, 26, 1105–1106. [Google Scholar] [CrossRef]
  8. Eksborg, S. Evaluation of method-comparison data. Clin. Chem. 1981, 27, 1311–1312. [Google Scholar] [CrossRef] [PubMed]
  9. Altman, D.G.; Bland, J.M. Measurement in Medicine: The analysis of method comparison studies. Statistician 1983, 32, 307–317. [Google Scholar] [CrossRef]
  10. Bland, J.M.; Altman, D.G. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986, 1, 307–310. [Google Scholar] [CrossRef] [PubMed]
  11. Linnet, K.; Bruunshuus, I. HPLC with enzymatic detection as a candidate reference method for serum creatinine. Clin. Chem. 1991, 37, 1669–1675. [Google Scholar] [CrossRef]
  12. Pollock, M.A.; Jefferson, S.G.; Kane, J.W.; Lomax, K.; MacKinnon, G.; Winnard, C.B. Method comparison—A different approach. Ann. Clin. Biochem. 1992, 29, 556–560. [Google Scholar] [CrossRef]
  13. Bland, J.M.; Altman, D.G. Comparing methods of measurement: Why plotting difference against standard method is misleading. Lancet 1995, 346, 1085–1087. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. National Committee for Clinical Laboratory Standards. Method Comparison and Bias Estimation Using Patient Samples, Approved Guideline; NCCLS publication EP9-A; NCCLS: Villanova, PA, USA, 1995. [Google Scholar]
  15. Hollis, S. Analysis of method comparison studies. Ann. Clin. Biochem. 1996, 33, 1–4. [Google Scholar] [CrossRef] [PubMed]
  16. Petersen, P.H.; Stöckl, D.; Blaabjerg, O.; Pedersen, B.; Birkemose, E.; Thienpont, L.; Lassen, J.F.; Kjeldsen, J. Graphical interpretation of analytical data from comparison of a field method with reference method by use of difference plots. Clin. Chem. 1997, 43, 2039–2046. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Thienpont, L.M.; Van Nuwenborg, J.E.; Stöckl, D. Intrinsic and routine quality of serum total potassium measurement as investigated by split-sample measurement with an ion chromatography candidate reference method. Clin. Chem. 1998, 44, 849–857. [Google Scholar] [CrossRef] [Green Version]
  18. Bland, J.M.; Altman, D.G. Measuring agreement in method comparison studies. Stat. Methods Med. Res. 1999, 8, 135–160. [Google Scholar] [CrossRef]
  19. Mantha, S.; Roizen, M.F.; Fleisher, L.A.; Thisted, R.; Foss, J. Comparing methods of clinical measurement: Reporting standards for bland and altman analysis. Anesth. Analg. 2000, 90, 593–602. [Google Scholar] [CrossRef]
  20. Dewitte, K.; Fierens, C.; Stöckl, D.; Thienpont, L.M. Application of the Bland-Altman plot for interpretation of method-comparison studies: A critical investigation of its practice. Clin. Chem. 2002, 48, 801–802. [Google Scholar] [CrossRef]
  21. Altman, D.G.; Bland, J.M. Commentary on quantifying agreement between two methods of measurement. Clin. Chem. 2002, 48, 801–802. [Google Scholar] [CrossRef] [Green Version]
  22. Lin, L.I.-K. A concordance correlation coefficient to evaluate reproducibility. Biometrics 1989, 45, 255–268. [Google Scholar] [CrossRef]
  23. Eastwood, B.J.; Farmen, M.W.; Iversen, P.W.; Craft, T.J.; Smallwood, J.K.; Garbison, K.E.; Delapp, N.W.; Smith, G.F. The minimum significant ratio: A statistical parameter to characterize the reproducibility of potency estimates from concentration-response assays and estimation by replicate-experiment studies. J. Biomol. Screen. 2006, 11, 253–261. [Google Scholar] [CrossRef] [Green Version]
  24. Dewé, W. Review of statistical methodologies used to compare (bio)assays. J. Chromatogr. B 2009, 877, 2208–2213. [Google Scholar] [CrossRef]
  25. Ludbrook, J. Confidence in Altman-Bland plots: A critical review of the method of differences. Clin. Exp. Pharmacol. Physiol. 2010, 37, 143–149. [Google Scholar] [CrossRef]
  26. Woodman, R.J. Bland-Altman beyond the basics: Creating confidence with badly behaved data. Clin. Exp. Pharmacol. Physiol. 2020, 37, 141–142. [Google Scholar] [CrossRef]
  27. Giavarina, D. Understanding Bland Altman analysis. Biochem. Med. 2015, 25, 141–151. [Google Scholar] [CrossRef] [Green Version]
  28. Carkeet, A. Exact parametric confidence intervals for Bland-Altman limits of agreement. Optom. Vis. Sci. 2015, 92, e71–e80. [Google Scholar] [CrossRef] [Green Version]
  29. Francq, B.G.; Govaerts, B. How to regress and predict in a Bland-Altman plot? Review and contribution based on tolerance intervals and correlated-errors-in-variables models. Stat. Med. 2016, 35, 2328–2358. [Google Scholar] [CrossRef]
  30. Hofman, C.S.; Melis, R.J.; Donders, A.R. Adapted Bland-Altman method was used to compare measurement methods with unequal observations per case. J. Clin. Epidemiol. 2015, 68, 939–943. [Google Scholar] [CrossRef] [PubMed]
  31. Lu, M.J.; Zhong, W.H.; Liu, Y.X.; Miao, H.Z.; Li, Y.C.; Ji, M.H. Sample size for assessing agreement between two methods of measurement by Bland-Altman method. Int. J. Biostat. 2016, 12, 20150039. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Tipton, E.; Shuster, J. A framework for the meta-analysis of Bland-Altman studies based on a limits of agreement approach. Stat. Med. 2017, 36, 3621–3635. [Google Scholar] [CrossRef] [PubMed]
  33. Carkeet, A.; Goh, Y.T. Confidence and coverage for Bland-Altman limits of agreement and their approximate confidence intervals. Stat. Methods Med. Res. 2018, 27, 1559–1574. [Google Scholar] [CrossRef] [PubMed]
  34. Misyura, M.; Sukhai, M.A.; Kulasignam, V.; Zhang, T.; Kamel-Reid, S.; Stockley, T.L. Improving validation methods for molecular diagnostics: Application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing. J. Clin. Pathol. 2018, 71, 117–124. [Google Scholar] [CrossRef] [Green Version]
  35. Sadler, W. ANNALS EXPRESS: Using the variance function to generalise Bland-Altman analysis. Ann. Clin. Biochem. 2018, 56, 198–203. [Google Scholar] [CrossRef] [PubMed]
  36. Jan, S.L.; Shieh, G. The Bland-Altman range of agreement: Exact interval procedure and sample size determination. Comput. Biol. Med. 2018, 100, 247–252. [Google Scholar] [CrossRef] [PubMed]
  37. Hill, R.E.; Whelan, D.T. Mass spectrometry and clinical chemistry. Clin. Chim. Acta 1984, 139, 231–294. [Google Scholar] [CrossRef]
  38. Lawson, A.M.; Gaskell, S.J.; Hjelm, M. International Federation of Clinical Chemistry (IFCC), Office for Reference Methods and Materials (ORMM). Methodological aspects on quantitative mass spectrometry used for accuracy control in clinical chemistry. J. Clin. Chem. Clin. Biochem. 1985, 23, 433–441. [Google Scholar]
  39. Tsikas, D. Application of gas chromatography-mass spectrometry and gas chromatography-tandem mass spectrometry to assess in vivo synthesis of prostaglandins, thromboxane, leukotrienes, isoprostanes and related compounds in humans. J. Chromatogr. B 1988, 717, 201–245. [Google Scholar] [CrossRef]
  40. Wilcken, B.; Wiley, V.; Hammond, J.; Carpenter, K. Screening newborns for inborn errors of metabolism by tandem mass spectrometry. N. Engl. J. Med. 2003, 348, 2304–2312. [Google Scholar] [CrossRef]
  41. Kuksis, A.; Myher, J.J. Application of tandem mass spectrometry for the analysis of long-chain carboxylic acids. J. Chromatogr. B 1995, 671, 35–70. [Google Scholar] [CrossRef]
  42. Oehme, M. Praktische Einführung in die GC-MS-Analytik mit Quadrupolen; Hüthig: Heidelberg, Germany, 1996. [Google Scholar]
  43. Busch, K.L.; Glish, G.L.; McLuckey, S.A. Mass Spectrometry/Mass Spectrometry; VCH Publishers: New York, NY, USA; Weinheim, Germany, 1998. [Google Scholar]
  44. Gross, J.H. Mass Spectrometry; Springer: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
  45. Watson, J.T.; Sparkman, O.D. Introduction in Mass Spectrometry, 4th ed.; Wiley: Chichester, UK, 2007. [Google Scholar]
  46. Jennings, K.R. (Ed.) A History of European Mass Spectrometry; IM Publications LLP: Chichester, UK, 2012. [Google Scholar]
  47. Tsikas, D.; Duncan, M.W. Mass spectrometry and 3-nitrotyrosine: Strategies, controversies, and our current perspective. Mass Spectrom. Rev. 2014, 33, 237–276. [Google Scholar] [CrossRef]
  48. Duncan, M.W. Good mass spectrometry and its place in good science. J. Mass Spectrom. 2012, 47, 795–809. [Google Scholar] [CrossRef] [PubMed]
  49. Araujo, P. Key aspects of analytical method validation and linearity evaluation. J. Chromatogr. B 2009, 877, 2224–2234. [Google Scholar] [CrossRef]
  50. Shah, V.P.; Midha, K.K.; Dighe, S.; McGilveray, I.J.; Skelly, J.P.; Yacobi, A.; Layloff, T.; Viswanathan, C.T.; Cook, C.E.; Mcdowall, R.D.; et al. Analytical methods validation: Bioavailability, bioequivalence, and pharmacokinetic studies. Pharmac. Sci. 1992, 81, 309. [Google Scholar] [CrossRef]
  51. Bansal, S.; DeStefano, S.A. Key elements of bioanalytical method validation for small molecules. AAPS J. 2007, 9, E109–E114. [Google Scholar] [CrossRef] [PubMed]
  52. Kaza, M.; Karaźniewicz-Łada, M.; Kosicka, K.; Siemiątkowska, A.; Rudzki, P.J. Bioanalytical method validation: New FDA guidance vs. EMA guideline. Better or worse? J. Pharm. Biomed. Anal. 2019, 165, 381–385. [Google Scholar] [CrossRef]
  53. Lindner, W.; Wainer, I.W. Requirements for initial assay validation and publication in J. Chromatography B. J. Chromatogr. B 1998, 707, 1–2. [Google Scholar]
  54. Bischoff, R.; Hopfgartner, G.; Karnes, H.T.; Lloyd, D.; Phillips, T.M.; Tsikas, D.; Xu, G. Summary of a recent workshop/conference report on validation and implementation of bioanalytical methods: Implications on manuscript review in the Journal of Chromatography B. J. Chromatogr. B 2007, 860, 1–3. [Google Scholar] [CrossRef] [PubMed]
  55. Booth, B.; Stevenson, L.; Pillutla, R.; Buonarati, M.; Beaver, C.; Fraier, D.; Garofolo, F.; Haidar, S.; Islam, R.; James, C.; et al. 2019 White Paper On Recent Issues in Bioanalysis: FDA BMV Guidance, ICH M10 BMV Guideline and Regulatory Inputs (Part 2—Recommendations on 2018 FDA BMV Guidance, 2019 ICH M10 BMV Draft Guideline and Regulatory Agencies’ Input on Bioanalysis, Biomarkers and Immunogenicity). Bioanalysis 2019, 11, 2099–2132. [Google Scholar] [CrossRef] [Green Version]
  56. Tsikas, D. Bioanalytical method validation of endogenous substances according to guidelines by the FDA and other organizations: Basic need to specify concentration ranges. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2018, 1093–1094, 80–81. [Google Scholar] [CrossRef]
  57. Tsikas, D. A proposal for comparing methods of quantitative analysis of endogenous compounds in biological systems by using the relative lower limit of quantification (rLLOQ). Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2009, 877, 2244–2251. [Google Scholar] [CrossRef]
  58. Tsikas, D.; Gutzki, F.M.; Sandmann, J.; Schwedhelm, E.; Frölich, J.C. Gas chromatographic-tandem mass spectrometric quantification of human plasma and urinary nitrate after its reduction to nitrite and derivatization to the pentafluorobenzyl derivative. J. Chromatogr. B Biomed. Sci. Appl. 1999, 731, 285–291. [Google Scholar] [CrossRef]
  59. Tsikas, D. Mass spectrometry-validated HPLC method for urinary nitrate. Clin. Chem. 2004, 50, 1259–1261. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  60. Tsikas, D. Simultaneous derivatization and quantification of the nitric oxide metabolites nitrite and nitrate in biological fluids by gas chromatography/mass spectrometry. Anal. Chem. 2000, 72, 4064–4072. [Google Scholar] [CrossRef]
  61. Becker, A.J.; Ückert, S.; Tsikas, D.; Noack, H.; Stief, C.G.; Frölich, J.C.; Wolf, G.; Jonas, U. Determination of nitric oxide metabolites by means of the Griess assay and gas chromatography-mass spectrometry in the cavernous and systemic blood of healthy males and patients with erectile dysfunction during different functional conditions of the penis. Urol. Res. 2000, 28, 364–369. [Google Scholar] [CrossRef] [PubMed]
  62. Sitar, M.E.; Kayacelebi, A.A.; Beckmann, B.; Kielstein, J.T.; Tsikas, D. Asymmetric dimethylarginine (ADMA) in human blood: Effects of extended haemodialysis in the critically ill patient with acute kidney injury, protein binding to human serum albumin and proteolysis by thermolysin. Amino Acids 2015, 47, 1983–1993. [Google Scholar] [CrossRef] [PubMed]
  63. Tsikas, D.; Schubert, B.; Gutzki, F.M.; Sandmann, J.; Frölich, J.C. Quantitative determination of circulating and urinary asymmetric dimethylarginine (ADMA) in humans by gas chromatography-tandem mass spectrometry as methyl ester tri(N-pentafluoropropionyl) derivative. J. Chromatogr. B 2003, 798, 87–99. [Google Scholar] [CrossRef]
  64. Schulze, F.; Wesemann, R.; Schwedhelm, E.; Sydow, K.; Albsmeier, J.; Cooke, J.P.; Böger, R.H. Determination of asymmetric dimethylarginine (ADMA) using a novel ELISA assay. Clin. Chem. Lab. Med. 2004, 42, 1377–1383. [Google Scholar] [CrossRef]
  65. Martens-Lobenhoffer, J.; Westphal, S.; Awiszus, F.; Bode-Böger, S.M.; Luley, C. Determination of asymmetric dimethylarginine: Liquid chromatography-mass spectrometry or ELISA? Clin. Chem. 2005, 51, 2188–2189. [Google Scholar] [CrossRef]
  66. Valtonen, P.; Karppi, J.; Nyyssönen, K.; Vakonen, V.P.; Halonen, T.; Punnonen, K. Comparison of HPLC method and commercial ELISA assay for asymmetric dimethylarginine (ADMA) determination in human serum. J. Chromatogr. B 2005, 828, 97–102. [Google Scholar] [CrossRef]
  67. Široká, R.; Trefil, L.; Rajdl, D.; Racek, J.; Cibulka, R. Asymmetric dimethylarginine-comparison of HPLC and ELISA methods. J. Chromatogr. B 2007, 850, 586–587. [Google Scholar] [CrossRef]
  68. Zoerner, A.A.; Gutzki, F.M.; Suchy, M.T.; Beckmann, B.; Engeli, S.; Jordan, J.; Tsikas, D. Targeted stable-isotope dilution GC-MS/MS analysis of the endocannabinoid anandamide and other fatty acid ethanol amides in human plasma. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2009, 877, 2909–2923. [Google Scholar] [CrossRef]
  69. Zoerner, A.A.; Batkai, S.; Suchy, M.T.; Gutzki, F.M.; Engeli, S.; Jordan, J.; Tsikas, D. Simultaneous UPLC-MS/MS quantification of the endocannabinoids 2-arachidonoyl glycerol (2AG), 1-arachidonoyl glycerol (1AG), and anandamide in human plasma: Minimization of matrix-effects, 2AG/1AG isomerization and degradation by toluene solvent extraction. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2012, 883–884, 161–171. [Google Scholar] [CrossRef]
  70. Kayacelebi, A.A.; Beckmann, B.; Gutzki, F.M.; Jordan, J.; Tsikas, D. GC-MS and GC-MS/MS measurement of the cardiovascular risk factor homoarginine in biological samples. Amino Acids 2014, 46, 2205–2217. [Google Scholar] [CrossRef] [PubMed]
  71. Cordts, K.; Atzler, D.; Qaderi, V.; Sydow, K.; Böger, R.H.; Choe, C.U.; Schwedhelm, E. Measurement of homoarginine in human and mouse plasma by LC-MS/MS and ELISA: A comparison and a biological application. Amino Acids 2015, 47, 2015–2022. [Google Scholar] [CrossRef]
  72. Morrow, J.D.; Roberts, J.L. Mass spectrometric quantification of F2-isoprostanes in biological fluids and tissues as measure of oxidant stress. Methods Enzymol. 1998, 300, 3–12. [Google Scholar]
  73. Devaraj, S.; Hirany, S.V.; Burk, R.F.; Jialal, I. Divergence between LDL oxidative susceptibility and urinary F(2)-isoprostanes as measures of oxidative stress in type 2 diabetes. Clin. Chem. 2001, 47, 1974–1979. [Google Scholar] [CrossRef] [Green Version]
  74. Yan, W.; Byrd, G.D.; Ogden, M.W. Quantitation of isoprostane isomers in human urine from smokers and nonsmokers by LC-MS/MS. J. Lip. Res. 2007, 48, 1607–1617. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  75. Ludbrook, J. Comparing methods of measurements. Clin. Exp. Pharmacol. Physiol. 1997, 24, 193–203. [Google Scholar] [CrossRef]
  76. Daniel, W.W. W.W. Biostatistics: A Foundation for Analysis in the Health Sciences, 2nd ed.; John Wiley & Sons: New York, NY, USA, 1978. [Google Scholar]
  77. Erpenbeck, V.; Jorres, R.A.; Discher, M.; Krentel, K.; Tsikas, D.; Luettig, B.; Krug, N.; Hohlfeld, J.M. Local nitric oxide levels reflect the degree of allergic airway inflammation after segmental allergen challenge in asthmatics. Nitric Oxide 2005, 13, 125–133. [Google Scholar] [CrossRef]
  78. Tsikas, D.; Schwedhelm, E.; Stutzer, F.K.; Gutzki, F.M.; Rode, I.; Mehls, C.; Frölich, J.C. Accurate quantification of basal plasma levels of 3-nitrotyrosine and 3-nitrotyrosinoalbumin by gas chromatography-tandem mass spectrometry. J. Chromatogr. B 2003, 784, 77–90. [Google Scholar] [CrossRef]
  79. Tsikas, D.; Schwedhelm, E.; Suchy, M.T.; Niemann, J.; Gutzki, F.M.; Erpenbeck, V.J.; Hohlfeld, J.M.; Surdacki, A.; Frölich, J.C. Divergence in urinary 8-iso-PGF(2alpha) (iPF(2alpha)-III, 15-F(2t)-IsoP) levels from gas chromatography-tandem mass spectrometry quantification after thin-layer chromatography and immunoaffinity column chromatography reveals heterogeneity of 8-iso-PGF(2alpha). Possible methodological, mechanistic and clinical implications. J. Chromatogr. B 2003, 794, 237–255. [Google Scholar]
  80. Tsikas, D.; Wolf, A.; Frölich, J.C. Simplified HPLC method for urinary and circulating creatinine. Clin. Chem. 2004, 50, 201–203. [Google Scholar] [CrossRef] [Green Version]
  81. Tsikas, D. A critical review and discussion of analytical methods in the L-arginine/nitric oxide area of basic and clinical research. Anal. Biochem. 2008, 379, 139–163. [Google Scholar] [CrossRef]
  82. Teerlink, T. HPLC analysis of ADMA and other methylated L-arginine analogs in biological fluids. J. Chromatogr. B 2007, 851, 21–29. [Google Scholar] [CrossRef] [PubMed]
  83. Horowitz, J.D.; Heresztyn, T. An overview of plasma concentrations of asymmetric dimethylarginine (ADMA) in health and disease and in clinical studies: Methodological considerations. J. Chromatogr. B 2007, 851, 42–50. [Google Scholar] [CrossRef] [PubMed]
  84. Martens-Lobenhoffer, J.; Bode-Böger, S.M. Chromatographic-mass spectrometric methods for the quantification of L-arginine and its methylated metabolites in biological fluids. J. Chromatogr. B 2007, 851, 30–41. [Google Scholar] [CrossRef]
  85. Zoerner, A.A.; Gutzki, F.M.; Batkai, S.; May, M.; Rakers, C.; Engeli, S.; Jordan, J.; Tsikas, D. Quantification of endocannabinoids in biological systems by chromatography and mass spectrometry: A comprehensive review from an analytical and biological perspective. Biochim. Biophys. Acta 2011, 1811, 706–723. [Google Scholar] [CrossRef]
  86. Tsikas, D.; Zoerner, A.A. Analysis of eicosanoids by LC-MS/MS and GC-MS/MS: A historical retrospect and a discussion. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2014, 964, 79–88. [Google Scholar] [CrossRef] [PubMed]
  87. Tsikas, D. Homoarginine in health and disease. Curr. Opin. Clin. Nutr. Metab. Care 2023, 26, 42–49. [Google Scholar] [CrossRef]
  88. Schwedhelm, E.; Benndorf, R.A.; Böger, R.H.; Tsikas, D. Mass spectrometric analysis of F2-isoprostanes: Markers and mediators in human disease. Curr. Pharm. Anal. 2007, 3, 39–51. [Google Scholar] [CrossRef]
  89. Vassalle, C.; Andreassi, M.G. 8-Iso-Prostaglandin F2 as a Risk Marker in Patients With Coronary Heart Disease. Circulation 2004, 109, e49–e50. [Google Scholar]
  90. Sircar, D.; Subbaiah, P.V. Isoprostane measurement in plasma and urine by liquid chromatography-mass spectrometry with one-step sample preparation. Clin. Chem. 2007, 53, 251–258. [Google Scholar] [CrossRef]
  91. Schwedhelm, E.; Tsikas, D.; Durand, T.; Gutzki, F.M.; Guy, A.; Rossi, J.-C.; Frölich, J.C. Tandem mass spectrometric quantification of 8-iso-prostaglandin F2alpha and its metabolite 2,3-dinor-5,6-dihydro-8-iso-prostaglandin F2alpha in human urine. J. Chromatogr. B 2000, 744, 99–112. [Google Scholar] [CrossRef]
  92. Tsikas, D. Analysis of nitrite and nitrate in biological fluids by assays based on the Griess reaction: Appraisal of the Griess reaction in the L-arginine/nitric oxide area of research. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2007, 851, 51–70. [Google Scholar] [CrossRef] [PubMed]
  93. Tsikas, D. Assessment of lipid peroxidation by measuring malondialdehyde (MDA) and relatives in biological samples: Analytical and biological challenges. Anal. Biochem. 2017, 524, 13–30. [Google Scholar] [CrossRef]
  94. Malaei, R.; Ramezani, A.M.; Absalan, G. Analysis of malondialdehyde in human plasma samples through derivatization with 2,4-dinitrophenylhydrazine by ultrasound-assisted dispersive liquid-liquid microextraction-GC-FID approach. J. Chromatogr. B 2018, 1089, 60–69. [Google Scholar] [CrossRef]
  95. Carkeet, A. A Review of the Use of Confidence Intervals for Bland-Altman Limits of Agreement in Optometry and Vision Science. Optom. Vis. Sci. 2020, 97, 3–8. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  96. Schwedhelm, E. Quantification of ADMA: Analytical approaches. Vasc. Med. 2005, 10 (Suppl. 1), S89–S95. [Google Scholar] [CrossRef] [Green Version]
  97. Tsikas, D.; Wolf, A.; Mitschke, A.; Gutzki, F.M.; Will, W.; Bader, M. GC-MS determination of creatinine in human biological fluids as pentafluorobenzyl derivative in clinical studies and biomonitoring: Inter-laboratory comparison in urine with Jaffé, HPLC and enzymatic assays. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2010, 878, 2582–2592. [Google Scholar] [CrossRef] [PubMed]
  98. Pannekeet, M.M.; Imholz, A.L.; Struijk, D.G.; Koomen, G.C.; Langedijk, M.J.; Schouten, N.; de Waart, R.; Hiralall, J.; Krediet, R.T. The standard peritoneal permeability analysis: A tool for the assessment of peritoneal permeability characteristics in CAPD patients. Kidney Int. 1995, 48, 866–875. [Google Scholar] [CrossRef] [Green Version]
  99. Bioanalytical Method Validation—Scientific Guideline. Available online: https://www.ema.europa.eu/en/bioanalytical-method-validation-scientific-guideline (accessed on 1 March 2023).
  100. Bradford, M.M. A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 1976, 72, 248–254. [Google Scholar] [CrossRef]
  101. Tsikas, D.; Zoerner, A.; Mitschke, A.; Homsi, Y.; Gutzki, F.M.; Jordan, J. Specific GC-MS/MS stable-isotope dilution methodology for free 9- and 10-nitro-oleic acid in human plasma challenges previous LC-MS/MS reports. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2009, 877, 2895–2908. [Google Scholar] [CrossRef] [PubMed]
Scheme 1. Chemical structures of the native analytes (left) and their chemical derivatives (right) discussed in the present work. Nitrate, nitrite, malondialdehyde, and creatinine are derivatized with pentafluorobenzyl bromide in aqueous acetone (e.g., 60 min at 50 °C). Asymmetric dimethylarginine (ADMA), homoarginine and homocysteine are first methylated in 2 M HCl in methanol (e.g., 30 min at 80 °C), and then by pentafluoropropionic anhydride in ethyl acetate (e.g., 60 min at 65 °C). Me, methyl; PFB, Pentafluorobenzyl; PFP, pentafluoropropionyl.
Scheme 1. Chemical structures of the native analytes (left) and their chemical derivatives (right) discussed in the present work. Nitrate, nitrite, malondialdehyde, and creatinine are derivatized with pentafluorobenzyl bromide in aqueous acetone (e.g., 60 min at 50 °C). Asymmetric dimethylarginine (ADMA), homoarginine and homocysteine are first methylated in 2 M HCl in methanol (e.g., 30 min at 80 °C), and then by pentafluoropropionic anhydride in ethyl acetate (e.g., 60 min at 65 °C). Me, methyl; PFB, Pentafluorobenzyl; PFP, pentafluoropropionyl.
Molecules 28 04905 sch001
Figure 1. Measurement of nitrate in human urine by GC-MS (i.e., MS, method (1) and GC-MS/MS (i.e., MS/MS, method (2) and their comparison by: (A) linear regression; (B) Bland–Altman; and (C) Oldham–Eksborg. This Figure was constructed by using the data of Table 1 of the article [58]. Samples were analyzed on the instrument TSQ 7000 first by GC-MS in the SIM mode and subsequently by GC-MS/MS in the SRM mode. The closed points in (A) indicate the 95% confidence bands. Horizontal solid lines in (B) indicate the 95% limits of agreement (±1.96 × SD).
Figure 1. Measurement of nitrate in human urine by GC-MS (i.e., MS, method (1) and GC-MS/MS (i.e., MS/MS, method (2) and their comparison by: (A) linear regression; (B) Bland–Altman; and (C) Oldham–Eksborg. This Figure was constructed by using the data of Table 1 of the article [58]. Samples were analyzed on the instrument TSQ 7000 first by GC-MS in the SIM mode and subsequently by GC-MS/MS in the SRM mode. The closed points in (A) indicate the 95% confidence bands. Horizontal solid lines in (B) indicate the 95% limits of agreement (±1.96 × SD).
Molecules 28 04905 g001
Figure 2. Comparison of measurements of asymmetric dimethylarginine (ADMA) in human plasma (method 1) and serum (method 2) of one patient with acute renal failure before, during and after extended haemodialysis for 8 h: (A) linear regression analysis; (B) Bland–Altman; and (C) Oldham–Eksborg. This Figure was constructed by using the data of Table 1 of a previous article [62]. Samples were analyzed for ADMA on the instrument TSQ 7000 by GC-MS/MS in the SRM mode as reported elsewhere [63]. Horizontal solid lines in (B) indicate the 95% limits of agreement.
Figure 2. Comparison of measurements of asymmetric dimethylarginine (ADMA) in human plasma (method 1) and serum (method 2) of one patient with acute renal failure before, during and after extended haemodialysis for 8 h: (A) linear regression analysis; (B) Bland–Altman; and (C) Oldham–Eksborg. This Figure was constructed by using the data of Table 1 of a previous article [62]. Samples were analyzed for ADMA on the instrument TSQ 7000 by GC-MS/MS in the SRM mode as reported elsewhere [63]. Horizontal solid lines in (B) indicate the 95% limits of agreement.
Molecules 28 04905 g002
Figure 3. Comparison of measurements of anandamide (AEA) in 277 human plasma samples by LC-MS/MS (method 1) and by GC-MS/MS (method 2): (A) linear regression analysis; (B) Bland–Altman; and (C) Oldham–Eksborg. Samples were analyzed for AEA on the instrument TSQ 7000 by GC-MS/MS [68] and by Xevo LC-MS/MS [69] as reported in these references. Horizontal solid lines in (B) indicate the 95% limits of agreement.
Figure 3. Comparison of measurements of anandamide (AEA) in 277 human plasma samples by LC-MS/MS (method 1) and by GC-MS/MS (method 2): (A) linear regression analysis; (B) Bland–Altman; and (C) Oldham–Eksborg. Samples were analyzed for AEA on the instrument TSQ 7000 by GC-MS/MS [68] and by Xevo LC-MS/MS [69] as reported in these references. Horizontal solid lines in (B) indicate the 95% limits of agreement.
Molecules 28 04905 g003aMolecules 28 04905 g003b
Figure 4. Comparison of measurements of homoarginine (hArg) in plasma by GC-MS (method 1) and by GC-MS/MS (method 2) in 369 plasma samples of pregnant women: (A) linear regression analysis; (B1,B2) Bland–Altman; and (C) Oldham–Eksborg. Within (B1,B2): linear regression analysis between percentage difference and average. This Figure was constructed by using the data published in a previous article [70]. Note that two different apparatus were used: Samples were analyzed for hArg on the instrument TSQ 7000 by GC-MS/MS in the SRM mode and on the instrument DSQ in the SIM mode as reported elsewhere [70]. Horizontal solid lines in (B1) indicate the 95% limits of agreement. Note that the difference between the methods is used in percentages ((B1), %) and in absolute concentrations ((B2), µM).
Figure 4. Comparison of measurements of homoarginine (hArg) in plasma by GC-MS (method 1) and by GC-MS/MS (method 2) in 369 plasma samples of pregnant women: (A) linear regression analysis; (B1,B2) Bland–Altman; and (C) Oldham–Eksborg. Within (B1,B2): linear regression analysis between percentage difference and average. This Figure was constructed by using the data published in a previous article [70]. Note that two different apparatus were used: Samples were analyzed for hArg on the instrument TSQ 7000 by GC-MS/MS in the SRM mode and on the instrument DSQ in the SIM mode as reported elsewhere [70]. Horizontal solid lines in (B1) indicate the 95% limits of agreement. Note that the difference between the methods is used in percentages ((B1), %) and in absolute concentrations ((B2), µM).
Molecules 28 04905 g004
Figure 5. Comparison of measurements of homoarginine (hArg) in mouse plasma by GC-MS/MS (method A) and by LC-MS/MS (method B) in 79 plasma samples: (A) linear regression analysis; (B1,B2) Bland–Altman; and (C) Oldham–Eksborg. Within (B1,B2): linear regression analysis between percentage difference and average. This Figure was constructed by using the data published in a previous article [71]. Note that two different apparatus were used: Samples were analyzed for hArg on the instrument TSQ 7000 by GC-MS/MS in the SRM mode and on the instrument Varian 1200 L Triple Quadrupole MS in the SRM mode as reported elsewhere [71]. Horizontal solid lines in (B1,B2) indicate the 95% limits of agreement. Shaded insets indicate ranges of maximum disagreement.
Figure 5. Comparison of measurements of homoarginine (hArg) in mouse plasma by GC-MS/MS (method A) and by LC-MS/MS (method B) in 79 plasma samples: (A) linear regression analysis; (B1,B2) Bland–Altman; and (C) Oldham–Eksborg. Within (B1,B2): linear regression analysis between percentage difference and average. This Figure was constructed by using the data published in a previous article [71]. Note that two different apparatus were used: Samples were analyzed for hArg on the instrument TSQ 7000 by GC-MS/MS in the SRM mode and on the instrument Varian 1200 L Triple Quadrupole MS in the SRM mode as reported elsewhere [71]. Horizontal solid lines in (B1,B2) indicate the 95% limits of agreement. Shaded insets indicate ranges of maximum disagreement.
Molecules 28 04905 g005
Figure 6. This Figure was constructed by using the data of Figure 1 of the article by Devaraj et al. [73]: comparison of a commercially available EIA method with a GC-MS method [72] for F2-isoprostanes in urine; data are reported as ng F2-isoprostanes per mg creatinine in this article: (A) linear regression analysis; (B) Bland–Altman; and (C) Oldham–Eksborg. Horizontal dotted lines indicate ±2 SD range. See Refs. [72,73,74,79,88,89,90,91] regarding analysis of F2-isoprostanes.
Figure 6. This Figure was constructed by using the data of Figure 1 of the article by Devaraj et al. [73]: comparison of a commercially available EIA method with a GC-MS method [72] for F2-isoprostanes in urine; data are reported as ng F2-isoprostanes per mg creatinine in this article: (A) linear regression analysis; (B) Bland–Altman; and (C) Oldham–Eksborg. Horizontal dotted lines indicate ±2 SD range. See Refs. [72,73,74,79,88,89,90,91] regarding analysis of F2-isoprostanes.
Molecules 28 04905 g006
Figure 7. This Figure was constructed by using the data of Figure 5 of the article by Yan et al. [74]; due to considerable overlap of data points not all data points of the original Figure 5 from Ref. [74] could be used in the present work. Comparison of a commercially available ELISA method with a LC-MS/MS method for IPF-III (8-iso-PGF) in urine by: (A) linear regression analysis, (B) Bland–Altman; and (C) Oldham–Eksborg. Horizontal dotted lines indicate ±2 × SD range in (B). See Refs. [72,73,74,79,88,89,90,91] regarding analysis of F2-isoprostanes.
Figure 7. This Figure was constructed by using the data of Figure 5 of the article by Yan et al. [74]; due to considerable overlap of data points not all data points of the original Figure 5 from Ref. [74] could be used in the present work. Comparison of a commercially available ELISA method with a LC-MS/MS method for IPF-III (8-iso-PGF) in urine by: (A) linear regression analysis, (B) Bland–Altman; and (C) Oldham–Eksborg. Horizontal dotted lines indicate ±2 × SD range in (B). See Refs. [72,73,74,79,88,89,90,91] regarding analysis of F2-isoprostanes.
Molecules 28 04905 g007
Figure 8. This Figure was constructed by using the data of Figure 4 of the article by Sircar and Subbaiah [90]. Comparison of an LC-MS method with a GC-MS method for IPF-III (8-iso-PGF) in urine by (A) linear regression analysis, (B) Bland–Altman, and (C) Oldham–Eksborg. Horizontal dotted lines indicate ± 2 SD range in (B). See Refs. [72,73,74,79,88,89,90,91] regarding analysis of F2-isoprostanes.
Figure 8. This Figure was constructed by using the data of Figure 4 of the article by Sircar and Subbaiah [90]. Comparison of an LC-MS method with a GC-MS method for IPF-III (8-iso-PGF) in urine by (A) linear regression analysis, (B) Bland–Altman, and (C) Oldham–Eksborg. Horizontal dotted lines indicate ± 2 SD range in (B). See Refs. [72,73,74,79,88,89,90,91] regarding analysis of F2-isoprostanes.
Molecules 28 04905 g008
Figure 9. This Figure was constructed by using the data of Table 1 of the article by Ludbrook [75] which had been lent out from Ref. [76]. Comparison of measuring systolic blood pressure (SBP) by method 1 (M1) and method 2 (M2) in 25 patients with essential hypertension; method 2 was chosen arbitrarily as the reference method by the author of the present article. (A) Linear regression analysis; (B) Bland–Altman method. (C) Oldham–Eksborg method.
Figure 9. This Figure was constructed by using the data of Table 1 of the article by Ludbrook [75] which had been lent out from Ref. [76]. Comparison of measuring systolic blood pressure (SBP) by method 1 (M1) and method 2 (M2) in 25 patients with essential hypertension; method 2 was chosen arbitrarily as the reference method by the author of the present article. (A) Linear regression analysis; (B) Bland–Altman method. (C) Oldham–Eksborg method.
Molecules 28 04905 g009
Figure 10. This Figure was constructed by using the data of the Table of the article by Bland and Altman [10]. Comparison of measuring peak expiratory flow rate (PEFR) by method 1 (M1) and method 2 (M2) in 17 subjects; method 2 was chosen arbitrarily as the reference method by the author of the present article. (A) Linear regression analysis; (B) Bland–Altman method. (C) Oldham–Eksborg method.
Figure 10. This Figure was constructed by using the data of the Table of the article by Bland and Altman [10]. Comparison of measuring peak expiratory flow rate (PEFR) by method 1 (M1) and method 2 (M2) in 17 subjects; method 2 was chosen arbitrarily as the reference method by the author of the present article. (A) Linear regression analysis; (B) Bland–Altman method. (C) Oldham–Eksborg method.
Molecules 28 04905 g010
Figure 11. Presentation of values for selected statistical parameters of Table 1 obtained from the Wilcoxon or paired t test (p), the linear regression analysis (β and r2), the coefficient of correlation ρ2 from the linear regression analysis of the Bland–Altman difference δ vs. the average concentration of the analyte, the Oldham–Eksborg ratio Λ, the AUC value from the ROC analysis (A); and of the percentage difference of the Bland–Altman test δ (%), and the coefficient of variation of the Oldham–Eksborg ratio CVOE (B). Data from 10 examples were used. Because of the greatly differing size of the values, the data are presented in two panels. Note the decadic logarithmic scale on the y axis in panel B. Data are shown as median with 95% confidence interval. LR, linear regression. See Table 1.
Figure 11. Presentation of values for selected statistical parameters of Table 1 obtained from the Wilcoxon or paired t test (p), the linear regression analysis (β and r2), the coefficient of correlation ρ2 from the linear regression analysis of the Bland–Altman difference δ vs. the average concentration of the analyte, the Oldham–Eksborg ratio Λ, the AUC value from the ROC analysis (A); and of the percentage difference of the Bland–Altman test δ (%), and the coefficient of variation of the Oldham–Eksborg ratio CVOE (B). Data from 10 examples were used. Because of the greatly differing size of the values, the data are presented in two panels. Note the decadic logarithmic scale on the y axis in panel B. Data are shown as median with 95% confidence interval. LR, linear regression. See Table 1.
Molecules 28 04905 g011
Figure 12. (A) Sum of values for selected statistical parameters taken from Table 1 obtained from the Wilcoxon or paired t test (p), the linear regression analysis (β and r2), the coefficient of correlation ρ2 from the linear regression analysis of the Bland–Altman difference δ vs. the average concentration of the analyte, the Oldham–Eksborg ratio Λ, the AUC value from the ROC analysis, the percentage difference of the Bland–Altman test δ(%), and the coefficient of variation of the Oldham–Eksborg ratio CVOE. (B) Like in (A), yet without δ(%) and CVOE in order to exclude high values. Data from 10 examples were used. Note the decadic logarithmic scale on the y axis in panel (B). Data are shown as median with 95% confidence interval. Iso, F2 isoprostane. Insets indicate the tested parameters and their values. BP, systolic blood pressure; Iso, F2-isoprostanes. See Table 1.
Figure 12. (A) Sum of values for selected statistical parameters taken from Table 1 obtained from the Wilcoxon or paired t test (p), the linear regression analysis (β and r2), the coefficient of correlation ρ2 from the linear regression analysis of the Bland–Altman difference δ vs. the average concentration of the analyte, the Oldham–Eksborg ratio Λ, the AUC value from the ROC analysis, the percentage difference of the Bland–Altman test δ(%), and the coefficient of variation of the Oldham–Eksborg ratio CVOE. (B) Like in (A), yet without δ(%) and CVOE in order to exclude high values. Data from 10 examples were used. Note the decadic logarithmic scale on the y axis in panel (B). Data are shown as median with 95% confidence interval. Iso, F2 isoprostane. Insets indicate the tested parameters and their values. BP, systolic blood pressure; Iso, F2-isoprostanes. See Table 1.
Molecules 28 04905 g012
Figure 13. Presentation of values for selected statistical parameters obtained from the linear regression analysis (β and r2), the coefficient of correlation ρ2 from the linear regression analysis of the Bland–Altman difference δ vs. the average concentration of the analyte, and the Oldham–Eksborg ratio in (A); and for δ (%) and the CVOE in (B). Lines combine symbols of the same example. Insets indicate the mean ± standard deviation of the statistical parameters from six examples (i.e., Nitrate, ADMA, AEA, hArg, BP, PEFR). Note the decadic logarithmic scale on the y axis in panel (B). See Figure 11.
Figure 13. Presentation of values for selected statistical parameters obtained from the linear regression analysis (β and r2), the coefficient of correlation ρ2 from the linear regression analysis of the Bland–Altman difference δ vs. the average concentration of the analyte, and the Oldham–Eksborg ratio in (A); and for δ (%) and the CVOE in (B). Lines combine symbols of the same example. Insets indicate the mean ± standard deviation of the statistical parameters from six examples (i.e., Nitrate, ADMA, AEA, hArg, BP, PEFR). Note the decadic logarithmic scale on the y axis in panel (B). See Figure 11.
Molecules 28 04905 g013
Table 1. Summary of the results from the re-evaluation of published studies by using linear regression analysis, the Bland-Altman and the Oldham-Eksborg method, and the ROC approach between two methods.
Table 1. Summary of the results from the re-evaluation of published studies by using linear regression analysis, the Bland-Altman and the Oldham-Eksborg method, and the ROC approach between two methods.
Meannp ValueLinear RegressionBland-AltmanOldham-EksborgROCMethod 1/Method 2Refs.
Noττnpαβr2δCV (%)δ (%)ρ2 (δ; δ%)ΛCV (%)AUC ± SEp
(1) Nitrate in urine (a, b) and plasma (c) (µM)
a10591048200.01361.240.990.998−12 ± 501.1 ± 4.7−1.5 ± 2.70.05; 0.020.986 ± 0.0272.80.531 ± 0.0930.735GC-MS/GC-MS/MS[58]
b8859072400.554−1211.160.97330 ± 1833.4 ± 21n.d. 0.983 ± 0.23123n.d. HPLC/GC-MS[59,60]
c4136400.0002−211.370.896−5.5 ± 8.713 ± 21n.d. 0.822 ± 0.25331n.d. Griess/GC-MS[61]
(2) Asymmetric dimethylarginine (ADMA) in urine (a) and plasma or serum (aa, b–g) (µM or nM)
aa13511334180.03412.71.0030.99417 ± 1.229 ± 1801.30.007; 0.0011.013 ± 0.02192.20.549 ± 0.0970.613Serum/Plasma[62]
a14.913.2100.0080.221.1000.99971.16 ± 1.512 ± 11n.d. 1.170 ± 0.0978.3n.d. GC-MS/GC-MS/MS[63]
b654640100.0722.371.0200.996614 ± 212.2 ± 3.3n.d. 1.02 ± 0.033.4n.d. GC-MS/GC-MS/MS[63]
c 9 −0.90.9990.991 ELISA/GC-MS[64]
d 29 0.010.850.984 ELISA/LC-MS/MS[64]
e0.720.3711 0.400.340.312 ELISA/LC-MS/MS[65]
f 55 ELISA/HPLC[66]
g1.750.83361.3 × 10−18−0.190.860.9440.9 ± 0.3108 ± 36 2.09 ± 0.24 ELISA/HPLC[67]
(3) Anandamide (AEA) in plasma (nM)
0.8440.729305<0.00010.0130.8480.8210.116 ± 0.12310615.6 ± 16.50.024; 0.0240.866 ± 0.136 160.626 ± 0.024<0.0001GC-MS/MS/LC-MS/MS[68,69]
(4) Homoarginine (hArg) in plasma (µM or nM)
a0.7440.643369<0.00010.030.8210.9940.105 ± 0.070 14.3 ± 270.871; 0.1740.868 ± 0.03540.596 ± 0.021<0.0001GC-MS/GC-MS/MS[70]
b245270790.1901861.860.918−25 ± 108 13.6 ± 560.860; 0.5420.978 ± 0.439450.508 ± 0.0210.869LC-MS/MS/GC-MS/MS[70,71]
(5) 15(S)-8-iso-PGF2α and other F2-isoprostanes in urine
a2.8132.037660.6120.8140.5600.642−0.147±0.98542; 38−1.07 ± 470.272; 0.0791.159 ± 0.691600.510 ± 0.0510.841EIA/GC-MS[72,73]
b93441767<0.00011761.820.752−517 ± 320 −75 ± 290.725; 0.005 2.412 ± 0.920380.845 ± 0.033<0.0001ELISA/LC-MS/MS[74]
c3.6050.82510<0.00010.9963.620.656147 ± 985335125 ± 150.906; 0.0284.513 ± 1.143250.990 ± 0.0160.0002GC-MS/LC-MS[74]
(6) Systolic blood pressure (mmHg)
16717825<0.0001−71.110.91110.7 ± 9.084−6 ± 4.60.203; 0.0950.943 ± 0.04450.609 ± 0.0800.187Apparat. 1/Apparat. 2[75,76]
(7) Peak expiratory flow rate (mL/min)
45045317<0.0001390.9170.8902.1 ± 391830−1 ± 120.007; 0.0580.995 ± 0.114 50.509 ± 0.1020.931Instrum. 1/Instrum. 2[10]
Miscellaneous
(8) Nitrite (a) and nitrate (b) in bronchoalveolar liquid (µM)
a0.51.7172<0.00010.070.250.395−1.2 ± 0.670 ± 35 0.28 ± 0.2071 CL/GC-MS[77]
b6.09.572<0.0001−0.350.670.946−3.1 ± 4.033 ± 42 0.64 ± 0.2133 CL/GC-MS[77]
(9) 3-Nitrotyrosine in plasma (nM)
4.81.65180.0023.061.040.3343.1 ± 3.7188 ± 224 3.9 ± 3.795 GC-MS/GC-MS/MS[78]
(10) 2,3-dinor-5,6-dihydro-PGF2α in urine
543502140.434121.060.83240 ± 1868 ± 37 1.12 ± 0.5953 GC-MS/GC-MS/MS[79]
(11) Creatinine in urine (mM)
a5.155.21240.466−0.131.010.998−0.06 ± 0.401.2 ± 7.7 0.97 ± 0.077.2 HPLC/GC-MS[80]
b4.045.2024 0.290.720.996−1.17 ± 1.7723 ± 34 0.82 ± 0.089.8 Jaffé/GC-MS[80]
(12) Total plasma homocysteine (µM)
13.612.731<0.0012.470.880.9900.91 ± 1.317.2 ± 10 1.13 ± 0.1412.4 FPIA/GC-MS[63]
N.A., not available. Abbreviations: CL, chemiluminescence; FPIA, fluorescence polarization immunoassay.
Table 2. Linear regression, Bland–Altman, Oldham–Eksborg and ROC analysis using simulated homoarginine concentrations. The original concentrations measured by DSQ (GC-MS, i.e., DSQ × 1.0) were multiplied by 1.2, 0.8 and 0.6 and compared with those measured by TSQ (GC-MS/MS).
Table 2. Linear regression, Bland–Altman, Oldham–Eksborg and ROC analysis using simulated homoarginine concentrations. The original concentrations measured by DSQ (GC-MS, i.e., DSQ × 1.0) were multiplied by 1.2, 0.8 and 0.6 and compared with those measured by TSQ (GC-MS/MS).
DSQ-FoldLinear RegressionBland–AltmanOldham–EksborgROC
DSQ × 1.2y = 0.04 + 0.9846xr2 = 0.9943δ = −0.0234 ± 0.0281CV = 117%ρ2 = 0.0271δ (%) = −3.94 ± 3.98
CV = 100%
1.041 ± 0.041
CV = 4%
0.5268
p = 0.2081
DSQ × 1.0y = 0.03 + 0.8205xr2 = 0.9943δ = 0.1046 ± 0.0698CV = 67%ρ2 = 0.8701δ (%) = 14.3 ± 3.9
CV = 28%
0.868 ± 0.034
CV = 4%
0.5961
p < 0.0001
DSQ × 0.8y = 0.02 + 0.6564xr2 = 0.9943δ = 0.2331 ± 0.1276CV = 55%ρ2 = 0.9699δ (%) = 36.2 ± 3.8
CV = 11%
0.694 ± 0.028
CV = 4%
0.7285
p < 0.0001
DSQ × 0.6y = 0.02 + 0.4923xr2 = 0.9943δ = 0.3617 ± 0.1871CV = 51%ρ2 = 0.9903δ (%) = 63.2 ± 3.6
CV = 6%
0.520 ± 0.021
CV = 4%
0.8595
p < 0.0001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tsikas, D. Mass Spectrometry-Based Evaluation of the Bland–Altman Approach: Review, Discussion, and Proposal. Molecules 2023, 28, 4905. https://doi.org/10.3390/molecules28134905

AMA Style

Tsikas D. Mass Spectrometry-Based Evaluation of the Bland–Altman Approach: Review, Discussion, and Proposal. Molecules. 2023; 28(13):4905. https://doi.org/10.3390/molecules28134905

Chicago/Turabian Style

Tsikas, Dimitrios. 2023. "Mass Spectrometry-Based Evaluation of the Bland–Altman Approach: Review, Discussion, and Proposal" Molecules 28, no. 13: 4905. https://doi.org/10.3390/molecules28134905

Article Metrics

Back to TopTop