**4. Discussions**

In this study, we evaluated the accuracy of drug–drug interaction signals for the newly proposed subset analysis that modified two shortcomings of the previous subset analysis on the basis of data from the spontaneous reporting system.

There were 3924 pairs of *drug <sup>D</sup>*1–*drug D*2–SJS in the spontaneous reporting system, JADER. There are several known combinations of drugs that onset SJS by drug–drug interactions [20]. On the other hand, there are some combinations that have not ye<sup>t</sup> been reported. Recently, we used the Ω shrinkage measure model to report potential drug combinations for the onset of SJS in concomitant use with antiepileptic drugs [21]. Not all AEs have been identified and there are still many unknown AEs. Unfortunately, unknown AE data do not exist anywhere in the world; there were no "real" true data for AEs. Therefore, to verify the accuracy of the subset analysis, we needed to prepare "hypothetical" true data of AEs. A previous comparative study [14] of five algorithms for detecting drug–drug interaction signals revealed that the Ω shrinkage measure model [16] detected the most conservative signal, while

the combination risk ratio model [17] did not detect any interaction signal in less than three reports due to the detection criterion. Therefore, of the five algorithms, we used the combination of signals detected by the three algorithms (the additive model, the multiplicative model, and the chi-square statistical model) as "hypothetical" true data.

Among the previous subset analysis, the newly proposed subset analysis, and the Ω shrinkage measure model, most signals were detected by the previous subset analysis with 1793 pairs (45.7% of the total combinations, *Accuracy*: 0.584, *Precision* (*PPV*): 0.302, *Recall* (*Sensitivity*): 0.587, *Specificity*: 0.583, *Youden's index*: 0.170, *F*-*measure*: 0.399, and *NPV*: 0.821), followed by the newly proposed subset analysis with 909 pairs (23.2% of the total combinations, *Accuracy*: 0.809, *Precision* (*PPV*): 0.596, *Recall* (*Sensitivity*): 0.587, *Specificity*: 0.878, *Youden's index*: 0.465, *F*-*measure*: 0.592, and *NPV*: 0.874). In contrast, the Ω shrinkage measure model detected the fewest signals with 712 pairs (18.1% of the total combinations, *Accuracy*: 0.858, *Precision* (*PPV*): 0.756, *Recall* (*Sensitivity*): 0.583, *Specificity*: 0.942, *Youden's index*: 0.525, *F*-*measure*: 0.658, and *NPV*: 0.880) (Table 2, Table 4).

This result indicates that the accuracy of signal detection has been greatly improved in the newly proposed subset analysis with a simple modification of the previous subset analysis. However, the newly proposed subset analysis exhibited slightly lower power and accuracy for detecting the drug–drug interaction signals compared to the Ω shrinkage measure model.

Verification by the number of reports showed that when the number of reports (*N*11; *n*111) < 2, the accuracy (*Youden's index*, *F*-*measure*) of signal detection was higher in the newly proposed subset analysis than in the Ω shrinkage measure model (*Youden's index*: the newly proposed subset analysis (0.337) vs. the Ω shrinkage measure model (0.174), *F*-*measure*: the newly proposed subset analysis (0.448) vs. the Ω shrinkage measure model (0.298)).

However, as the number of reports increased, the Ω shrinkage measure model became more accurate (*Youden's index*: the newly proposed subset analysis (0.465) vs. the Ω shrinkage measure model (0.525), *F*-*measure*: the newly proposed subset analysis (0.592) vs. the Ω shrinkage measure model (0.658)) (Table 4).

Additionally, the *True positive* values for the previous subset analysis and the newly proposed subset analysis were the same (Table 3). Since all signals obtained by the newly proposed subset analysis were included in the previous subset analysis, this result indicates that the detection criterion of the previous subset analysis was loose and that the data contained false positives.

The similarity between the newly proposed subset analysis and the Ω shrinkage measure model was κ (95% CI): 0.375 (0.355–0.395), *<sup>P</sup>*positive: 0.502, and *<sup>P</sup>*negative: 0.870. On the other hand, the similarity between the previously subset analysis and the Ω shrinkage measure model was κ (95% CI): 0.088 (0.071–0.105), *<sup>P</sup>*positive: 0.325, and *<sup>P</sup>*negative: 0.684. Thus, the newly proposed subset analysis w more similar to the Ω shrinkage measure model than the previously subset analysis. However, the similarity of the newly proposed subset analysis and the Ω shrinkage measure model is not very high. Additionally, when the number of reports (*N*11; *n*111) was ≥3, no significant change was observed in the similarity between the Ω shrinkage measure model and the newly proposed subset analysis. Despite not being similar to the Ω shrinkage model, the newly subset analysis showed a high degree of accuracy. This result suggests that the newly subset analysis may be detecting signals that the Ω shrinkage model has failed to detect.

This study has the following three limitations. First, unfortunately, unknown AE data do not exist anywhere in the world [14]. Therefore, there were no "real" true data for AEs. Thus, for the purpose of verification, it was necessary to set "hypothetical" true data for AEs instead of "real" true data. Therefore, of the five algorithms for detecting drug–drug interaction signals, we used the combination of signals detected by the three algorithms (the additive model, the multiplicative model, and the chi-square statistical model) as "hypothetical" true data in this study. In other words, the hypothetical true data consisted of statistically based *drug <sup>D</sup>*1–*drug D*2–AE combinations, not pharmacologically based combinations.

Second, usually it is important to compare detection trends using all AEs recorded in the validation dataset created on the basis of a spontaneous reporting system; however, it takes an extremely long time to calculate signal values for all combinations of drug–drug interactions. Such a study design is not realistic. Therefore, this study targeted SJS, the same AE used in previous comparative studies [14,15]; if di fferent reference sets were used, the possibility of obtaining di fferent performance characteristics might not be ruled out. There are fewer enrolled cases than in the global dataset because JADER is limited to cases in Japan. However, the signal detection is based on a comparison between the ratio of reported cases ( *N*) to expected values (*E*). Therefore, di fferences in the number of cases enrolled in the spontaneous reporting system had only a very small statistical impact in this study. Recently, validation of the number of cases enrolled in the spontaneous reporting system has also been reported by Caster et al. [22]. Moreover, di fferences in the way regulatory authorities think may result in a di fferent tendency to register AEs to the spontaneous reporting system. For example, the Food and Drug Administration Adverse Events Reporting System (FAERS) in the United States has also registered reports from non-medical professionals, but JADER has not registered reports from patients until recently. It is unknown how the di fferences in registration tendencies a ffect the results of this study.

Finally, neither the general algorithms for detecting drug–drug interaction signals nor the proposed subset analysis in this study were antagonistic; only signals of synergistic interactions were detected [10].
