1. Introduction
The skin, which is the largest organ in human body [
1] due to the fact that it has the largest surface and accounts ca. 15% of adult body weight [
2], provides a major barrier against the external environment from the internal environment [
3]. It is composed of multiple layers, namely the surface epidermis, the deeper dermis, and the innermost subcutis [
4], which, in turn, have different constructs, morphology forms, and functions [
5]. The hydrophobic stratum corneum (SC), which is the uppermost layer of epidermis, plays a predominant role in barrier to skin permeation and is normally regarded as the “rate-limiting step of permeation” [
6].
Topical and transdermal drug delivery only accounts a small portion of administration routes [
7]. Nevertheless, it has become an attractive and preferred route of therapeutic delivery partly due to its noninvasive nature and more desirable safety profiles [
8,
9]. For instance, it has been proposed to use patches to deliver insulin [
10] and the pandemics COVID-19 vaccine [
11]. Furthermore, it can provide extra clinical benefits as compared with the other administration routes. For instance, it is not uncommon to observe postoperative nausea and vomiting (PONV) after general or regional anesthesia [
12] and it is normal to treat with scopolamine (hyoscine) [
13], which is associated with various undesirable side-effects, such as xerostomia, blurriness, drowsiness, vertigo, or hallucinations in some cases [
14]. Those anticholinergic symptoms, nevertheless, can be avoided in case of administration by transdermal patch [
15]. In fact, scopolamine was the first marketed transdermal patch [
16].
Additional clinical benefits offered by topic administration can be illustrated by the fact that topic administration can be totally exonerated from the potential adverse side-effects associated with the first pass effect (FPE) in the liver when administrated orally as well as the variations in gastrointestinal (GI) tracks, namely pH discrepancies, food intake, stomach emptiness [
17].
Skin permeability is a pivotal factor that should be taken into account in the pharmaceutical and cosmetics industries for optimization of the delivery of active substances as well as hazard and risk evaluation [
18]. Various in vitro, in vivo, and ex vivo assay systems have been devised to assess drug retention in skin layers and skin permeability [
19]. Of various in vitro assay systems, skin from human, pig, hairless rodent, guinea pig, and artificial membrane are acceptable by the European Medicines Agency (EMA) as a means to evaluate the skin permeability [
20]. Nevertheless, ex vivo excised human skin is still considered as the de facto standard for in vitro permeation assessments despite the fact that there are a number of ethical issues associated with it [
21]. In vitro skin permeability is normally defined by the permeability coefficient or constant (
Kp) as follows,
where
Jss and Δ
CV are the steady state flux (
Jss) and the chemical concentration difference (Δ
CV), respectively [
22].
In silico modeling provides an interesting alternative to assess skin permeability since it is less time-consuming and economically efficient in addition to the fact that there are no ethical issues when compared with its in vivo and in vitro counterparts [
23]. Most importantly, in silico technology can be applied to the virtual compounds,
viz. compounds that have not been synthesized yet. In fact, numerous in silico models, to predict skin permeability, have been published [
24,
25,
26,
27,
28,
29,
30,
31,
32,
33,
34,
35,
36,
37,
38].
Skin permeation can take place through the transcellular route, in which the permeants cross SC, the intercellular route, in which the permeants across the lipid matrix, and the shunt or appendageal route, in which the diffusion goes into the hair follicles, sebaceous gland, and sweat gland [
20] as illustrated by Figure 1 of Benson [
39]. Furthermore, compounds with different physicochemical properties can penetrate skin layers via different routes. For instance, very polar, mediate polar, and poor polar compounds can exhibit different permeation behavior and a sophisticated theoretical model that can take into account the diverse mechanisms accommodated by solutes of different polarities is needed as suggested [
24]. The ATP-binding cassette (ABC) superfamily and solute carrier (SLC) superfamily can be expressed in human skin [
40] that can further enhance and/or reduce permeability, making the production of a sound in silico model that can take into account all those complicated factors extremely difficult, if not impossible. Theoretical models based on in vitro assays, except for human skin, can be of limited applicability due to poor or little correlation between human skin and skin of the other animal species.
To date, most of the quantitative in silico models are constructed using either of two categories, namely the linear regression or machine learning (ML) schemes [
41]. The former including partial least square (PLS) and multiple linear regression (MLR) can render the link between adopted descriptor and biological activity [
41]. Nevertheless, it is hard for linear models to properly function when such links are very complicated as exemplified by the varied weights between molecular polarity and skin permeability (
vide supra). This difficulty can be appropriately addressed by ML-based schemes since ML-based models generally perform better than their linear counterparts in handling nonlinearity [
42]. This “black box” approach, conversely, makes ML models difficult relating the selected descriptors to biological activity [
41]. These seemingly contradictory features between interpretability and predictivity can be solved by a novel two-QSAR approach [
43] by incorporating the ML-based hierarchical support vector regression (HSVR) scheme [
44] and the linear PLS scheme. Herein, this study was aimed at predicting the
Kp values based on the ex vivo human skin permeability data for facilitating drug discovery by means of the two-QSAR scheme.
4. Discussion
Skin permeation can take place through a series of processes, in which molecules must penetrate through various skin layers before they can reach the body circulation system (
vide supra). As such, skin permeability is governed by various factors, of which log
P and molecular weight (MW) are the two most frequently adopted descriptors that can be manifested by the predictive model developed by Potts and Guy [
24], in which only both descriptors were used.
In fact, the significance of log
P in skin permeability can be realized by the fact that all of the SVR models in the ensemble unanimously adopted this descriptor and can be further supported by the largest absolute weight (1.27) given by PLS (Equation (22)) among all of the selected descriptors. Furthermore, it has been observed by Potts and Guy that log
Kp linearly increases with the increase of log
P. Nevertheless, the correlation between log
P and log
Kp was only 0.42 for all of the molecules selected in this study (
Table 1), which, actually, is similar to the value observed by Chen et al. (
r = 0.467) [
32]. This inconsistency is plausibly attributed to the fact that Potts and Guy selected the compounds of log
P < 4 only, whereas this study and Chen et al. included some compounds of log
P ≥ 4. More importantly, it can be observed from
Figure 8, which displays the average log
Kp for each histogram bin of log
P for all selected molecules, that log
Kp increased with log
P initially and then decreased once log
P ≥ 4, leading to an apparently bi-linearity between log
Kp and log
P.
This intricate reliance can be realized by the fact that it is easier for the more hydrophobic permeants to approach the skin lipid bilayer, which is hydrophobic
per se [
20]. Conversely, it will be harder for those too hydrophobic permeants to escape skin lipid bilayer or even retain in the skin layers without significant penetration. As such, the collective
r is reduced once both the positive and negative
r values are taken into account. It is plausible to expect that such bi-linearity cannot be properly addressed by linear models, whereas this nonlinearity can be appropriately handled by ML schemes provided that the other descriptors are properly selected.
Potts and Guy have adopted the descriptor MW to render the size impact on the skin permeability [
24]. In fact, most published in silico models also have selected MW as the size-related descriptor. Nevertheless, none of the SVR models in the ensemble included MW and yet SVR A enrolled the descriptor molecular volume (
Vm). This divergence can be justified by the fact that MW was highly correlated with
Vm with an
r value of 0.96 for all of the molecules compiled in this study, suggesting that it is plausible to replace MW by
Vm as a size-related descriptor. This justification, actually, is also consistent with the postulation made by Wilschut et al. that the molecular size can be better denoted by
Vm when taking into account the electron-density distributions [
103]. For instance, the steric isomers have the same MW, whereas their
Vm values are different, indicating that MW cannot show the distinction between both steric isomers and
Vm is a better way to render the size factor. As such, the empirical observation unequivocally indicated that models with the selection of
Vm performed better than those with the selection of MW (data not shown) that, additionally, can be partially attributed to the fact that
Vm was enumerated based the geometry that was fully optimized by the more sophisticated DFT with the selection of a descent basis set along with the consideration of a solvent effect.
The PLS placed a negative weight to
Vm (Equation (22)) that is similar to the other published models, which unanimously gave negative coefficients to MW. The reverse relation between molecular size and skin permeability can be plausibly explained by the fact that molecular size is the most critical factor in demining the solute flux amounts through the epidermis since smaller solute molecules tend to have higher possibilities to enter the SC pores and, consequently, across the SC pores and lipid lamellar layers faster [
104].
It is unusual to observe that SVR B adopted the descriptor
0χ, which depicts the molecular connectivity index of order zero, since none of the published in silico models has selected this descriptor. Nevertheless, it can be observed from
Figure 9, which exhibits
Vm versus
0χ, that
Vm and
0χ were extremely correlated with each other (
r = 0.98), suggesting that
0χ can be another descriptor to describe the size factor in skin permeability. The over-training issue was not applicable in this study since SVR A and SVR B recruited
Vm and
0χ, respectively,
viz. no simultaneous selection of both descriptors by any SVR model in the ensemble. The significance of
0χ in skin permeability can be manifested by the weight given by PLS, which is very similar to the one associated with
Vm (−0.554268 vs. −0.55661). More importantly, the empirical operations have disclosed that HSVR based on this descriptor combination executed better than the others (data not shown) plausibly as a result of the descriptor–descriptor interaction [
7]. Any other linear or nonlinear ML-based QSAR methods, contrarily, cannot properly address such paradoxical descriptor selections.
It is of interest to note the selection of partial positive surface area (Jurs_PPSA_1) by SVR C since it has never been included by any published in silico models. Nevertheless, it has been observed that polar surface area (PSA) plays a significant role in distinguishing between the substrates and non-substrates of P-glycoprotein (P-gp) [
87]. It has been found that P-gp can be expressed in the human skin [
105]. In contrast to the intestine and blood–brain barrier (BBB), the efflux transporter P-gp in the skin plays an influx role by transporting substrates from the surface into the dermis [
106]. As such, the descriptor Jurs_PPSA_1, which is a modified version of PSA, was adopted in this study with better model performance (data not shown). Compounds selected in this study were further classified as P-gp substrates and non-P-gp substrates using
admetSAR (available at
http://lmmd.ecust.edu.cn/admetsar2/, accessed on 17 September 2021.) to investigate the Jurs_PPSA_1 impact on the skin permeability. The results are shown in
Figure 10, which displays the plot of log
Kp versus Jurs_PPSA_1 for those P-gp substrates and non-P-gp substrates along with their associated regression lines. It can be observed that Jurs_PPSA_1 was substantially associated with log
Kp with an
r value of 0.96 for P-gp substrates, whereas there was a negative correlation between log
Kp and Jurs_PPSA_1 for non-P-gp substrates, suggesting that PSA can facilitate the influx of P-gp substrate that, in turn, can enhance the skin permeation consequently.
PSA can also represent molecular polarity [
87]. Abraham et al. has adopted a molecular polarity-related descriptor to describe its impact on skin permeability [
29]. The negative coefficient of Jurs_PPSA_1 given by PLS (−0.076344) as well as the negative weight associated with the polar descriptor in the model developed by Abraham et al. unequivocally indicate the reverse relationship between PSA/polarity and skin permeability. Additionally, larger PSA or dipole will result in stronger interactions between solute and solute as well as between solute and solvent, increasing higher desolvation energy when they approach the skin lipid bilayer [
7].
The rather small r value (−0.34) between log Kp and Jurs_PPSA_1 for non-P-gp substrates can be presumably attributed to the different permeation routes for molecules with different polarities (vide supra) as well the nature of solute−solute and solute−solvent interactions. As such, Jurs_PPSA_1 plays a profound role in skin permeation since it can simultaneously enhance and reduce skin permeation depending on the nature of the permeant and such a contradictory feature cannot be properly depicted by any traditional linear model. HSVR, conversely, can correctly render such complicated relationship.
It has been observed the neutral compounds are more permeable in the human colon carcinoma cell layer (Caco-2) and parallel artificial membrane permeability assay (PAMPA) system [
7,
89]. It is of interest to investigate that if neutral compounds have higher permeability values in the ex vivo skin permeability model as compared with the other ion classes. All of the molecules enlisted in this study were categorized into four ion classes according to their p
Ka values. It can be found from
Figure 11, which demonstrates the box plot of the log
Kp minimum, maximum, mean, median, the 25th percentile, and the 75th percentile for each ion class, that the log
Kp values of neutral compounds are larger than the other ion classes, whereas that of basic compounds are statistically lower than the others, suggesting that neutral compounds are more permeable, which is consistent with the observation made by the Caco-2 and PAMPA systems, and basic compounds are less likely to penetrate through skin.