data from supplier. n.a.: not available; experiments were not performed due to batch-to-batch variation of TiO2 NW resulting in suspensions of poor quality.

#### *3.2. Dissolution and Transformation*

The solubility in physiologically relevant fluids was determined using two approaches, applying a static and a dynamic method. While the static approach was applied for artificial alveolar (AAF, pH 7.4) and artificial lysosomal (ALF, pH 4.5) fluid, the dynamic approach was conducted with a phagolysosomal simulant fluid (PSF, pH 4.5). The results are summarized as a proportion of dissolved material after seven days of incubation (Table 4). Most materials were not soluble in AAF. Only Cu NP and Cu NW showed a low and comparable solubility in AAF of 12%. The solubility in artificial lysosomal fluid (ALF) depended strongly on the material examined. High solubilities were observed in the case of Cu NP (64%) and Cu NW (57%), as well as for CuO NP (57%) in the static method. These materials also showed high solubility in the dynamic process, whereby the CuO NP exerted the highest solubility of 97%. The nickel materials exhibited a moderate (Ni NW) to high solubility (Ni NP) in the static process, while Ni NW showed a high solubility of 94% in the dynamic process. In both, static and dynamic approaches, no or only a very low solubility for the Ag NP was detected. The fibrous silver material (Ag NW) was also found to be insoluble by the static approach, while a low solubility of 11% was found applying the dynamic system. For particulate and fibrous TiO2 as well as for particulate CeO2 no solubility was observed in any of the media tested.

**Table 4.** Summary of the solubility in biologically relevant model fluids after seven days. AAF: Artificial alveolar fluid (pH 7.4); ALF: Artificial lysosomal fluid (pH 4.5); PSF: Phagolysosomal simulant fluid (pH 4.5).


Next, the transformation of nanomaterial shape and speciation of NW after dynamic dissolution was investigated (Figure 1). For this purpose, the flow-through cells were flushed with water, opened and the remaining solids were rinsed onto a centrifuge vial with a TEM grid at the bottom. By centrifugation, all solid material > 10 nm was spun onto the TEM grid and the supernatant containing the buffer salts was discarded. Compared to non-treated Ag NW, the occurrence of particulate morphologies was observed, which represents the thermodynamically stable form. A smaller number of Cu NW, decorated by newly formed substructures, was found on TEM grids due to high solubility in PSF. The tendency of increased polymorphism from long fibers towards a higher number of small particle structures coincided with sulfidation (EDXS, data not shown), which may have formed passivating layers. Since Ni NW almost completely dissolved, no NW were detected during TEM measurements. As expected, undissolved TiO2 NW stayed aggregated after dissolution.

**Before After Ag NW** 

**Cu NW** 

**Ni NW** 

**Figure 1.** TEM images of NP and NW before and after treatment in the flow-through cells with phagolysosomal simulant fluid (PSF).

#### *3.3. Abiotic Reactivity*

To assess the abiotic reactivity, the so-called ferric reduction ability of serum (FRAS) assay was applied. This assay is based on the measurement of a mass-metric Biological Oxidative Damage (mBOD) of nanomaterials due to their oxidative potential by the reduction of human blood serum [34]. For each NP and NW, a dose-response was carried out and one concentration close to ~20% of maximum NM oxidative potential was selected for the evaluation of ion contribution (Figure 2).

**Figure 2.** FRAS testing (**A**) Cu NP, Cu NP ions, Cu NW and Cu NW ions (**B**) Ni NP, Ni NP ions, Ni NW and Ni NW ions (**C**) TiO2 NP and TiO2 NW. Error bars indicate one standard deviation from triplicate testing, and are smaller than the size of the symbol in most cases. Statistics were performed using either ANOVA-Dunnett's T3 (\* *p* ≤ 0.05, \*\* *p* ≤ 0.01, \*\*\* *p* ≤ 0.001) or the 2-sided Dunnett's test (‡ *p* ≤ 0.05, ‡‡‡ *p* ≤ 0.001).

After incubation of the Cu NP in the FRAS assay media for the duration of the assay at a concentration of 0.04 g/L (~20% of the measured Cu NP oxidative potential), the actual Cu ion concentration was determined, being 11 mg/L. The ion oxidative potential was more than two times and thus significantly lower (18,048 ± 2863 nmol TEU/L) than the response induced by the total Cu NP (49,783 ± 644 nmol TEU/L); therefore the reactivity of Cu NP at 0.04 g/L was predominately assigned to the particle with a steep dose-response curve. Similarly, CuO NP presented high reactivity which was dominated by the particle itself (Figure S3A). The ion contribution of Cu NW was tested at a concentration of 0.1 g/L and the actual Cu ion concentration detected was 6.6 mg/L. The reactivity of Cu NW (0.1 g/L) originated completely from ions (Figure 2A). Moreover, FRAS mBOD values were calculated at concentrations of 0.22 g/L as 842 ± 10 and 922.0 ± 5.4 nmol TEU/mg for Cu NP and NW respectively, representing very high reactivity for both forms.

The ion contribution for Ni NP and NW was examined at a concentration of 15 g/L. Ni ion concentrations were determined to be 105 mg/L and 31 mg/L, respectively. Both Ni ions and NPs contributed significantly to the reactivity of Ni NP at 15 g/L (Figure 2B), with values of 23,897 ± 1470 nmol/L TEU (ions) and 30,063 ± 441 nmol/L TEU (particles). The mBOD values at concentration 15 g/L are for Ni NP 2.00 ± 0.03 nmol TEU/mg and for

NW 0.50 ± 0.01 nmol TEU/mg. Although the difference is metrologically significant, the values are very similar in comparison to the dynamic range of the assay as exemplified by the values of the Cu-based materials.

Since TiO2 NP and NW were insoluble materials, ion contribution to the reactivity was not considered. Similar mBOD values at a concentration of 15 g/L were calculated for both NP: 1.20 ± 0.09 and NW: 1.5 ± 0.22 nmol TEU/mg (not all replicates were useful for the NW, and the error of the worst triplicate in the concentration series is given). Again, the difference is metrologically significant, but the values are very similar in comparison to the dynamic range of the assay.

Moreover, the reactivity of CeO2 and Ag NP was evaluated (Figure S3B,C). As an insoluble material, CeO2 presented very low reactivity (mBOD: 1.90 ± 0.09 nmol TEU/mg at conc. 5.6 g/L). Ag NP showed intermediate reactivity with a mBOD value of 22.3 ± 0.2 nmol TEU/mg (at conc. 3.0 g/L).

#### *3.4. Cell Viability and Bioavailability*

For all cellular experiments doses are stated as μg/mL to facilitate comparison between NP and NW, since the parameter of the deposited dose was only available for NP and not for NW as described in Section 3.1.

#### 3.4.1. Cell Viability

Cell viability was investigated in four different cell lines, namely three lung epithelial cell lines (A549, Beas-2B and RLE-6TN (rat)) and one cell line with macrophage-like properties (dTHP-1). For each nanomaterial, five different doses were chosen and their impact on the ATP content was assessed. Furthermore, RCC after incubation with three different doses was analyzed. (Figure S4). As shown in the overview provided in Figure 3, all materials which were soluble in lysosomal fluid (Cu, CuO, Ni) showed a dose-dependent cytotoxic effect in all investigated cell lines. Additionally, a dose-dependent decrease of the viability of the cells was seen after incubation with the Ag-based materials, even though an ion release in the lysosomal fluid was not observed for this material. The dTHP-1 cells, as a cell culture model with macrophage-like properties, revealed the most sensitive reaction to all lysosomal-soluble nanomaterials. Comparing NP and NW of the same material at the same concentrations, with the exception of Cu-based materials, the application of NW resulted in a less pronounced cellular toxicity. For the insoluble materials, TiO2 NP and CeO2 NP no decrease in viability was observed, even for the highest applied dose of 100 μg/mL.

**Figure 3.** Impact of nanomaterials on the ATP content of A549, Beas-2B, RLE-6TN, and dTHP-1 cells after incubation with five different doses of the nanomaterials. The ATP content of incubated samples was normalized to an untreated control. Significantly different from negative controls: \* *p* ≤ 0.05, \*\* *p* ≤ 0.01, \*\*\* *p* ≤ 0.001 (ANOVA-Dunnet's *t-*test).

#### 3.4.2. Bioavailability

Bioavailability of all nanomaterials was analyzed after 24 h incubation with three different concentrations (Figure 4) in four different cell lines (A549, Beas-2B, RLE-6TN, and dTHP-1). Doses were chosen in preliminary experiments and normalized to low, mid, and high cytotoxic effects. Cells were incubated with nanomaterials for 24 h in their respected cell culture media. Depicted are the means of three independently performed experiments ± standard deviation.

**Figure 4.** Bioavailability of Cu- (**A**), Ni- (**B**), Ag- (**C**), and Ce- and Ti-based materials (**D**) in A549 (light blue), Beas-2B (purple), RLE-6TN (dark green), and dTHP-1 (light green) cells. Bioavailability is displayed as ion release in ng/106 cells. Cells were incubated with nanomaterials for 24 h in their corresponding cell culture media. Depicted are the means of three independently performed experiments ± standard deviation. Statistics were performed using either ANOVA-Dunnett's test (\* *p* ≤ 0.05, \*\* *p* ≤ 0.01, \*\*\* *p* ≤ 0.001) or the unpaired *t* test (• *p* ≤ 0.05, •• *p* ≤ 0.01) to compare differences from basal concentration.

All materials which revealed a solubility in ALF (pH 4.5) also showed a dose-dependent increase of the intracellular ion release. The basal Cu content within the epithelial cell lines (A549, Beas-2B, RLE-6TN) was around 2–3 ng/10<sup>6</sup> cells (15 μM). Regarding the dTHP-1 cells, a basal Cu concentration of 20 ng/10<sup>6</sup> cells (30 μM) was observed. After incubation with the CuO NP, the intracellular content of released copper ions increased up to 250 ng/10<sup>6</sup> cells (1700 μM), being comparable for all cell lines except the RLE-6TN cells. Here, only a small increase of intracellular copper content after incubation with the CuO NP was observed (70 ng/106 cells (600 μM)). Comparing Cu NP and NW, the bioavailability of the Cu NW was considerably higher, especially in Beas-2B cells (up to 800 ng/106 cells (5500 μM)). This observation could be explained by the higher dissolution rate of the Cu NW in cell culture media used for Beas-2B cultivation (shown in Figure S5) and therefore a simultaneous uptake of Cu ions in Beas-2B cells. Besides Beas-2B cells, dTHP-1 cells also

exerted a strong release of Cu ions after incubation of the Cu NW, with a maximum of 570 ng/106 cells (3157 μM).

For both Ni-based materials a strong dose-dependency in intracellular bioavailability was observed. Basal Ni concentrations ranged around 1–3 ng/10<sup>6</sup> cells (3–10 μM) for all epithelial cells and 7 ng/10<sup>6</sup> cells (26 μM) for the dTHP-1 cells. After incubation with 10 μg Ni NP/mL, intracellular Ni content increased up to 250 ng/106 cells (3000 μM). In comparison, the bioavailability of Ni NW was much lower. Intracellular Ni contents up to 44 ng/106 cells (485 μM) were seen for two epithelial cell lines (A549 and RLE-6TN) after an incubation dose of 10 μg Ni NW/mL After treatment with 50 μg Ni NW/mL and higher, Ni-ion content increased up to 140 ng/106 cells (950 μM). Regarding the Ni content in the dTHP-1 cells at an incubation dose of 10 μg Ni NW/mL, intracellularly dissolved Ni was found to be four times higher (160 ng/106 cells (600 μM)) compared to that of all epithelial cells. For Beas-2B cells only a low bioavailability was observed after incubation with Ni NW, leading to a maximum intracellular Ni content of 13 ng/10<sup>6</sup> cells (90 μM). A lower bioavailability of Ni in the Beas-2B cells was also observed after incubation with the Ni NP. Solubility of the Ni-based materials in cell culture media was very low, leading to a maximum dissolution rate of 4% in all cell culture media used in this study (Figure S5). Thus, the intracellular bioavailability of these materials can be correlated to the uptake of undissolved materials. Moreover, no differences in the solubility between the different cell culture media were seen. Therefore, differences in the intracellular bioavailability of the Ni-based materials appear to be dependent on the cell lines and their specific properties.

In acellular investigations, no dissolution of the Ag-based materials in ALF was seen. However, bioavailability was observed in the cellular studies. Incubation of the Ag NP resulted in a weak dose-dependent increase of the intracellular Ag content of up to 47 ng/10<sup>6</sup> cells (180 μM) at an incubation dose of 100 μg/mL for all epithelial cell lines. In contrast, Ag content in the dTHP-1 cells was much higher, reaching an intracellular Ag ion release of 94 ng/106 cells (175 μM) at 10 μg Ag NP/mL Incubation of the Ag NW led to a maximum Ag ion release of 34 ng/10<sup>6</sup> cells (102 μM) at the maximum dose of 100 μg/mL for the epithelial cells, whereas Ag ion content of the dTHP-1 cells was found to be around five times higher (150 ng/106 cells (330 μM)) after an incubation dose of 100 μg Ag NW/mL. For the insoluble materials TiO2 NP and CeO2 NP no bioavailability was seen even after treatment with the highest dose of 100 μg/mL.

#### 3.4.3. Intracellular Distribution

Additionally, the intracellular distribution of all bioavailable materials was investigated by fractionating the cells into the soluble fractions of the cytoplasm and nucleus (Figure 5). Since treatment doses vary between the different materials, they are stated in Table S2. A dose-dependent increase of ion concentrations was seen for all of the materials in the cytoplasm as well as in the nucleus. For the Cu-based materials, a strong accumulation of intracellular dissolved Cu ions was found within the nucleus of all cell lines, reaching concentrations of 1 mM and higher. As already observed for the cellular bioavailability, the compartment-specific Cu concentration was much more pronounced after treatment with Cu NW when compared to the particulate Cu-based materials. Regarding the Ag-based materials, a nuclear accumulation was evident in all cell lines. Here, concentrations up to 4 mM were observed after incubation of Beas-2B cells with Ag NP and after applying Ag NW on dTHP-1 and RLE-6TN cells. For the Ni-based materials, a lower concentration of released Ni ions was found in the nucleus as compared to the cytoplasm of all cell lines.


**Figure 5.** Intracellular distribution in cytoplasm and nucleus of A549, Beas-2B, RLE-6TN and dTHP-1 cells after incubation with three doses metal-based nanomaterials. Significantly different from basal concentration: \* *p* ≤ 0.05, \*\* *p* ≤ 0.01, \*\*\* *p* ≤ 0.001 (ANOVA-Dunnet's *t-*test).

#### **4. Discussion**

In this study, nine different particulate or fibrous metal-based nanomaterials were investigated with respect to their physicochemical properties and solubility behavior in acellular fluids of different pH values. Furthermore, the cytotoxicity and intracellular bioavailability of the materials in four different cell lines relevant for inhalative exposure was determined. To the best of our knowledge, this is the first study systematically comparing the impact of different particulate and fibrous metal-based materials on all of these parameters in parallel, exerting some quantitative differences between nanomaterial shapes, but a more distinct impact of the respective metal species under investigation.

Concerning the acellular investigations, for all materials, solubility in AAF (pH 7.4) was not apparent or considered to be low. This suggests that the analyzed materials do not dissolve in the extracellular matrix of the respiratory tract which is in accordance with previous studies [24]. Therefore, a nanomaterial-cell interaction within the lung can be postulated. However, since nanomaterials are taken up via endocytosis and are subsequently transported to lysosomes, dissolution in this acidic environment appears to be relevant. This may result in higher solubility in this acidic cellular compartment, with potential intracellular metal ion release and thus potential metal-ion derived cellular toxicity. Therefore, two different approaches were chosen to determine solubility under acidic conditions and compared, namely, the static dissolution in artificial lysosomal fluid (ALF) and a dynamic dissolution approach in phagolysosomal simulant fluid (PSF), both pH 4.5. The dynamic approach was chosen additionally, since the lung is not a static system and dissolved ions are transported quickly to other compartments, rendering it as a more realistic approach [36,37]. Furthermore, the dissolution by the dynamic approach is not limited by saturation conditions and therefore an underestimated dissolution rate can be prevented [38]. Regarding the different materials under investigation, even after seven days no solubility in either experimental system was observed in case of TiO2 NP or NW, CeO2 NP nor in case of Ag NP. However, some solubility was observed for Ag NW in the dynamic

system as opposed to no detectable dissolution under static conditions. Far higher solubility of around 50% and above was evident in case of Ni- and Cu-based materials. Here, both Ni NP and Ni NW, as well as CuO NP, exerted higher dissolution fractions in the dynamic system, while the opposite was observed in case of Cu NP and Cu NW. Nevertheless, in all cases except for the insoluble TiO2, CeO2, and Ag NP, solubility was highly accelerated under acidic conditions, evident by both experimental approaches. These differences in nanomaterial dissolution were also reflected in structural transformation as determined by TEM in the dynamic approach.

Additionally, the oxidative potentials of NP and NW and their free ions were evaluated by utilizing the FRAS assay, which measures biological oxidant damage in serum [15]. Here, NP and NW based on the same metals (Cu NP/NW, Ni NP/NW, TiO2 NP/NW) demonstrated similar reactivity. However, the difference between the metals was more significant. While all Cu-based materials (NP/NW and CuO NP) exerted very high reactivity at low concentrations around and above 0.1 g/L, about 100-fold higher concentrations were required in case of TiO2 NP/NW, Ni NP/NW, and CeO2 NP to exert some but still low reactivity. With regard to the respective NP, the results confirm those obtained previously [20,39]. No such studies have been conducted for the NW analyzed within this study.

Since the respiratory tract is a complex system consisting of different cell types, the cytotoxicity and bioavailability of the nanomaterials in four different cell lines were investigated, all being relevant for the respiratory tract. Thus, three epithelial cell lines of human (A549, Beas-2B) or rat (RLE-6TN) origin, as well as a cell line with macrophage-like properties (differentiated THP-1) were applied. To assess the cytotoxicity, two parameters were chosen, namely RCC and ATP content, which were determined in all four cell lines. Based on the outcome, bioavailability studies were conducted at low, mid, and high cytotoxic doses of the respective materials, as stated in Table S2. To distinguish intracellular bioavailability from particles potentially stuck to the outer cell membrane, and to further discriminate between cytoplasmic and nuclear metal ion concentrations, two different fractionation protocols were applied as published previously [15,17]. Briefly, to assess bioavailability in whole cells, the cell membrane with potential material residues was separated by cell lysis followed by a centrifugation step. Besides the bioavailability in the whole cell, metal-ion concentrations in the cytoplasm and the nucleus were investigated. Here, cells were separated into the soluble fractions of cytoplasm and nucleus. In both approaches, metal-ion concentration was determined by atomic absorption spectrometry or ICP-MS afterwards.

In general, with the notable exception of Ag-based materials, both cytotoxicity as well as bioavailability reflected the acellular dissolution rates in physiological lysosomal media (pH 4.5), since materials that exhibited an acellular dissolution also showed a dosedependent cytotoxicity and bioavailability within all tested cell lines. Here, highly elevated concentrations were seen in the cytoplasm and the nucleus; particularly high concentrations in the nucleus were found in the case of Cu- and Ag-based materials, reaching millimolar concentrations.

TiO2 and CeO2 NP, which were insoluble in acellular lysosomal fluid, also showed neither cytotoxicity nor intracellular metal ion release. This is in agreement with a previous study showing that resorbed TiO2 NP remained within the phagosomes of the cells without measurable ion release in the cytoplasm and caused no cellular toxicity [40].

A good correlation between solubility in artificial lysosomal fluids, cytotoxicity, and intracellular bioavailability was also evident for Cu- and Ni-based materials, showing some cell line depending differences. For CuO NP, a pronounced and dose-dependent bioavailability was seen in the Beas-2B cells, followed by A549 and dTHP-1 cells. The correlation between the solubility at a low pH and intracellular bioavailability was already described for CuO NP in two previous studies, where the dissolution in ALF with the intracellular bioavailability in A549 and Beas-2B cells was compared [15,17]. Interestingly, the bioavailability of Cu NP in the Beas-2B cells, when compared to the A549 cells, was much lower. Comparing Cu NP and Cu NW at the same doses, a higher bioavailability

was seen for Cu NW, however, the toxic effects of Cu NP and Cu NW were comparable. Additionally, high concentrations of released ions from Cu NW were found in the nuclei. The observation of higher Cu ion concentration within nuclei by Cu NW compared to Cu NP may result in a higher genotoxic potential of the fiber-shaped material due to the redox potential of Cu ions. However, this hypothesis needs to be further evaluated in subsequent studies.

After incubation with Ni NP, a dose-dependent intracellular nickel ion release was evident in all cell lines, with somehow less pronounced uptake in Beas-2B cells. This is in agreement with results presented previously by Capasso and colleagues, who demonstrated that the uptake of NiO NP in A549 cells is mainly endocytosis-related, while there was no evidence for endocytotic uptake of NiO NP in Beas-2B cells [41]. The same tendency was seen in the case of Ni NW, with lower levels of deliberated metal ions in all cell lines, possibly due to the branched structure of the fibers. Recent studies have already shown that Ni NW are taken up by different cell types, such as fibroblasts [42], colon cancer cells [43], and macrophages [44] causing different toxic effects. However, this study offers a quantitative comparison of the bioavailability of Ni NP and NW in different cell lines, which has, to that extent, not been published. Interestingly, Ni NW exhibited a higher bioavailability in dTHP-1 cells when compared to the epithelial cell lines. This indicates that macrophages rapidly start to take up nanowire via phagocytosis, which has already been observed in vivo [45]. Despite the fact that high concentrations of Ni ions were also found in the nucleus of all cells after incubation with the Ni materials, it can be stated that the intracellular released Ni ions mainly remain in the cytoplasm of all cell lines. This observation strengthens results reported by Schwerdtle and colleagues, who investigated the impact of NiO MP in A549 cells [46].

One very interesting example of differences in bioavailability observed in cells and suggested solubility from acellular studies is the case of Ag-based materials. While neither Ag NP nor Ag NW showed considerable ion release, even in acidic artificial lysosomal media, Ag NP, as well as Ag NW, revealed an intracellular bioavailability at all applied concentrations. The observed intracellular bioavailability is in agreement with recent studies [24,47]. Intracellular metal ion release was highest in dTHP-1 cells, with even higher metal ion concentrations in the nucleus when compared to the cytoplasm. The discrepancy between the acellular solubility and intracellular bioavailability appears to be unique for Ag-based materials and may be explained by the fact that silver forms insoluble secondary structures due to its affinity to S- and Cl-groups [48]. These secondary structures are not quantifiable by the static solubility approach used in this study, and may not be fully quantifiable with the dynamic approach either, even though some solubility was observed in the latter test system. Thus, it cannot be excluded that also in the acellular studies, there was a release of Ag ions which bound rapidly to buffer components resulting in the formation and precipitation of these insoluble secondary particles. Within the cell, however, Ag ions may be released, leading to a dynamic equilibrium between cellular reactants, and being quantifiable within the soluble fractions of the respective compartments.

#### **5. Conclusions**

While only minor differences were seen for acellular dissolution and abiotic oxidative reactivity detected by the FRAS assay when comparing NP and NW of the same metal, their reactivity and dissolution are mostly driven by the respective metal under investigation. High solubility in acidic fluids, as models for the lysosomal environment, and pronounced reactivity was seen for Cu-based particulate and fibrous materials. Similarly, high solubility but moderate reactivity was seen for Ni NP and NW. Interestingly, in the case of Ag, no dissolution in acellular fluids was observed, probably due to the formation of insoluble secondary structures; however, an intermediate oxidative reactivity was seen for the Ag NP. CeO2- and TiO2-based materials exhibited no acellular dissolution and no oxidative reactivity. The dissolution behavior of the metal-based nanomaterials was strongly reflected in cellular toxicity and intracellular bioavailability. Thus, CeO2- and TiO2-based materials

showed neither cytotoxicity nor intracellular bioavailability in either cell line, while the bioavailability which was seen for the soluble materials also correlated with the cytotoxicity of these materials. Cytotoxic effects appear to be due to intracellular dissolved metal ions followed by a metal ion overload, and not due to nanomaterial-cell interactions. This is in line with the proposed Trojan-horse type mechanism [13,49]. An interesting exception was seen in the case of Ag-based materials; here the acellular dissolution was not predictive for its cellular toxicity and bioavailability. This may be due to the formation of secondary particles formed after the dissolution of the nanomaterials, which likely precipitate in acellular systems and thus remain undetectable in the soluble fraction, but which may add to the soluble and thus bioavailable fraction in the cellular system. Concerning the different cell lines applied, differences in toxicity and bioavailability were metal-dependent, with no common pattern across the metals. In the case of Ni NW and Ag NW, a comparatively high bioavailability was seen in THP-1 cells with macrophage-like properties, supporting their higher proficiency for phagocytotic uptake.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/nano12010147/s1. Table S1: Parameters used in the DG-model simulations; Table S2: Dose selection for in vitro studies; Table S3: Comparison of freshly prepared and thawed particle dispersions with regard to Z-Average and deposited dose fraction calculated using the DG model; Figure S1: Size distribution of freshly prepared and thawed particle dispersions at concentrations of 100 μg/mL in supplemented RPMI-1640; Figure S2: Impact of freshly prepared and thawed particle dispersions at concentrations of 100 μg/mL in supplemented RPMI-1640 on the deposited dose fraction applying the DG model; Figure S3: FRAS-Testing; Figure S4: Impact of nanomaterials on the relative cell count of A549, Beas-2B, RLE-6TN and dTHP-1 cells after incubation with three different doses; Figure S5: Dissolution in cell culture media.

**Author Contributions:** Conceptualization, J.W., D.A.S., W.W., U.H. and A.H.; methodology, J.W., D.A.S., R.N., M.L., F.S. (Feranika Schworm), P.S., U.H., F.S. (Florian Schulz), M.H., W.W. and A.H.; validation, J.W., D.A.S., R.N., M.L., F.S. (Feranika Schworm), P.S., F.S. (Florian Schulz), M.H., W.W. and A.H.; formal analysis, J.W., D.A.S., R.N., M.L., F.S. (Feranika Schworm), P.S., F.S. (Florian Schulz), M.H., W.W. and A.H.; investigation, J.W., D.A.S., M.L., F.S. (Feranika Schworm), R.N. and M.H.; resources, A.H. and W.W.; data curation, J.W., D.A.S., W.W. and A.H.; writing—original draft preparation, J.W. and D.A.S.; writing—review and editing, J.W., D.A.S., R.N., M.L., F.S. (Feranika Schworm), P.S., U.H., F.S. (Florian Schulz), M.H., W.W. and A.H.; visualization, J.W., D.A.S., R.N., M.L., F.S. (Feranika Schworm), P.S., F.S. (Florian Schulz), M.H., W.W. and A.H.; supervision, A.H., W.W., M.H. and P.S.; project administration, A.H.; funding acquisition, A.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the German Federal Ministry of Education and Research (BMBF), grant number 03XP0211 (MetalSafety).

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the first (J.W., D.A.S.) and corresponding author (A.H.) for researchers of academic institutes who meet the criteria for access to the confidential data.

**Acknowledgments:** We would like to thank Roel Schins (IUF, Düsseldorf, Germany) and Richard Gminski (University of Freiburg, Freiburg, Germany) for providing A549 cells and THP-1 cells. Furthermore, we would like to thank the Laboratories for Electron Microscopy at KIT and BASF, especially Heike Störmer and Volker Zibat, for taking the presented REM images, and Philipp Müller and Thorsten Wieczorek for TEM images, and the Institute of Applied Geosciences at KIT, particularly Elisabeth Eiche for performing the ICP-MS measurements. Lastly, we would like to thank Julian Oppler and Timur Okkali for their technical support.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**


#### **References**


## *Article* **Agglomeration State of Titanium-Dioxide (TiO2) Nanomaterials Influences the Dose Deposition and Cytotoxic Responses in Human Bronchial Epithelial Cells at the Air-Liquid Interface**

**Sivakumar Murugadoss 1,\*,†, Sonja Mülhopt 2,\*,†, Silvia Diabaté 3, Manosij Ghosh 1, Hanns-Rudolf Paur 2, Dieter Stapf 2, Carsten Weiss 3,† and Peter H. Hoet 1,†**


**Abstract:** Extensive production and use of nanomaterials (NMs), such as titanium dioxide (TiO2), raises concern regarding their potential adverse effects to humans. While considerable efforts have been made to assess the safety of TiO2 NMs using in vitro and in vivo studies, results obtained to date are unreliable, possibly due to the dynamic agglomeration behavior of TiO2 NMs. Moreover, agglomerates are of prime importance in occupational exposure scenarios, but their toxicological relevance remains poorly understood. Therefore, the aim of this study was to investigate the potential pulmonary effects induced by TiO2 agglomerates of different sizes at the air–liquid interface (ALI), which is more realistic in terms of inhalation exposure, and compare it to results previously obtained under submerged conditions. A nano-TiO2 (17 nm) and a non-nano TiO2 (117 nm) was selected for this study. Stable stock dispersions of small agglomerates and their respective larger counterparts of each TiO2 particles were prepared, and human bronchial epithelial (HBE) cells were exposed to different doses of aerosolized TiO2 agglomerates at the ALI. At the end of 4h exposure, cytotoxicity, glutathione depletion, and DNA damage were evaluated. Our results indicate that dose deposition and the toxic potential in HBE cells are influenced by agglomeration and exposure via the ALI induces different cellular responses than in submerged systems. We conclude that the agglomeration state is crucial in the assessment of pulmonary effects of NMs.

**Keywords:** nanomaterials; titanium dioxide; agglomerates; air-liquid interface; pulmonary toxicity

#### **1. Introduction**

Nanotechnology is ubiquitous, brings novel advancements in all aspects of human life on a daily basis, and has a wide variety of applications, such as in consumer goods, electronics, communication, environmental treatments and remediations, agriculture, nanomedicine, water purification, textiles, aerospace industry, and efficient energy sources, among many others. The field of nanotechnology is one of the fastest expanding markets in the world and its global value is expected to exceed the USD 125 billion mark by 2024 [1].

Nanomaterials (NMs) are generally defined as a material with at least one dimension in the nanoscale (1–100 nm) range [2]. While NMs are abundant in nature and produced by various sources, such as forest fires and volcanic eruptions, they are also intentionally manufactured by nanotechnologies on a global scale for industrial and commercial purposes. EU recommended a definition for NM solely for regulatory purpose, which states

**Citation:** Murugadoss, S.; Mülhopt, S.; Diabaté, S.; Ghosh, M.; Paur, H.-R.; Stapf, D.; Weiss, C.; Hoet, P.H. Agglomeration State of Titanium-Dioxide (TiO2) Nanomaterials Influences the Dose Deposition and Cytotoxic Responses in Human Bronchial Epithelial Cells at the Air-Liquid Interface. *Nanomaterials* **2021**, *11*, 3226. https://doi.org/ 10.3390/nano11123226

Academic Editor: David M. Brown

Received: 28 October 2021 Accepted: 25 November 2021 Published: 27 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

219

that "natural, incidental or manufactured material containing particles, in an unbound state or as an aggregate or as an agglomerate and where, for 50% or more of the particles in the number size distribution, one or more external dimensions is in the size range 1 nm–100 nm" [3].

Among the manufactured NMs, titanium dioxide (TiO2) is one of the widely used NMs in commercial applications and approximately four million tons of TiO2 are produced annually worldwide [4–6]. Commercial TiO2 NMs come in different crystalline forms such as anatase and rutile. As TiO2 NMs reflect UV light, they are widely used in cosmetics and in paints as a UV filter [5] as well as in plastics [7]. TiO2 NMs are also extensively used as food colourant (food additive E171) [8]. Due to their light dependent properties, TiO2 NMs are being studied for potential medical and bio-medical applications such as antibacterial activity, biosensing, drug delivery, and implant applications [9,10]. The production of TiO2 NMs is expected to expand continuously due to their potential in the energy sector and environmental based applications [11]. This clearly indicates that there is a potential for human exposure, particularly inhalation, as this is the major route of exposure to TiO2 NMs in occupational settings and raises concerns about their safety and adverse pulmonary effects [12].

Toxicological evaluations of TiO2 NMs are often performed using in vivo models such as mice and rats. Short and long term exposure to TiO2 NMs via inhalation induced pulmonary inflammation, fibrosis and tumours [6,13–15]. A significant increase in cytotoxicity, inflammation, oxidative stress, and DNA damage was observed in mice exposed to high doses (10 mg/kg [16] and ~4 mg/kg [17]) of TiO2 NMs. The studied endpoints are major key events identified to play essential roles in fibrosis and tumour development [14,18].

In vitro models are often employed as a first screening method, to unveil the mechanisms involved in the induction of adverse effects, and to prioritize NMs for further animal testing. Traditionally, submerged in vitro cell cultures are widely used to assess the adverse effects of NMs with a particular focus on the production of reactive oxygen species, which can be generated specially in case of TiO2 NMs [19]. Submerged exposure to TiO2 NMs induced cytotoxicity, oxidative stress, pro-inflammatory responses, and genotoxicity in lung derived immortalized cell lines [6]. In submerged exposure systems, the cells are covered with culture media to which NMs are added. The biomolecules present in the culture media can adsorb to the surface of the NMs to form a protein corona [20,21]. Such changes to the surface can potentially prevent the adverse effects of NMs [22], affect the physico-chemical properties relevant for toxicological assessment (size, surface area, surface composition, surface charge, and agglomeration, etc.) [23], and also effective density [24], an important parameter that determines the sedimentation of NMs. However, these modifications of NMs in cell culture medium often do not reflect the conditions upon inhalation in real life situations.

Recently, exposure at the air liquid interface (ALI) has been evolving as a potential alternative to conventional submerged in vitro exposure systems. At the ALI, cells grown on transwell plates are directly exposed to aerosolized particles and gases, which better reflects the exposure in vivo via inhalation [25–28]. Previously, we have developed, validated, and used a fully integrated ALI exposure system for the assessment of toxicological effects of various NMs and aerosols [29–33].

It is well known that the physicochemical properties of NMs influence their toxicity [34]. Among all the properties, the influence of agglomeration on the toxicity of NMs is less well studied and poorly understood. In our previous study, we assessed the influence of TiO2 NM agglomeration on (cyto) toxicity and biological responses [35] using a human bronchial epithelial (HBE) cell line. However, the entire study was carried out in submerged exposure conditions. While there is only a limited number of toxicological investigations addressing adverse effects of TiO2 NMs using ALI exposure systems [31,36,37], the impact of agglomeration has not been researched. Here, we prepared TiO2 NM agglomerates of different sizes and performed toxicological studies employing ALI exposure. The aim of the present work was to investigate the cytotoxicity and biological responses in HBE cells

after ALI exposure to different doses of TiO2 agglomerates of different sizes and compare the results to those previously obtained under submerged conditions [35].

#### **2. Materials and Methods**

#### *2.1. Preparation of Dispersions and Characterization of TiO2 NMs*

Two TiO2 NMs (representative test materials) of different primary sizes were kindly provided by the European Commission's Joint Research Centre (JRC, Ispra, Italy). Mean primary size of TiO2-JRCNM10202a was determined as 17 nm and TiO2-JRCNM10200a as 117 nm. Therefore, the two NMs are indicated as 17 nm and 117 nm sized TiO2 in the text. Both TiO2 NMs are pristine and anatase in nature. Detailed physicochemical characterization of these NMs were provided in a previously published JRC report [38].

Detailed information on the development of the dispersion protocol to obtain two different agglomeration states (small and large agglomerates) of both TiO2 NMs were published elsewhere [17,35]. Briefly, to obtain agglomerates of different sizes, particles were dispersed in different pH conditions (2 and 7), the dispersions were probe sonicated (7056 J) and stabilized with 1% bovine serum albumin (BSA). After stabilization, the suspensions at pH 2 were readjusted to pH 7–7.5 by slowly adding sodium hydroxide solution (NaOH). While the original dispersion protocol was developed to prepare 10 mL of stock dispersions and intended for submerged exposure, for ALI exposure, the quantity was scaled up to 120 mL to provide sufficient quantity of dispersions for the aerosolization of TiO2 agglomerates during the 4 h exposure period. Each dispersion was freshly prepared before each exposure. Table 1 shows the nomenclature of different dispersions.

**Table 1.** Nomenclature of TiO2 agglomerate dispersions.


#### *2.2. Cell Culture Maintenance*

The human bronchial epithelial cell line (16HBE14o- or HBE) was kindly provided by Dr. Gruenert (University of California, San Francisco, CA, USA). HBE cells were cultured in DMEM/F12 supplemented with 5% fetal bovine serum (FBS), 1% penicillin-streptomycin (P-S) (100 U/mL), 1% L-glutamine (2 mM) and 1% fungizone (2.5 g/mL). All cell culture supplements were purchased from Invitrogen (Merelbeke, Belgium) unless otherwise stated. Cells were cultured in T75 flasks at 37 ◦C in a 100% humidified air containing 5% CO2. Medium was changed every 2 or 3 days and cells were passaged every week (7 days). Cells from passage 4 to 8 were used for the experiments.

#### *2.3. Air–Liquid Interface Exposure*

For ALI exposure, 3.5 × 105 HBE cells/mL were seeded on the apical side of a 6 well transwell plate (Corning Costar Transwell insert membranes type 3450, culture area 4.67 cm2, pore size 0.4 μm, cat no 10619141, Fischer scientific, Schwerte, Germany) with 1.5 mL of cell culture medium on the basolateral side and incubated overnight at 5% CO2 and at 37 ◦C. Before ALI exposure, the apical and the basolateral media were removed. Then, cell culture medium was added into the basolateral compartment and the apical side was left uncovered (no medium). Uncovered cells were exposed to "clean air" (humidified synthetic air as negative control) or in parallel to airborne TiO2 agglomerates at low dose without electrostatic deposition and at different levels of electrostatic deposition (400 V, 800 V, and 1200 V) for 4h to facilitate dose–response evaluation of different biological endpoints. After exposure, the medium at the basal side of the transwell inserts was collected for LDH analysis.

#### *2.4. Aerosol Generation and Characterization*

Figure 1 shows the layout of the ALI exposure system. For TiO2 aerosol generation, a setup according to the VDI guideline 3491 (Technical Division Environmental Measurement Technologies, 2016) was used. The TiO2 dispersions, continuously stirred during the experiment, were sprayed in a drying reactor with a silica gel fill along the walls using a two-substance nozzle (model 970, Düsen-Schlick GmbH, Untersiemau/Coburg, Germany). The dry TiO2 aerosol was regularly characterized for the number size distribution in the drying reactor using a Scanning Mobility Particle Sizer SMPS + C (Grimm Aerosol GmbH, Ainring, Germany) and directed to the ALI exposure system, as described by Mülhopt et al. [30]. In the conditioning reactor of the ALI exposure system, the TiO2 aerosol is tempered to 37 ◦C and humidified to 85% r. h., and then sampling streams are directed to the single exposure chambers containing the cell cultures using an exposure flow rate of 100 mL/min. For describing the aerosol state as exposed to the cell cultures, the number size distribution was also measured by a Scanning Mobility Particle Sizer U-SMPS (Palas GmbH, Karlsruhe, Germany) sampled in the aerosol conditioning reactor. Every 5 min a scan was performed; means and standard deviation were calculated from all scans of an experiment in each channel and corrected regarding sampling losses according to Asbach et al. [39].

**Figure 1.** Experimental setup: generation of airborne TiO2 agglomerates in dry air according to VDI guideline 3491 and exposure of human lung cells in the Air–Liquid-Interface Exposure System with accompanying measurement of particle size distribution using Scanning Mobility Particle Sizer (SMPS) at the dry stage in the reactor as well as for the aerosol inside the exposure system.

#### *2.5. Determination of the Deposited Dose*

The deposited cell culture surface dose is reflected by the deposited fraction of the TiO2 agglomerates exposed as aerosol towards the cells and not easy to determine. For this

reason, three different methods were applied: the online monitoring of mass dose using the quartz crystal microbalance QCM (Vitrocell Systems GmbH, Waldkirch, Germany) [29], the image analysis of exposed TEM grids as presented in [40] and the calculation from the SMPS measured number size distribution as shown in [30]. The effective density of all TiO2 agglomerates were used as determined and reported in [35], and did not differ much between the dispersions ((17nm-SA: 1.55 g/cm3), (17nm-LA: 1.48 g/cm3), (117nm-SA: 1.78 g/cm3), and (17nm-LA: 1.78 g/cm3)).

#### *2.6. Metabolic Activity*

Metabolic activity was evaluated as a measure of cell viability using the WST-1 assay (Merck, Overijse, Belgium). At the end of ALI exposure, cells were washed with HBSS and incubated with 500 μL of WST1 reagent (diluted in HBSS at the ratio of 1:10) for 45 min. At the end of incubation, 100 μL was transferred to a 96 well plate and optical density was recorded at 450 nm. Sample OD values were subtracted from blank OD values and results were expressed as percentage of negative control cells. Cells exposed to clean air were treated as negative control and Triton X-100 (0.1%) lysed cells were treated as positive control (data not shown).

#### *2.7. Membrane Integrity*

Lactate dehydrogenase (LDH) activity in the cell culture supernatant was measured as an indicator of membrane damage. Briefly, 100 μL of cell culture medium collected at the basal side at the end of ALI exposure were transferred to a 96 well plate and incubated with LDH mixture (prepared as indicated in the manufacturer's protocol, Sigma-Aldrich, Taufkirchen, Germany, cat no 11644793001) and the optical density (OD) was recorded at 490 nm. Sample OD values were subtracted from blank OD values and results were expressed as percentage of Triton X-100 (0.1%) lysed cells. Cells exposed to clean air were treated as negative control.

#### *2.8. Total Glutathione Measurements*

Reduced glutathione (GSH) depletion was measured as an indicator of oxidative stress induction. Briefly, exposed cells were scraped, transferred into Eppendorf tubes and centrifuged at 150× *g* for 5 min. Then, the supernatants were discarded and cells were resuspended in 1 mL of phosphate-buffered saline (PBS). After centrifugation, PBS was removed and 450 μL of 10 mM hydrochloric acid (HCL) was added to each tube. Cell lysis was performed by the freeze thawing procedure (15 min freezing, 15 min thawing for two times) and immediately protein content analysis (by BCA assay) was performed using 10 μL of the cell lysate. Then, the lysate was resuspended in 6.5% 5-sulfosalicylic acid (SSA), incubated on ice for 10 min and centrifuged at 20,800× *g* (14,000 rpm) for 10 min at 4 ◦C to precipitate the proteins. The supernatants were stored at −80 ◦C for later GSH determination. GSH was measured using a glutathione detection kit (Enzo life sciences, Brussels, Belgium). Cells exposed to clean air were treated as negative control.

#### *2.9. DNA Damage*

Briefly, at the end of the ALI exposure, the cells were detached with trypsin, centrifuged at 250× *g* for 5 min, suspended in the storage buffer, composed of sucrose 85.5 g/L, dimethyl sulfoxide (DMSO) 50 mL/L prepared in citrate buffer (11.8 g/L), pH 7.6, and immediately frozen at −80 ◦C. DNA strand breaks were measured using the alkaline comet assay kit (Trevigen, C.No.4250–050-K, Gaithersburg, MD, USA) according to the manufacturer's protocol. Fifty cells per gel were measured. Cells exposed to clean air were treated as negative control and cells treated with methyl methane sulfonate (Merck, Overijse, Belgium; 100 μM for 1h) served as positive control (data not shown). Results were expressed as percentage of DNA in the tail. NMs can interfere with the comet assay [41]. To evaluate this, we mixed TiO2 NMs with clean air exposed cells (negative control) and cells treated with MMS (positive control), performed the comet assay, and compared the results

with negative and positive controls prepared without TiO2 NMs. The results indicated that the TiO2 NMs at high concentrations (100 μg/mL) did not interfere with the assay.

#### *2.10. Statistical Analysis*

Two or three independent experiments were performed with six replicates each and data was presented as mean ± standard deviation (SD). Using GraphPad prism 7.04 for windows, GraphPad Software (7.04, La Jolla, CA, USA), www.graphpad.com (accssed on 24 November 2021), the results were analysed with one-way ANOVA followed by a Dunnett's multiple comparison test to determine the significance of differences compared with control.

#### **3. Results**

#### *3.1. Size Characterization in Stock Suspensions*

We obtained four agglomerate dispersions from two TiO2 NMs of different sizes. Detailed information on the physicochemical characterization and methods used to characterize the agglomerates in stock dispersions were published elsewhere [17,35]. Using a standardized TEM technique in our previous study [42], we measured the size of several thousand agglomerates in each dispersion. The TEM based determination of the diameter (median feret min) indicated that the size of 17 nm sized TiO2 NMs in their least agglomerated form (indicated as 17 nm-SA) was 33 nm, while it was 120 nm for the strongly agglomerated condition (17 nm-LA). The sizes of small (117 nm-SA) and large agglomerates (117 nm-LA) of 117 nm sized TiO2 NMs were 148 and 309 nm, respectively (see Figure S1 and Table S1). In summary, at low pH agglomeration of TiO2 NMs was modest whereas at neutral pH strong agglomeration of the small (17 nm) and less pronounced for the larger (117 nm) TiO2 NM is observed [17,35].

#### *3.2. Aerosol Characterization and Determination of Deposited Dose*

The particle number size distributions (Figure 2) showed nearly the same characteristics for 17 nm-SA and 117 nm-LA with modal values xM of 72 and 71 nm, respectively. The other titania 17 nm-LA and 117 nm-SA were also nearly identical with a size of xM = 144 and 139 nm, respectively. All particle number size distributions have a typical geometric standard deviation σgeo in the range of 2. These results show a similar trend as our previously reported TEM sizes (Supplementary Material Figure S1, [35]) for the different agglomerates in the stock solutions except for 117 nm-LA. Comparing SMPS and primary TEM data from stock solutions, the aerosol processing may cause differences for the size determination of 117 nm-LA agglomerates as the aerosol is characterized with the SMPS under nearly dry conditions, whereas for submerged exposure and subsequent TEM analysis, the water content and media components might increase the size of agglomerates. In addition, the SMPS measurements only cover the range of 10 to 800 nm and neglect possible larger agglomerates.

In Table 2, the summary of all measurements regarding TiO2 aerosol characteristics is listed. The mass concentration is calculated from the number size distribution of the SMPS measurements. The QCM was operated without electrostatic deposition delivering the diffusional doses as listed. Image evaluation of exposed TEM grids provides deposited surface doses for the 17 nm-LA and 117 nm-SA, both corresponding very well with the QCM data in the case of diffusional deposition (Supplementary Material, Figure S2). The enhanced doses for electrostatic deposited agglomerates were evaluated for the 17 nm-LA and 117 nm-SA from the exposed TEM grids. For these two types of agglomerates, the dose enhancement factors were determined. In case of 117 nm-SA, deposition enhancement factors of 5 for 400 V, 12 for 800 V, and 13 for 1200 V were calculated. Similarly, in case of 17 nm-LA deposition enhancement factors of 4 for 400 V, 9 for 800 V, and 9 for 1200 V are derived. For both types of agglomerates, the relative increase in deposition becomes less with the increase of the electrostatic field strength. This saturation behavior has been also shown earlier [40] and occurs when all charged agglomerates are deposited. In the TEM images of 17 nm-SA and 117 nm-LA individual particles could not be unambiguously identified due to a strong background signal, (Supplementary Material Figure S2), the doses listed in the summary table were calculated on the basis of the QCM data (measured at 0 V) multiplied by the enhancement factors derived at the different voltages for the corresponding particle types 17 nm-LA and 117 nm-SA.

**Figure 2.** Particle size distributions of TiO2 agglomerates measured by Scanning Mobility Particle Sizer U-SMPS in the range of 8 to 800 nm. Each curve shows the means of number size distributions in dependence of particle type and agglomeration state. Red circles: 17 nm-SA; black squares: 17 nm-LA; blue triangles: 117 nm-SA; and green inverted triangles: 117 nm-LA.

**Table 2.** Characteristics of TiO2 aerosols and measured or calculated deposited surface doses on cell cultures.


<sup>a</sup> calculated from SMPS data, <sup>b</sup> n.a. = not analyzed as particles could not be clearly identified by TEM analysis, <sup>c</sup> calculated on the basis of QCM data (0 V EF) multiplied with the corresponding factor for enhanced deposition at the different voltages as determined for the 17 nm-LA, <sup>d</sup> calculated on the basis of QCM data (μg/cm2, 0 V EF) multiplied with the corresponding factor for enhanced deposition at the different voltages as determined for the 117 nm-SA.

#### *3.3. Cytotoxicity: Effect on Metabolic Activity and LDH Release*

After 4 h exposure to aerosolized TiO2 agglomerates at the ALI, we measured the effect on cell metabolic activity using the WST-1 assay (Figure 3). Significant loss of metabolic activity (~20%) was observed in cell cultures exposed to smaller agglomerates of 17 nm sized TiO2 (17 nm-SA) at the highest dose deposited ~30 μg/cm2 (800 and 1200 V) while their larger counterparts (17 nm-LA) induced significant increase in metabolic activity (~25%) at the lowest (~1.6 μg/cm2) and at the highest doses (~16 μg/cm2) deposited. Although an increasing trend in the metabolic activity can be seen (Figure 3C,D) compared

to their clean air controls, small (117 nm-SA) and large agglomerates (117 nm-LA) of 117 nm sized TiO2 did not affect the metabolic activity significantly even at the highest doses deposited (~24 and 38 μg/cm2, respectively). Subsequently, we measured the LDH activity in the supernatant (basal media) of the cells exposed at the ALI (Figure 4). Compared to clean air exposed controls, a trend of dose dependent increase in LDH activity was noticed for all TiO2 agglomerates.

#### *3.4. Oxidative Stress and DNA Damage*

We measured GSH depletion as an indicator of oxidative stress induction (Figure 5). We detected significant and similar decrease of GSH for 17 nm-SA at doses ~12 and 30 μg/cm2 while a slight but statistically significant increase in glutathione was noticed for 17 nm-LA only at the highest dose (~16 μg/cm2) deposited. There was a non-significant increase of glutathione for both agglomerates of 117 nm sized TiO2 (Figure 5C,D). We assessed the DNA strand breaks as a measure of DNA damage using the alkaline comet assay (Figure 6). An increasing trend in DNA damage was noticed for 17 nm-SA and 17 nm-LA which, however, was not significant compared to unexposed controls (Figure 6A,B). Both SA and LA of 17 nm TiO2 NMs induced significant increase in DNA damage only at mass doses ~24 and 38 μg/cm2, respectively (Figure 6C,D).

**Figure 3.** Effect on metabolic activity of HBE cells after 4 h exposure to TiO2 agglomerates at the ALI. 17 nm-SA (**A**), 17 nm-LA (**B**), 117 nm-SA (**C**), and 117 nm-LA (**D**). Data are expressed as means ± SD from three independent experiments with six replicates each. *p* < 0.05 (\*) and *p* < 0.01 (\*\*) represent significant difference compared to control (One-way ANOVA followed by Dunnett's multiple comparison test).

**Figure 4.** LDH activity measured in HBE cell supernatants after 4 h exposure to TiO2 agglomerates at the ALI. 17 nm-SA (**A**), 17 nm-LA (**B**), 117 nm-SA (**C**), and 117 nm-LA (**D**). Data are expressed as means ± SD from three independent experiments with six replicates each. *p* < 0.05 (\*) represent significant difference compared to control (One-way ANOVA followed by Dunnett's multiple comparison test).

**Figure 5.** Glutathione levels measured in HBE cells after 4 h exposure to TiO2 agglomerates at the ALI. 17 nm-SA (**A**), 17 nm-LA (**B**), 117 nm-SA (**C**), and 117 nm-LA (**D**). Data are expressed as means ± SD from two independent experiments with six replicates each. *p* < 0.05 (\*) represent significant difference compared to control (One-way ANOVA followed by Dunnett's multiple comparison test).

**Figure 6.** DNA damage measured in HBE cells after 4 h exposure to TiO2 agglomerates at the ALI. 17 nm-SA (**A**), 17 nm-LA (**B**), 117 nm-SA (**C**), and 117 nm-LA (**D**). Data are expressed as means ± SD from two independent experiments with six replicates each. *p* < 0.05 (\*) represent significant difference compared to control (One-way ANOVA followed by Dunnett's multiple comparison test).

#### *3.5. Summary of Biological Responses*

Although there is an overlap of the different deposited doses of the various TiO2 agglomerates, for a convenient comparison, we rather considered the significant lowest observed adverse effect concentration for each endpoint to determine the toxic potency of agglomerates (Table 3). Increase in LDH activity and decrease in glutathione was noticed for 17 nm-SA even at low doses (3.4 and 12.2 μg/cm2, respectively), which could possibly lead to a decrease in metabolic activity at the higher dose (30 μg/cm2), but no such effects were noted for other agglomerates. These results indicate that the smaller agglomerates of nano-sized TiO2 are more potent in terms of cytotoxicity and oxidative stress induction at the ALI. However, when considering DNA damage at the different deposited doses, agglomerates of non-nano sized TiO2, small agglomerates in particular, appear to be more potent compared to agglomerates of nano-sized TiO2.

**Table 3.** Significant lowest observed adverse effect concentration of different TiO2 agglomerates observed for different biological endpoints at the ALI. "-" indicates no significant effect could be detected.


Table 4 shows significant lowest observed adverse effect concentrations determined from our previously published study [35] for different endpoints in HBE cells exposed in submerged conditions. None of the agglomerates did induce significant cytotoxic effects at the tested doses but significant decrease in glutathione was noticed for all the agglomerates only at the dose of 155 μg/cm2. Large agglomerates of 117 nm-SA induced DNA damage at the dose of 13 <sup>μ</sup>g/cm<sup>2</sup> while other agglomerates induced such effects at the dose ≥ 26 μg/cm2, indicating that the large agglomerates of non-nano sized TiO2 are more potent in terms of DNA damage.

**Table 4.** Significant lowest observed adverse effect concentration of different TiO2 agglomerates observed for different biological endpoints at the submerged exposure system (from our previously published study). "-" indicates no significant effect could be detected.


#### **4. Discussion**

Poor correlation of conventional in vitro and in vivo nanotoxicological exposure studies has been urging the development and validation of models that more closely represent the physiological responses of inhalation exposure. Compared to conventional submerged in vitro systems, air–liquid interface (ALI) exposures are shown to better mimic the inhalation exposure as cell cultures grown at the ALI are exposed to aerosolized particles. However, a deeper understanding of the behavior of NMs in relation to their physicochemical characteristics within the ALI system is essential. In this study, we investigated the influence of TiO2 NM agglomeration on their deposition and cytotoxic potency in the ALI system. Our results indicated that dose deposition and their cytotoxic potential are influenced by TiO2 agglomeration, particularly for nano-sized TiO2.

In the current study, we could deposit in the absence of an EF mass doses of 1.6–3.4 μg/cm2, which could be further enhanced in the presence of an EV to 29.8–38.5 μg/cm2. Hence, in contrast to submerged exposure where the deposited dose of nano- and non-nano-sized NMs varies drastically, at the ALI similar doses independent of particle size could be deposited as agglomerates. This allows a more direct comparison of dose–response relationships without the need of additional modelling as required under submerged conditions. In a previous study, using the same ALI system, 0.17 μg/cm2 and nearly 1.14 μg/cm2 was deposited at 0 and 1000 V, respectively, for the same exposure duration using another non-agglomerated nano TiO2 (NM-105) [31]. This indicates that the type of TiO2 NMs and their agglomeration state influences the dose which is deposited.

We noticed that the smaller agglomerates of nano-sized TiO2 NMs induced significant cytotoxicity and oxidative stress at the ALI at low doses (dose < 13 μg/cm2) while agglomerates of 17 or 117 nm sized TiO2 NMs induced oxidative stress, but no cytotoxicity, under submerged exposure conditions only at the highest dose tested (~155 μg/cm2). In the case of DNA damage, small agglomerates of non-nano sized TiO2 NMs appear to be more potent at the ALI while large agglomerates of non-nano sized TiO2 NMs were found to be the most potent in submerged exposure conditions. These results indicate that the degree of agglomeration influences the potency of TiO2 NMs to damage DNA in HBE cells differentially at the ALI and in submerged conditions.

Here, we found that the small agglomerates of nano-sized TiO2 NMs (agglomerate size < 100 nm) are more potent in terms of cytotoxicity and oxidative stress induction at the ALI compared to submerged exposure conditions. Noël et al. exposed rats to 7 mg/m<sup>3</sup> of small (31 nm) and large agglomerates (194 nm) of TiO2 NMs for 6h and noticed a significant increase in LDH activity and 8-isoprostane concentration in BALF, which are markers for cytotoxic and oxidative stress effects, respectively [43]. In another study of the same group, rats were exposed to 20 mg/m<sup>3</sup> of small (29, 28 and 35 nm) and large (156, 128 and 135 nm, respectively) agglomerates obtained from differently sized TiO2 NMs (primary size of 5, 20, and 50 nm, respectively) for6h[44]. The results indicated that, only the small agglomerates (size < 100 nm) of 5 nm sized TiO2 NMs caused a significant increase in cytotoxic effects while the small agglomerates (size < 100 nm) of all TiO2 NMs, irrespective of primary particle size, induced a significant increase in oxidative damage compared to larger agglomerates (size > 100 nm), which showed no significant effects for these endpoints. These in vivo results are in agreement with our recent but also previous findings [31], indicating that ALI exposure systems are more suitable than submerged exposure assays to recapitulate adverse effects upon inhalation of NMs. Furthermore, it is of utmost importance to deposit doses in the range of ng to maximally a few ug of nanomaterials per cm2 cellular surface area to recapitulate exposure of humans upon inhalation as outlined previously [25].

Recently, the European Food Safety Authority (EFSA) concluded that the use of TiO2 as a food additive is no longer considered safe, which highlights the importance to investigate adverse effects of nano-TiO2, genotoxic effects in particular [45]. In our previous study, we noticed that the small and large agglomerates of both TiO2 NMs used in this study induced DNA damage in HBE cell cultures exposed at submerged conditions at a dose range of <50 μg/cm2 (see Table 3) without inducing a significant increase in cytotoxicity and oxidative stress [35]. In this study, both agglomerates of 117 nm TiO2 NMs induced DNA damage at the ALI within this dose range also without inducing a significant increase in cytotoxicity and oxidative stress. Our previous study and others have shown that the TiO2 NMs were internalized by bronchial epithelial cells in submerged culture [35,46] and such internalized NMs can induce primary DNA damage by directly interacting with DNA, without the induction of cytotoxicity or oxidative stress. In the case of ALI, post exposure incubation for longer periods (such as 24 h) are needed to verify whether the induced DNA damage causes a difference in cell viability or oxidative stress. In contrast, small agglomerates of 17 nm sized TiO2 NMs provoked cytotoxicity and oxidative stress but no DNA damage at the same doses. This indicates that genotoxic effects of TiO2 NMs are impacted by their agglomeration state, as non-agglomerated TiO2 NMs of a modal diameter of 47 nm induced DNA damage already at 1.12 μg/cm<sup>2</sup> [31]. Moreover, our results further suggest that the genotoxicity of submicron sized TiO2 particles or their agglomerates should be also considered in the future.

#### **5. Conclusions**

In this study, we investigated the influence of agglomeration on the deposition and cytotoxic potency of TiO2 NMs at the ALI. Our results indicate that dose deposition and the cytotoxic potential are influenced by agglomeration, particularly for nano-sized TiO2 particles. This suggests that the agglomeration state of NMs is crucial in the assessment of pulmonary effects of NMs. Our findings also show that exposure via the ALI induces different cellular responses compared to exposure in submerged systems. More attention should be paid to the methods used to prepare the dispersions of TiO2 NMs, specifically concerning agglomeration, in order to assess the (nano) effects at the air-liquid interface and to better predict the hazardous potential of NMs upon inhalation.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/10 .3390/nano11123226/s1. Figure S1: Representative TEM micrographs of freshly prepared TiO2 stock dispersions, Figure S2: TEM micrographs of aerosolized TiO2 agglomerates. Table S1: Characterization of freshly prepared TiO2 stock dispersions.

**Author Contributions:** Conceptualization, S.M. (Sivakumar Murugadoss), S.M. (Sonja Mülhopt), S.D., M.G., H.-R.P., C.W. and P.H.H.; Data curation, S.M. (Sivakumar Murugadoss); Formal analysis, S.M. (Sivakumar Murugadoss); Funding acquisition, P.H.H.; Investigation, S.M. (Sivakumar Murugadoss); Methodology, S.M. (Sivakumar Murugadoss) and S.M. (Sonja Mülhopt); Project administration, P.H.H.; Software, S.M. (Sivakumar Murugadoss); Supervision, D.S. and P.H.H.; Visualization, S.M. (Sivakumar Murugadoss), S.M. (Sonja Mülhopt), and P.H.H.; Writing—original draft, S.M. (Sivakumar Murugadoss), S.M. (Sonja Mülhopt), C.W. and P.H.H.; and Writing—review and editing, S.M. (Sivakumar Murugadoss), S.M. (Sonja Mülhopt), S.D., M.G., H.-R.P., C.W. and P.H.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was funded by Post-Doctoral Mandates KU Leuven internal funding (PDM/20/ 162), the Belgian Science Policy (BELSPO) program "Belgian Research Action through Interdisciplinary Network (BRAIN-be)" for the project "Towards a toxicologically relevant definition of nanomaterials (To2DeNano)" and EU H2020 project (H2020-NMBP-13-2018 RIA): RiskGONE (Sciencebased Risk Governance of NanoTechnology) under grant agreement no 814425.

**Data Availability Statement:** The primary data processed and presented in this study are available on request from the corresponding author.

**Acknowledgments:** We thank Sonja Oberacker for her assistance within this study. The authors thank JRC Nanomaterials Repository, Italy for providing the TiO2 NMs. We acknowledge support by the KIT-Publication Fund of the Karlsruhe Institute of Technology.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Assessing the Toxicological Relevance of Nanomaterial Agglomerates and Aggregates Using Realistic Exposure In Vitro**

**Sivakumar Murugadoss 1, Lode Godderis 2,3, Manosij Ghosh <sup>1</sup> and Peter H. Hoet 1,\***


**Abstract:** Low dose repeated exposures are considered more relevant/realistic in assessing the health risks of nanomaterials (NM), as human exposure such as in workplace occurs in low doses and in a repeated manner. Thus, in a three-week study, we assessed the biological effects (cell viability, cell proliferation, oxidative stress, pro-inflammatory response, and DNA damage) of titanium-di-oxide nanoparticle (TiO2 NP) agglomerates and synthetic amorphous silica (SAS) aggregates of different sizes in human bronchial epithelial (HBE), colon epithelial (Caco2), and human monocytic (THP-1) cell lines repeatedly exposed to a non-cytotoxic dose (0.76 μg/cm2). We noticed that neither of the two TiO2 NPs nor their agglomeration states induced any effects (compared to control) in any of the cell lines tested while SAS aggregates induced some significant effects only in HBE cell cultures. In a second set of experiments, HBE cell cultures were exposed repeatedly to different SAS suspensions for two weeks (first and second exposure cycle) and allowed to recover (without SAS exposure, recovery period) for a week. We observed that SAS aggregates of larger sizes (size ~2.5 μm) significantly affected the cell proliferation, IL-6, IL-8, and total glutathione at the end of both exposure cycle while their nanosized counterparts (size less than 100 nm) induced more pronounced effects only at the end of the first exposure cycle. As noticed in our previous short-term (24 h) exposure study, large aggregates of SAS did appear to be similarly potent as nano sized aggregates. This study also suggests that aggregates of SAS of size greater than 100 nm are toxicologically relevant and should be considered in risk assessment.

**Keywords:** nanotoxicology; titanium dioxide; synthetic amorphous silica; agglomerates and aggregates; realistic exposure in vitro

#### **1. Introduction**

Manufactured nanomaterials (NMs) are, due to their unique physico-chemical properties, used in a large variety of applications. Nowadays, at least 1800 products containing NMs, ranging from personal care products to sporting goods, are in circulation in the global market [1]. Concerns regarding the human health effects of NMs are gradually increasing due to their increased production and use [2–5].

In the real world, such as in occupational exposure settings, NMs exist as primary particles, agglomerates, aggregates, or as a mixture thereof [6–8]. In agglomerates, the particles are loosely bound by weak forces such as Van der Waals in a reversible manner, while in aggregates, particles are irreversibly fused together by chemical bonding such as covalent or ionic bonding [9]. The term agglomerates and aggregates (AA) is included in the definition of NMs recommended by the European Union [10]. It states that "manufactured material containing particles, in an unbound state or as an aggregate

**Citation:** Murugadoss, S.; Godderis, L.; Ghosh, M.; Hoet, P.H. Assessing the Toxicological Relevance of Nanomaterial Agglomerates and Aggregates Using Realistic Exposure In Vitro. *Nanomaterials* **2021**, *11*, 1793. https://doi.org/10.3390/nano11071793

Academic Editors: Andrea Hartwig and Christoph Van Thriel

Received: 18 June 2021 Accepted: 8 July 2021 Published: 9 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

235

or as an agglomerate and where, for 50% or more of the particles in the number size distribution, one or more external dimensions is in the size range 1 nm–100 nm". However, the definition was recommended solely for regulatory applications without any regard for hazard. Moreover, the relevance of AA in terms of toxicological perspectives is still largely unknown.

Titanium-di-oxide (TiO2) and synthetic amorphous silica (SAS) are among the most widely used NMs. Due to their unique properties, they have found applications in food, cosmetics, paints, etc. [2,11,12]. TiO2 NMs are well known for their tendency to agglomerate [13], while SAS NMs are known to aggregate easily during their production for industrial/commercial applications [14]. Thus, to determine the influence of agglomeration and aggregation on NM toxicity, we investigated and compared in our previous studies the acute (24 h) toxicological effects of TiO2 NMs in different agglomeration states [11] or SAS in different aggregation states [12] in three different cell lines. The results suggested that in most cases, large agglomerates or aggregates were not less potent compared to their smaller counterparts. This indicated that the toxicity of tested NMs was not mitigated by their agglomeration/aggregation state, and therefore AA of NMs of larger size (size greater than 100 nm) appear to be toxicologically relevant.

To date, most studies have evaluated the toxic potential of NMs after short-term exposure [15–17]. Recently, long-term and repeated low dose exposure studies for the hazard assessment have been set up for NMs, better mimicking the real life exposure (e.g., workers in production) that occurs (often) at low doses. Biological effects induced by NMs have also shown to be different between short-term versus (relatively) long-term exposure [18–20]. Xi et al., performed a 21 d (3w) exposure study using vanadium dioxide (VO2) nanoparticles (NPs) [19]. In his study, A549 cells were repeatedly exposed to a low dose (0.2 μg/mL) of VO2 NPs and the authors observed a 50% decreased proliferation during sub-culturing at the end of every week. Similarly, Chen et al. (2016) also performed a 21 d exposure study and noticed that the proliferation of Caco2 cells were reduced up to 50% when repeatedly exposed to 0.5 μg/mL of silver (Ag) NPs [20]. In both studies, an increase in cytokines and reactive oxygen species (ROS) generation were associated with decreased proliferation.

In this study, we aimed to determine how different AA suspensions influence the biological responses in cell cultures repeatedly exposed to a low dose (three week study). There is no consensus to estimate the dose for long-term exposure. We estimated 0.76 μg/cm2 as an appropriate dose based on OELs for TiO2 and SAS [21,22], which corresponds to a concentration of 2 μg/mL. This dose was also determined as non-cytotoxic in short-term experiments (data not shown).

#### **2. Materials and Methods**

#### *2.1. Preparation of Dispersions and Size Characterization*

Two TiO2 NPs of different sizes (17 nm and 117 nm) in different agglomeration states (small and large agglomerates) were freshly prepared during each exposure as described in [11] (p. 9) and details of methods used for size characterization in stock are provided in (p. 10). Two different suspensions of SAS in different aggregation states (indicated as DE-AGGR and AGGR) were prepared. In addition, we also studied the two identified subfractions in the AGGR suspension (SuperN and PREC) as described in [12]. All suspensions were freshly prepared as described in [12] (pp. 8–9) and details of methods used for size characterization in stock are provided in (p. 9).

#### *2.2. Cell Culture*

The human bronchial epithelial cell line (16HBE14o- or HBE) and the human monocytic cell line (THP-1) were kindly provided by Dr. Gruenert (University of California, San Francisco, CA, USA), and the Caucasian colon adenocarcinoma cell line (Caco2) (P.Nr: 86010202) was purchased from Sigma-Aldrich (Overijse, Belgium). HBE cells were cultured in DMEM/F12 supplemented with 5% FBS, 1% penicillin-streptomycin (P-S) (100 U/mL), 1% L-glutamine (2 mM) and 1% fungizone (2.5 g/mL) while RPMI 1640 supplemented with 10% FBS, 1% P-S (100 U/mL), 1% L-glutamine (2 mM) and 1% fungizone (2.5 g/mL) was used for THP-1. DMEM/HG supplemented with 10% FBS, 1% P-S (100 U/mL), 1% L-glutamine (2 mM), 1% fungizone (2.5 g/mL) and 1% non-essential amino acids (NEAA) was used for Caco2 cells. All cell culture supplements were purchased from Invitrogen (Merelbeke, Belgium) unless otherwise stated. Cells were cultured in T75 flasks (FALCON, Corning, NY, USA) at 37 ◦C in 100% humidified air containing 5% CO2. Fresh medium was changed every 2 or 3 d and cells were passaged every week (7 d). Cells from passage 3–6 were used for experiments.

#### *2.3. In Vitro Exposure Conditions*

The experimental design used in this study was adapted from [19,20]. For the first exposure cycle (seven days), HBE cells, Caco2 cells, and THP-1 cells were seeded at a density of 10,000 cells/cm2, 5000 cells/cm2, and 10,000 cells/mL, respectively in six well plates (day 0). Based on cell doubling time, the cell numbers for each cell line were adjusted to attain optimal confluency at the end of the first exposure cycle. After overnight incubation (day 1), the cells were exposed to cell culture media containing 2 μg/mL or 0.76 μg/cm2 of different suspensions of TiO2 and SAS for 48 h (day 2 and 3). On day 3 and 5, the supernatant was removed; cell cultures were rinsed with warm HBSS twice and exposed to fresh cell culture media containing 2 μg/mL or 0.76 μg/cm<sup>2</sup> of NMs for 48 h. On day seven, the supernatants were collected and the cell cultures were washed and trypisinized (subculturing). The cell number and viability were determined immediately and the same number of cells (10,000 cells/cm2, 5000 cells/cm2 and 10,000 cells/mL for HBE, Caco2 and THP-1, respectively) were seeded for the second exposure cycle. The remaining cells were processed/stored for further analysis such as glutathione measurements and DNA damage. The steps were repeated for second (7–14 d) and third exposure cycle (14–21 d).

#### *2.4. Cell Viability and Number Determination*

During each subculture step, about 10 μL of cell suspension-trypan blue mix (1:1 ratio) was loaded into the counting chamber slides and cell viability and number was determined by the countessTM automated cell counter (Invitrogen, Merelbeke, Belgium). The results are expressed relative to control.

#### *2.5. Total Glutathione Measurements*

Reduced glutathione (GSH) was measured using a glutathione detection kit (Enzo life sciences, Brussels, Belgium) according to the manufacturer's protocol and the protein content was estimated using bicinchoninic acid (BCA) protein assay kit (Thermo Scientific Pierce, Merelbeke, Belgium). GSH was normalized to the total protein content and the results were expressed relative to control (untreated cells).

#### *2.6. Cytokine Quantification*

Interleukin (IL)-8 and IL-6 were quantified using ELISA kits (Sigma Aldrich, Overijse, Belgium). The cytokines were measured in the supernatants (collected during glutathione measurement experiments and stored at −20 ◦C) according to the manufacturer's protocol and the results were expressed relative to control (untreated cells).

#### *2.7. Comet Assay*

An alkaline comet assay kit [(Trevigen (C.No. 4250-050-K), Gaithersburg, MD, USA)] was used to quantify DNA strand breaks as a measure of DNA damage according to manufacturer's protocol. Cells treated with methyl methane sulfonate (MMS) (Sigma-Aldrich, Overijse, Belgium) 100 μM for 1–2 h served as positive control.

#### *2.8. Statistical Analysis*

Two independent experiments were performed in triplicate or duplicate, and data were presented as mean ± standard deviation (SD). Using GraphPad prism 7.04 for windows, GraphPad Software, La Jolla, CA, USA, www.graphpad.com, the results were analyzed with one-way ANOVA followed by a Dunnett's multiple comparison test to determine the significance of differences compared with control.

#### **3. Results**

#### *3.1. Dispersion and Size Characterization*

#### 3.1.1. TiO2 Suspensions

The results of size characterization and zeta potential of TiO2 suspensions were already published in [11], and are therefore provided in the Supplementary Materials. Supplementary Figure S1 shows electron microscopy (TEM) micrographs of small (SA) and large agglomerates (LA) of 17 and 117 nm sized TiO2 NPs and Table S1 shows the sizes of different TiO2 suspensions characterized by different techniques. We used a standardized TEM technique in our previous study [23], which enabled us to measure the size of several thousand agglomerates in each suspension. The TEM characterization (median feret min) indicated that the size of 17 nm sized TiO2 in their least agglomerated condition (indicated as 17 nm-SA) was 33 nm while it was 120 nm in their strongly agglomerated condition (17 nm-LA). The sizes of small (117 nm-SA) and large agglomerates (117 nm-LA) of 117 nm sized TiO2 were 148 and 309 nm, respectively, indicating that there were also clear differences in sizes between SA and LA of both TiO2 NPs. Although differences between SA and LA were observed in the sizes measured by dynamic light scattering (DLS) and particle tracking analysis (PTA), technical issues involved in observing larger sizes were discussed in [11] (p. 8). To verify the stability of agglomerates, TiO2 stock suspensions were diluted to 100 μg/mL in complete culture medium (CCM) and sizes were measured using DLS at 0 h and 24 h (Supplementary Table S2). The sizes of all agglomerates remained similar at 0 and 24 h, indicating their good stability over time.

#### 3.1.2. SAS Suspensions

The results of size characterization and zeta potential of SAS suspensions were already published in [12], and are therefore provided in the Supplementary Materials. Figure S2 shows the bright field (BF) microscopic image of different SAS suspensions and Table S3 shows the sizes of different SAS suspensions characterized by different techniques. SAS is a material with aggregates of broad size range (few hundred nm to few tenths μm). Thus we used different techniques (such as sonication and vortexing) to obtain suspensions with different sizes. The TEM characterization of sonicated suspension (de-aggregated, indicated as DE-AGGR) was quite straightforward and their mean feret min size was determined as 28 nm. However, using TEM and DLS, we were not able to determine the difference in sizes of other suspensions such as a vortexed suspension (aggregated, AGGR) or a suspension fractionated from AGGR [non-precipitating fraction (SuperN) and precipitating fraction (PREC)]. Thus, we used bright field microscopy and sizes of SuperN and PREC aggregates were roughly determined as 2.5 and 25 μm, respectively. By combining different techniques, we were able identify the differences in sizes between these SAS suspensions. To verify the stability of aggregates, SAS stock suspensions were diluted to 100 μg/mL in CCM and sizes were measured using DLS at 0 and 24 h. Despite knowing that AGGR and PREC sizes were not reflecting the realistic size distribution due to their quick sedimentation while performing DLS measurements, we provided the results in Supplementary Table S4. Thus, we only consider the sizes of DE-AGGR and SuperN aggregates. The sizes of DE-AGGR remained similar at 0 and 24 h, while the size of SuperN aggregates slightly reduced after 24 h.

#### *3.2. Comparison of Biological Responses*

#### 3.2.1. TiO2 Suspensions

The proliferation profiles and viability of cell cultures determined at the end of every week in three different cell lines is shown in Figure 1. None of the TiO2 suspensions did affect the cell proliferation and viability at the end of any exposure cycles. Compared to control, no significant effects for any of these suspensions were noticed for glutathione depletion, IL-8 and IL-6 increase, or DNA damage (Figure 2), which were evaluated after the third exposure cycle only.

**Figure 1.** Effect of repeated low dose exposure to TiO2 suspensions on cell proliferation and viability. Cell proliferation profiles (**a**,**c**,**e**) and cell viability (**b**,**d**,**f**) was measured in different cell cultures after first (**a**,**b**), second (**c**,**d**), and third exposure cycle (**e**,**f**). Data are expressed as means ±SD from two independent experiments performed in duplicates. SA—small agglomerates; LA—large agglomerates.

**Figure 2.** Effect of repeated exposure to TiO2 suspensions (0.76 μ/cm2) on biological responses. Total glutathione (GSH) (**a**), IL-6 (**b**), IL-8 (**c**), and DNA damage (**d**) was measured in different cell cultures after third exposure cycle. Data are expressed as means ±SD from two independent experiments performed in duplicate. *p* < 0.001 (\*\*\*) represents significant differences compared to control (One-way ANOVA followed by Dunnett's multiple comparison test). SA—small agglomerates; LA—large agglomerates.

#### 3.2.2. SAS Suspensions

Figure 3 shows the summary of biological responses evaluated in cell cultures exposed to SAS after the third exposure cycle. DE-AGGR reduced HBE cell number significantly compared to control but AGGR did not. DE-AGGR and AGGR induced a significant increase in IL-8 and IL-6 only in HBE cell cultures. As observed for TiO2, SAS did not induce significant DNA damage at the tested dose. Importantly, no significant effects were noticed in the Caco2 or THP-1 cell lines in any of the biological endpoints measured. These preliminary results suggest that SAS induces biological responses at the tested dose, and it would be interesting to study and compare all fractions of the AGGR suspensions of SAS. In a set of follow-up experiments, we used only HBE cells to investigate other SAS suspensions for their effect on cell number, viability, GSH, IL-6, and IL-8. We planned two exposure cycles (two weeks) with a view to the potential recovery after discontinuing exposure, the third observation week was a recovery period without SAS exposure.

**Figure 3.** Effect of repeated exposure to SAS suspensions (0.76 μ/cm2) on biological responses. Cell proliferation (**a**), viability (**b**), IL-6 (**c**), IL-8 (**d**), and DNA damage (**e**) was measured in different cell cultures after third exposure cycle. Data are expressed as means ± SD from two independent experiments performed in duplicate. *p* < 0.05 (\*), *p* < 0.01 (\*\*) and *p* < 0.001 (\*\*\*) represent significant differences compared to control (One-way ANOVA followed by Dunnett's multiple comparison test). DE-AGGR—de-aggregated suspension; AGGR—aggregated suspension.

Effect on proliferation and viability: To determine the effect on cell proliferation, we measured cell number and cell viability at the end of each exposure cycle and recovery period (Figure 4). DE-AGGR and SuperN fractions strongly affected the cell growth at the end of the first exposure cycle. Compared to untreated cells, the DE-AGGR and SuperN fractions decreased the cell growth to about 65 and 50%, respectively (Figure 4a). AGGR, on the other hand, inhibited cell growth by about 20%. Surprisingly, DE-AGGR and AGGR exposed cell cultures recovered and remained similar compared to controls at the end of the second exposure cycle, but SuperN exposed cell cultures still exhibited decreased cell growth (about 35%). Despite a mild and non-significant decreasing trend observed at the end of the second exposure cycle, PREC fractions did not affect the cell growth significantly after both exposure cycles. After a week of recovery, all cell cultures exhibited similar

growth to control. Compared to untreated controls, none of these suspensions affected the cell viability significantly after exposure cycles and recovery cycle (Figure 4b).

**Figure 4.** Effect of repeated exposure to different SAS suspensions (0.76 μ/cm2) on biological responses. Cell proliferation (**a**), cell viability (**b**), total glutathione levels (GSH) (**c**), IL6 (**d**), and IL8 (**e**) were measured in HBE cell cultures after different exposure cycles. Recovery denotes a week of exposure to cell culture medium without SAS. Data are expressed as means ±SD from two independent experiments performed in duplicate. *p* < 0.05 (\*), *p* < 0.01 (\*\*) and *p* < 0.001 (\*\*\*) represent significant differences compared to control (One-way ANOVA followed by Dunnett's multiple comparison test). DE-AGGR—de-aggregated suspension; AGGR—aggregated suspension; SuperN—non-precipitating suspension; PREC—precipitating suspension.

Effect on total glutathione: At the end of the first exposure cycle, we observed that the GSH levels had increased to about 200 (±47) and 270 (±71) % in DE-AGGR and SuperN exposed cells, respectively, compared to untreated cells (Figure 4c). Additionally, an upward trend was noticed for AGGR and PREC fractions but was not significant.

The GSH levels in DE-AGGR exposed cell cultures returned to normal after the second exposure cycle while the GSH levels were still high in SuperN exposed cells (about 160 ± 18%). Interestingly, cell cultures exposed to PREC also showed mild but significantly increased GSH levels (about 130 ± 5%). The GSH levels in all the exposed cell cultures returned to normal after seven days of recovery period.

Effect on cytokine secretion: After each cycle, cytokines such as IL-6 and IL-8 were quantified in the supernatant of cell cultures (Figure 4d,e, respectively). SuperN fractions resulted in a nearly 2-fold increase in IL6 and IL8 after one week exposure and remained significantly increased at the end of second week. Like at other endpoints, DE-AGGR fractions induced a significant increase only at the end of the first week of exposure. Compared to controls, AGGR and PREC did not affect the levels of IL-6 and IL-8. After a week of recovery, no differences between suspensions were found.

#### **4. Discussion**

In this study, we aimed to determine how different AA suspensions influence the biological responses in cell cultures repeatedly exposed (3w study) to a dose of 0.76 μg/cm2. Neither of the TiO2 dispersions induced significant effects, while SAS suspensions generated by sonication (DE-AGGR) induced some effects compared to control and vortexed suspensions (AGGR), mainly in HBE cells. In an additional study comparing two weeks' exposure of HBE cells with four different SAS suspensions (AGGR, DE-AGGR, SuperN or PREC), it appears that SuperN did not appear to be less potent compared to De-AGGR, which is in line with the acute effects (24 h) described in our previous study [12].

In our recent study [11], we showed that TiO2 agglomeration influences the toxicity/biological responses in high dose short-term exposure (24 h), while in this repeated low dose study, neither TiO2 exposure nor their agglomeration influences the biological responses. In a three week exposure experiment, no cytotoxic effects were observed in human mesenchymal stem cells although nano-TiO2 was detected in the cytoplasm [24]. Kocbek et al. (2010) did not notice any significant effects in keratinocytes repeatedly exposed to 10 μg/mL of TiO2 NPs for three months, while at the same concentration ZnO NPs induced a decrease in mitochondrial activity, abnormal cell morphology, and disturbances in cell-cycle [25]. Vales et al. (2014) suggested that BEAS2B cells repeatedly exposed to 20 μg/mL for four weeks showed potential for carcinogenicity (soft agar assay) [26]. These results suggest that the TiO2 dose used in our experiments (2 μg/mL) might not be sufficient to induce adverse effects. We based the choice of 2 μg/mL on our earlier 'acute' exposure experiments without cyto/genotoxicity, which now appears to be a relatively safe dose after three weeks of exposure.

In this study, DE-AGGR, the least aggregated and nano-sized SAS, induced a more pronounced effect than AGGR at the same mass concentrations. Our characterization revealed that 75% of total mass of AGGR was composed of PREC aggregates, which is about 25 μm in size [12]. PREC aggregates, when studied separately, did not induce any effects. Given their larger size, such aggregates are less likely to be taken up by the cells, and therefore induced no effects. This indicates that overall biological activity of SAS NMs in their manufactured form was reduced due to aggregation.

Similar to acute studies, SuperN fractions of AGGR suspension exhibited noticeable biological activity in a low dose repeated exposure study. The most quoted nanotoxicity paradigm is "the smaller the size of the NPs the greater the toxicity/biological responses". Likewise, several short-term cytotoxicity studies showed that nano-sized particles are more biologically active than micron-sized studies [27–29]. In a recent study, bronchial cells repeatedly exposed to a low dose of VO2 NPs for three weeks showed greater adverse response for nano-sized particles than micron-sized particles [19]. In contrast to these observations, we observed that SuperN aggregates of size about 2.5 μm showed similar biological activity to nano-sized fractions. This suggests that larger aggregates of NP may not necessarily be considered biologically less active and highlights the need for a case-by-case analysis.

The size of SuperN aggregates (2.5 μm) is far greater than DE-AGGR aggregates (100 nm) yet falls under the category of respirable particles [30]. Therefore, exposure and hazard assessment of such fractions is valuable since commercially available SAS can be composed of small and large aggregates. We also observed that PREC was the least biologically active. Considering the size of the aggregates play a key role in determining its toxicological relevance, these findings could also contribute to the "safe-by-design" of SA, by considering aggregation as a critical factor.

Studies have indicated that the effects induced by NMs were different for short-term and long-term exposure [18–20]. In this study, we noticed an increase in glutathione levels after the first exposure cycle (one week) for both DE-AGGR and SuperN. Therefore, in addition to a three week exposure study, we also investigated the in vitro effects after short-term high dose exposure to SAS exposure under the same experimental conditions (Supplementary Figure S3). In short-term exposure (24 h), mild cytotoxicity (Figure S3b) and total glutathione depletion (Figure S3c) was observed at high concentrations of DE-AGGR and SuperN. Glutathione depletes when excessive ROS is produced. Several short-term studies have shown that SAS reduced glutathione levels [31–34], which is in agreement with our findings. This indicates that glutathione depletion is an earlier effect of short-term cytotoxicity while increased glutathione production is possibly a sign of a protective effect to prevent further damage. Further, decreased cell proliferation in a three week study is also consistent with an increase in IL-8 and IL-6, while only IL-8 was consistent with short-term cytotoxicity (Figure S3e). These results indicate that cell cultures may respond to NM differently depending on the modes of exposure (short-term high dose or low dose repeated exposure).

To have a view on the potential role of survival cells from first cycle exposure, the cells from the first cycle exposure were passaged and repeatedly exposed in the second cycle. At the end of second exposure cycle, we noticed that the increase in glutathione, IL-6, and IL-8 was somewhat less compared to the first exposure cycle in cell cultures exposed to DE-AGGR and SuperN suspensions. It appears that the cells stressed during first exposure cycle, undergoing recovery probably due to protective effects induced during the first exposure cycle. Moreover, cell viability at the end of both exposure cycles remained similar to control. Further research is needed to verify whether the decreased cell growth was the result of cell cycle arrest and/or cell death (apoptosis). Nevertheless, the cells recovered similarly to the control one week after exposure was discontinued, indicating that the response observed was due to continuous exposure to SAS. This finding is particularly important as this indicates that continuous human exposure to SAS results in elevated levels of biological responses, which could lead to adverse effects.

Numerous studies have reported that short-term in vivo exposure to SAS elevated the levels of LDH, IL-6, IL-8, and GSH depletion in the lung [15]. However, long-term and repeated exposure in vivo studies for SAS are scarce. In a study [34], rats were exposed to 50 mg/m3 of SAS for 6 h/day, 5 days/week for 13 weeks and effects were characterized after 6.5 weeks and 13 weeks of exposure, and after three and eight months of recovery. An increase in cytotoxicity biomarkers (LDH) and inflammatory cells was noticed after 6.5 and 13 weeks, but the effects were significantly mitigated after both recovery periods. Genotoxicity was not observed at any of these time points. In another study [35], rats were exposed to 50 mg/m3 of SAS for 6 h/day for five days and adverse effects were characterized after last exposure or one or three months later. SAS induced elevated levels of cytotoxicity biomarkers and lung damage after last exposure, but the effects were reversed three months post exposure. In our study, we observed that the effects induced by DE-AGGR and PREC were reversed after a one week recovery period. This suggests that our long-term exposure design may be appropriate to predict the in vivo adverse outcome of repeated exposure to NMs.

#### **5. Conclusions**

In this study, we demonstrated the toxicological relevance of AA in a repeated low dose in vitro exposure study. Neither TiO2 exposure nor their agglomeration state affected the measured biological endpoints, possibly due to insufficient applied dose. On the other hand, we noticed that a fraction of SAS aggregates in their manufactured form (2.5 μm) did not appear biologically less active compared to nano-sized SAS produced by sonication. Apparently, in vitro studies with more biological endpoints and animal studies are required to verify these results. Moreover, further characterization is needed to reveal properties other than size that make SuperN fractions biologically more active. Since SAS used in this study is a representative of SAS approved as a food additive (E551), more attention needs to be paid in the future to the possible adverse effects of SuperN fractions, particularly their long-term effects. The results of this study also might spur toxicologists to perform more long-term studies in the future to reveal the toxicological relevance of other NMs that are agglomerated/aggregated in their manufactured form.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/ 10.3390/nano11071793/s1: Figure S1: Representative TEM micrographs of freshly prepared TiO2 stock suspensions, Figure S2: Representative bright field microscopic images of freshly prepared SAS stock suspensions, Figure S3: Influence of SAS aggregation on cytotoxicity and biological responses, Table S1: Size characterization of freshly prepared TiO2 stock suspensions, Table S2: Size characterization of freshly prepared TiO2 stock suspensions, Table S3: Characterization of freshly prepared SAS stock suspensions, Table S4: Z-average sizes (measured by DLS) of SAS suspensions in different cell culture medium (100 μg/mL).

**Author Contributions:** Conceptualization, S.M., L.G., M.G. and P.H.H.; methodology, S.M., M.G. and P.H.H.; investigation, S.M.; writing—original draft preparation, S.M.; review and editing, S.M., L.G., M.G. and P.H.H.; project administration and funding acquisition, P.H.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was funded by the Belgian Science Policy (BELSPO) program "Belgian Research Action through Interdisciplinary Network (BRAIN-be)" for the project "Towards a toxicologically relevant definition of nanomaterials (To2DeNano)" and EU H2020 project (H2020-NMBP-13-2018 RIA): RiskGONE (Science-based Risk Governance of NanoTechnology) under grant agreement No. 814425.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding authors.

**Acknowledgments:** The authors thank JRC Nanomaterials Repository, Italy for providing the TiO2 NPs and SAS.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## **The State of the Art and Challenges of In Vitro Methods for Human Hazard Assessment of Nanomaterials in the Context of Safe-by-Design**

**Nienke Ruijter 1, Lya G. Soeteman-Hernández 1, Marie Carrière 2, Matthew Boyles 3, Polly McLean 3, Julia Catalán 4,5, Alberto Katsumiti 6, Joan Cabellos 7, Camilla Delpivo 7, Araceli Sánchez Jiménez 8, Ana Candalija 7, Isabel Rodríguez-Llopis 6, Socorro Vázquez-Campos 7, Flemming R. Cassee 1,9 and Hedwig Braakhuis 1,\***

	- <sup>3</sup> Institute of Occupational Medicine (IOM), Edinburgh EH14 4AP, UK
	- <sup>4</sup> Finnish Institute of Occupational Health, 00250 Helsinki, Finland
	- <sup>5</sup> Department of Anatomy, Embryology and Genetics, University of Zaragoza, 50013 Zaragoza, Spain
	- <sup>6</sup> GAIKER Technology Centre, Basque Research and Technology Alliance (BRTA), 48170 Zamudio, Spain
	- <sup>7</sup> LEITAT Technological Center, 08225 Barcelona, Spain
	- <sup>8</sup> Instituto Nacional de Seguridad y Salud en el Trabajo (INSST), 48903 Barakaldo, Spain
	- <sup>9</sup> Institute for Risk Assessment Sciences (IRAS), Utrecht University, 3584 CS Utrecht, The Netherlands
	- **\*** Correspondence: hedwig.braakhuis@rivm.nl

**Abstract:** The Safe-by-Design (SbD) concept aims to facilitate the development of safer materials/products, safer production, and safer use and end-of-life by performing timely SbD interventions to reduce hazard, exposure, or both. Early hazard screening is a crucial first step in this process. In this review, for the first time, commonly used in vitro assays are evaluated for their suitability for SbD hazard testing of nanomaterials (NMs). The goal of SbD hazard testing is identifying hazard warnings in the early stages of innovation. For this purpose, assays should be simple, cost-effective, predictive, robust, and compatible. For several toxicological endpoints, there are indications that commonly used in vitro assays are able to predict hazard warnings. In addition to the evaluation of assays, this review provides insights into the effects of the choice of cell type, exposure and dispersion protocol, and the (in)accurate determination of dose delivered to cells on predictivity. Furthermore, compatibility of assays with challenging advanced materials and NMs released from nano-enabled products (NEPs) during the lifecycle is assessed, as these aspects are crucial for SbD hazard testing. To conclude, hazard screening of NMs is complex and joint efforts between innovators, scientists, and regulators are needed to further improve SbD hazard testing.

**Keywords:** nanomaterials; safe-by-design; hazard testing; in vitro methods; SAbyNA; advanced materials

#### **1. Introduction**

The rapid expansion of the field of nanotechnology and its ever-growing number of applications has created a challenge for toxicologists and risk assessors. The continuous uncertainties surrounding nanomaterial (NM) safety, as well as the pace at which new NMs are developed, call for a more prevention-oriented strategy. The Safe-by-Design (SbD) concept is increasingly applied within the field of nanotechnology, as can be seen by the high number of EU funded nano-projects addressing SbD over the past years [1], and by its adoption in the EU Chemical Strategy for Sustainability as a strategy to meet the EU Green Deal ambitions [2,3].

SbD aims to reduce the human and environmental risk of a substance throughout its entire life cycle by minimizing or eliminating the hazard and/or by reducing exposure [4]. The concept of SbD consists of three pillars: safer materials and products, safer production,

#### **Citation:** Ruijter, N.;

Soeteman-Hernández, L.G.; Carrière, M.; Boyles, M.; McLean, P.; Catalán, J.; Katsumiti, A.; Cabellos, J.; Delpivo, C.; Sánchez Jiménez, A.; et al. The State of the Art and Challenges of In Vitro Methods for Human Hazard Assessment of Nanomaterials in the Context of Safe-by-Design. *Nanomaterials* **2023**, *13*, 472. https://doi.org/10.3390/ nano13030472

Academic Editors: Andrea Hartwig and Christoph Van Thriel

Received: 23 December 2022 Revised: 16 January 2023 Accepted: 18 January 2023 Published: 24 January 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

and safer use and end-of-life. For NMs, these were first described in the NanoReg2 project [5], and later in an internationally accepted working description of the OECD Safe Innovation Approach Report [6]. In practice, SbD is a two-step process: the first step is an early hazard and/or risk screening during the design phase of the innovation process of a new substance, NM, or product [7,8]. The second step is to take actions (SbD interventions) to reduce or minimize hazard, exposure, or both.

For NMs and nano-enabled products (NEPs), SbD interventions can be achieved in different ways. One option is to modify a NM in order to improve its safety profile. For example, Xia et al. (2011) showed that doping ZnO nanoparticles with iron reduces the shedding of harmful ions and reduces the toxicity of the particles upon pulmonary exposure [9]. Another example of a SbD intervention is applying a surface treatment to minimise NM biological reactivity, as has been successfully achieved for nano-SiO2 by adding silanol groups to the silica surface [10]. Reducing exposure is also a fundamental part of SbD and can be achieved by implementing procedural changes such as working in closed systems or using wet synthesis methods [5]. Reduced release and therefore minimized exposure can also be achieved by altering the design of the NEP, for example by improving the immobilization between the NM and the matrix, as was conducted for silver NMs onto cotton fabrics [11].

The above-mentioned examples can only be achieved after first assessing hazard and risk early in the innovation process, and then using this knowledge to integrate safety into the design of the NM, NEP, or production process. For many NMs, and especially for novel ones, hazards are largely unknown [12], and cannot be predicted only based on physicochemical (PC) characterisation. Therefore, carrying out suitable hazard testing at the early stages of product development is of utmost importance for SbD applicability. SbD hazard testing aims to identify hazard warnings in the early stages of the innovation process using simple in vitro methods. Once a product is designed and produced, the manufacturer should comply to the regulations and perform hazard and risk assessment accordingly.

Many strategies and frameworks for hazard assessment of NMs in the context of SbD have been proposed in recent years [13–18], some proposing specific in vitro assays, and some based only on a selection of toxicological endpoints to consider. However, no comprehensive investigation of the suitability of currently available in vitro assays for such strategies has been conducted thus far.

From previous studies on the mechanisms of action of NMs, it is known that transformation (e.g., dissolution), reactivity, inflammation, cytotoxicity, and genotoxicity are among the most important parameters and endpoints to evaluate when assessing the hazard of a NM, and therefore these are suggested to be measured in many available strategies and frameworks [13,19–21]. Selecting in vitro tests suitable for SbD hazard testing is not trivial. Only a few OECD test guidelines and ISO standards are available specifically for testing NMs. Due to the interfering behaviour of NMs with their surrounding environment and with the assay readout, routinely used toxicity assays (i.e., those used to test soluble chemicals) may prove unsuitable or may require optimizations and inclusion of extra controls [22]. In contrast with hazard assessment for soluble chemicals, NM testing requires additional steps, such as dispersion protocols and determining the dose delivered to the cells in submerged cell culture experiments [23]. Specifically for the purpose of SbD hazard testing, since it is performed early in the development of a NM/NEP, assays will not only have to be compatible with the NM to be tested, but should preferably also be fast, cost-effective, and able to correctly indicate hazard warnings.

This work provides a practical and critical evaluation of the suitability of most frequently used in vitro toxicity assays and the challenges for their use in NM SbD hazard testing. For this purpose, criteria for the suitability of methods for application in a SbD hazard testing strategy are established, leading to an evaluation of the methods currently in use for the parameters and endpoints identified as important for the mechanisms of action of NMs. This work is conducted under the umbrella of the Horizon2020 project SAbyNA which aims to develop a user-friendly platform for industry with optimal workflows to

support the development of SbD NMs and NEPs. For this purpose, existing resources, such as in vitro assays are identified, distilled, and streamlined. This state of the art and evaluation of in vitro assays for SbD applicability can be used as an outlook for innovators, regulators, industry, and scientists of how early hazard testing of NMs and NEPs can be put into practice to eventually contribute to the design of SbD NEPs.

#### **2. Criteria**

A set of performance criteria is proposed to evaluate the suitability of in vitro methods for SbD hazard testing. The criteria were adapted from the widely used Good In Vitro Method Practices (GIVIMP) [24] and tailored to suit SbD hazard testing for NMs specifically. Figure 1 shows several key considerations for assay selection for SbD hazard testing.

**Figure 1.** Considerations for assay selection for SbD hazard testing.

**Predictive:** The first criterium is that an in vitro assay should be sufficiently predictive of the in vivo situation. This comparison is preferably made with human data, or alternatively using animal data. SbD hazard testing is carried out in an early stage of product development and is considered a first screening. The aim of SbD hazard testing is to detect **early hazard warnings** and not to derive a point of departure for risk assessment. Therefore, assays are sufficiently predictive for SbD hazard testing when they are able to indicate hazard warnings. Assays that are able to **accurately rank** NMs/NEPs based on their hazard potency are of extra value for SbD hazard testing, as this will allow comparison of candidate NMs and comparison with benchmark NMs.

Predictivity can be assessed by looking into the prediction accuracy of the assay. An assay's **accuracy** to predict in vivo effects is a combination of its **sensitivity** and **specificity**. Sensitivity is the ability of the assay to detect true positives and specificity is the ability of the assay to detect true negatives.

**Simple and cost effective:** Simplicity and cost effectiveness are key for SbD hazard testing since these assays are to be performed in an early stage of NM/NEP product development. Ideally, an assay should be easy to perform, time-efficient and cost-effective.

**Robust:** An assay should give consistent and repeatable results between experimental repetitions and between different labs.

**Compatible:** An assay should preferably be compatible with a wide range of NMs, or at least its compatibility domain should be identified. Assays with optimized protocols specifically for NMs are preferred.

**Readiness:** Methods that are considered 'ready to use' and already standardized or (pre-)validated for NMs are prioritized.

#### **3. Challenges of Testing NMs In Vitro for SbD Applicability**

NMs are particulate matter, making NM in vitro testing by default more challenging than testing soluble chemicals. Several additional aspects need considering when testing NMs in vitro, including determining the behaviour of the NM in exposure medium, selecting a dispersion protocol to create stable suspensions which preferably mimic human exposure as much as possible, and assessing the potential interference of the NM with the assay components or optical readouts. Furthermore, elaborate characterization of the NM is required [25], but this will not be discussed further in this review. The fact that SbD hazard testing needs to be as simple as possible creates an important predicament that needs addressing. An overview of key aspects that should be taken into account when performing SbD hazard testing of NMs is shown in Figure 2. The most important aspects are discussed below.

**Figure 2.** Overview of aspects that might have to be considered when performing SbD hazard testing, showing that simple testing can be challenging to achieve.

#### *3.1. Choice of Dispersion Protocol*

Classically, in vitro toxicity evaluation is performed in cultured cells maintained in submerged conditions. To ensure reproducible and controlled exposure from one replicate to another, stable suspensions of well-dispersed NMs are prepared, sometimes requiring energy input to disrupt particle agglomerates. For SbD hazard testing, a prerequisite for the suitability of a dispersion procedure is that the SbD properties (e.g., coatings or surface treatments) of the dispersed NM/NEP are preserved, and that the resulting dispersed NM/NEP is relevant for human exposure in terms of size and other physicochemical properties.

The most commonly used dispersion procedure for NMs is via sonication, using either an ultrasonic bath, a probe, or a cup-type sonicator [26,27]. For SbD hazard testing, sonication would not be the preferred option in some specific cases. For instance, sonication can break multi-walled carbon nanotubes (MWCNT), causing a reduction of their length [27], and therefore leading to different toxicity profiles than the MWCNT that humans would be exposed to. Sonication has been used to produce MWCNTs with different lengths from a same initial batch of MWCNTs [28,29], and Hadrup et al. (2021) concluded that the length of the MWCNT is a major determinant of its toxicity [29]. Therefore, when assessing hazard properties of CNTs in vitro, sonication should be limited as much as possible, and in case

sonication is used, NM physicochemical properties should be verified to ensure they still maintain exposure-relevant characteristics.

Another example where sonication would have to be carefully considered is when testing specific synthetic amorphous silicas, which are in some cases intentionally produced as agglomerates. Sonication can disrupt most of the agglomerates, reducing the overall hydrodynamic diameter, which constitutes a substantial modification of the initial material [30]. In an inhalation exposure scenario, a person would be exposed to these agglomerates, and therefore sonication would not be the preferred option. However, there are indications that these agglomerates disintegrate in the intestine [31]. In the case of ingestion, the gut epithelium is exposed to nano-sized silica, and sonication would result in an exposure-relevant material.

Dissolution is a major determinant of the toxicity of some NMs (e.g., release of toxic metal ions such as silver or copper ions) and decreasing the NM dissolution potential can be considered a SbD intervention. Sonication has been shown to enhance the dissolution of some metallic NMs, such as Cu, Mn, and Co [32], and the dissolution can be further increased when proteins are present in the solution during sonication [33]. Thus, dispersion procedures that involve sonication of NMs, especially in a medium that contains proteins, such as the procedure optimized within the Nanogenotox project [34], should be carefully considered in view of exposure scenarios when testing NMs that can potentially dissolve.

Some NMs are designed as core-shell structures (e.g., quantum dots (QDs)) where the shell reduces dissolution and leaching of potentially toxic elements from the core. The design of more robust shells, used as a SbD intervention, reduces the QDs dissolution rate and thereby their toxicity [35–37]. However, the core-shell boundary is a region of fragility and sonication could promote shell fragmentation and the release of core contents. Thus, sonication could result in a reduced effect of the SbD intervention, potentially resulting in an overestimation of toxicity in SbD hazard testing. Therefore, for core-shell NMs, sonication should not be recommended, unless humans are exposed to fragmented QDs.

Coating NMs with surface ligands or grafting them on an inert matrix such as cellulose has also been tested as a SbD intervention to produce safe(r) photocatalytic paints containing TiO2 NMs. Coating TiO2 NMs with polyethylene glycol (PEG), poly(acrylic acid) (PAA), or 3,4-dihydroxy-L-phenylalanine (DOPA) increased the stability of the doped paint and its resistance to weathering and abrasion, while their grafting on cellulose fibres enhanced their photocatalytic properties, thereby allowing for the reduction of the amount of NMs necessary to reach efficient photocatalysis [38]. Again, sonicating these surfacecoated TiO2 NMs or TiO2-containing composites might lead to the reduction of the effect of SbD interventions. In addition to that, extensive sonication has been shown to alter the zeta potential of TiO2 and CeO2 NMs [39,40] and to cause re-agglomeration of Cu or Mn NMs [33,41]. It should in each case be investigated what the exposure-relevant form of the NM or NEP is.

Moreover, samples could be contaminated by the release of Al and Ti from the sonication probe upon extensive sonication, potentially leading to toxicity [30,42]. Finally, extensive sonication of NMs in a growth medium containing proteins or in water with bovine serum albumin (BSA) (as in the procedure optimized within the Nanogenotox project [34]) could promote the degradation of proteins, leading to the formation of large aggregates of degraded proteins [43].

To conclude, for SbD hazard testing sonication should be carefully considered, and exposure relevancy should always be kept in mind. If exposure-relevant and stable dispersions in the exposure medium are obtained using simple methods such as vortexing, dispersion via sonication might not be needed. In the case of NMs that quickly agglomerate and form large clumps, more controlled sonication methods might be appropriate. For example, a protocol using minimum material-specific energy to reach a stable dispersion as described by DeLoid et al. (2017) could be used [23]. NMs should subsequently be characterized to ensure that no PC changes were produced that deviate from the exposure-relevant material. The PC properties of the NM tested should reflect the exposure conditions, whether it be the pristine NM with SbD interventions or the agglomerated NM. However, it has to be noted that unstable suspensions could lead to difficulties with reproducibility and/or interferences with the assay readout.

Finally, it might be recommended to also perform in vitro assays after extensive sonication, as this might be required for regulatory risk assessment. By doing this, the first steps towards compiling a dossier for regulatory compliance are made, and this might already indicate if any issues can be foreseen for market entry. Additionally, extensive sonication may provide a worst-case scenario in in vitro assays, which could fit in a precautionary approach. OECD guidance on sample preparation [44] is currently being revised to include considerations for the choice of a specific dispersion protocol rather than applying extensive sonication by default.

#### **Box 1**

For SbD hazard testing sonication should be carefully considered, and exposure relevancy should always be kept in mind. If exposure-relevant and stable dispersions in the exposure medium are obtained using simple methods such as vortexing, dispersion via sonication might not be needed. In the case of NMs that quickly agglomerate and form large clumps, more controlled sonication methods might be appropriate. NMs should subsequently be characterized to ensure that no PC changes were produced that deviate from the exposure-relevant material.

#### *3.2. Influence of Medium Components*

Supplementation of cell-culture medium with serum (i.e., foetal calf serum (FCS) and foetal bovine serum (FBS)) is common practice in cell culture procedures as it is required for cell growth and maintenance. When exposing the NMs in a test medium, constituents of the medium including proteins, amino acids, lipids, and sugars adsorb on the surface of NMs, leading to the formation of the so-called biomolecular corona [45]. This corona is highly dynamic and may change upon changing the composition of the test medium [46,47]. This dense layer of biological molecules can modify NM toxicity in several ways. Firstly, it could do so by masking the surface reactive sites of the NM [48]. Secondly, serum may stabilize the NM dispersion, leading to a lower dose delivered to the cells in in vitro assays, as has been shown for TiO2 NMs for example [49]. Lastly, a biomolecular corona may reduce NM surface energy, and thereby its cellular uptake via adhesion-induced endocytosis, as has been shown for SiO2 NMs [50,51].

These effects are clear when comparing the toxicity of NMs tested with and without serum. For instance, the cytotoxic potency of polystyrene NMs was found to decrease 2-fold when the exposure medium contained serum [52]. Similarly, the cytotoxicity of SiO2 NMs decreased up to 92%, and pro-inflammatory response decreased up to 87% when cells were exposed in medium with serum [53]. In addition, the species of origin of the serum could lead to different responses [54]. Addition of bovine serum albumin (BSA), a protein often used to help stabilize dispersions, has also been reported to reduce cytotoxicity [55,56].

In short, the addition of serum and BSA to the exposure medium may lead to lower toxicity in in vitro assays. Which approach is most suitable for SbD hazard testing should be explored further. Since SbD hazard testing is mostly focused on detecting hazard warnings, it could be argued that testing without serum is more appropriate, as it ensures a higher sensitivity. Additionally, when testing serum-free, a worst-case scenario could be mimicked without the protective effect of serum on NM-cell interaction. On the other hand, testing with serum is the more realistic approach as humans are rarely exposed to NMs without a biomolecular corona. Eventually, the route of (potential) human exposure should be taken into account when selecting an exposure protocol, as systemically injected NMs will immediately be covered by serum proteins, whereas inhaled NMs will come in contact with epithelial lining fluids, which contains a different set of biomolecules. For SbD hazard testing, exposure relevancy is important, and a biocorona could be applied which corresponds to the route of exposure, such as lung-lining fluid for pulmonary exposure. In the context of exposure relevance, and in the context of the 3Rs (replacement, reduction, and refining), the use of human serum or serum-free alternatives may be favoured over FBS.

#### *3.3. Determining Dose Delivered to Cells*

A particular challenge when testing NMs in vitro in adherent, submerged cell cultures is determining the delivered dose, i.e., the amount of material that reaches the cells. Settling of NMs depends on their density, size, and the properties of the cell-culture medium, as well as on their agglomeration state [57]. The latter is again influenced by the dispersion method used [58].

Determining the delivered dose in an in vitro experiment is an absolute requirement, even when performing a simple hazard screening for the purpose of SbD, and when performing high throughput screening (HTS) experiments. This is because the administered dose can differ substantially from the delivered dose that reaches the cells. For example, for particles that settle rapidly, the difference between administering 100 μL per well or 200 μL per well of the same concentration will mean a doubling of the amount of material per well, and thus a potential doubling of the delivered dose. Moreover, since sonication influences the agglomeration state, and the agglomeration state influences the settling rate and thus the dose delivered to cells, determining the delivered dose may help the comparison of data among independent experiments using different dispersion protocols. A visual representation of two example NMs with different settling rates is shown in Figure 3. For SbD hazard testing specifically, determination of delivered dose aids in a comparison to benchmark materials with known toxicity, as settling may differ greatly between NMs. The importance of determining the delivered dose was shown in a study by Pal et al. (2015), where a correction for the delivered dose led to a considerable change in the hazard ranking of a panel of NMs, after which the in vitro outcomes matched better with the in vivo results [59].

**Figure 3.** Visual representation of two NMs with different properties, resulting in different doses delivered to the cells, when administered doses are equal.

The delivered dose can be modelled using the DG [60] or ISD3 model [61], which are currently available in the DosiGUI software generated in the PATROLS project [62]. The DosiGUI is user friendly, however, these models also introduce uncertainty as they do not take into account some critical factors such as particle convection [63], or dispersionstabilizing surface functionalization. Moreover, cell stickiness needs to be chosen from an arbitrary scale, which is often an unknown parameter that has a big effect on the modelled delivered dose [64].

These models require the effective density of the NM as input, which is the density of the NM in a dispersion. In the case of agglomerates, this includes the density of the medium trapped inside the agglomerate. The effective density can be measured using analytical ultracentrifugation (AUC) or using the volumetric centrifugation method (VCM). In order to adhere to the criteria of SbD hazard testing, the VCM is preferred over AUC, as it is easier, less costly, and does not require specialized equipment [65]. It is important to measure model input parameters precisely, as small differences in input as a result of instrument variation may lead to large differences in the modelled deposited dose [64]. Stable dispersions are a requirement for modelling the dose rate and final dose delivered to the cells, as the calculations are based on a one-size distribution. The accuracy of the model outcome—and thus the estimated deposited dose (rate)—is less accurate if the size distribution of the dispersion changes over time due to agglomeration or aggregation.

The need to determine the delivered dose adds an extra step to SbD hazard testing, leading to a reduction of the achievable simplicity. Determining the delivered dose is however a requirement, even for SbD hazard testing, as more precise dosimetry will allow for more informative hazard testing. This, however, only applies to submerged testing, and not to experiments in which, for instance, an air-liquid interface (ALI) exposure protocol is followed. For ALI exposures, a quartz crystal microbalance (QCM) may provide sensitive and accurate deposited-dose measurements [66].

#### *3.4. SbD Hazard Testing of NEPs and NMs Released during the Life Cycle*

One of the most important aspects of SbD is assessing the safety of a product along its entire life cycle (LC) [5]. Usually, only pristine NMs are included in toxicity assays. This could be insufficient, as humans are also exposed to NEPs, aged NM and/or NMs released during the product LC including production, use, and end-of-life. The physicochemical characteristics of the NMs released to the environment along the different stages of the LC can be very different in terms of shape, chemical composition, agglomeration state, and surface modification [67–71]. Moreover, NEPs and NMs released during the LC might pose a different hazard than pristine NMs [72–75]. Thus, gathering information on the characteristics and hazard of NEPs and NMs released during the LC is important for designing relevant SbD interventions.

Processes leading to the release of NMs from a NEP during the entire LC can be simulated under laboratory conditions, after which NMs can be collected (e.g., by using filters) and redispersed in liquid for toxicity testing [76] or collected in liquid suspensions directly [77]. Realistic, released NMs, relevant for consumer exposure during the use of NEPs, can be obtained by using standardized methods (e.g., abrasion and weathering) that are normally used to test the durability and performance of NEPs. In the case of abrasion processes, there are different instruments that can be used to simulate mild or hard abrasion. Experimental parameters (e.g., cycles of abrasion, abrasion materials, normal load at the top of the abrader, etc.) can be tuned to reflect closely the NEP use conditions.

Aging experiments simulate conditions to which a product could be exposed during its use phase and are usually performed in a weathering chamber under accelerated conditions of UV exposure and rain. The weathering conditions (e.g., duration of cycles of light and rain, duration of the experiment, etc.) can be selected to follow international standards or be customized. In order to obtain higher quantities of released material (worst-case scenario), NEPs can be fragmented and sieved [78].

Released aerosols can be size-separated by using e.g., a cascade impactor to ensure inhalable or respirable fractions of NMs; After which they are collected on filters [79]. Efforts should be made to ensure high extraction efficiency and minimal compositional alterations when extracting material from filters [80]. Another option which is less easy but mimics better a real-life exposure is the direct exposure of cells to the released material, as performed by Zarcone and colleagues [81] for diesel exhaust. This approach is however somewhat more labour-intensive for SbD hazard testing but might be useful for gaining a more fundamental insight into the toxicity of released materials (exposure-relevant material) without losing a fraction to filter extraction.

After obtaining and extracting the material from filters, the same actions should be taken as for pristine NM testing, such as an accurate dose determination, controlling for interference, endotoxin contamination testing, choosing an appropriate dispersion protocol, etc. For NEPs and NMs released during the LC, endotoxin contamination might pose an extra challenge as these materials are generally not produced in sterile environments. Finally, compatibility in submerged settings might pose additional challenges as a NEP matrix is often plastic-based and might float on culture medium.

#### Feasibility and Relevance

Obtaining sufficient amounts of NM that are released at a given stage of the LC can be challenging due to low emission rates, contamination with other substances (e.g., sanding material), as well as laboursome and time-consuming procedures. The question is whether performing SbD hazard testing on pristine NMs is sufficiently relevant when assessing the safety of a NM along its entire life cycle. There are some examples in literature in which both pristine and released NMs have been tested to assess and compare their hazards. In most cases, materials released during the LC induced less or equal toxicity as compared to the pristine NM, as has been shown in in vivo studies [75,82,83] as well as in vitro [84]. This means that testing pristine materials, albeit often far from representing the reality, can still represent a worst-case scenario. In this case, risk screening of NMs released from a NEP can be mainly based on emission rates combined with the hazard information of the pristine NM. However, it should be noted that there is very little known about the toxicity of released NMs as compared to pristine NMs, and the exception may prove the rule.

Similarly, testing pristine NMs may represent a worst-case scenario for aged NMs. For example, freshly ground quartz particles have been shown to induce higher levels of pulmonary inflammation and cytotoxicity as compared to aged quartz [85].

For now, it should be considered on a case-by-case basis whether testing forms other than the pristine NM is required. More research is needed to determine whether the use of the pristine NM in SbD hazard testing is sufficient due to its 'worst-case' nature, or whether testing aged NMs, and NMs released during LC is crucial for designing SbD interventions.

#### **Box 2**

It should be considered on a case-by-case basis whether testing forms other than the pristine NM is required. More research is needed to determine whether the use of the pristine NM in SbD hazard testing is sufficient due to its 'worst-case' nature, or whether testing aged NMs, and NMs released during LC is crucial for designing SbD interventions.

#### *3.5. Challenging NMs and Advanced Materials*

Hydrophobic NMs, NMs with low material density, multi-component NMs, and other advanced materials yet to be invented may show poor compatibility with commonly used in vitro assays. Applying a single standardized exposure method to all types of NMs will inherently give biased outcomes. For SbD hazard testing it is therefore important to consider the compatibility of NMs with challenging physicochemical properties and to be prepared for future novel advanced materials.

#### 3.5.1. Hydrophobic Particles

Since cells are always cultured in aqueous culture medium, hydrophobic particles can be of extra difficulty to test. Carbon nanotubes (CNTs) and graphene-based particles, for example, are notorious for being difficult to disperse in culture medium. Ethanol prewetting, using different dispersion media [86], and adjusting sonication time and frequency have been shown to improve dispersibility of NMs. However, for some NMs a stable dispersion can never be achieved in cell-culture medium. For example, some CNTs are specifically designed to agglomerate in order to reduce their dustiness and thereby improve their safety. For cases such as these, dry exposures at the air-liquid interface (ALI) should be considered when focussing on potentially respirable NMs. This requires a cell type that can be cultured on membranes in medium on the basal side, while being exposed to the air on the apical side. For the generation of a dry (dust) aerosol for ALI exposure, several

methods are available [87]. However, it should be kept in mind that if a dust cannot be generated in a laboratory setting, inhalation of the NM is very unlikely. Thus, in these cases the relevance of an inhalation study should be reconsidered.

#### 3.5.2. Buoyant NMs

NMs with a density lower than cell-culture medium (e.g., certain types of plastic particles or agglomerates with a low effective density) will float and do not settle over time, resulting in no contact with adhering cells in a classical, submerged in vitro setup. This will likely lead to an underestimation of the potency of NMs such as nano-plastics and liposomes [88]. A solution to solve the problem of buoyant particles is to perform an inverted 'overhead' cell culture, where the cells are not cultured on the bottom of a culture dish, but upside down on top of the exposure medium. With this approach, it was possible to produce a dose response for several floating particles, whereas the traditional approach did not show any results [88,89]. For buoyant NMs, where inhalation is the relevant exposure route, ALI exposures can also be an option.

#### 3.5.3. Multicomponent NMs and Other Advanced Materials

In the past years, the more complex multicomponent nanomaterials (MCNM) have gained popularity. These next generation NMs consist of two or more materials or substances, giving rise to properties (e.g., reactivity) that are not equal to the sum of the properties of each component [90]. There are still many knowledge gaps when it comes to the toxicity of these and other novel advanced materials, which is why the concept of SbD is a suitable prevention-oriented approach. Whether these materials are compatible with the available toxicity assays is unknown, and this might pose challenges for future SbD hazard testing. Theoretically, MCNMs could exhibit multiple types of assay interference, attributed to the individual components of the MCNM. It is important to be aware of these challenges and to always assess interference.

#### **4. Evaluation of In Vitro Methods for SbD Hazard Testing**

#### *4.1. Cytotoxicity*

Measuring cell viability or cytotoxicity is a fundamental part of most hazard assessment strategies and integrated approaches to testing and assessment (IATA's) for several reasons. Firstly, cytotoxic potency (for example LC50) gives an indication of the relative hazard of a NM. Secondly, cytotoxicity assays allow for the selection of appropriate sublethal doses for further mechanistic testing (e.g., genotoxicity and inflammation). Lastly, for several mechanistic assays such as genotoxicity assays, cytotoxicity measurements are a requirement for the correct interpretation of the results. Cell viability can be determined by the measurement of various cellular parameters, such as mitochondrial activity, lysosomal integrity, and membrane integrity. Different endpoints should be included to assess cytotoxicity [91–93], as results from different assays do not always correspond [94].

#### 4.1.1. Most Frequently Used Assays, Strengths and Limitations

The most-used approaches for measuring cytotoxicity or cell viability in vitro include measuring mitochondrial activity (examples are MTT, MTS, XTT, and WST-1 assays), release of cytoplasm components (examples are LDH and AK), lysosomal integrity (Neutral red uptake), apoptosis markers (caspase 3/7), and stains that can specifically enter apoptotic and/or necrotic cells (Trypan blue, Propidium iodide, and Annexin V). Propidium iodide and Annexin V can be combined to determine plasma membrane restructuring which can be representative of either necrosis or apoptosis specifically. Most cytotoxicity assays are relatively simple, can be carried out in a 96-well microplate format, could be used for HTS, and have commercial kits available. An exception are the assays that require microscopic evaluation of a certain staining, as this is more labour-intensive.

For many cytotoxicity and viability assays, NMs can interfere with assay reagents and/or the optical readout [95–98]. Therefore, potential interference of the NMs with assays should always be assessed [92,99]. The elimination of NMs via high-speed centrifugation may reduce optical interference [92]. Alternatively, since mitochondrial activity is measured intracellularly, NMs can be washed away from cells prior to incubation with the reagent to avoid interaction of the NMs with the reagent [100]. For products measured in the supernatant (e.g., LDH and AK), washing is not feasible, but centrifugation can help remove larger NMs and thereby reduce optical interference. However, some NMs are known to inactivate or adsorb LDH directly [101]. If interference still occurs after taking precautions, it is advised to perform another type of cytotoxicity test, as the subtraction of the average background signal of the NMs will reduce the accuracy of the outcome [97,102,103]. For specific NMs, some assays might prove not to be compatible.

Much effort has been put into the optimization and standardization of in vitro cytotoxicity assays specifically for NMs in the past years. An ISO standard for the MTS assay was published in 2018 [104]. In 2021, an ISO standard was published for impedance measurements for NMs specifically [105]. This assay involves growing cells on an electrode during exposure to the NM. The detachment of cells, indicating cytotoxicity, is measured as a decrease in electrochemical impedance, as demonstrated in the assessment of polylactic acid NM-induced toxicity in A549 epithelial cells [106]. This assay is less prone to interferences as no optical readout and no assay reagents are required. However, it does require specialized equipment not available in many labs. Internationally standardized and harmonized standard operating procedures (SOPs) for other cytotoxicity assays have not been published to date.

In the NanoReg project, an interlaboratory study for the MTS assay was carried out, and acceptable robustness levels were found depending on the cell type. The human alveolar cell-line A549 showed a good agreement in cytotoxicity between labs, whereas the differentiated human monocyte cell-line THP-1 (dTHP-1) showed varying results and a poor robustness [107]. In a large interlaboratory study by Piret et al. (2017), a good robustness was found for the MTS assay and ATP content measurements. These comparisons were carried out using both A549 as well as dTHP-1 cell lines, and two different NMs. The authors stressed the importance of avoiding interference of the NM with the assay in order to obtain more reliable results, and a lower inter-laboratory variability. They also found that the caspase-3/7 assay showed a high inter-laboratory variability [100].

A large interlaboratory study of eight labs studied how to improve the robustness of the LDH and MTS assay. After a first round of experiments, adaptations to the protocols were made and robustness increased significantly within and between laboratories. Changes made to the protocols included the optimization of the differentiation of THP-1 cells and centrifugation after incubation with MTS reagent to remove NMs [108]. These findings on the MTS assay were confirmed in another interlaboratory study using the A549 cell line. Additional sources of variability were identified in this study. A549 cells from two different suppliers showed a large difference in cytotoxicity in response to polystyrene NMs. Also, the inclusion of serum effectuated large differences in cytotoxicity as compared with serum-free experiments. Moreover, differences in pipetting techniques (e.g., harsh aspiration vs. gentle pipetting and completely removing medium vs. partially removing medium before MTS incubation) and dispersion protocols were identified as causing differences in results between laboratories [52]. The importance of more elaborate and detailed SOPs was again stressed in a recent inter-laboratory study, where the inclusion of several acceptance criteria was found to improve the robustness of the MTS assay, such as maximum acceptable variations between replicates, minimum cell survival, and maximum interference levels [102].

#### 4.1.2. Predictivity and Relevance

Whether in vitro cytotoxicity assays are predictive of in vivo acute toxicity has been studied for years for soluble chemicals. For NMs, however, there are only a few studies that correlate in vitro cytotoxicity with in vivo toxicity. Therefore, in this section we have included not only studies that correlate in vitro cytotoxicity with in vivo markers of cell death (apoptosis, necrosis etc.), but also with any type of in vivo toxicity.

In general, the predictivity of cytotoxicity assays depends on the mechanism of action of the NM, as well as on the cell type used for the in vitro study [109–111]. NMs which exert their effect through the shedding of toxic ions are usually also cytotoxic in vitro [100]. In a comprehensive comparison study, in vitro cytotoxicity was compared to in vivo lung inflammation for several different particles, using comparable doses. LDH release and trypan blue exclusion assays were able to predict the inflammation-inducing effects of ionshedding NMs, but not of poorly soluble NMs [112]. However, in another study, in vitro LDH release in response to poorly soluble TiO2 NMs correlated well to the in vivo number of polymorphonuclear cells (PMN) in BALF. This correlation was only present when the dose was expressed as surface area, and not when using mass as dose metric [113].

The toxic effects of CNTs in vivo upon inhalation are not easily predicted using in vitro cytotoxicity assays, unless the toxicity is caused by metal impurities [109]. Also for CNTs, the in vitro effects differ between cell types [114]. Other carbon-based materials, such as diesel exhaust also do not show an accurate correlation of cytotoxic response with in vivo effects. LDH release from A549 cells and LDH measured in BALF from rats upon instillation with diesel exhaust did not correspond, and even showed an opposite ranking in toxicity [115]. However, the suspensions used were not purely the particle fraction and contained other substances such as lube oil (which floats on culture medium), possibly causing the contrasting rankings.

The choice of cell type is crucial for performing a predictive in vitro cytotoxicity assay. For example, the WST-1 and NRU assays were able to establish an accurate ranking in toxicity of Ag, Au, SiO2, and MWCNTs, but IC50 values differed between the cell types used [114]. In a study by Sayes and colleagues, in vitro LDH release did not correlate with rat pulmonary LDH release and inflammation (% PMNs) for rat primary pulmonary macrophages and rat pulmonary epithelial L2 cells grown in mono-cultures. However, when grown in co-culture, in vivo LDH release and inflammation were accurately predicted via the in vitro LDH release for crystalline silica and ZnO (but not for amorphous silica) [94]. A similar study also showed a good correlation for this co-culture model for ZnO NM, but only at the highest (particle overload) dose [116].

When choosing a cell line, immune cells are found to give a higher prediction accuracy than fibroblasts [114]. When macrophages are thought to be involved in the toxicity of NM, it is especially important to select an immune cell type for testing cytotoxicity. THP-1 cells which were differentiated to macrophages (dTHP-1) showed a higher sensitivity for cytotoxic effects as compared to A549 cells (alveolar cell line) for a panel of 24 NMs [117]. Cho and colleagues found that differentiated peripheral blood mononuclear cells and isolated lung macrophages performed better compared to cell lines such as dTHP-1, A549, and 16-HBE [112]. The fact that primary cells are more sensitive than cell lines is generally accepted. Despite this, primary cells are more difficult to work with and more expensive, and will therefore most likely be disfavoured for SbD hazard testing.

#### 4.1.3. Overview of Needs and Knowledge Gaps

Table 1 shows a summary of how the different cytotoxicity assays perform in terms of the criteria for SbD hazard testing. As cytotoxicity measurements are a requirement for several mechanistic assays, they are crucial for SbD hazard testing. Simple cytotoxicity assays although optimizations for NMs are needed—serve as a good starting point for detecting hazard warnings in SbD hazard testing. It is recommended to include at least two different cytotoxicity assays as different assays measure different mechanisms [92,118]. A combination of a mitochondrial activity assay and a membrane-integrity assay is recommended.


**Table 1.** Evaluation of suitability of cytotoxicity assays for SbD hazard testing.

Prediction accuracy of cytotoxicity assays should be investigated further. It was shown that predictivity depends on the cell type used and the mode of action (MOA) of the NM. Assay applicability domains should be mapped in more detail to understand which toxic effects can be predicted with in vitro cytotoxicity assays and which cannot. We also found that protocol optimization improves assay robustness. Moreover, interferences are quite common for cytotoxicity assays, and they can be avoided by taking the right precautions and including the right controls, which is crucial even when performing SbD hazard testing. Together, this indicates the need for optimised and standardized protocols for NMs specifically. This will in turn also aid the determination of the prediction accuracy of assays.

#### **Box 3**

As cytotoxicity measurements are a requirement for several mechanistic assays, they are crucial for SbD hazard testing. Simple cytotoxicity assays—although optimizations for NMs are needed—serve as a good starting point for detecting hazard warnings in SbD hazard testing. It is recommended to include at least two different cytotoxicity assays as different assays measure different mechanisms A combination of a mitochondrial activity assay and a membrane integrity assay is recommended.

#### *4.2. Dissolution*

Although in vitro testing of dissolution is a measure of a PC property, and not directly a measure of toxicity, the results obtained can be used to infer potential toxicity, or even potential pathogenicity. This is through consideration of a material's biodurability or its transformation to ions or molecules. In this context, biodurability may be accompanied with biopersistence, which historically has been linked to the fibre pathogenicity paradigm such as that relating to asbestos, CNTs, and other respirable fibres [119], but also in relation to poorly soluble particles such as TiO2. Long-term inhalation exposure to poorly soluble particles can induce impaired clearance and chronic inflammation that might even progress to cancer, as has been observed for TiO2 in rats [120]. The human relevance of these results is a topic currently receiving a resurgence in interest within the scientific community [121]. Conversely, rapid dissolution of a substance can indicate exposure to potentially harmful soluble components, such as metal ions, which can be released in body compartments that are otherwise inaccessible.

Information on dissolution in relevant conditions is greatly beneficial for the hazard assessment of NMs and in defining SbD interventions. Information on dissolution is already a requirement of REACH and EFSA [21] and dissolution rates are a valuable criterion within all of the current risk assessment tools available for NM hazard assessment and can also be used for grouping/read-across [14,16,18].

#### 4.2.1. Most Frequently Used Assays, Strengths and Limitations

There are various methods used in in vitro testing of dissolution, acellular and cellular, which have not changed significantly for some time. There are a number of guidance documents including ones from ISO (ISO 19057:2017) and OECD (OECD GD No. 318, specifically for environmental studies) which provide the start of standardization of these techniques. The output of various EU projects (e.g., GRACIOUS, Gov4Nano and BIORIMA) will also greatly impact the development of this methodology.

For acellular testing, it is possible to test within static systems [122] or flow-through (dynamic) systems [123,124]. The application of these methods is extremely diverse, as the formulation of different simulant fluids may facilitate the simulation of any biological compartment, including extracellular and intracellular compartments, and any exposure route of interest including oral, dermal, or inhalation [125], with recommendations made within an ISO technical report (ISO 19057:2017). There are a number of differences, both subtle and substantial, in the simulant fluids used that determine the accuracy of in vivo prediction. For example, components such as citric acid have a significant effect on the dissolution of certain metals, and inclusion of proteins/serums will also have an effect on dissolution [126]. These considerations have been recently reviewed [127]. Two recently completed projects (nanoGRAVUR and GRACIOUS) identify the abiotic flow-through system ISO/TR 19057:2017 as the most relevant system [124,128], with a technology readiness level (TRL) identified as high/medium for metals using Inductively Coupled Plasma Mass Spectroscopy (ICP-MS) analysis and medium/low for materials such as CNTs that require techniques such as Transmission Electron Microscopy (TEM) or X-Ray Photoelectron Spectroscopy (XPS).

Although there are currently no accepted methods for in vitro cellular dissolution testing, various studies have been conducted and these may be more reflective of the in vivo response following inhalation of particles, although there have been concerns raised with the cellular methods.

#### Acellular Methods

Acellular dissolution can certainly be considered simple, considering the practical requirements. However, each methodology has different demands and associated limitations. For example, the solutes released during dissolution within a static system, especially those of a basic nature, may cause enhanced nucleation, precipitation, changes in localised pH, and/or saturation effects preventing further dissolution [129–132]; this behaviour is unlikely to reflect the in vivo behaviour, demonstrating clear limitations of static systems. Although dynamic systems, by design, circumvent these issues, they are not without limitations, and there are a number of factors which may affect their reproducibility [132]. NMs may pass through filters used in flow-through systems, leading to misinterpretation of results and potential false-positive results [133], or filters may become blocked and ruptured by components of the more complex fluids such as proteins or lipids [134]. Practically, dynamic systems are cumbersome due to the high volume of liquids required in long-running tests [135]. Although a comparison of acellular methods is not often made, when done so the findings have been confounding. For example, the dissolution rate of gold nanoparticles has been found similar in static and dynamic systems [136], while the solubility of BaSO4 has been shown to differ in static and dynamic systems [137].

It is often reported that distinction between different material forms is possible, with a level of sensitivity allowing for a distinction between dissolution rates leading to grouping, as has been suggested for fibrous NMs within the GRACIOUS project [138]. This approach has been aligned with previously established methodology for man-made vitreous fibres (MMVF) and asbestos, whereby NMs biodurability can be defined by the respective dissolution in alveolar fluid and lysosomal fluid. Good comparability has already been found for in vitro dissolution of MMVF materials and in vivo biopersistence [139], allowing confidence in this approach. Heavy influencers of sensitivity include the analytical method used to detect released ions, as well as high background measurements caused by the complexity of fluids used, although this can be alleviated through the removal (or reduction) of specific metal components within the fluid, in line with the solutes expected to be released from the test material [140]. The current use of dissolution within RA tools may not be so greatly impacted by sensitivity, as the thresholds used are very broad and/or rather elementary, using a ranking based on dissolution time (ANSES, Swiss Precautionary Matrix) or soluble concentration (GUIDEnano). Advances have been made recently to include threshold decisions based on dissolution rate [21,123]. Nevertheless, unless very significant changes are made to the particle to result in very different dissolution behaviour, it is unlikely that the sensitivity of the thresholds used will be dynamic enough to provide meaningful SbD decisions on dissolution behaviour.

In terms of compatibility, the potential of acellular tests is broad, and other than particular hydrophobic materials, it is difficult to list examples that could not be tested. In fact, these acellular methods have been used for some time to resolve the time-kinetic release of metals within complex materials, such as man-made fibres, or from occupational dusts such as welding fumes [141]. The biological predictivity of acellular tests, although not always established within the literature, has been demonstrated to a relatively high level, with various promising outcomes. For example, the solubility of BaSO4 in the dynamic system was considered to reproduce what is known for the solubility of BaSO4 in vivo, while the results of the static system underestimated this [137]. With the use of a lysosomal simulant fluid, the dynamic dissolution system was also shown to replicate cellular dissolution of BaSO4 (in conjunction with SrCO3 and ZnO) in rat macrophage models [123], and similarly acellular dissolution of MoO3 in the same lysosomal simulant fluid was found comparable to dissolution within mouse macrophage models [142]. These studies, and others, have demonstrated that by using various simulated biological fluids of intracellular compartments and/or lung lining fluid, a number of correlations with either cellular assays or in vivo exposures can be attained [123,143–146]; however, it should be acknowledged that there is an equal number of studies that have shown no correlation, raising the concern for appropriate fluid selection [127].

#### Cellular Methods

The basic principle of the cellular method is simple and can be performed cheaply as typical assessments investigate dissolution within cells up to 24 h. The difficulty with this methodology is the success of analysing the ions released. There are various options for separating cells and supernatant, such as centrifugal ultrafiltration and cloud point extraction [147], however it is not always as straightforward as the acellular assessments, as released ions may form complexes with biomolecules and therefore separation may be hampered [148]. Additional concerns may arise from the complexing of ions to biomolecules. Therefore, studies have often opted to determine both the ion concentration and the NM concentration to aim to avoid false positives or false negatives [149].

Koltermann-Jülly et al. (2018) found that macrophage-assisted dissolution in vitro was only applicable for an exposure period of 1–2 days, which they believe is too short and may be responsible for the low amounts of ions detected for the NMs tested [123]. The authors state that using cellular systems gives no additional benefit to the abiotic flow-through system with regards to predicting the in vivo response. This conclusion, however, is based only on the three materials tested. Moreover, when a cellular method uses uptake as an inference of dissolution, overestimating particle concentrations may occur, when measurements are not only of internalised particles but also of those adhered to the cell membrane. However, this is likely to be resolved by following well-described methodology which includes steps to limit this interference such as ensuring thorough washing and etching of the cells prior to analysis to remove adhered particles [150]. Further promising methodologies for this include the isolation of NMs and ions from cells using Triton X-114-based cloud point extraction as has previously been conducted for intracellular Ag NMs and Ag+ isolation [147,149].

#### 4.2.2. Overview of Needs and Knowledge Gaps

Table 2 shows a summary of how the different dissolution assays perform in terms of the criteria for SbD hazard testing. There is a wealth of studies available for interpretation of in vitro dissolution methods, and although there are promising findings, there are still too many uncertainties to be sure of which model is most appropriate or reliable for specific materials. For SbD hazard testing, the use of a static system would be preferred, due to its simplicity. Although correlations with in vivo outcomes have been shown for static and flow-through systems, this is not always the case, and therefore requires further attention. Going forward, assessing predictivity will be important, especially when assessing novel materials; however, auspiciously, as shown above, for some substances a strong relationship between acellular, cellular, and in vivo findings has already been observed. It has been previously suggested that specific in vitro methods (e.g., specific fluid choices) should be selected based on feasible degradation pathways [142] which could be dependent upon specific degradation routes e.g., complexation, protonation, or to establish robust and fully accurate biological simulations [127]. Moreover, if these methods are to be applied in grouping and read-across approaches, the development of reliable and robust methods for determining particle dissolution rates has been considered paramount [123].

**Table 2.** Evaluation of suitability of dissolution assays for SbD hazard testing.


**Table 2.** *Cont.*

#### **Box 4**

There is a wealth of studies available for interpretation of in vitro dissolution methods, and although there are promising findings, there are still too many uncertainties to be sure of which model is most appropriate or reliable for specific materials. For SbD hazard testing, the use of a static system would be preferred, due to its simplicity.

#### *4.3. Oxidative Potential and Oxidative Stress*

The oxidative potential (OP) of a NM is a chemical property that defines the ability of a NM to form potentially toxic species such as hydroxyl (•OH) and superoxide (O2 −) radicals and hydrogen peroxide (H2O2) (collectively called reactive oxygen species (ROS)), or reactive nitrogen species (RNS) through redox reactions. This parameter is part of many NM hazard assessment strategies and grouping approaches due to its potential as a predictor of toxicity [13–15,18]. The pros and cons of OP assays in NM research have extensively been reviewed previously [156,157].

Oxidative stress (OS) is a cellular state in which the amount of ROS, caused by NM OP or the release of reactive ions, overwhelms the cells' antioxidant capacity, potentially leading to the oxidation of biomolecules, inflammation, and oxidative DNA damage [158,159]. OS is seen as an important key event in the mode of action of many NMs and is therefore important to quantify as an early warning indictor [156,160,161].

#### 4.3.1. Most Frequently Used Assays, Strengths and Limitations Acellular Methods

In an acellular assay, OP is usually measured as a rate of depletion of a reductor. OP assays do not measure reductor depletion by OP only, since the release of reactive ions by dissolution will also lead to a depletion. Multiple acellular assays have been proposed to evaluate NM OP. The acellular dichloro-dihydro-fluorescein (DCFH) assay, electron paramagnetic resonance (EPR), electron spin resonance (ESR) spectroscopy, and the Ferric Reducing Ability of Serum (FRAS) assay are frequently used and have been evaluated extensively in literature [162–167]. An additional assay which is especially relevant for measuring NM OP is the haemolysis assay. For this assay, red blood cells are isolated from whole blood, and the ability of NMs to disturb their membranes is measured through absorbance. This assay requires whole blood, but no cell culturing, making it an easy and cost-effective method. No interferences have been reported in this assay, but it has not been studied extensively for NMs. There is no information available on robustness, and there is no publicly available standardized protocol.

Out of the other acellular assays, standardized protocols are only available for EPR/ESR (i.e., ISO 18827:2017). These assays require relatively expensive equipment, not always available in standard laboratories. FRAS and DCFH are considered simple and low-cost assays, requiring equipment that is present in any standard biological lab [165,166,168]. The FRAS method was originally developed to measure the ferric reduction in blood plasma (FRAP) [169]. It has been adapted to be used with serum, optimized for smaller volumes [165] and for multi-dose measurements, while showing good sensitivity and

reproducibility for several metal-bearing NMs [163]. Recently, the FRAS protocol was successfully adapted to measure the reactivity of graphene-based materials by adding a filtration step for NMs of very low density [168]. Interlaboratory studies for the FRAS assay have not yet been performed.

The robustness of the acellular DCFH assay using different NMs was evaluated in a recent inter-laboratory study. A good robustness was found for the positive control NMs when normalizing fluorescence values between labs. However, for the other NMs, interlaboratory reproducibility differed per particle type [170]. Several papers reported NM interference with, for example, the fluorescent readout of the DCFH assay [162,166,171]. Zhao and Riediker (2014) identified several other factors that could reduce the reliability of the acellular DCFH assay, such as the use of different dispersing agents, as well as using too-high concentrations of NM [172]. Interference by way of NM flocculation or optical interference has been noted regarding the FRAS assay when testing various NM pigments [173], while no NM interference has been reported for ESR/EPR.

The EPR, DCFH, and FRAS assays have recently been evaluated for NM-grouping purposes. Results showed that the sensitivity of the methods greatly depends on the type of particle studied. For example, CuO, BaSO4, and Mn2O3 were consistent in their reactivity level across the three methods, but ZnO and CeO2 only showed a response in the FRAS assay, and not in EPR and DCFH measurements [173]. This might suggest that reactive species produced by certain NMs are captured better by some assays than by others, or that the FRAS assay is more sensitive in general. The latter has been confirmed in several studies that showed that the FRAS and ESR/EPR perform better than the DCFH assay in terms of sensitivity [163,171,174,175].

The choice of assay should depend on the goal of testing. For example, if one would like to know which types of radicals are formed in order to know what to change in the NM design as a SbD intervention, ESR/EPR measurements with different spin traps will provide the most informative results [176]. The FRAS assay can provide a more general image of ROS generating potential, as a result of the cocktail of antioxidants that is present in serum. The DCFH assay is especially sensitive for one-electron oxidizing species (such as hydroxyl radicals) [177]. It should be taken into account that these assays measure the OP of the NM in the specific environment required by the assay. OP is greatly influenced by the exposure environment, and therefore the OP measured in the assay may not fully reflect the OP in a real-life exposure scenario. A relevant protein corona could be applied to ensure exposure relevancy, as is described in Section 3.2.

#### Cellular Methods

Cell-based assays can directly measure the intracellular ROS, irrespective of their origin (i.e., as a result of the surface chemistry of the NM, as a cell-generated signalling molecule, or as a defence mechanism of the cell within the phagolysosome), for example through use of the cellular DCFH-DA assay. Other options include the assessment of the effect of these radicals on biomolecules such as lipids (e.g., lipid peroxidation) and proteins (e.g., protein carbonylation), cellular antioxidant status (e.g., glutathione (GSH:GSSG ratio)), and antioxidant gene regulation (HO-1 expression and Nrf-2 reporter cell lines), of which the latter two are extensively described in Boyles et al. (2016) [178].

It has been suggested that OS measured in a cell-system has advantages over measuring the OP in acellular systems. By measuring in a cellular environment, the cells' ability to defend itself against the induced OS is taken into account, the ROS' generated genotoxicity can be assessed, and other mechanisms other than the OP that lead to OS are captured as well [156]. Other mechanisms leading to OS, such as through mitochondrial perturbation, have been shown for chemicals extracted from diesel exhaust particles [179,180], yet there is no convincing evidence that NMs are capable of inducing OS through mechanisms other than OP or ion release.

#### 4.3.2. Predictivity and Relevance

For SbD hazard testing, it would be desirable to be able to predict human health effects or at least effects that are observed in studies in experimental animals with simple, fast, and cheap acellular OP assays. The ability of acellular assays to predict cellular oxidative stress and in vivo oxidative stress markers is quite good, as shown in a comprehensive review by Moller et al. (2010), but not for all particles and all test systems [160]. It has been shown that NMs can induce different types of ROS [181] and therefore it depends on the type of NM and their MOA whether assays can predict cellular and in vivo effects. For example, data derived with the haemolysis assay correlate very well with in vivo pulmonary inflammation for a panel of 13 metal oxide NMs (92% prediction accuracy), whereas EPR (69% prediction accuracy) and DCFH (77% prediction accuracy) results showed lower correlations [164]. The haemolysis assay was able to predict in vivo pro-inflammatory responses of both NMs that act through soluble ions as well as NMs that act through surface reactivity with a prediction accuracy of 62.5% for a panel of eight NMs [112]. The FRAS assay has even been shown to be able to correctly distinguish OPs between several types of CNTs [175]. In a study comparing pulmonary inflammation (PMN influxes) upon inhalation of a range of NMs with acellular ESR and DCFH results, the correlation was reasonable; however, here it was concluded that ESR measurements in macrophages give a higher prediction accuracy than the acellular assays [182]. For SiO2 NMs, EPR results correlated very well with in vitro cellular cytotoxicity [183]. ESR also correlated well with in vitro protein carbonylation for a large panel of NMs [184]. However, ESR as well as the FRAS assay were able to accurately predict only 50% of the in vivo outcomes for a panel of 35 NMs [162].

False positives in acellular OP assays (when compared to in vivo outcomes) can be explained by the fact that cells and organisms can resolve ROS to a certain extent. Therefore, effort should go towards establishing thresholds for these assays. False negatives in acellular OP assays can be explained by the fact that other mechanisms other than OP can lead to pulmonary inflammation as well, which cannot be detected by these assays. The large variation between prediction accuracies between studies could be explained by the differences between the NM panels tested. Each assay has a specific applicability domain and prediction accuracy will therefore depend on the NM types and the resulting types of ROS.

In general, cellular assays show a higher prediction accuracy than acellular assays, and a combination of both might perform even better [157,162,185]. However, for SbD hazard testing, acellular assays may already give a good indication of toxicity and could serve as a valuable initial screening in the very early stages of NM development.

#### 4.3.3. Overview of Needs and Knowledge Gaps

Table 3 shows a summary of how the different OP assays perform in terms of the criteria for SbD hazard testing. There are clear indications that only measuring acellular reactivity would be sufficient for SbD hazard testing, when cellular testing is already performed for cytotoxicity, genotoxicity, and pro-inflammatory effects. OP assays can be used to categorize the materials and to explore in more depth if the OP can or should be reduced in a SbD intervention. Mapping the prediction accuracy for each assay, as well as an applicability domain will help understand which assays can be used to predict which specific effects.


**Table 3.** Evaluation of suitability of oxidative potential assays for SbD hazard testing.

#### **Box 5**

There are clear indications that only measuring acellular reactivity would be sufficient for SbD hazard testing, when cellular testing is already performed for cytotoxicity, genotoxicity, and proinflammatory effects.

#### *4.4. Inflammation*

Many IATAs and testing strategies include the measurement of inflammatory potential using NMs since this is generally accepted as one of the key mechanisms of NM toxicity [15,19,20]. Pulmonary inflammation in response to NM exposure has been shown to lead to several adverse health effects, such as fibrosis as well as lung cancer in animal studies [120,186,187]. For oral exposure, inflammation is a key parameter in NM toxicity as well [188]. However, this section will focus on assays targeting the pulmonary route of exposure only.

#### 4.4.1. Most Frequently Used Assays, Strengths and Limitations

It is impossible to capture the complexity of an in vivo inflammatory response in an in vitro model, where recruitment of inflammatory cells other than those already present cannot occur. It is however possible to detect the cytokines responsible for this recruitment in an in vitro experiment. The most widely used approach to assess inflammatory responses in in vitro assays is measuring cytokine production or secretion, using, for instance, an enzyme-linked immunosorbent assay (ELISA), RT-qPCR, or multiplex-based immunoassays [189] after exposing cultured cells to NMs. Measuring the levels of pro-inflammatory cytokines and other inflammatory mediators may give insight into the mechanisms of the

immunomodulatory effects of NMs in vitro, such as inflammasome activation or dendritic cell maturation. Cytokines of specific interest for NM pulmonary toxicity are, amongst others, IL-8 as markers for neutrophil recruitment [190], IL-1β as a marker for NLRP3 inflammasome activation [191], and TNF-α as a marker for macrophage activation [192].

Cytokine release can be measured in e.g., epithelial cells, macrophages, and dendritic cells, cultured in mono- and co-cultures. Cells can be exposed in a submerged setup or at the air-liquid interface (ALI), where cells are cultured in contact with the air and exposed to aerosols on the apical side whilst kept in medium on the basal side, better resembling the physiological environment of cells in the respiratory tract (Figure 4).

**Figure 4.** Submerged (**left**) and ALI exposures (**right**) to NMs. Submerged exposures are considered easier, whereas ALI exposures are considered more physiologically relevant for inhalation (and dermal and intestinal) exposures.

A critical factor when assessing pro-inflammatory responses in cell models is that some NMs can interfere with common in vitro assays. SWCNT and MWCNT can nonspecifically adsorb TNF-α and IL-8 to their surface, and TiO2 NMs have been described to be able to adsorb IL-8, thus causing a false-negative result in ELISA assays [99,118]. This effect has also been observed for Ag NM in combination with TNF-α and IL-8 [100]. NMs are also known to interfere with the components of the ELISA. This problem can be overcome by centrifugation to remove the NMs from the supernatant before performing the ELISA. It is also essential to test the NMs for endotoxin contamination, as endotoxins can induce inflammation at very low concentrations, leading to false positive results [193,194], especially since NMs are generally not produced in a sterile environment.

Despite the relevance of pro-inflammatory effects of NMs in human health, there is currently no validated test method available to investigate inflammatory responses in vitro. Submerged assays have been used far longer compared to the relatively new ALI models, and thus more advances in standardization and optimization have been accomplished. Only a few studies have been performed to show the robustness of one of these protocols. At the ALI, Calu-3 cells with and without macrophages (either differentiated THP-1 cells (dTHP-1) or primary cells) showed high reproducibility in seven participating labs based on measurements of membrane integrity and mitochondrial activity. Cytokine release however showed higher variability, although similar trends between the seven labs were observed [195]. The reproducibility between labs after exposure of A549 cells at the ALI to lipopolysaccharide (LPS) as a positive control was found to be quite low. However, after protocol optimizations, special training of personnel for cell handling, and homogenization of disposables and reagents, the reproducibility increased [196]. The reproducibility of results between labs when using dTHP-1 cells is a frequent topic of debate. Not only do they show varying responses to NMs, but also to a positive control such as LPS, as shown in a large inter-laboratory study [100]. In another large inter-laboratory study by

Xia et al. (2013), it was shown that good results can be obtained when using very detailed protocols and using the same batch of serum and cells. They also showed that cell-culture conditions and the duration of differentiation greatly affect the variability of dTHP-1 cells between labs [108].

A frequently used alternative for dTHP-1 macrophages are monocyte derived macrophages (MDMs), derived from donor blood. Even though they are considered more predictive of the in vivo situation, they are also known for their donor-to-donor variation. The same holds true for the use of commercially available primary epithelial cells, which are considered more relevant, but also show considerable variation [197].

In ALI exposure systems, there are many other factors that may contribute to an increased variability, such as the accuracy of the microbalance in the exposure system, the quality of the nebulizer used, the method of sample preparation, etc. A comprehensive overview of factors that can influence reliability and robustness can be found in Petersen et al. (2021) [198].

In terms of predictivity, commercially available primary cells generally give a good indication of in vivo effects for known inflammation-inducing particles. Studies have shown that the pro-inflammatory effects of quartz [197], Ag NMs [199], SiO2 NMs [200], and Pd and Cu NMs [201] are accurately predicted using primary cell models. Co-cultures of cell lines are also able to predict in vivo responses in many cases, as has been shown for quartz [94,202], CuO [112], and ZnO [94]. In this latter study, it was shown that co-cultures perform better than the two cell types separately, showing the importance of interplay between epithelial cells and immune cells. The addition of macrophages seems to be crucial in order to capture a much wider domain of immunological responses as compared to epithelial cells only, as has been shown in multiple studies [108,203,204]. Epithelial cell lines in mono-cultures were not able to predict the toxic effects of quartz [205], but did accurately detect Ag NMs' pro-inflammatory effects [112].

In short, primary cells are the most sensitive, followed by co-culture systems with macrophages, and then mono-cultures. There are however some studies that prove otherwise. Cho et al. (2013) showed that cell lines performed similar to primary alveolar macrophages and differentiated PBMCs in terms of accuracy [112]. The A549 epithelial cell line in tri-culture with inflammatory cells did not pick up the pro-inflammatory effect of Ag NMs [206]. Mono-cultures of the epithelial cell line 16-HBE better predicted in vivo effects of Ag NMs than when in co-culture with dTHP-1 cells [207]. Furthermore, CeO2, Co3O4, and NiO NMs induced an increase in granulocytes in BALF, whereas no pro-inflammatory effects were seen in submerged mono- and co-cultures [112]. Finally, BEAS-2B and dTHP-1 cells were able to predict a ranking in pro-inflammatory effects of several types of CNTs which corresponded to in vivo markers of lung fibrosis in two separate studies [208,209]. This could mean that cell lines could be suitable for SbD hazard testing. Likely, different modes of action of toxicity require different levels of complexity in a cell model. In order to be sure about the predictive capacity of the different cell types, more types of NMs should be tested.

The exposure method chosen will also impact the predictivity of the method. Exposing ALI-cultured cells is generally considered a more sensitive approach, since it is more physiologically relevant, as has been shown in multiple studies [210–212]. However, for SbD hazard testing it is desirable to work with a model that is as simple and cost-effective as possible. This disfavours the use of primary cells and favours simple submerged exposure systems as opposed to the more complex ALI cultures. There are strong indications that simple submerged models could be predictive enough for SbD hazard testing. For example, in a study by Loret et al. (2016) they concluded that, indeed, co-cultures were more sensitive than monocultures, and that ALI exposures were more sensitive than submerged cultures. However, the general ranking of the NMs in terms of their toxicity was similar across the various exposure methods [213]. A study by Di Ianni et al. (2021) showed a strong correlation between in vitro submerged co-cultures and in vivo results when testing CNTs [190]. In a study by Herzog et al. (2014), the pro-inflammatory effects of Ag

NMs were not detected in ALI culture conditions, but were detected under submerged conditions, suggesting a better performance of the submerged model [214]. Submerged and ALI exposures performed equally well for cytotoxicity in response to TiO2 [215]. In a study by Panas et al. (2014), submerged conditions were more sensitive in detecting the pro-inflammatory effects of SiO2 NMs as compared to ALI conditions [216]. Altogether, the potential for submerged experiments to predict in vivo responses has been shown in multiple studies, and is worth exploring further, especially for SbD hazard testing.

The differences in sensitivity between the ALI and submerged exposures can be explained by many (potentially confounding) factors. Firstly, certain cell types such as A549 are suggested to produce surfactant at the ALI, but not under submerged conditions, making them more vulnerable to toxic effects in a submerged experiment [217]. Secondly, the effective dose in submerged experiments is not always (correctly) calculated, and this may lead to a skewed comparison to ALI and in vivo results. Thirdly, the medium used in the submerged experiments may have an impact on NM behaviour in terms of the protein corona and dissolution rate, which does not occur, or occurs differently, in ALI experiments. Lastly, studies finding a good correlation between an in vitro model and in vivo results are more likely to be published, leading to publication bias. In a comprehensive overview of different cell types and exposure methods by McLean et al. (in preparation), it was shown that for quartz hazard prediction, the strength of in vitro prediction of in vivo responses was highly inconsistent, and largely dependent upon the data and study quality, which highlighted a need for robust SOPs which take into account numerous requirements for in vitro/in vivo extrapolation (McLean et al., in preparation).

There are several advantages of using ALI over submerged exposures. Particle alterations due to interaction with medium (such as the formation of a protein corona, dissolution, agglomeration) are no longer an issue, and calculating the deposited dose is much easier as compared to submerged experiments [212,218]. ALI exposures are compatible with a wider range of NMs, including hydrophobic particles, as exposures can be performed using a powder. Without having to take into consideration their behaviour in medium, abrasion products of NEPs can also directly be applied, making this type of model especially interesting for assessing life-cycle considerations, as is crucial for SbD hazard testing. Using ALI exposures is however more time-consuming and less high-throughput. Additionally, robustness of deposited dose after nebulization in an ALI setup may be low, depending on the NM used [219].

#### 4.4.2. Overview of Needs and Knowledge Gaps

Table 4 shows a summary of how the different inflammation assays perform in terms of the criteria for SbD hazard testing. For SbD hazard testing, it is crucial to include tests that are predictive yet simple and cost-effective. Therefore, based on the current literature, the use of a submerged co-culture model including at least a type of macrophage might be the most suitable. However, more research is needed to confirm that simple methods are predictive enough for early hazard screening by testing data-rich NMs. Moreover, novel and advanced NMs should be tested in the available cell models in order to determine the compatibility of the cell models and readouts with different types of NMs. For a better predictivity, avoiding issues with dosimetry and medium interactions, and for hydrophobic particles, a simple ALI experiment can be set-up for SbD hazard testing.

A short-lived inflammatory response is beneficial to help clear NMs from the lung, and macrophage recruitment may not necessarily be a hazard warning. There is still a poor understanding of which amount of inflammation could be considered an adverse outcome, especially when measured in vitro. Establishing meaningful thresholds for these assays is important. Moreover, since pulmonary inflammation is mostly a chronic adverse effect, more work should be focusing on predicting chronic effects with in vitro assays, with which a promising start has been made in the PATROLS project [220].


**Table 4.** Evaluation of suitability of Inflammation assays for SbD hazard testing.

#### **Box 6**

For SbD hazard testing, it is crucial to include tests that are predictive yet simple and cost-effective. Therefore, based on current literature, the use of a submerged co-culture model including at least a type of macrophage might be the most suitable. However, more research is needed to confirm that simple methods are predictive enough for early hazard screening.

#### *4.5. Genotoxicity*

One of the main safety concerns related to NMs is their possible genotoxicity [221,222]. Genotoxicity describes the capacity of a chemical or physical agent to produce genetic damage that, if left unrepaired, may lead to cancer [223]. Therefore, every mutagen is potentially carcinogenic [222].

Due to the important consequences to human health, mutagenicity is a hazard endpoint required in all product regulations (chemicals, biocides, pharmaceuticals, medical devices, food additives, cosmetics, etc.) [224]. The assessment of genotoxicity is based on validated in vitro assays, which can be followed up by validated in vivo assays, depending on the in vitro outcome and the regulation involved [225]. Therefore, genotoxicity assessment at an early stage of innovation is highly advised. In fact, genotoxicity is a key endpoint in most of the testing strategies developed for NMs [13,19,221,226,227].

#### 4.5.1. Most Frequently Used Assays, Strengths and Limitations

The mutagenicity of chemicals is usually evaluated on the basis of a battery of standard genotoxicity assays, able to detect gene mutations, chromosomal damage, and aneuploidy, as all these different mechanisms need to be considered in the assessment [221,224,227]. A core in vitro battery comprising the Ames test (detecting bacterial gene mutations) plus the in vitro micronucleus test (detecting chromosomal damage and aneuploidy) was already

proposed 10 years ago for soluble chemicals [228]. Nearly 100% (958 out of 962) of rodent carcinogens or in vivo genotoxins were correctly detected with these two tests, which makes this battery a particularly sensitive combination [224]. However, the specificity of both assays together was unacceptably low (12.0%), giving rise to a high rate of falsepositive results [229]. Hence, most of the EU regulations require a follow-up in vivo study when in vitro positive results are obtained [224]. In the case of NMs, the Ames test does not appear to be a suitable method as some NMs may not be able to penetrate through the bacterial wall, whereas others may kill the bacteria due to their bactericidal effects [230–234]. Based on this evidence, results obtained with this method should be followed up with other gene mutation assays using mammalian cells [21,235], or better yet, the Ames test should be avoided for NMs.

A roadmap for the genotoxicity testing of NMs was suggested some years ago [236], followed by guidance and common considerations [237]. There are two OECD TGs for assessing in vitro mammalian gene mutations: the In vitro Mammalian Cell Gene Mutation Tests using the Hprt and xprt genes (OECD TG 476), and the In vitro Mammalian Cell Gene Mutation Tests Using the Thymidine Kinase Gene (OECD TG 490). The latter, also called the mouse lymphoma assay (MLA), can detect a broader spectrum of genetic damage than the former, including chromosome rearrangements, deletions, and mitotic recombination [236]. Both assays are time-consuming, requiring long culture times (e.g., 10–14 days before counting colony formation), which has probably precluded an extensive use of these assays. For soluble chemicals, the sensitivity and specificity of the MLA assay was reported to be 73.1 and 39.0%, respectively, resulting in a prediction accuracy of 62.9 % [224]. In the case of NMs, given the low number of studies performed with these assays, and the wide variety of NMs included in these few studies, it has not been possible to draw any conclusions concerning the relative sensitivity of the various reporter genes to the potential mutagenicity of NMs [236]. Nevertheless, there are ongoing efforts to adapt the HPRT assay for use with NMs, e.g., within the EU H2020 RiskGone project, where round robin activities are ongoing.

Among the assays detecting chromosome damage, the In vitro Mammalian Cell Micronucleus (MN) Test (OECD TG 487) has been the most extensively used in nanotoxicology [221,230]. The assay detects chromosome mutations induced by either clastogenic or aneugenic agents. In addition, it also detects most mutagenic events as most mechanisms leading to gene mutation also induce chromosome mutations [228]. For soluble chemicals, the sensitivity and specificity of the MN assay was reported to be 78.7 and 30.8%, respectively [229], resulting in a concordance of 67.8%. In the case of NMs, there are few papers that evaluated a similar material, and there is a substantial variation in the methodology applied, which precludes raising conclusions on the reproducibility and predictability of this assay [236].

The most classically used version of the MN assay is the cytochalasin-blocked MN assay, which includes the use of cytochalasin B, a cytokinesis blocking agent that enables the identification of dividing cells [238]. However, as cytochalasin B may impair NM intracellular internalization, leading to false-negative results in the MN assay [239], it is recommended to successively treat the cells with NMs, and then with this agent [240,241]. Since cells should undergo mitosis for binucleated cells to form, the use of serum in exposure medium is recommended. The need of both proliferating cells and cells accumulating NMs in the MN assay has been illustrated recently while optimizing the MN assay on 3D cell models. The 3D EpiDerm™ skin model accumulates less NMs than 2D skin cells, resulting in less MNs upon exposure to genotoxic ZnO NMs [242]. Moreover, HepG2 spheroids still hold the capacity to proliferate while HepaRG spheroids do not, and genotoxic ZnO NMs do not show a positive outcome in the MN assay in 3D HepaRG while they do on the 3D HepG2 model [242,243]. An adaptation of this TG for NMs, within the OECD project 4.95 ('Guidance Document on the Adaptation of In vitro Mammalian Cell Based Genotoxicity TGs for Testing of Manufactured Nanomaterials'), is currently ongoing based on previous recommendations [236]. One of the first round robin studies on the in vitro

MN assay was performed within the NanoGenotox project [244], involving 12 laboratories, and comparing the genotoxicity of three reference materials. Relatively reproducible results were obtained in some cases, but they were material- and cell line-specific. A similar conclusion was reached by Louro et al. (2016) on four benchmark MWCNTs in two lung epithelial cell lines [245]. One reason could be the low fold increase over control values, as was also pointed out in the genotoxicity assessment performed by Elespuru et al. (2018) [236]. Currently, round robin tests are planned within the OECD project 4.95 and the EU H2020 RiskGone project.

Lately, the MN assay has been applied in co-culture systems involving inflammatory cells (e.g., THP-1 cells) and target cells (e.g., lung epithelial cells) allowing the evaluation of the mechanisms of action (primary vs. secondary) underlying NMs' genotoxicity [246,247]. Genotoxins operating by a secondary mechanism of action, mediated by inflammation, are assumed to have a threshold response [248].

Classically, the MN assay has involved a labour-intensive manual scoring under the microscope. However, the speed of the analyses can nowadays be increased by using automated microscope scoring platforms [249], or flow cytometry [250–252]. The latter has recently been adapted to NMs [253].

The other validated method for assessing chromosome damage is the In vitro Mammalian Chromosome Aberration (CA) Test (OECD TG 473). This assay has been used much less because it is more time-consuming and requires a significant level of expertise to score the aberrations [236]. Furthermore, the CA assay does not detect aneugens, while the MN assay does (OECD TG 473, 2016). Hence, the CA assay would not be recommended for SbD hazard testing.

Furthermore, HTS approaches that could be applied to the testing of NMs are currently in development. These methods are non-OECD-guideline methods but proved efficient to detect potential NM genotoxicity. The comet assay is by far the most employed among these assays, and it could complement the recommended in vitro mutagenicity assays [236]. During the past decade, some effort has been dedicated to increase its throughput, with the highest achieved in the 96-minigel version using gel bond films [254]. It has been optimized and successfully applied to assess the genotoxicity of NMs within the FP7 NanoREG project [255].

Lastly, one commonly used genotoxicity assay for NMs is the immunolabelling of DNA repair protein foci, such as gamma-H2AX, which form during DNA double-strand break repair. The background of DNA double-strand breaks in cells is generally very low (although some exceptions exist, such as in some cancer cell lines), which makes these assays very sensitive. High-throughput versions of the assays exist. Foci can be counted using automated microscopy platforms or flow cytometry, with the advantage of their rapidity and possibility of analysing other cell parameters such as cell viability or apoptosis simultaneously [256]. Such high-throughput methods have rarely been applied on advanced 3D cell models for assessing NM genotoxicity because they necessitate additional steps. For example, a high-throughput comet assay can be performed on 3D cells after enzymatic and mechanical dissociation of the spheroid [257], reducing simplicity. Still, such advanced models could increase the predictivity of the assay. For example, Ag NMs cause significant DNA damage in a 2D liver-cell system, while the outcome of the comet assay is insignificant in 3D HepG2 spheroids, which is similar to most of the in vivo studies published up to now reporting Ag NM genotoxicity via comet assay [258].

#### 4.5.2. Overview of Needs and Knowledge Gaps

Table 5 shows a summary of how the different genotoxicity assays perform in terms of the criteria for SbD hazard testing. One of the main problems for determining the sensitivity and predictability of the genotoxicity assays when assessing NMs is the absence of nano-sized particulate controls. NM-specific controls have rarely been demonstrated [236,259], making comparisons among labs difficult. Furthermore, it is not possible to establish historical positive control ranges that would confirm the sensitivity of the

tests [259]. Based on the currently available information, we follow the recommendations by Elespuru et al. (2018) [236] to use the MN assay in combination with a gene-mutation assay (HPRT or MLA). In the meantime, further optimizations of genotoxicity assays for testing NMs are ongoing.

**Table 5.** Evaluation of suitability of genotoxicity assays for SbD hazard testing.


For advanced models, such as 3D models, the 3D skin comet and micronucleus assays are sufficiently validated for conventional chemicals and individual OECD TGs could start being developed [260]. However, the 3D airway and liver models are still lacking assays that could measure micronuclei and gene mutations, respectively [260]. Working with NMs raises additional technical hurdles that need to be overcome. Nevertheless, advanced models will offer advantages over current assays, especially by mimicking better the human body response and being able to evaluate modes of actions, e.g., secondary genotoxicity [261].

#### **Box 7**

Based on the currently available information, we follow the recommendations by Elespuru et al. (2018) to use the MN assay in combination with a gene mutation assay (HPRT or MLA). In the meantime, further optimizations of genotoxicity assays for testing NMs are ongoing.

#### **5. Discussion and Outlook**

The development, manufacturing, and use of NMs with novel properties and potentially undesired health risks are growing at a rapid rate. One way to reduce potential adverse effects caused by NMs is to incorporate SbD in NM development processes. Within a SbD approach, the potential hazards of a NM throughout the life cycle are assessed at an early stage of product innovation. Although advancements have been made in terms of nano-specific risk assessment strategies [13,19,138,227], several challenges remain when applying these strategies to SbD:


In this review, in vitro toxicity assays have been critically assessed in terms of their suitability for SbD hazard testing. The main purpose of SbD hazard testing is the identification of early hazard warnings and obtaining a general idea of the potential hazards of a novel NM, NEP, or components released thereof during the LC. It therefore serves as a first screening during the early stages of the development of a new NM or NEP. For SbD hazard testing, a balance needs to be sought between simplicity and comprehensive testing that addresses all concerns (Figure 5). The more elaborate the assessment, the more uncertainty is minimized, and the more the testing becomes too complex for the purpose of SbD.

**Figure 5.** The balance of SbD hazard testing. SbD aims to address safety at an early stage in the product development process. On the one hand, SbD tries to be comprehensive to address all concerns, while on the other hand the approach should be simple.

#### *5.1. Assay Predictivity*

#### 5.1.1. Early Hazard Warnings

Correlating in vitro effects to in vivo potency has not yet been possible [109] and is not a requirement for SbD hazard testing. The identification of hazard warnings and detecting the most potent NMs is more relevant, and in this review it was shown that many assays are capable of doing this. The prediction accuracy of the evaluated assays in many cases depends on the type of particle, sample preparation, as well as the type of cell system used. Effects of NMs which assert their effect through ion shedding, such as Ag and ZnO NMs, were most accurately predicted across all toxicity endpoints. Current in vitro assays were less capable of predicting the effects of NMs acting through surface reactivity, and fibre-like NMs. Another important finding across several toxicity outcomes was the better predictivity of primary cells as compared to cell lines, as well as better predictivity of macrophage-like cells as compared to other cell types. However, there are clear indications that simple submerged assays might be suitable for prediction of adverse effects in vivo, as was also shown in a recent review by Di Ianni et al. (2022) [262].

#### 5.1.2. Hazard Ranking

For an assay to be able to establish an accurate ranking in toxicity is valuable for SbD hazard testing, as this allows the use of the assay for comparison between candidate NMs, and for comparison to benchmark materials with known toxicity. There are indications that in vitro cytotoxicity assays are able to predict an adequate ranking in toxicity which corresponds to in vivo pulmonary inflammation [114]. Also, the detection of cytokines at the ALI could potentially detect a ranking that corresponds to in vivo pro-inflammatory mediators [211]. However, both of these studies could not draw any definitive conclusions on comparable rankings. Two studies showed that simple submerged cell lines are able to produce accurate rankings in pro-inflammatory effects that corresponded to in vivo markers of fibrosis [208,209]. It is important to note that the accuracy of toxicity rankings is hugely dependent on the calculation of the dose delivered to the cells, and this should therefore always be carried out [59].

#### 5.1.3. Applicability Domains

A low prediction accuracy of an assay could be improved by exploring the exact applicability of the assay. Ensuring that the specific MOA that caused the in vivo toxicity can be detected using the assay will reduce the rate of false negative outcomes. The applicability domain of each assay should be well-understood (which assay can predict what kind of in vivo (human) toxicity) to be able to use assays that are fit-for-purpose. When looking at the transition to animal free testing in general, this is one of the issues that needs addressing for soluble chemicals as well [263]. In that respect, in vitro toxicity testing of NMs should be mechanism-based by looking for specific effects or MOAs [264]. If the applicability domain of assays and cell models is established with more certainty, it is possible to combine assays into a strategy to holistically assess NM toxicity in vitro. In such a strategy, assay and cell type selection would be facilitated by a combination of applicability domains and the most relevant exposure route.

#### 5.1.4. Prediction Accuracy

Prediction accuracy in terms of sensitivity and specificity should be determined for more assays, to increase knowledge about assay reliability. Determination of accuracy is a crucial step during the validation process of in vitro assays in general [265]. For example, for regulatory genotoxicity testing, it is known that in vitro assays have a high sensitivity but low specificity. This means that there is a high chance of false positives, and positives should always be confirmed in an in vivo study. A better understanding of the prediction accuracies of in vitro assays used for SbD hazard testing would help enormously with the interpretation of results.

#### 5.1.5. Challenges in Assessing Predictivity

Although in vitro assays have been used for some time to test NM toxicity, not all the criteria could be evaluated properly due to the lack of or limited availability of high-quality data. Especially for predictivity, there is a data gap that needs to be filled in order to correctly interpret results. Several factors complicate the assessment of prediction accuracy of in vitro assays. Since there is a lack of human data on NM toxicity, predictivity of assays is at present evaluated compared to in vivo data derived from studies in experimental animals. This means that an in vitro model comprised of human cells is being compared to animal data, to predict a human response. The relevance of this approach is questionable, due to the differences between humans and experimental animals [266,267]. The lack of deposited dose calculations, interference controls, proper characterization, and varying sample preparation protocols (e.g., the use of serum, different media, different dispersion techniques) across the literature add another layer of complexity to assessing the predictivity of assays. Since these factors can have such an impact on assay outcome, assay standardization will aid in determining assay predictivity for adverse human health effects. Additionally, the lack of clear positive and negative controls for NMs hampers the assessment of prediction accuracy.

Finally, it must be noted that in the papers reviewed here, an optimistic perspective is given about the predictivity of in vitro assays for toxicity and adverse health outcomes in vivo. We should however be cautious, as negative results or results with a low correlation with known in vivo or human health responses may not reach publication: a phenomenon known as publication bias.

#### *5.2. Outlook for Innovators, Regulators, and Industry Based on Current Knowledge*

An overview of the most important knowns and unknowns with regards to NM SbD hazard testing is summarized in Table 6. Figure 6 shows what we think is the road forward towards successfully putting SbD hazard testing into practice. The successful implementation of SbD hazard testing requires efforts from innovators, regulators, as well as from industry.


**Table 6.** Overview of the most important findings in this review, including knowns and needs for SbD hazard testing.


**Table 6.** *Cont.*

**Figure 6.** Factors that became evident throughout this review that are crucial for putting SbD hazard testing into practice. Protocol standardization is key for SbD hazard testing, as well as for better understanding structure–activity relationships and prediction accuracies of assays.

5.2.1. A Change in Mindset towards Purpose-Driven Innovations

The current European policy landscape (the European Green Deal, the European Chemical Strategy for Sustainability and the Zero Pollution Action Plan [2,3,268]) demands a new mindset for innovating. SbD provides an approach aiming at developing safer NMs and NEPs by integrating safety into the innovation process and material development in a LC thinking approach, from design to end-of-life. Any innovation that does not have a green or sustainable purpose will not survive.

#### 5.2.2. Starting In Silico: Databases and SARs

SbD hazard assessment should first and foremost be based on material knowledge and material–activity relationships. Before even starting in vitro experiments, an elaborate evaluation of available physicochemical data should be performed [220]. Here, certain hazard warnings could already be noticed. For example, the structure–activity relationship (SAR) of high aspect ratio NMs (HARNs) and mesothelioma risk is widely accepted [138]. It would be unnecessary to perform hazard testing on HARNs, as it would already be clear beforehand that this material raises a hazard warning. Another potential hazard warning would be respirable crystalline silica particles, due to their structure–activity relationship with silicosis and lung cancer [269,270]. Knowledge on structure activity relationships is especially important for the identification of potential hazards and application of SbD interventions.

For novel advanced materials however, limited information on these tox-driving properties is available, and the SbD decisions are mostly based on SbD hazard-testing outcomes.

#### 5.2.3. Importance of Experimental Design

The physical aspects of NMs add another dimension to the complexity of toxicity testing. It should always be considered that the way the experiment is carried out (dispersion protocol, medium type, addition of serum) affects the outcomes and that the behaviour of the particle in the culture dish (settling, agglomerating, floating, dissolution, formation of protein corona) should always be analysed [23,49,53,58,59]. Checking and accounting for assay interference is crucial, also for SbD hazard testing and high throughput screening, where it is often overlooked [97]. This makes SbD hazard testing for NMs more challenging than that of soluble chemicals.

#### 5.2.4. Combinations of Assays

An integrated approach to testing and assessment (IATA) that can combine information from multiple sources (available data, in silico tools, in vitro assays) is the way forward towards an effective early hazard identification of NMs and NEPs and for the development of SbD interventions. This review discusses simple assays since it focusses on the initial stages of innovation. However, at more advanced stages, SbD hazard testing may also include approaches that are not as simple and cost-effective [264]. With regards to the transition to animal-free alternatives, the focus on simplicity as is required for SbD hazard testing should not create a barrier for the development of more realistic and innovative cell models with potentially better predictivity, such as induced pluripotent stem cells (iPSCs) and organoids.

For inflammatory potential, chronic inflammation (leading to tissue damage and remodelling as well as loss of functions) is the adverse outcome of concern, which is presently not captured by one or more in vitro tests. An acute pro-inflammatory effect in an in vitro assay as measured by cytokine secretion, in combination with slow dissolution, indicating high bio-persistency [14], might together indicate that the NM induces chronic inflammation. Combining assay outcomes in SbD hazard testing should be further explored.

#### 5.2.5. Thresholds for Toxicity

In order to raise hazard warnings and to interpret results from combinations of assays, thresholds are needed. This is especially challenging for inflammatory potential assays. Macrophages are the major defence mechanism against foreign materials, and their activation is crucial for the clearance of NMs [192]. It is unclear when a beneficial immune

response turns into persistent pulmonary inflammation in vivo, and how to predict this in vitro.

Previously established frameworks have made a step towards generating thresholds for toxicity. The Nanoreg2 framework and the Swiss Precautionary Matrix score NMs as low, medium, or high hazard according to their fold change increase as compared to a negative control [271]. The Nanoreg2 framework adds a scoring system that allows for combining outcomes of different assays, and subsequent comparison of different NMs. With both approaches, a significantly positive response in an assay might still lead to a classification as low hazard.

Since SbD hazard testing is performed as an early screening, and its main goal is determining early hazard warnings, a zero-tolerance principle might be more suitable in this case (as is common practice in the pharmaceutical industry). For primary genotoxicity, a zero-tolerance principle is already in place in regulatory risk assessment, as genotoxic carcinogens are regarded as having no threshold and thus an acceptable exposure level cannot be derived [224]. For SbD hazard testing, it could be argued that a worst-case approach would be suitable for the other endpoints as well, meaning that any indication of inflammation, reactivity, or cytotoxicity at relevant doses would raise a hazard warning. Here it is important to consider the possibility of false negatives produced in the assays.

For SbD hazard testing, the inclusion of benchmark NMs with known in vivo toxicity is recommended to compare the new NM to existing information. Thresholds could be set according to the response of the benchmark NM in a specific assay. Alternatively, an appropriate ranking in potency of NMs could be useful for making SbD decisions when comparing several candidate NMs.

#### 5.2.6. Assay Standardization

In Figure 6, assay standardization is represented connecting many important aspects. As mentioned throughout this review, assay standardization is a key need for the further development of SbD hazard testing, as well as for putting SbD into practice. Firstly, we showed that in vitro-in vivo comparisons are hampered by the lack of standardized protocols. Moreover, fundamental research into structure–activity relationships will benefit from standardized protocols as well. Assay standardization will result in more high-quality fundamental data on the MOAs of toxicity of NMs, which will in turn aid the refining of SbD hazard testing. Ultimately, standardization will increase the chances of industrial use and acceptance of these assays into existing legal frameworks, which will make incorporating SbD approaches more appealing for manufacturers [272].

On the contrary, the complexity of NM toxicity testing hampers the standardization of assays. It is for example impossible to create one exposure method suitable for all NMs, especially considering NMs of the future which will possess yet unknown properties. A case-by-case or targeted approach will be needed for specific NMs with incompatible PC properties. In some cases, standardization may not be feasible, but guidance will be of great help.

Assay standardization should be followed by assay validation in order to improve our understanding of the robustness, predictivity, and compatibility of the assays. The ongoing work in the OECD's Working Party on Manufactured Nanomaterials [273], the Malta Initiative [274], and work in ongoing European projects such as NanoHarmony [275], Nanomet [276] and Gov4Nano [277] are currently supporting the standardization efforts.

#### 5.2.7. Compatibility (NEPs and Novel Materials)

Safety along the LC as well as keeping pace with the rapid emergence of advanced materials are important hallmarks of SbD. Consumers are most likely exposed to NEPs and not pristine NMs. Therefore, assay optimization is needed to be able to test NEPs in an accurate way. Assay compatibility with NEPs and novel advanced materials needs to be studied further. More data on how NMs can change over the LC and the possible risks they may pose during this process is very much needed. This will help determine whether testing only pristine NMs may be sufficient for SbD hazard testing.

#### 5.2.8. Gathering Experimental Data following FAIR Principles

Since SbD hazard testing will involve the generation of large datasets, it is important to ensure that the data gathered from the different in vitro assays are adequately collected using templates that support FAIR principles, and that the data is findable, accessible, interoperable and reusable. Guidance for finding these templates can be found in the GoFair initiative and guidance on experimental workflows design and implementation can be found within the NanoCommons initiative.

#### 5.2.9. The Chemical Strategy for Sustainability

Although this review covers cytotoxicity, dissolution, oxidative potential, inflammatory potential, and genotoxicity, the Chemical Strategy for Sustainability has put forth extra endpoints to ensure the ambition towards a toxic-free environment and protection against the most harmful chemicals is fulfilled [3]. One of these endpoints is endocrine disruption. Under REACH, endocrine disruptors are identified as substances of very high concern alongside chemicals known to cause cancer, mutations, and toxicity to reproduction. Work is ongoing by ECHA to develop classification and labelling criteria for endocrine disruption [278]. From a NM-perspective, there is increasing evidence showing endocrine disruption and reproductive impairments caused by NMs such as nano plastics [279,280], and this warrants further attention.

Although this review is only focused on SbD, sustainability impacts should also be considered early in the innovation process. Safe-and-sustainable-by-design is a central element of the European Chemical Strategy for Sustainability and it demands the optimization of safety and sustainability interventions in the design of NMs, NEPs, and all processes in a life-cycle approach.

#### **6. Conclusions**

This review provides the first building blocks towards an early hazard testing strategy for SbD applicability and is the first detailed state of the art analysis of in vitro assays against performance criteria (simplicity and cost effectiveness, predictivity, robustness, compatibility, and readiness) for SbD hazard testing. The most important conclusions are:


the hazard assessment of NMs and advanced materials is complex and that in vitro tests need to be further developed, tested, and evaluated to assess their suitability in identifying potential hazards.

**Author Contributions:** All authors contributed to the conceptualization of the manuscript. The preparation of the different sections of the original draft was done by the following authors: Introduction by N.R. and L.G.S.-H.; Criteria by N.R.; Choice of dispersion protocol by M.C. and J.C. (Joan Cabellos); the use of serum and stabilizers by N.R.; determining dose delivered to cells by N.R.; SbD hazard testing of NEPs and NMs released during the life cycle by M.C., C.D., A.S.J. and N.R.; challenging NMs and advanced materials by N.R.; cytotoxicity by H.B. and N.R.; Dissolution by M.B. and P.M.; oxidative potential and oxidative stress by A.K., J.C (Joan Cabellos) and N.R.; inflammation by A.K., A.C. and N.R.; genotoxicity by M.C. and J.C. (Julia Catalán); discussion and outlook by N.R., H.B. and L.G.S.-H.; conclusions by N.R., L.G.S.-H., F.R.C. and H.B. Afterwards, in several iterations all authors read, commented, and proposed revisions to the entire manuscript, which were compiled by N.R. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the SAbyNA project, European Union's Horizon 2020 research and innovation program under grant agreement No 862419, and by the Dutch Ministry of Infrastructure and Water Management.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Acknowledgments:** The authors would like to thank Rob Vandebriel and Agnes Oomen for their valuable suggestions and critical review of the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Nanomaterials* Editorial Office E-mail: nanomaterials@mdpi.com www.mdpi.com/journal/nanomaterials

Academic Open Access Publishing

www.mdpi.com ISBN 978-3-0365-7813-2