*Article* **Proteomics Data Analysis for the Identification of Proteins and Derived Proteotypic Peptides of Potential Use as Putative Drought Tolerance Markers for** *Quercus ilex*

**Bonoso San-Eufrasio 1, Ezequiel Darío Bigatton 1,2, Victor M. Guerrero-Sánchez 1, Palak Chaturvedi 3, Jesús V. Jorrín-Novo 1, María-Dolores Rey <sup>1</sup> and María Ángeles Castillejo 1,\***


**Abstract:** Drought is one of the main causes of mortality in holm oak (*Quercus ilex*) seedlings used in reforestation programs. Although this species shows high adaptability to the extreme climate conditions prevailing in Southern Spain, its intrinsic genetic variability may play a role in the differential response of some populations and individuals. The aim of this work was to identify proteins and derived proteotypic peptides potentially useful as putative markers for drought tolerance in holm oak by using a targeted post-acquisition proteomics approach. For this purpose, we used a set of proteins identified by shotgun (LC-MSMS) analysis in a drought experiment on *Q. ilex* seedlings from four different provenances (viz. the Andalusian provinces Granada, Huelva, Cadiz and Seville). A double strategy involving the quantification of proteins and target peptides by shotgun analysis and post-acquisition data analysis based on proteotypic peptides was used. To this end, an initial list of proteotypic peptides from proteins highly represented under drought conditions was compiled that was used in combination with the raw files from the shotgun experiment to quantify the relative abundance of the fragment's ion peaks with the software Skyline. The most abundant peptides under drought conditions in at least two populations were selected as putative markers of drought tolerance. A total of 30 proteins and 46 derived peptides belonging to the redox, stress-related, synthesis,-folding and degradation, and primary and secondary metabolism functional groups were thus identified. Two proteins (viz., subtilisin and chaperone GrpE protein) were found at increased levels in three populations, which make them especially interesting for validation drought tolerance markers in subsequent experiments.

**Keywords:** peptide markers; *Quercus ilex*; drought tolerance; targeted post-acquisition proteomics

### **1. Introduction**

Holm oak (*Quercus ilex*) is the dominant tree species in natural forest ecosystems over the Western Mediterranean Basin, as well as in the agrosilvopastoral Spanish "*dehesa*", which is environmentally, economically and socially important [1,2]. This species is highly adaptable to drought and to the high temperatures and irradiation typical of Southern Spain. However, the main cause of mortality in holm oak plantations is water deficiency, with drought stress acting as a major factor of decline [3,4]. This situation can be expected to worsen in a scenario of climate change where statistical models have predicted that 40% of the land areas with a high-density of *Q. ilex* will be unsuitable for its survival [5].

E.D.; Guerrero-Sánchez, V.M.; Chaturvedi, P.; Jorrín-Novo, J.V.; Rey, M.-D.; Castillejo, M.Á. Proteomics Data Analysis for the Identification of Proteins and Derived Proteotypic Peptides of Potential Use as Putative Drought Tolerance Markers for *Quercus ilex*. *Int. J. Mol. Sci.* **2021**, *22*, 3191. https://doi.org/10.3390/ ijms22063191

**Citation:** San-Eufrasio, B.; Bigatton,

Academic Editor: Frank M. You

Received: 13 February 2021 Accepted: 18 March 2021 Published: 21 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Studies on the genetic variability associated with both environmental factors and genotypes have shown high population variability and polymorphism in *Quercus* spp [6–9]. In addition, *Q. ilex* has been shown to exhibit high variability in traits associated with drought tolerance, both within and between populations [10]. Because this is a nondomesticated species with a very long-life cycle, it is not amenable to conventional plant breeding. Therefore, for management and conservation practices based on resilient, elite genotypes of holm oak trees to be effective, they must rely on a sound knowledge of their biology and molecular mechanisms of adaptation to adverse climatic conditions. The response of plants to stress-related situations may, in theory, be improved by characterizing their biodiversity and selecting elite genotypes based on specific molecular markers. This approach can be quite challenging with orphan species, such as holm oak, which has a still incompletely sequenced genome and largely unexplored molecular features [11–21]. Fortunately, omics approaches have enabled crucial advances in these directions. Thus, some multiomics studies have addressed *Q. ilex* [19]; also, a reference transcriptome for this species has been generated [16,17], and, more recently, the metabolome of the acorn was determined [22]. In any case, the greatest efforts have focused on the proteomics of *Q. ilex*. A number of proteomic studies have used 1D and 2D gel-based analysis to investigate drought tolerance in this species [11,12,23]. In addition, recent studies have addressed various aspects of its biology by using shotgun (LC–MS/MS) proteomic analysis [19,21,24]. In addition, species-specific improved databases, such as the recently compiled holm oak transcriptome database [16,17], and other sequenced *Quercus* species databases, such as those for *Q. robur* [25] and *Q. suber* [26], have substantially expanded available knowledge of holm oak biology.

Quantitative proteomics is providing increasingly powerful tools for identifying markers of complex traits. Thus, identifying target peptide signals against mass spectrometry libraries is an efficient method for protein identification and quantification [27]. Targeted proteomics, however, does not allow the identification of new proteins as it requires the prior measurement of the targeted proteins by discovery proteomics; rather, it is useful for detecting changes in the protein abundances from previously acquired information [28]. Therefore, this proteomics branch can be useful to characterize coordinated changes in protein abundance with a view to identifying or validating proteins as markers for specific traits. Recently, proteotypic peptides have proved useful for protein quantification [29].

In this work, we used a double strategy for proteins and peptides quantification by targeted post-acquisition data analysis against a species-specific *Q. ilex* database with a view to identifying proteotypic peptides of potential use as putative drought tolerance markers for holm oak. For this purpose, a set of raw data generated in a shotgun experiment performed after 17 and 24 days under drought conditions in *Q. ilex* seedlings from four different provinces in Andalusia was used. Based on previous studies of inter-population variability of this species [12,18,30,31], we selected two populations from the southeast (Cadiz, Granada) and two from the northwest (Huelva, Seville) of Andalusia, with the purpose of identifying changes in proteins and derived peptides persistent over time in response to drought in different populations. In this work, various proteins and peptides are proposed as putative markers of drought tolerance in holm oak that transcend not only the tolerant phenotype but also populations and are examined in biological terms.

### **2. Results**

### *2.1. Qualitative and Quantitative Analysis of Drought Stress Responsive Proteins*

Shotgun analysis allowed a total of 4470 proteins to be identified in the *Q. ilex* leaf proteome (Supplementary Table S1; data are available via ProteomeXchange with identifier PXD023782) of which 2920 fulfilled the following criterion for confident identification: XCorr ≥ 2 and at least two different peptides per protein. An overall 2261 proteins were deemed variable in accordance with the following confidence criteria: (a) consistent presence in all replicates, (b) statistical significance with false discovery rate (FDR) < 0.05, and (c) drought/control ratio ≥ 2 and ≤0.5. In this group, 1692 proteins exhibited qualitative

changes in at least one population and sampling time and 569 quantitative changes, 1683 being more abundant in droughted seedlings. The initial dataset was screened to select those variable proteins most markedly represented under drought conditions at both sampling times in each population. A total of 380 proteins were thus screened, of which 48 were present in at least two populations and deemed markers candidates. A schematic view of the workflow, as well as the details of the experimental design, are shown in Figure 1 and Supplementary Figure S1.

**Figure 1.** Schematic workflow for selection of putative markers of drought tolerance.

Multivariate analysis integrating the entire dataset is shown in Supplementary Figure S2A, where the first two components (25% of the total variability) separated populations. Hierarchical clustering was also performed and represented in a dendrogram (Supplementary Figure S2B), in which two main clusters were observed: one grouped Huelva and Granada populations, and the other grouped Seville and Cadiz, independently of the sampling time. Replicates of the same experimental condition (population, treatment and sampling time) were grouped, demonstrating reproducibility throughout the experiment. To check the effect of stress on the populations, the sampling times were analyzed separately by partial least-squares regression analysis (PLS-DA) (Figure 2). The first two components explained 20% (first sampling time) and 31% (second sampling time) of the total variability. Component 1 resolved Cadiz control plants, and component 2 resolved plants under drought for 17 days (first sampling) from control plants except in the Huelva

population (Figure 2A). For the second sampling time (24 days), component 1 resolved the population pairs Huelva–Granada and Seville–Cadiz, and component 2 resolved treatments (Figure 2B). Based on this analysis, a clearer effect of drought treatment and populations can be appreciated at the second sampling time.

**Figure 2.** Partial least-squares discriminant analysis (PLS) of the entire dataset after 17 (**A**) and 24 days of drought (**B**) is shown in the upper part. The cumulative proportion of variance explained by components is shown below. C, Cadiz; G, Granada; H, Huelva; S, Seville. The letters following C, G, H, or S denote treatment (D, drought; C, control), the numbers before the underscore sampling time (1, 17 days; 2, 24 days) and that after it replicates (1, 2 or 3).

Figure 3A shows the 2261 variable proteins significantly (FDR < 0.05) up- or downaccumulated (twofold change) in the drought group, and Figure 3B a Venn diagram of the 380 variable proteins among them. The Granada and Huelva populations exhibited the greatest number of unique variable proteins (107 each), followed by Seville (71) and Cadiz (47). Huelva and Seville shared the largest number of variable proteins (16), followed by Granada–Huelva (14), Huelva–Cadiz (8), Seville–Cadiz (8), Granada–Seville (4) and Granada–Cadiz (2). Only two proteins changed significantly by the effect of drought in three populations; thus, GrpE protein (qilexprot\_13677) changed in Huelva, Seville and Cadiz, and subtilisin-like protease (qilexprot\_25223) in Granada, Seville and Cadiz.

The previous 380 variable proteins were characterized in functional terms by using Mercator [32] and GO enrichment (http://pantherdb.org/ accessed September 2020) for classification into 16 main groups (Figure 4A), namely: energy, carbohydrate, amino acid, lipid, hormone, coenzyme and secondary metabolism, other metabolic processes, cellular processes,-folding-sorting and degradation, synthesis (transcription/translation), structural, defense and response to stress, redox, signaling and transport. The best represented functional group was synthesis (62), followed by folding-sorting and degradation (42), defense and response to stress (37), carbohydrate metabolism (33) and redox (29), the

previous five groups accounting for more than 50% of all identified proteins. Figure 4B compares the proteins whose abundance was altered by drought in each population as grouped by functional category.

**Figure 3.** Variable proteins significantly (false discovery rate of 5%) uprepresented or downrepresented (twofold change) under drought conditions (**A**); C, Cadiz; G, Granada; H, Huelva; S, Seville. The letter following C, G, H or S denotes treatment (D, drought; C, control) and the number sampling time (1, 17 days; 2, 24 days). Venn diagram showing significantly up-represented proteins under drought conditions in each population (**B**).

### *2.2. Targeted Data Analysis for Selection of Peptides as Putative Markers of Drought Tolerance*

A list of proteotypic peptides derived from the 48 selected proteins was compiled for subsequent targeted analysis. The list, which included 219 proteotypic peptides with a charge state of +2 or higher, was used for quantification with the software Skyline as described in Section 4.5. As confirmed by Supplementary Figure S3, using the above-described library allowed 159 peptides to be successfully integrated with a robust, reproducible, highquality peak shape in all three replicates (Supplementary Table S2). A statistical analysis of variance (ANOVA) on normalized data revealed 71 peptides comprising 32 different proteins were significantly better represented in droughted seedlings. Those peptides and proteins better represented in at least two populations (46 peptides from 30 proteins) were selected as putative markers of drought tolerance (Table 1). Supplementary Figure S4 illustrates protein and peptide quantification graphically. The most representative protein functions in the marker panel (Table 1) were synthesis and mRNA processing with 9 proteins. Seven proteins belonged to the redox and response to functional stress groups; 3 to the folding, sorting and degradation group; and 2 to the transport group. Other metabolic functional groups, such as carbohydrate metabolism (2), secondary metabolism (1), photosynthesis (1) and other processes (5), were also represented. The Huelva population was that exhibiting the greatest number of proteins, most of which belonged to the synthesis and mRNA processing group (7) and the stress-related and secondary metabolism group (6). Other proteins associated with energy and metabolism (viz., photosynthesis, carbohydrate metabolism and other cellular processes) were also represented (5). The second population as regards protein changes was Seville, with proteins of the metabolism (6), synthesis (4),-folding and degradation (3), and stress-related (3) groups. The Granada population exhibited smaller numbers of changing proteins, and only in the synthesis (5) and metabolism (3) groups. Finally, the Cadiz population was that exhibiting the least changes and mainly in stress-related (3), metabolic processes (2) and folding and degradation proteins (2). A protein–protein interaction network among the 48 proteins previously selected was performed using the web-tool STRING10 (http://string-db.org accessed January 2021) (Figure 5). A strong connection between proteins of synthesis and those of folding and degradation was observed.

**Figure 4.** Functional categories of the 380 proteins. Total number of proteins significantly increasing in abundance after drought (**A**) and in each of the populations (**B**): C, Cadiz; G, Granada; H, Huelva; S, Seville.





**Figure 5.** Analysis of the protein interaction network of the 48 proteins selected as putative markers of drought tolerance.

### **3. Discussion**

In this work, we used a double proteomic strategy for protein and peptide quantification in order to identify putative protein markers associated with drought tolerance in *Quercus ilex*. For this purpose, a dataset obtained by shotgun proteomics analysis of four *Q. ilex* populations from different provinces of Andalusia, Southern Spain, under severe drought stress [31] was analyzed by using a double strategy, combining shotgun protein quantification of proteins and target peptides with post-acquisition analysis of data based on proteotypic peptides. The populations were selected based on previous studies of *Q. ilex* variability between eastern populations and western ones [30,31], which demonstrated a relationship between tolerance and provenance. However, intrapopulation variability was also observed as corroborated by the study of morpho-physiological and biochemical parameters [31]. For this reason, the aim of this work was the search for putative drought tolerance markers that transcend not only the tolerant phenotype but also populations.

The typically high complexity of proteomes makes protein identification by mass spectrometry irreproducible as a result of precursor ions being selected stochastically. In addition, forest plants exhibit enormous biological variability that results in even poorer reproducibility among samples. Targeted proteomics analyses can be used to identify, characterize and quantify small sets of proteins previously selected by mass spectrometry analysis [33]. The few targeted proteomic studies conducted so far to identify markers of important traits have focused on crops, such as potato [34], apple [35], grapevine [36]

and tomato [37,38]. Some have examined gluten profile [39,40], but even fewer have dealt with forest species. Although selected reaction monitoring (SRM) and its variants are the current gold standard for quantitative estimation of proteins, the recently dataindependent acquisition (DIA) method is increasingly being used for targeted proteomics. Unlike existing alternatives, DIA extracts specific information from previously acquired data [41]. This method has been used in combination with proteotypic peptides to identify peptide markers of resistance to *Peyronellaea pinodes* in pea [42]. The use of proteotypic peptides for protein quantification recently proved more accurate than methods based on algorithms (usually on the intensity of the strongest ion peaks) [29].

As shown here, the shotgun technique, in combination with proteotypic peptides extracted from previously acquired data, has great analytical potential. While not a targeted proteomic strategy proper, this approach allows one to select peptides and proteins closely associated with specific traits. This requires processing a dataset compiled from a properly designed and conducted experiment (viz., one using a large enough number of replicates or individuals). In this work, we compiled a list of peptides and proteins potentially useful as putative markers of drought tolerance in *Q. ilex* that are briefly discussed in biological terms below. The proteins in the marker panel were differentially represented among populations, with the greatest numbers of changes found in the Huelva population, followed by Seville, Granada and Cadiz. Many of the selected proteins are involved in synthesis or degradation processes or, to a lesser extent, in metabolic processes, such as carbohydrate metabolism, photosynthesis and other metabolic reactions. Stress response, redox and secondary metabolism proteins were also well represented.

Changes in synthesis and degradation of proteins under stress conditions, such as drought, can be interpreted as a mechanism of adaptation through the installation of the translational apparatus and protein synthesis by recycling available amino acids in plants through protein degradation. Thus, plants respond to drought by synthetizing protective proteins and repairing or degrading damaged proteins [43]. Considering the general changes observed in the proteome in response to drought, synthesis (ribosomal and transcription) was the most represented group of proteins showing qualitative and quantitative changes, followed by folding and degradation category. Many of the proteins selected as putative markers here are involved in synthesis processes; such is the case, for instance, with translation initiation factor, zinc finger protein VAR3, RNA-binding protein, and methionyl-tRNA synthetase. The marker panel also included folding and degradation proteins, such as the chaperones calreticulin and GrpE protein, and serine protease subtilisin. 2DE-MSMS proteomic analysis previously revealed a similar response involving some of the previous proteins in *Q. ilex* [12,23] and *Q. robur* [44] under drought and suggested active metabolic adjustment to stress.

By contrast, degradation of starch in response to stress has been often associated with improved tolerance and potentially limited photosynthesis [45,46]. Sugars resulting from starch degradation, and other derivative metabolites, help plants grow under stress and function as osmoprotectants and compatible solutes to mitigate the adverse effects of stress [47], as found in droughted *Q. robur* [44]. Although environmental factors are known to have strong effects on the starch synthesis, their regulatory mechanisms remain unclear [48]. Some studies reported increased starch accumulation under stress, mainly in response to high salinity or cold [49–51]. In this work, two proteins of carbohydrate metabolism of the marker panel (viz., granule-bound starch synthase 1, which is chloroplastic/amyloplastic, and the glycosyl hydrolase family protein with chitinase insertion domain) were found at increased levels in, mainly, the Seville population. However, several starch degradation proteins not selected as putative markers were significantly increased in response to drought in some of the experimental conditions studied, including phosphoglucan water dikinase, α-glucan phosphorylase and α-amylase. Although apparently contradictory, this response may be related to the presence of different types of starch, whether permanent or transitory and that of isoforms involved in their synthesis and mobilization [48,52,53]. On the other hand, the photosynthetic machinery was seemingly

unaffected; in fact, only a few photosynthesis proteins exhibited any changes, and only one (chlorophyll a-b binding protein) was included in the marker panel. This result is consistent with those of physiological studies on *Q. ilex* populations under severe drought in Seville, Granada and Cadiz, which exhibited no significant changes in photosynthetic pigments [31].

The broadest group of proteins and derived peptides selected as putative markers consisted of redox (2-alkenal reductase NADP-dependent, short-chain alcohol dehydrogenase A, disulfide-isomerase) and stress response proteins (endoplasmin, dehydrin, senescence/dehydration-associated protein and aldo-keto reductase). Some were closely associated with drought in several studies on the genus *Quercus* [4,44] or with biotic stress caused by *Phytophthora cinnamomi* [54]. Furthermore, a representative number of redox proteins not included in the marker panel have been identified as being increased to a greater or lesser extent in some of the conditions studied in response to drought, including glutathione S-transferase, glutathione peroxidase, thioredoxin, peroxidase, superoxide dismutase, lipoxygenase, among others. Our marker panel also included two enzymes involved in the shikimate–phenolic biosynthetic pathways, namely: chalcone synthase (CHS) and 3-phosphoshikimate 1-carboxyvinyltransferase. One of the potential roles of phenolic compounds is to scavenge harmful reactive oxygen species [55]. Consistent with our results, *Q. ilex* [31,56] and other *Quercus* spp. [57,58] were previously found to exhibit increased total levels of phenolics.

Finally, transport proteins, such as the water channel protein aquaporins, have been associated with plant tolerance of biotic and abiotic stresses, to which they respond by regulating the movement of water and small molecules through plasma membranes and vacuoles [59]. Based on a proteomics strategy involving the identification of proteotypic peptides, some transport proteins have been proposed as markers of tolerance to drought [60] and resistance to *Ascochyta* blight [42] in pea. The proteins were assumed to induce signaling and transport processes as mechanisms to maintain homeostatic equilibrium and cope with stress. The proposed putative markers included the importin subunit alpha, aquaporin and mitochondrial fission 1 protein A, although other transport and signaling proteins, such as 14-3-3-like protein, lipocalin, outer envelope pore protein (OEP), voltagedependent anion-selective channel (VDAC) and translocase of chloroplast 90 protein, were also more represented under drought in some of the experimental conditions in this study.

Despite it is outside the scope of this work, based on our results, a clear distinction in response to drought among the populations studied cannot be postulated, for, which a greater number of populations covering a wider area must be included. However, attending to the high number of the proteins that showed changes in the Huelva population also observed in Seville, we could speculate on a similar response pattern to drought in both, perhaps due to geographic proximity. The fact that the proteomic profile of Cadiz is different from the rest may also have a geographical explanation as has already been described by Fernandez i Marti et al. [18], suggesting that the Guadalquivir Valley has played an important role in determining population divergence. To the authors' knowledge, this is the first study aimed to identify proteins and derived peptides as putative markers of drought tolerance for a forest species, such as *Q. ilex*. Such markers may be useful with a view to selecting drought-tolerant genotypes or individuals.

### **4. Materials and Methods**

### *4.1. Proteomic Dataset*

The dataset used was compiled from the leaf proteome of *Q. ilex* seedlings in four different Andalusian populations, Southern Spain, namely: Granada (G), Huelva (H), Cadiz (C) and Seville (S). A detailed map of the Andalusian localizations from which samples were collected is shown in Supplementary Figure S5. All populations were under severe drought stress conditions, such as those imposed in summer under Mediterranean climate (Supplementary Table S3). Detailed information about specimen provenances (Supplementary Table S3), plant growth, stress conditions imposed, and physiological

measurements can be found elsewhere [31]. Briefly, acorns were germinated and grown under greenhouse conditions in perlite, according to Simova-Stoilova et al. [23]. Severe drought was imposed by withholding water for 28 days at the 10-leaf stage under the following experimental conditions: 46/22 ◦C, 28 MJm<sup>−</sup>2/day, 41% HR [31]. Two different sampling times based on measured leaf physiological parameters were used for proteomic analysis. Thus, leaves were collected after 17 and 24 days of drought, corresponding to an average drop in chlorophyll fluorescence of 20% and 40%, respectively, in droughted seedlings relative to well-watered seedlings in all populations [31] (Supplementary Figure S6). These sampling times were selected as representative of an early and later stage of the response to drought with photosynthetically active leaves. Supplementary Figure S7 shows visual damage symptoms observed in the seedlings 25 days after a drought. Healthy leaves from different seedlings under different conditions as regards population, treatment and sampling time were collected, frozen in liquid nitrogen and stored at <sup>−</sup><sup>80</sup> ◦C for subsequent proteomic analysis.

### *4.2. Protein Extraction and Mass Spectrometry Analysis*

Five fully expanded (photosynthetically active) leaves per plant from each population (G, H, C, S), treatment (control well-watered, control and drought) and sampling time (17 and 24 days) were crushed with liquid nitrogen and used for protein extraction. Proteins from three independent biological replicates (200 mg of fresh tissue each) were extracted with TCA/acetone-phenol [61], solubilized in a solution containing 7 M urea, 2 M thiourea, 4% (*w/v*) CHAPS (3 [(3-cholamidopropyl) dimethylammonio]-1-propanesulfonate), 0.5% (*w/v*) Triton X-100 and 100 mM DTT, and quantified by the Bradford method [62] using bovine serum albumin (BSA) as standard.

Shotgun analysis was performed by using 90 μg of BSA protein equivalents per sample that were prefractionated in SDS–PAGE according to Valledor and Weckwerth [63]. The resulting unique bands were excised from the gels and digested with proteomics-grade trypsin (Promega) to a final concentration of 12.5 ng/μL according to Romero-Rodríguez et al. [21]. Digested peptides were desalted by passage through C18 cartridges from Scharlau (Barcelona, Spain), and eluted peptides were vacuum dried and dissolved in a mixture of 70:30 (*v/v*) acetonitrile (ACN)/water containing 0.1% trifluoroacetic acid. Mass spectrometry analyses were conducted at the Proteomics Facility for Research Support Central Service (SCAI) of the University of Cordoba (Spain), using a Dionex Ultimate 3000 nano UPLC instrument from Thermo Scientific (San Jose, CA, USA) coupled to a nanoelectrospray ionization source and a trihybrid analyzer Thermo Orbitrap Fusion mass spectrometer, also from Thermo Scientific, operating in the positive ion mode. The specific settings used in the LC–MS/MS analyses are described elsewhere [42].

### *4.3. Protein Identification and Quantification*

The raw data obtained from the MS analysis were processed with the software Proteome Discoverer v. 2.1.0.81 from Thermo Scientific. MS2 spectra were searched with the SEQUEST engine against the FASTA database obtained from the *Q. ilex* transcriptome [16,17]. Precursor mass tolerance was set at 10 ppm and fragment ion mass tolerance fixed at 0.1 Da. Only those ions with a charge state of +2 or greater were used. In silico peptide lists were generated by theoretical tryptic digestion, allowing up to two missed cleavages, carbamidomethylation of cysteines as a fixed modification and oxidation of methionine as a variable modification. Peptides were classified into protein groups according to the law of parsimony and filtered to FDR = 5% and XCorr ≥ 2.

Proteins were quantified in relative terms from the peak areas for precursor ions (specifically, the average of the three strongest peptide ion signals) from the identified peptides were used [64]. Protein values were then normalized by using a method that accounts for variability between runs to normalize relative protein abundance between samples, using the sum of the peak area values for each sample. Only those values consistently present in the three biological replicates were considered for further statistical analysis. Proteins were categorized by function by using the protein FASTA sequences in the software MERCATOR (https://www.plabipd.de/portal/mercator4 accessed August 2020) [30], an online tool for batch classification of proteins or gene sequences into Map-Man functional plant categories. In addition, nonannotated proteins were subjected to GO enrichment by using the Panther tool (http://pantherdb.org/ accessed September 2020).

The raw mass spectrometry data thus obtained were deposited on the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org on 25 January 2021) via the PRIDE partner repository [65] with the dataset identifier PXD023782.

### *4.4. Statistical Analysis of Data and Selection of Target Proteins*

A partial least-squares-discriminant analysis (PLS-DA) of the entire dataset was performed to explain variance and correlation between the different variables, using the software RStudio v. 4.0.3 (Caret package v. 6.0-86 available at https://CRAN.R-project. org/package=caret) [66]. Target proteins were selected, and in silico data were analyzed by using the entire dataset provided by the shotgun analysis. All proteins selected met the following criteria: (a) the entire dataset was filtered by confidence parameters (score ≥ 2, at least 2 peptides per protein); and (b) they were the best represented in qualitative and/or quantitative terms (*p* ≤ 0.05) under drought conditions at both sampling times in at least two populations. Figure 1 illustrates the experimental workflow.

### *4.5. Targeted Post-Acquisition Data Analysis for Selection of Putative Peptides Markers*

A list of target peptides generated from the proteotypic peptides (specific peptide sequences from the selected proteins as far as existing annotation information allowed for) was compiled. Proteotypic peptides were searched by aligning the sequences on the entire *Q. ilex* database, using the bioinformatics tools Bourne-again shell (Bash) (available at http://ftp.gnu.org/gnu/bash/ accessed November 2020) [67] and the BLAST protein (BLASTp) tool for the open source software operating system Ubuntu (available at https://www.exoscale.com/syslog/blast/ accessed November 2020). Proteotypic peptides were quantified by integrating the areas of the chromatographic fragment ion peaks in Skyline software (available at https://skyline.ms). The parameters used for relative quantification of MS1 were 0.055 m/z mass tolerance for the instrument, 0.5 m/z for library peak integration and a resolution of 120,000 at m/z 200. The integration peak and retention time (RT) for each peptide were checked by hand in order to confirm the reproducibility of ions among samples. Peptide values were subsequently assessed for statistical significance by using the external tool MS Stats in Skyline [68]. Data were normalized by equalizing intensity medians and then subjected to log2-transformation, after which ANOVA analysis was used to select the best-represented proteins among those exhibiting significant changes (*p* ≤ 0.05) under drought conditions. Supplementary Figure S4 provides a quantitative depiction of the proteins and precursor ions.

A protein interaction network of the selected proteins was generated by using the web tool STRING10 (http://string-db.org accessed January 2021). The protein homologs in *Arabidopsis* were analyzed by sequence BLASTing of the TAIR database (http://www. arabidopsis. org/Blast/index.jsp accessed January 2021), followed by application of STRING10 to develop a proteome-scale interaction network [69].

### **5. Conclusions**

Methodologically, the proposed targeted strategy is aimed at identifying peptides associated with the response of *Q. ilex* to drought stress. As a supplement to shotgun analysis, using proteotypic peptides in addition to proteins allows putative markers enabling the identification of specific phenotypes, such as those best-resisting drought, to be selected. Our methodological workflow consisted of selecting those consistent and confident proteins whose proteotypic peptides were used for quantification. Of them, 46 peptides showed significant changes in response to drought stress in the same way that the protein they come from, which were proposed as putative markers.

Biologically, the results suggest that plants possess effective protective mechanisms for adaptation to drought through water loss prevention and protein protection and detoxification. However, small differences in response mechanisms may result in plant survival or adaptation to extreme conditions depending on the particular population or individual. As can be inferred from the composition of the marker panel, *Q. ilex* seedlings from four different populations responded differentially to drought, with the greatest number of changes being observed in the Huelva and Seville populations. Only two proteins (viz., the protease subtilisin and chaperone GrpE protein) were increased to a similar extent in three of the four populations. These proteins should be validated as biomarkers of drought tolerance in *Q. ilex* with further testing.

Our study constitutes a step forward in the molecular elucidation of this forest species. Advances in molecular techniques, including omics, in combination with physiological studies, can be expected to allow especially tolerant or resilient genotypes or individuals under stress conditions, such as those propitiated by a climate change scenario, to be selected in reforestation programs.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/1422-006 7/22/6/3191/s1, Figure S1: Schematic workflow for selection of putative tolerance drought markers in *Q. ilex* seedlings by targeted post-acquisition proteomic analysis. C, Cadiz; G, Granada; H, Huelva; S, Seville, Figure S2: multivariate statistical analysis (PCA) of the entire dataset (A). Dendrogram shows hierarchical clustering of experimental conditions (B). C, Cadiz; G, Granada; H, Huelva; S, Seville. The letters following C, G, H, or S denotes treatment (D, drought; C, control), the numbers sampling time (1, 17 days; 2, 24 days), Figure S3: Representative MS1 fragment ion, retention times (RT) and normalized peak area of a target precursor ion given by the software Skyline, Figure S4: Quantitative protein (left) and abundance of precursor ions (right), Figure S5: Localization of all Andalusian Q. ilex provenances used in this study. Andalusia is delimited by the blue outline, Figure S6: Measurements of quantum yield of photosystem II (Fv/Fm) in dark-adapted leaves of Q. ilex of Andalusian provenances (Cadiz, C; Granada, G; Huelva, H; Seville, S) under drought conditions throughout the experiment. Values are means ± standard errors (SE). Arrows indicate the sampling times when the mean of leaf fluorescence had decreased by 20% after 17 days and 40% after 24 days relative to control seedlings, Figure S7: Visual damage symptoms in four Andalusian Q. ilex provenances (C, Cadiz; H, Huelva; S, Seville; G, Granada; the letters following C, H, S or G denotes treatment: D, drought; C, control) 25 days after drought treatment. Control seedlings did not show damage symptoms throughout the experiment. In contrast, some seedlings did show clear damage symptoms related to drought. Only asymptomatic seedlings were used in this study. Table S1: Raw data, normalized data and significant changes (FDR < 0.05) as calculated from the shotgun proteomics experiment, Table S2: List of proteotypic peptides derived from the 48 selected proteins, Table S3: Locations and environmental conditions of the four Q. ilex Andalusian populations.

**Author Contributions:** Conceptualization, J.V.J.-N. and M.Á.C.; methodology, M.Á.C., B.S.-E. and V.M.G.-S.; software, M.Á.C., B.S.-E. and V.M.G.-S.; formal analysis, M.Á.C., B.S.-E., P.C. and V.M.G.-S.; investigation, M.Á.C. and E.D.B.; writing—original draft preparation, M.Á.C., writing—review and editing, M.Á.C., M.-D.R. and J.V.J.-N., supervision, M.Á.C. and J.V.J.-N.; funding acquisition, J.V.J.-N. and M.Á.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Spanish BIO2015-64737-R and PID2019-10908RB-100 projects.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The raw mass spectrometry data were deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org on 25 January 2021) via the PRIDE partner repository with the dataset identifier PXD023782.

**Acknowledgments:** M.Á.C. and M.-D.R. are grateful for awards of a Ramón y Cajal (RYC-2017-23706) and a Juan de la Cierva-Incorporación (IJC2018-035272-I) contract, respectively, by the Spanish Ministry of Science, Innovation and Universities. E.D.B. is grateful for the fellowship of the Argentinian National Research Council (CONICET).

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**

