*4.3. Genome Sequencing, Assembling, and Annotation*

DNA samples were sequenced using the GA II Illumina sequencer (2 × 75 paired-end reads with an estimated inset size of 400 bp). Quality check on raw reads was performed using FastQC v.0.11.2 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Then, the fastq\_quality\_filter utility from the FASTX-toolkit (http://hannonlab.cshl.edu/fastx\_toolkit/) was used to remove sequences with a quality score equal or lower than 30 in more than 90% of the read length. Illumina technical sequences were removed by using Trimmomatic v.0.32 [58]. Reference-based assembly was performed using the Columbus module within the Velvet package [59] with a k-mer size of 65. The chloroplast genome sequence of *S. lycopersicum* cv. IPA-6 (AM087200) was used as reference. Contigs were ordered and oriented by using ABACAS [60] for the final assembly. Finally, high quality reads were aligned back onto the assemblies using Bowtie2 [61] with default settings to validate and manually fix errors in the assemblies. Per base genome coverage was computed using the genomecov utility of bedtools version 2.20.1 (Figure S6) [62]. The annotation of chloroplast genomes was performed using GeSeq (https://chlorobox.mpimp-golm.mpg.de/geseq.html). Gene annotations were manually curated using *S. lycopersicum* cv. IPA-6 (AM087200) annotations as reference. Chloroplast genome sequences and annotations produced in this study can be found in GenBank under accession numbers MT811790-MT811798.

## *4.4. Detection and Analysis of Sequence Variations*

Single nucleotide variants (SNVs) were identified using the snp-sites tool (https://github.com/ sanger-pathogens/snp-sites). Such a tool extracts SNPs from a multiple sequence alignment using the cpDNA of *S. lycopersicum* cv. IPA-6 as reference sequence. SNP annotation was manually curated.

The microsatellite (MISA) identification tool (http://pgrc.ipk-gatersleben.de/misa/) was run to identify microsatellites (SSR) using the unit\_size/min\_repeats parameters as follows: 1/8, 2/6, 3/5, 4/5, 5/5, 6/5. The Tandem Repeat Finder web tool accessible at https://tandem.bu.edu/trf/trf.basic.submit.html was used to detect perfect tandem repeats with default settings.

In silico identified SSR loci were experimentally tested for variation in the number of repeat units. For this aim, 8 SSR loci were selected from the MISA output by focusing on those with small variation in the number of repeat units to verify the correct estimation of their repeat length. Primers were designed with Primer3 (http://frodo.wi.mit.edu/primer3/). The primer size was set from 18 to 25 bp, the Tm ranged from 51 to 59 ◦C and the other parameters were set as default (Table S1). For each microsatellite locus, the forward primers were labeled with the different fluorescent dyes 6-FAM, ATTO550, ATTO565, and HEX (Sigma Aldrich, USA). Beside the sequenced local accessions, we applied these primers to 19 additional local genotypes, namely further seven local accessions and twelve processed/fresh market tomatoes.

All PCR amplifications were performed by a Perkin Elmer 9700 thermocycler according to PCR conditions as reported in [63]. The conditions were maintained constant for all loci in order to maximize standardization. Amplified microsatellite products were then genotyped using an Applied Biosystem 3130 automatic sequencer with LIZ (500) as an internal standard and sized with GENEMAPPER software v. 3.7 (Thermo Fisher Scientific-Applera, USA).

Multiple sequence alignments (MSA) were generated using MAFFT version 7 [64] with default settings. Single-nucleotide variants were identified by the snp-sites software [65] using as input the plastomes MSA and the cpDNA of *S. lycopersicum* cv. IPA-6 (AM087200) as reference. To highlight differences among nucleotide sequences of plastomes, MSA were visualized using the NCBI Multiple Sequence Alignment Viewer available at https://www.ncbi.nlm.nih.gov/projects/msaviewer/.

RAxML [66] was used to build a maximum-likelihood (ML) tree with 10,000 rapid bootstrap inferences, a generalized time reversible (GTR) substitution matrix and Gamma model of rate heterogeneity. The plastome of *S. tuberosum* cv. Désirée (DQ386163) was used as the outgroup. The ML tree was visualized with FigTree v.1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/).

In addition to *S. lycopersicum* cv. IPA-6 (AM087200), eleven tomato genotypes available in GenBank: *S. peruvianum* (KP117026), *S. chilense* (KP117021), *S. neorickii* (*S. neorickii* 2, KP117025), *S. pennellii* (HG975452), *S. habrochaites* (KP117023), *S. galapagense* (NC\_026878), *S. cheesmaniae* (NC\_026876), *S. pimpinellifolium* (*S. pimpinellifolium* 2, KP117027), *S. lycopersicum* (cv M82, HG975525), and *S. lycopersicum* var. *cerasiforme* (cer1, KY887588; and cer2, KY887587) were retrieved for comparative analyses. Heatmaps were generated using Morpheus (https://software.broadinstitute.org/morpheus). Single-linkage hierarchical clustering on both rows and columns was based on the metric "Euclidean distance".

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2223-7747/9/11/1443/s1, Figure S1: Overview of the nucleotide variability in nine plastomes sequenced in this study and in eleven species available in GenBank. The accession number AM087200 (cv. IPA-6) was used as reference. Red lines represent variable regions, Figure S2: Schematic representation of the nucleotide variability observed in the *ndhH* gene for the plastomes under investigation. Grey bar represents the nucleotide multiple-sequence alignment (MSA) and it is scaled according to the MSA length. Black boxes indicate variable regions in the MSA. Above and below each box, a snapshot of the MSA along with alignment positions is reported, Figure S3: Schematic representation of the amino acid variability observed in the Ycf1 protein for the plastomes under investigation. Grey bar represents the amino acid multiple-sequence alignment (MSA) and it is scaled according to the MSA length. Black boxes indicate variable regions in the MSA. Above and below each box, a snapshot of the MSA along with alignment positions is reported; Figure S4: Simple sequence repeats (SSRs) in nine plastomes sequenced in this study and in eleven species available in GenBank. The plastome of IPA-6 (AM087200) was used as reference a) Heatmap representing differences in SSR size; colors range from red (SSR size larger than the reference) through yellow to blue (SSR size smaller than the reference). Black is for missing SSRs. b) Pie chart describing the percentage of SSRs located in coding sequences of genes, introns, and intergenic regions and in the large single copy (LSC), small single copy (SSC), and inverted repeat b (IR) regions, Figure S5: Multiple-sequence alignment (MSA) of the region harboring the duplicated sequence (ATAA)2 scored only in local landraces, Figure S6: Distribution of per-base sequencing depth for each chloroplast genome sequenced in this work. The average coverage per-base is also reported. Table S1: Tomato cpSSR primers developed in this study, Table S2: Simple sequence repeats (SSRs) in the twenty-one tomato chloroplast genomes using IPA-6 (AM087200) as the reference genome. SSRs size, location, and distribution among different regions, namely coding, intron, and intergenic are reported. The unit\_size/min\_repeats parameters were as follows: 1/8, 2/6, 3/5, 4/5, 5/5, and 6/5. SSRs located in IRa were not counted. SSRs were identified using MISA—microsatellite identification tool (http://pgrc.ipk-gatersleben.de/misa/), Table S3: Tandem repeats (TRs) in the twenty-one tomato chloroplast genomes using IPA-6 (AM087200) as the reference genome. TRs copy number and location are reported. TRs were identified using the tool available at https://tandem.bu.edu/trf/trf.basic.submit.html.

**Author Contributions:** T.C., S.C., N.D. and N.S. conceived and designed the research. C.C., N.D. performed bioinformatic analyses. R.T., L.S., D.C. and N.S. carried out wet-lab experiments. L.O. performed Illumina sequencing of plastomes. R.T., D.C., S.C., N.D. and N.S. contributed to data interpretation. R.T. and N.S. wrote the manuscript. T.C., S.C., N.D. and N.S. revised the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was partially funded by grant from the Italian Ministry of Economy and Finance (MEF)-National Research Council of Italy (CNR), Project "**C**onoscenze **I**ntegrate per la **S**ostenibilità e l'**I**nnovazione del *made in Italy* **A**groalimentare" (CISIA), Legge n. 191/2009.

**Acknowledgments:** Technical assistance of Mr. G. Guarino and Mr. R. Nocerino (CNR-IBBR, Portici, Italy) with artworks and plant growth is gratefully acknowledged.

**Conflicts of Interest:** The authors declare no conflict of interest.
