*4.1. Mining LEA Genes in the C. Songorica Genome*

*C. songorica LEA* genes were mined based on their protein sequence homology with the previously published *A. thaliana* [28], *Oryza sativa* (rice) [36] and *Zea mays* (maize) [37] LEA protein sequences. The published full length *A. thaliana*, rice and maize LEA protein sequences or coding sequences were retrieved from (https://phytozome.jgi.doe.gov/pz/portal.html) [28,32]. The obtained LEA protein sequences were used as queries to blast search the whole *C. songorica* genome sequence retrieved from the BMK cloud: http://www.biocloud.net/ using a local blast tool [52,53].

The resulting non-redundant sequences were further examined with the Hidden Markov Model available in the Pfam database (http://pfam.sanger.ac.uk/search) [13] and then submitted to the SMART database (http://smart.embl-heidelberg.de/) [54] and the NCBI Conserved Domain Search database (http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) [55], respectively, to confirm CsLEA Pfam domain families. The obtained LEA nucleotide and protein sequences were then submitted to the Genbank to obtain respective accession numbers (Table S1).
