*2.11. Genome Analysis*

The whole genome sequence of *Rhodovulum* sp. strain MB263, which was isolated previously from a pink-blooming pool [21] in the tidal flat area where we found blooms Y1 and Y3 in this study, was determined. An axenic culture of this strain has been deposited with the Biological Resource Center, National Institute of Technology and Evaluation, Kisarazu, Japan with accession number NBRC 112775. For comparison, *Rdv. sulfidophilum* strain DSM 1374<sup>T</sup> obtained from DSMZ-German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany, was subjected to whole-genome sequencing. Genomic DNA was extracted from phototrophically grown cultures using the CTAB method [47]. Genome sequencing and gap closing of the *Rhodovulum* strains were performed using a previously established pipeline [48]. Briefly, a PCR-free paired-end library was prepared with a KAPA Hyper prep kit (Roche Sequencing and Life Science KAPA Biosystems, Wilmington, MA, USA) after shearing of genomic DNA into ~550 bp using an M-220 focused-ultrasonicator (Covaris, Woburn, MA, USA). A mate-pair library of ~8 kbp insert length was prepared with a Nextera mate-pair sample preparation kit (Illumina, San Diego, CA, USA). Both libraries were sequenced on an Illumina MiSeq system with a MiSeq reagen<sup>t</sup> kit version 3 (600 cycles) for *Rhodovulum* sp. MB263 and a MiSeq reagen<sup>t</sup> kit version 2 (500 cycles) for *Rdv. sulfidophilum* DSM 1374T. Removal of junction adapter sequence and conversion of RF to FR orientation of the mate pair reads were performed by ShortReadManager, an accessory tool of GenoFinisher (http://www.ige.tohoku.ac.jp/joho/genoFinisher/) [49]. The paired-end and mate pair reads were assembled with newbler version 2.9 [50]. Sequence gaps between the sca ffolds and contigs were determined in silico using GenoFinisher and AceFileViewer [49], followed by PCR and Sanger sequencing. The finished sequence was validated by FinishChecker, an accessory tool of GenoFinisher. Annotation was performed using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP, https://www.ncbi.nlm.nih.gov/genome/annotation\_prok/) [51].

## *2.12. Phylogenomic Analysis*

Average nucleotide identity (ANI) values [52] between the genome sequence of *Rhodovulum* sp. MB263 and other *Rhodovulum* strains were estimated using an ANI calculator (http://enve-omics.ce. gatech.edu/ani/index). A phylogenetic tree of 25 strains of *Rhodovulum* species was reconstructed by the maximum likelihood method based on concatenated sequences of 92 up-to-date bacterial core genes (UBCGs), which were prepared using the UBCG pipeline [53]. The tree reconstruction and gene support index and bootstrapping (100 replications) tests were performed using RAxML version 8.2.11 with the -m GTRCAT -f a -# 100 options [54]. The UBCG tree was visualized with the MEGA7 program [45].

#### *2.13. Statistical and Numerical Analysis*

Correlation analysis between di fferent parameters was performed using Microsoft Excel. Di fferences in *pufM* gene sequence-based community structure among environmental samples were evaluated using the dissimilarity ( *D*) index [22], which is a modification of city-block distance between two samples with *k* dimensions. In this study, *k* corresponded to the number of *pufM* phylotypes (22 phylotypes) as described below. Multi-dimensional scaling (MDS) of *D* matrix data was performed using the XLSTAT program (Addinsoft, New York, NY, USA).

## *2.14. Accession Numbers*

The *pufM* gene sequences determined in this study have been deposited under DDBJ accession numbers LC512373–LC512431. The complete genome sequence of *Rhodovulum* sp. MB263 was deposited with GenBank with accession numbers CP020384.1 for chromosome, CP020385.1 for plasmid pRSMBA, and CP020386.1 for plasmid pRSMBB. The BioSample and BioProject IDs are SAMN06610252 and PRJNA379495, respectively. The complete genome sequence of *Rdv. sulfidophilum* DSM 1374<sup>T</sup> was deposited with GenBank with accession numbers CP015418.1 for the chromosome, CP015419.1 for plasmid unamed1, and CP015420.1 for plasmid unamed2. The BioSample and BioProject IDs are SAMN04903811 and PRJNA319729, respectively.
