*2.1. Study Area*

Gwangyang Bay is part of the Korean National Archipelagos located o ff the south coast of the Korean peninsula (Figure 1). The bay receives an annual mean discharge of 2298 × 10<sup>6</sup> m<sup>3</sup> yr<sup>−</sup><sup>1</sup> from the Seomjin River [26]. A significant amount of nutrients drains into the system from the watershed (~5000 km2). The water depth varies from 10 m at the Seomjin River estuary to 50 m at the outer Gwangyang Bay. The tidal cycle appears to be semi-diurnal. Compared to other Korean river estuaries which have barriers, the Seomjin River estuary remains open, and thus the water mass is exchanged between the river and ocean more actively. The natural condition of Gwangyang Bay is apt to increase primary productivity as well as biological diversity. In this respect, Gwangyang Bay (~450 km<sup>2</sup> from the estuary to the outer bay) is the most economically productive coastal ecosystem in the Korean peninsula [25].

### *2.2. Sampling, Data Collection and Primer Selection*

A survey permitted by Ha-dong local governmen<sup>t</sup> (permission number: 2010-0165) was conducted in June 2018. We sampled the surface water (approximately the top 50 cm) in this study. In total, nineteen sampling sites were selected, which covered an extensive area from the Seomjin River estuary to the outer Gwangyang Bay (Figure 1). According to Kim et al. [25], at Gwangyang Bay, higher water temperatures corresponded to lower salinity, and vice versa. Phosphorus and nitrogen concentrations were spatially similar across the Bay. Based on the divisions indicated in Kim et al. [25], Zone I covered the inner Bay (sites 3–7, and 9), Zone II represented the main channel of the Bay (sites 8, 10, and 11), and Zone III mostly belonged to the outer Bay (sites 12–21) (Figure 1). Water samples for meta-barcoding analysis (more than 1 L per sample) were obtained at the same time and moved with dry ice to laboratory to filter (0.45 μm pore-size membrane; Advantec MFS membrane filter, Dublin, USA) and stored at −80 ◦C before NGS analysis. Negative controls were included for every study site to prevent cross contamination. Water temperature and salinity were measured on-site using portable equipment (Model: YSI Professional Plus, OH, USA), while the nutrient and chlorophyll-*a* concentrations were analyzed in the lab. Phosphorus and nitrogen concentrations were measured using an UV spectrophotometer based on standard analytical methods proposed by the Korean Ministry of Oceans and Fisheries. Chlorophyll-*a* measurements were also based on UV spectrophotometry. In contrast to nutrient measurements, chlorophyll-*a* samples were filtered through a 0.45 μm pore-size membrane (Model: Advantec MFS membrane filter). The filter membrane was then homogenized after acetone extraction prior to spectrophotometry. Organic and inorganic carbon concentrations were measured using a carbon analyzer (Model: vario TOC cub, Langenselbold, Germany) using 850 ◦C combustion catalytic oxidation methods.

**Figure 1.** Map of the study sites with division of zones referred to by Kim et al. (2019) [25].

We selected the V9 region of the 18S rDNA gene (18S V9), primarily because of its broad range among eukaryotes [27,28]. NGS approaches using the 18S V9 region have recently allowed the characterization of marine planktonic biodiversity in the oceans [29] and prompted biomarker establishment initiatives [30].

### *2.3. DNA Extraction and Metagenomic Sequencing*

Genomic DNA was extracted using PowerSoil® DNA Isolation Kit (Cat. No. 12888, Qiagen, Düsseldorf, Germany) according to the manufacturer's protocol. DNA extracted for sequencing was prepared according to Illumina 18S Metagenomic Sequencing Library protocols (San Diego, CA, USA). DNA quantity, quality, and integrity were measured by PicoGreen (Thermo Fisher Scientific, Waltham, MA, USA) and VICTOR Nivo Multimode Microplate Reader (PerkinElmer, Akron, OH, USA). The 18S rRNA gene was amplified using 18S V9 primers. The primer sequences are as follows: 18S V9 primer including adaptor sequence (Forward Primer: 5 TCGTCGGCAGCGTCAGATGTGTATAA GAGACAGCCCTGCCHTTTGTACACAC 3/Reverse Primer: 5 GTCTCGTGGGCTCGGAGATG TGTATAAGAGACAGCCTTCYGCAGGTTCACCTAC 3). To amplify the target region attached with adapters, as a first PCR process, the extracted DNA was amplified by 18S V9 primers with one cycle of 3 min at 95 ◦C, 25 cycles of 30 s at 95 ◦C, 30 s at 55 ◦C, 30 s at 72 ◦C, and a final step of 5 min at 72 ◦C for amplicon PCR product. As a second process, to produce indexing PCR, the first PCR product was

subsequently amplified with one cycle of 3 min at 95 ◦C, eight cycles of 30 s at 95 ◦C, 30 s at 55 ◦C, 30 s at 72 ◦C, and a final step of 5 min at 72 ◦C. The final products were normalized and pooled using PicoGreen (Thermo Fisher Scientific, Waltham, MA, USA), and the size of libraries were verified using the LabChip GX HT DNA High Sensitivity Kit (PerkinElmer, Akron, OH, USA).

The library was sequenced using the MiSeq™ NGS platform (Illumina, San Diego, CA, USA) provided as a commercial service (Macrogen Inc., Seoul, Korea). Raw reads were trimmed with CD-HIT-OTU and chimeras were identified and removed using rDnaTools. For paired-end merging, FLASH (Fast Length Adjustment of SHort reads) version 1.2.11 was used. Merged reads were processed using Qiime version 1.9 [31] and were clustered into operational taxonomic units (OTUs) with UCLUST [32], using a greedy algorithm with OTUs at a 97% OUT cutoff value. Taxonomic classifications were assigned to the obtained representative sequences using BLASTn [33] and UCLUST [32].
