**2. Results and Discussion**

Thirteen strains of the marine bacteria, isolated from coastal seawater and sediments from the Sea of Japan, marine invertebrates, the mussel *Crenomytilus grayanus* from the Sea of Japan and the deep-water sponge *Esperiopsis digitate* from the Sea of Okhotsk, and the red algae *Ahnfeltia tobuchiensis* from the Sea of Okhotsk, which are deposited in the Collection of Marine Microorganisms (KMM, G.B. Elyakov Pacific Institute of Bioorganic Chemistry, Far Eastern Branch, Russian Academy of Sciences, http://www.piboc.dvo.ru/), were assigned to the genus *Cobetia* by physiological, biochemical and molecular genetic parameters, using sequencing and phylogenetic analysis of their 16S rRNA genes (Table 1). However, the strain-specific metabolic versatility and ecophysiological diversity of these *Cobetia* isolates could not allow distinguishing between their species [9,20,21]. Thus, our study showed that 7 from 13 strains have 99.86–100% identity of the 16S rRNA genes simultaneously to two type strains *C. marina* LMG 2217<sup>T</sup> (JCM 21022T) and *C. pacifica* KMM 3879<sup>T</sup> (NRIC 0813T), one strain has 100% identity to the type strain *C. crustatorum* JCM 15644T, and three strains have 100% identity to the type strain *C. amphilecti* KMM 1561<sup>T</sup> (NRIC 0815T) (Table 1). The strain *Cobetia* sp. 2AS (KMM 7514), isolated from a coastal seawater, showed 99.93 and 99.86% similarities with *C. amphilecti* KMM 1561T and *C. litoralis* KMM 3880T, respectively (Table 1). From four independent replicates, half of the 16S rRNA DNA samples from the clones of the strain *Cobetia* sp. 29-18-1 (KMM 7000), isolated from the sponge, were of 100% identity with the type strain *C. amphilecti* KMM 1561T (NRIC 0815T), and the other half had 100% identity with the 16S rRNA gene of the type strain *C. litoralis* KMM 3880<sup>T</sup> (NRIC 0814T) (Table 1). Therefore, the EzBioCloud identification, based on the use of 16S rRNA gene sequences of the type strains [22], showed only one of the identical reference records in its database (Table S1). Furthermore, the results of similarity calculation by EzBioCloud 16S database showed that the 16S rRNA genes are completely identical for *C. marina* JCM 21022<sup>T</sup> and *C. pacifica* KMM 3879T (100% identity), and the only single nucleotide polymorphism (99.93% identity) is between the type strains *C. amphilecti* KMM 1561<sup>T</sup> and *C. litoralis* KMM 3880<sup>T</sup> (Tables S2 and S3).

The comparative genomics of *Cobetia* isolates is also impossible due to the absence of the type strains' whole genome sequences to date except for *C. marina* JCM 21022<sup>T</sup> [8]. Moreover, the next-generation sequencing (NGS) solutions cannot always allow resolving this problem quickly without detailed bioinformatics analysis. Thus, the whole genome shotgun sequencing for the strain *Cobetia* sp. 2AS1, KMM 7005 (GenBank: JADAZN000000000.1) has led to the loss of its complete 16S rRNA gene, and consequently, indicated the same similarity (100%) of its partial sequence to both *C. amphilecti* KMM 1561T and *C. litoralis* KMM 3880<sup>T</sup> with the use of EzBioCloud identification system "Genome-based ID" [22].


**Table 1.** The species identification of *Cobetia* isolates from the Pacific Ocean.

**\*** KMM—Collection of Marine Microorganisms, G.B. Elyakov Pacific Institute of Bioorganic Chemistry, Far Eastern Branch, Russian Academy of Sciences, http://www.piboc.dvo.ru/.

> To clarify the species identity of this coastal seawater isolate *Cobetia* sp. 2AS1 (KMM 7005), we have tried to obtain PCR-products, using the sequences of structural genes encoding for the key metabolic hydrolases of the non-type strain *C. amphilecti* KMM 296 associated with a mussel, which were selected for the functional genomic studies for this species (GenBank: JQJA00000000.1). Putative coding DNA sequences (CDSs) for the following enzymes were selected as identification markers: alkaline phosphatases of the structural families PhoA (DQ435608) and PhoD (WP\_043333989); EEP-like (DNaseI-like) nuclease (WP\_084589364) and DNA/RNA non-specific (S1-like) endonuclease (WP\_043334786), ATPdependent protease Clp (KGA03297), phospholipase A (WP\_084589432), and periplasmic serine peptidase with thrypsin-like peptidase domain of the Do/DeqQ family (KGA03014). The Do/DeqQ family serine peptidase has a chaperone function at low temperatures and proteolytic activity at elevated temperatures, thus protecting bacteria from thermal and other stresses [23]. Clp proteases are involved in a number of cellular processes, such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping

removal of dysfunctional proteins, the control of cell growth, and targeting DNA-binding protein from starved cells [23]. The large EEP (exonuclease/endonuclease/phosphatase) domain superfamily (structural family: cl00490) includes a diverse set of proteins, including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), and deoxyribonuclease I (DNaseI), which share a common catalytic mechanism of cleaving phosphodiester bonds, with the substrates range from nucleic acids to phospholipids and, probably, proteins [23]. The DNA/RNA endonuclease belonging to the structural cl00089: NUC superfamily can non-specifically cleave both double- and singlestranded DNA and RNA, whose domain may be present in phosphodiesterases [23]. Thus, all these enzymes are fundamental for bacteria survival [24–29]. If the strains *Cobetia* sp. 2AS1 (KMM 7005) and *C. amphilecti* KMM 296 are of the same species, they should have a high level of their CDS similarity and the same pattern of the presence and distribution of the PCR products by electrophoresis. In addition, the same PCR primers were also applied to all type strains and new isolates belonging to the genus *Cobetia* (Figures 1 and 2).

**Figure 1.** Gel-electrophoresis of PCR products of the new molecular markers: (**A**) *C. amphilecti* **KMM 1561T**, *C. amphilecti* KMM 296 and KMM 7516; (**B**) *C. marina* **LMG 2217<sup>T</sup>** and KMM 6284; (**C**) *C. litoralis* **KMM 3880T**, KMM 7000, KMM 7005 and KMM 7514; (**D**) *C. pacifica* **KMM 3879T**, KMM 6731, 7505, 7508, 7515, 6816, and 6818. **Lane numbers: 1**—DNA/RNA non-specific endonuclease precursor; **2**—DNA/RNA non-specific endonuclease; **3**—EEP-like (DNaseI-like) nuclease; **4**—alkaline phosphatase/phosphodiesterase PhoD; **5**—phospholipase A; **6**—ATP-dependent protease Clp; **7** periplasmic serine peptidase Do/DeqQ; **8**—CmAP-like alkaline phosphatase PhoA; **M**—1 kb DNA ladder marker (Evrogen).

Figure 1 shows the results of gel-electrophoresis of the PCR products for the type strains of the *Cobetia* species: *C. amphilecti* NRIC 0815T (KMM 1561T) (A), *C. marina* LMG 2217T (B), *C. litoralis* NRIC 0814<sup>T</sup> (KMM 3880T) (C), *C. pacifica* NRIC 0813<sup>T</sup> (KMM 3879T) (D), and *C. crustatorum* JCM 15644<sup>T</sup> (Figures 1 and 2). The type strain of each species is characterized by an individual distribution of the PCR products that could allow classifying the new *Cobetia* isolates by these patterns (Figures 1 and 2). Because of the PCR-based method for the molecular differentiation, using the gene-specific primers and the genomic DNA of the strains under study, they can be divided into five groups belonging to the five species of the genus *Cobetia*. The type strain *C. amphilecti* NRIC 0815<sup>T</sup> (KMM 1561T) and the strains *C. amphilecti* KMM 296 and KMM 7516 had an identical pattern of distribution for the PCR products of all new eight molecular markers, which indicate their 100% homology accordingly to the results of 16S rRNA analysis (Figure 1A, Table 1 and Table S1).

**Figure 2.** Gel electrophoresis of PCR products of the new molecular markers in the strains *C. crustatorum* JCM 15644<sup>T</sup> and *C. crustatorum* KMM 6817. Lane numbers: **1**—DNA/RNA non-specific endonuclease precursor; **2**—DNA/RNA non-specific endonuclease; **3**—EEP-like (DNaseI-like) nuclease; **4**—alkaline phosphatase/phosphodiesterase PhoD; **5**—phospholipase A; **6**—ATP-dependent protease Clp; **7**—periplasmic serine peptidase Do/DeqQ; **8**—CmAP-like alkaline phosphatase PhoA; **M**—1 kb DNA ladder marker (Evrogen).

The next group of the strains, with an identical PCR pattern, includes two *C. marina* strains: LMG 2217T and KMM 6284 (Figure 1B), which expectedly lost most PCR products due to the more distant relation to *C. amphilecti* by both their 16S rRNA genes and whole genome sequences [8,9].

Remarkably, the strain *Cobetia* sp. KMM 7000, and two strains KMM 7514 (2AS) and KMM 7005 (2AS1), which have higher homology by 16S rRNA sequences with the species *C. amphilecti* (Table 1 and Table S1), showed the complete identity of the PCR product distribution with the type strain *C. litoralis* KMM 3880T (Figure 1C). However, both type strains of *C. amphilecti* and *C. litoralis* have been found to possess highly homologous CDSs for all the predicted hydrolases (Figure 1A,C), used here as the molecular markers, except for the well-studied highly active PhoA-like alkaline phosphatase CmAP (Figure 1A,C, lanes **8**) [30]. The question arose as to how close these strains can be to each other, taking into account the fact that their physiological parameters and the results of DNA:DNA hybridization allowed them to be attributed to different biological species [7]. A comparative analysis of the whole genome sequence of *Cobetia* sp. 2AS1 (KMM 7005) (GenBank: JADAZN000000000.1), with the use of the SEED Viewer at the Rapid Annotation using Subsytems Technology (RAST) server [31], confirmed that the most CDSs for the enzymes, used as the marker genes (Table S4; column **B**: 2373, 3075, 2144, 1612, 617, 1748, 322), are similar with those of *C. amphilecti* KMM 296 by 99.32–99.64%, except for phospholipase A1 (97.14%) and protease Cpl (100%). However, an alkaline phosphatase PhoA, structurally similar to the alkaline phosphatase CmAP from *C. amphilecti* KMM 296, is absent in *Cobetia* sp. 2AS1 (KMM 7005) (Table S4). A putative orthologue (alkaline phosphatase EC 3.1.3.1), which should carry a similar function, showed only 38.39% identity with CmAP (Table S4, column B: 1612). Generally, 94.84% from 3253 CDSs of *Cobetia* sp. KMM 7005 (GenBank: JADAZN000000000.1) showed an average similarity of 83.5% with CDSs of *C. amphilecti* KMM 296 (GenBank: JQJA00000000.1) that is approximately correspondent to 88% average nucleotide identity (ANI) of their genomes (Table S4).

The high percentage of 16S rRNA genes' (99.93–100%) and whole genomes' (88%) identities may mean that the strains of *C. amphilecti* and *C. litoralis* belong to the same species, but currently they are undergoing significant phenotypic and genotypic divergence because of adaptive evolution [32]. Possibly, the highly active alkaline phosphatase PhoA was acquired by the cosmopolite *Cobetia* strains during their trying colonization of an invertebrate digestive tract due to the putatively significant role of the enzyme in the relationship (symbiotic or pathogenic) between marine habitants, such as *C. amphilecti* KMM 296 and the mussel *C. grayanus* or *C. amphilecti* KMM 1561T, and the eponymous sponge *Amphilectus*

*digitatus* [7,9,20,24]. Meanwhile, their closely related strains of *C. litoralis*, including the type strain KMM 3880T, were isolated predominantly from coastal sediments, therefore, they may not need such enzymatically active and specific alkaline phosphatase as CmAP [7,30]. The 16S rRNA heterogeneity of *C. litoralis* KMM 7000 may be an additional evidence of the species divergence due to the adaptation to colonization of marine invertebrates, which are the predominant habitats of the closely related strains of the species *C. amphilecti*, including KMM 296 and KMM 1561<sup>T</sup> [7,20,21]. Thus, a squid-vibrio symbiosis is feasible by modulation of the bacterial symbiont lipid A signaling by the host alkaline phosphatases facilitating its colonization of the juvenal squid's light organ [33]. The urgent need for mineralization and repair of the invertebrate's exoskeleton can also be a key factor in symbiosis with a carrier of a highly efficient nonspecific phosphatase like CmAP [9,30,34]. However, the conclusion should be drawn only after sequencing the whole genomes of the type strains of *Cobetia* and elucidation of biological functions of their species- and strainspecific genes and proteins. In addition, such a high adaptability and metabolic versatility in various environmental conditions requires investigating the possible contribution of the bacterium to the toxicity or pathogenicity of shellfish, particularly, for the humans consuming them raw.

A similar situation may be with other closely related species *C. marina* and *C. pacifica*. The largest group of our isolates, assigned to the species *C. marina/C. pacifica* (KMM 7508, 7515, 6816, 6818, 7505 and 6731), have 100% identity by 16S rRNA with both species, but their results from the suggested PCR-based method correspond to the species-specific pattern inherent for the type strain *C. pacifica* KMM 3879<sup>T</sup> (Figure 1D, Table 1 and Table S1). The dominant differences between these species were in the lanes 1, 4, and 5, indicating the differences in their PCR-targeted sequences (Figure 1B,D).

Finally, *C. crustatorum* KMM 6817 and the type strain *C. crustatorum* JCM 15644T significantly differ from other groups of the *Cobetia* isolates in the number and location of the bands of PCR products that correspond to the 16S rRNA analysis results (Figure 2, Table 1 and Tables S1–S3).

Thus, the suggested molecular markers used in the PCR-based method allowed distinguishing the isolates of *C. marina* and *C. amphilecti* from the isolates of *C. pacifica* and *C. litoralis*, respectively. From seven isolates of indistinguishable 16S rRNA sequences, only one from the red algae seeds was identified as *C. marina* (KMM 6284) and the others were of *C. pacifica* (KMM 7508, 7515, 6816, 6818, 7505, and 6731), isolated from algae and coastal seawater (Table 1). Five isolates of *C. amphilecti* and *C. litoralis*, indistinguishable by 16S rRNA analysis, were assigned as two *C. amphilecti* (KMM 296 from the mussel and KMM 7516 from coastal seawater) and three *C. litoralis* (KMM 7005, KMM 7514 from sediments and KMM 7000 from the sponge). The strain *Cobetia* sp. KMM 6817 was easily assigned to the *C. crustatorum* species according to the results of both methods of analysis, which proves the validity of the suggested molecular markers (Table 1).

According to the results of the *Cobetia* species identification at this stage of investigation, there is a tendency for the predominant association of the Pacific Ocean populations of *C. pacifica*, *C. amphilecti* and *C. litoralis* with algae, invertebrates and sediments or coastal water, respectively (Table 1). However, to confirm these observations, a more extensive search for isolates from different habitats should be carried out.
