In this study, the total genomic DNA extracted from both the frozen milt and scale samples of
Tor spp. produced identical SSR loci with the expected sizes (
Supplementary Materials, Figures S1 and S2). Similar findings were also obtained in the study on Whitefish (
Coregonus lavaretus L.) using DNA from adipose fins and cryopreserved milt [
71]. SSR alleles were also proven to be the same from different tissues of the same individual, as reported in a study of Chinese Holstein bulls using blood and semen samples [
72]. Therefore, the total genomic DNA obtained from different sample types should not be disputed for its abilities to amplify similar and consistent PCR products.
4.1. Genetic Diversity of the Tor spp. Collection
The
Tor collection in this study still retained a reasonable amount of genetic variation, with allele richness of 1 to 33 alleles per locus, polymorphic loci 76.45%, and an average PIC of 0.4942. Compared to studies on the same species by [
32,
34,
36], the microsatellite loci assessed in the present study showed the highest polymorphism level and allelic richness. The number of alleles was influenced by the sample sizes [
73,
74]. Generally, the larger the sample size, the higher the number of alleles generated from the population. Allelic richness, corrected for unequal sample size among samples for each locus, is the preferred measure for genetic diversity.
Allelic richness is an essential element in conservation programmes, as it is indicative of a population’s long-term potential for adaptability and persistence [
75,
76]. The allelic richness of the HLKW population (1–20 alleles per locus) obtained from Indonesia, believed to be
T. tambroides, was quite similar to the allelic richness, i.e., 5–21 alleles per locus obtained on the same species in the study by [
36]. For
T. tambra in all the other populations that we studied, the allelic richness ranged from 1–2 alleles per locus to 1–17 alleles per locus, but with most populations (i.e., FFRC, PPAP, AGHR, PHG, EMS and GPRK) having the allelic richness of around ten alleles per locus. The reported allelic richness for
T. tambra was ten alleles per locus in several studies [
32,
34,
36].
Most SSR loci (90.9%) revealed a significant deviation from HWE after the Bonferroni correction. Departures from HWE in most SSR loci are plausible and have been commonly reported in studies of natural populations in a wide range of fish species [
77,
78,
79,
80,
81]. Excess or lack of heterozygotes can cause departures from HWE. In this study, heterozygote deficiencies were observed in sixteen loci, whereas an excess of heterozygotes was detected in six loci. Heterozygote deficiency can be attributed to small sample size, the presence of inbreeding or genetic patchiness (Wahlund effect), reduction in effective breeding population size (Ne), and the presence of null alleles, which caused an excess of homozygotes. Small sample size may cause the founder effects, bottleneck effects or genetic drift [
35,
82]. Selective breeding, over-exploitation and anthropogenic disturbances resulted in occurrences of inbreeding and reduction in Ne [
83,
84,
85]. The high mutation rate in microsatellites increases the occurrence of null alleles [
86]. In this study, the small sample sizes of the AGHR and EMS populations (with <10 samples) could result in sampling error, inbreeding and the presence of bottleneck, all of which could cause heterozygote deficits, resulting in deviations from HWE and subsequently resulting in population differentiation.
The small sample size in both EMS and AGHR populations has resulted in low genetic variation. In populations with small sample sizes, rare genotypes are likely to be included in the samples. The small population size also increases inbreeding and genetic drift, thus reducing genetic variability over the long term [
87]. When inbreeding occurs, the number of homozygotes will increase because the mating individuals have the same alleles. This excess homozygosity, in turn, causes heterozygosity deficit in the population [
88,
89,
90,
91].
Heterozygosity deficit in the present study could also be the consequence of null alleles or stuttering among the SSR loci. Both null alleles and stuttering were detected in nine populations in this study. Of the nine populations with null alleles, five populations, i.e., HLKW, MSJ, HLS, FFRC and PPAP, showed heterozygote deficits. A highly significant (p < 0.001) heterozygosity loss among and within the populations was also revealed in the present study, as revealed in the AMOVA and F-statistics analysis. For populations with null alleles in the present study, the null alleles’ frequencies were relatively high in general (9.1–30.9%, data not shown). These populations showed highly significant genetic differentiation, with low gene flow among the populations as shown from the AMOVA.
Each locus that deviated from HWE can amplify at least one allele in all the samples. Thus, the low frequencies of null alleles were not enough to affect the analysis [
92]. Generally, null alleles with low frequencies between 5% and 8% would only have a minor effect on the classical estimation of population genetic parameters such as genetic diversity, population differentiation, population FST and genetic distances. However, when null alleles were present at frequencies higher than 10%, it could affect the genotyping of individuals at some loci and lead to the under-estimation of genetic diversity and the over-estimation of population differentiation. Genetic distances tend to be underestimated when null alleles occurred at high frequencies [
93]. The null allele at microsatellite loci with frequencies higher than 10% and its consequences in estimating population structure and differentiation have been reported in several studies on bivalve species [
94,
95,
96,
97,
98]. Nevertheless, in a study on Wedge Clam (
Donax trunculus), the presence of unusually high frequency of null alleles (>10%) did not appear to affect the FST estimates significantly [
94].
The presence of null alleles in microsatellite data and their consequences on population genetic parameters had been tested using various analytical and simulation tools [
99] and actual population samples [
94]. As shown in the simulations by [
100], those SSR loci with null alleles would slightly overestimate the FST but are unlikely to impact genetic differentiation significantly. Therefore, SSR loci with null alleles that did not seem to alter the overall outcome of assignment testing could still be included in the studies. In this study, 163 out of 181 (90%) individuals were correctly assigned to their respective populations. Five out of nine populations with null alleles AGHR, EMS, FFRC, GPRK, KENS, PHG and PPAP exhibited 100% correctly assigned individuals. Therefore, all 22 SSR loci were kept and used in the present study.
4.2. Genetic Differentiation and Genetic Structure Analysis
In the present study, the fixation indices F
ST, F
IS and F
IT indicated a significant reduction in heterozygosity within and among the populations due to non-random mating. F
IS values significantly higher or lower than zero reveal inbreeding or outbreeding, respectively [
101]. As reflected in the AMOVA, genetic differentiation was at a medium level in the overall
Tor spp. collection (F
ST = 0.149), as evidenced by the low level of gene flow estimate (N
m = 1.548 per generation) and highly significant level (
p < 0.001) of inbreeding among individuals within population (F
IS = 0.106). Generally, it was observed in this study that the inter-population differentiation was low among the
Tor spp. populations in Malaysia. Similar findings were also reported in the previous study by [
33] on the same species. Nevertheless, a mixed level of population differentiation, from low to high, was observed among the
Tor populations, with the pairwise population F
ST values ranging from low (0.016) to very high genetic differentiation (0.237) following the F
ST classification by [
102]. Significant differences (
p < 0.05) were detected in 83.6% of the pairwise comparisons among populations. These confirmed their population divergence.
Genetic differentiation can be attributed to migration, geographical barriers, genetic drift and gene mutation [
103,
104]. Low gene flow resulted in small genetic variation transfer from one population to another among the
Tor spp. collection. The proportional membership of
Tor spp. individuals with low genetic admixtures (individual
q value > 0.8), as shown in
Figure 4, revealed a low level of gene flow. Since the Nm value across the overall population in this study was greater than one, it was likely that genetic drift was not the main factor accounting for genetic differentiation among the
Tor spp. populations. From the pairwise F
ST generated for the
Tor spp. populations, results presented relatively higher differentiation between HLKW and all other populations, indicating that the population is most likely a separate species (
T. tambroides). The same observation was also seen in the EMS population, which revealed significantly higher differentiation between HLKW, KENS and GPRK populations but no significant differentiation from other remaining populations. For AGHR and EMS, a small sampling size looks likely to be the main cause of population differentiation. For other populations, it seems more likely that the Wahlund effect due to geographic distances or habitat fragmentation may have caused the local genetic differentiation among the
Tor spp. by limiting the gene flow among the populations [
105,
106]. Habitat fragmentation resulted in a reduction in the genetic diversity and viability of the small and isolated populations, consequently impacting the population genetic structure [
107].
4.3. Genetic Distance and Population Structure among Sampling Locations
Genetic distance among the
Tor spp. populations ranged from 6.2% to 42.3% in this study. A great genetic distance was observed between the HLKW population and the other populations, with a pairwise genetic coefficient ranging from 31.8% to 42.3%. This finding again supported the idea that the HLKW population was from a different
Tor species than the other populations because there was no sexual selection between different species. The pairwise genetic coefficient of the
T. tambra populations ranged from 6.2% to 29.3%.
T. tambra from EMS of Sarawak also showed a higher genetic distance from other
T. tambra populations in Peninsular Malaysia, ranging from 27.5% to 35.3%. The distinct population clusterings were further supported by the results of the population assignment tests, using both PCoA analysis and Bayesian cluster analysis. A high percentage of correctly assigned individuals indicated substantial genetic divergence among the populations [
108]. The pattern of clustering using Bayesian analysis was similar to the PCoA, with four genetically distinct groups formed according to the geographical origins of the
Tor spp. samples obtained. Moreover, the genetic admixture of all the
Tor stocks was relatively low (individual
q value > 0.8), indicating that individuals in each population were weakly differentiated. These genetically uncontaminated populations served as ideal sources of fresh alleles for future aquaculture and restocking programmes [
27].
Similar to the genetic structures revealed in both the PCoA plot and the model-based cluster analysis at K = 4, the UPGMA clustering of
Tor populations could also be explained according to their geographical distribution. Populations in cluster D (GPRK, HLS, MSJ, KENS, PPAP, AGHR, FFRC and TGN), as illustrated in the UPGMA dendrogram, had small genetic distances which ranged from 6.2% to 18.3%, indicating that these populations were closely related and had a recent common ancestor. It was apparent that these closely related populations were from the same source and origin. Similar clustering was also observed in the previous study by [
33], i.e., samples from Negeri Sembilan, Pahang and Perak were closely related and grouped in the same cluster.
Samples of the FFRC population were obtained from Kenyir Lake, while samples of the TGN population were collected from the Terengganu River, originating in Kenyir Lake. Samples of population PPAP were collected from the Pahang River. In contrast, the samples of the KENS population were obtained from the Kenaboi River, which is one of the tributaries of the Pahang River. However, the close relationship between samples of the HLS population from the Hulu Langat River and samples of the MSJ population from Mersing Johor could not be explained according to their geographical locations because these two river systems were not from the same origin. The Hulu Langat River flowed westwards of Peninsular Malaysia and ended at the Straits of Malacca, while the Mersing River flowed to the southeast of Peninsular Malaysia and ended at the South China Sea.
Based on the population structure derived from the STRUCTURE analysis at K = 4, it was noticed that individuals in populations HLS and MSJ had similar genetic compositions. Tor individuals from both the HLS and MJS populations comprise a mixture of Tor with distinct genetic contributions from the North (GPRK) at ≈35% and East Coast (PPAP, AGHR, FFRC and TGN) of Peninsular Malaysia at ≈65%. Generally, the samples of populations HLS and MSJ were more closely related genetically to population GPRK. This observation was also supported in the population assignment test, by which 25% and 33.3% of the HLS and MSJ populations respectively mismatched one another and mismatched with the GPRK population or populations from the East Coast (TGN and AGHR) of Peninsular Malaysia. The admixture percentages were low among individuals in the HLS and MSJ populations, indicating that they did not interbreed. Therefore, it looked more likely that the local fish traders who supplied the HLS and MSJ stocks had obtained their fish stocks from the same source.
On the other hand, the samples of AGHR, which originated from the Keniam River, were found to be more closely related to the stocks from FFRC and TGN. In the meantime, both the GPRK and EMS were found to cluster in the same grouping in both PCoA plot and Bayesian cluster analysis, indicating that samples from Perak and Sarawak were closely related. These results again highlighted the possibility of mislabeling the EMS sample by the fish trader who supplied the fish. The so-called EMS broodfish was most probably from the local source in north Peninsular Malaysia. Unfortunately, no other samples from the same population were available for verification.
4.4. Genetic Relatedness among Individuals
Genetic relatedness shows the relationship between individuals in a population [
109]. Knowledge of the genetic relatedness of individuals in a population is important in genetic analysis to estimate heritabilities, genetic correlations and breeding values for developing optimized strategies for artificial selection and conservation [
110]. The mean Rxy among all
Tor samples revealed by different relatedness estimators in the present study were not significantly (
p > 0.05) different among all
Tor populations across all estimators. It was also observed that mean Rxy based on the moment estimators showed negative values in most populations. A negative Rxy value indicates that the individuals are less related than the average relatedness. It also reflected how much lower the probability of recent coalescence is for the individuals relative to the average probability for all considered individuals from the reference population [
111]. Meanwhile, mean Rxy based on the likelihood estimators was slightly higher with positive values and highly correlated, especially among HLKW, HLS and MSJ populations. A similar observation was also reported in a study by [
31] on seabass (
Lates calcarifer), in which the Rxy estimates for wild and hatchery stocks did not differ significantly (
p > 0.05). However, a significant increase of genetic relatedness with a high correlation coefficient and a decline in Ne estimates were detected within a selectively bred population from the hatchery stocks. Therefore, selective breeding has caused a significant loss of genetic variation, allelic diversity and overall heterozygosity compared to the parental generation.
The pattern of genetic relatedness has a direct functional relationship with the Ne of the population [
109]. The next generation would have a higher probability of sharing the same parents if interbreeding was performed between individuals from populations with small Ne [
112]. As a result, the mean and variance in pairwise relatedness within the next generation are expected to increase with decreasing Ne [
109]. Therefore, it is also worth noting that extra caution should be taken when selecting broodfish of this
Tor spp. collection for cross breeding in the future to avoid a rapid increase in genetic relatedness and reduction in Ne.
4.5. Population Bottleneck, Effective Population Size (Ne) and Population Assignment
In the present study, a recent population bottleneck was detected in FFRC, GPRK, HLKW, KENS, MSJ and PHG populations and a mode-shift in allele frequencies in AGHR population. Sampling error in GPRK and small sample size in AGHR and EMS populations have resulted in negative effective population size (Ne) estimates in these populations. Ne measures the rate of inbreeding and genetic drift in the population [
113]. Population bottleneck and Wahlund effect could influence the Ne [
114]. Generally, a mass reduction in the Ne can lead to a large decrease in SSR variations [
115]. Consequently, the genetic differentiation, gene flow and genetic diversity of the population will be affected [
114].
The accuracy of the population assignment test did not seem to affect much by null alleles in the present study. In the population assignment test, a high percentage (i.e., 90%) of individuals correctly assigned to respective populations were observed for the
Tor spp. collection. The percentage has doubled the previous study (i.e., 42.8%) by [
33]. As reported in many studies, SSR loci with null alleles could lower the power to correctly assign individuals in the population assignment test [
100]. Therefore, loci less prone to null alleles should always be preferred in population genetic studies [
93]. Thus, the population assignment test outcome is more affected by the population differentiation and might improve by having an ample number of loci [
100].
4.6. Genetic Information and Broodstocks Management
An appropriate base population containing selected fish with desirable characteristics that harbour adequate genetic diversity is a prerequisite for successful broodstock development and effective genetic management. Therefore, the genetic information obtained from this study is essential to formulate appropriate strategies for genetic resource protection of Tor spp. and for their utilization in aquaculture development, especially for selective breeding programmes. The high percentages of departures from HWE as the consequences of excessive homozygosities among the SSR loci showed an urgent need for proper management strategies of these Tor stocks. It was evident from the genetic structures obtained that the Tor spp. collected for the establishment of the hatchery population comprised the natural gene pool of four distinct genetic sources. Understanding the connectivity among the populations provides a useful tool to determine appropriate strategies for fisheries conservation, effective management and genetic improvement of the Malaysian mahseer.
Analysis of genetic relationships is an essential component in a genetic improvement programme. It provides information about genetic diversity, and it also offers the platform for the stratified sampling of breeding populations [
116,
117,
118]. For sustainable aquaculture development, strategies to minimise the loss of genetic variation of the captive breeding populations should be undertaken through minimising genetic drift, while maximising the Ne [
112]. A genetic admixture of several different genetic stocks that can help increase the mean number of alleles and heterozygosity is the preferred strategy. This management strategy has been applied successfully in some aquaculture species [
30,
119,
120,
121].
Proper knowledge of stock structure is necessary to preserve genetic diversity and ensure sustainable exploitation of the broodstocks. With reasonable variability, which ranged from intermediate to high levels in the current Tor spp. collection, it should serve as a valuable germplasm resource and a suitable base population to start with for future utilization and genetic improvements of this species. Excessive homozygosity caused the departures from HWE we observed, highlighting the need for better management and planned breeding programmes of these Tor stocks.
For better prospects, Tor spp. stock from east Malaysia (Sabah and Sarawak) and northern Peninsular Malaysia (Kelantan) shall be included in future studies to best characterize the Tor spp. in Malaysia, which could better understand the current genetic status of Malaysian mahseer in the whole country. Future samples shall be obtained from more reliable sources for the stocks of Hulu Langat River and Mersing Johor populations. In genetic conservation programmes, milt samples of the Tor stocks from the GPRK, PHG and east Malaysia populations should be prioritized for sperm cryo-banking. Besides that, the polymorphic SSR loci with considerable genetic variations used in this study and those with private alleles are potentially useful for pedigree and parentage analyses of the new breeds from the stocks, as well as in the development of marker-assisted selection technology (MAS) for Tor spp. in this region. The SSR markers used in the study are expected to be useful for the ongoing inter-population diallel cross-breeding and growth performance assessments of the fingerlings produced from the same pool of candidate broodstocks. These SSR markers are also of potential use in monitoring the genetic impacts of restocking activities on the wild populations of Tor spp. The levels of genetic variation, which included measures of allelic diversity, overall heterozygosity, Ne and genetic relatedness, should be monitored continuously for the breeds resulting from these broodstocks.