1. Introduction
Considering the global seamless coverage of 6G networks, high-speed aircraft require stable wireless communication connections to guarantee the effective execution of diverse missions [
1,
2]. Due to its insensitivity to large Doppler shifts, Frequency Modulation (FM) technology has been widely used in aircraft communications standards [
3] such as IRIG-106 [
4] and DVB-S2 [
5]. To ensure the quality of signal demodulation, the Multiple Symbol Detection (MSD) algorithm [
6] is a key signal processing technique used in the FM receiver.
Figure 1 shows the bit error rate (BER) performance of the three-symbol MSD algorithm under different Doppler shifts, from which it can be seen that, when the symbol rate is 10 Msps and the Doppler shift is 100 kHz, the BER loss is almost negligible. Therefore, FM technology exhibits remarkable superiority in resisting Doppler shifts.
The MSD algorithm can enhance the signal detection performance by leveraging the continuity between different symbols based on the maximum likelihood principle. The essence of the MSD algorithm lies in its ability to holistically process multiple consecutively received symbols as a complete unit, transcending the limitation of relying solely on a single acquired symbol [
7]. MSD algorithms can overcome the BER loss of traditional single-symbol detection algorithms in low signal-to-noise ratio (SNR) scenarios, effectively reducing the demodulation threshold of the receiver [
8,
9].
However, due to the need to calculate correlation results between the received signal and all possible local signals, a large number of multiplication and addition operations are required, leading to considerable computational complexity [
10]. Therefore, this study starts from fundamental computational units such as multipliers and adders, and leverages stochastic computing (SC) to mitigate the computational complexity of MSD algorithms.
As a novel computational paradigm, SC has been proposed to effectively reduce the complexity of hardware circuits and improve their computational efficiency [
11,
12,
13], and has been applied to a wide variety of applications such as neural networks [
14,
15,
16], image processing [
17,
18,
19], and channel coding [
20,
21,
22].
SC uses random bits to encode binary numbers [
23]. Information is contained in a series of random bits, where the probability of logical 1 represents the target values. In SC, as the computational process is transformed into operations on single-bit data streams, arithmetic circuits exhibit exceptional simplicity [
24]. Specifically, for unipolar input data, the multiplier circuit is an AND gate, and for bipolar input data, the multiplier is an XNOR gate. This simple structure presents significant advantages over alternative approximation paradigms, such as fixed-point arithmetic [
25,
26] or neural-based approximations [
27,
28,
29].
However, the correlation between two stochastic sequences in the SC context has a pivotal influence on the computational outcomes [
30]. This correlation can potentially necessitate a longer sequence to ensure the reliability of the computational results. Consequently, this may compromise the real-time performance of the system, posing challenges in terms of timeliness and responsiveness.
Due to its ability to enhance the computational efficiency of SC, the hybrid SC has attracted the attention of researchers [
31,
32,
33,
34,
35]. SC exhibits low resource consumption, yet demonstrates inferior computational accuracy compared to binary computing. Conversely, binary computing achieves higher computational precision at the expense of higher computational complexity compared to SC. Hybrid computation can leverage the unique advantages of both stochastic and binary computing, while simultaneously addressing their respective limitations. Through integrating these two computing paradigms, hybrid computation offers a comprehensive solution that maximizes the benefits and minimizes the drawbacks of each individual method.
In this study, we delve into the computational process of the MSD algorithm, encompassing two distinct stages: correlation calculation and symbol decision. The correlation calculation necessitates substantial complex multiplication and addition operations, constituting the major computational load in the entire MSD algorithm. To alleviate the resource overhead, the SC is utilized upon conducting correlation calculations due to its characteristic simple computing circuit. Moreover, the correlation calculation requires summing a series of data. Given the constraint in SC that the result must fall within the interval of 0 to 1, this study introduces a stochastic adder with an adaptable scaling factor to guarantee that the summation results meet the specified range. Furthermore, considering the need to identify the maximum value among all correlation results for the subsequent symbol decision, selecting an appropriate scaling factor in stochastic addition also ensures discrimination between different computational results, thereby guaranteeing the detection performance of the entire MSD algorithm.
When it comes to symbol decision, the modulus of the correlation results must be calculated. If the SC is adopted in this process, a new random sequence should be generated based on the corresponding results. However, the complete independence of the regenerated random sequence from the original sequence cannot be guaranteed. This will result in decreased accuracy of the computational results, thereby leading to a loss in the detection performance of the MSD algorithm. Considering the limitations of SC, binary computing is adopted for the symbol decision in the MSD algorithm. Additionally, the errors in binary computing are due to minor deviations from data truncation during the bit quantization process. Therefore, the adoption of binary computing becomes a viable option in the process of symbol decision. Due to the requirement of a certain computation time to ensure the convergence of results in SC-based correlation calculation, there is ample computation time for the binary-computing-based symbol decision. Through meticulously designing a pipeline architecture, the hardware resources needed for binary computing can be significantly reduced, thereby optimizing the overall computational efficiency.
This study makes the follow contributions:
A hybrid stochastic computing architecture for the MSD algorithm is proposed. By utilizing SC and binary computing at various computational stages in the MSD algorithm, based on the intrinsic properties of the different computing paradigms, the proposed method significantly reduces hardware resource consumption while maintaining the signal detection performance.
Considering the constraints on the range of data values in SC, a scalable stochastic adder is proposed to achieve the summation of different sequences. This adder can permit different scaling factors, thereby facilitating the selection of optimal parameters tailored to various input conditions. This ensures that the computational results do not exceed the prescribed limits, while simultaneously guaranteeing computational accuracy.
Due to the low update rate of the SC results, a pipeline architecture is proposed to maximize the utilization of computing time for the symbol decision of the MSD algorithm. The fully serial computation approach is employed to minimize the resource overhead in binary computing.
The rest of this paper is organized as follows.
Section 2 introduces the related works, including those on the MSD algorithm, FM signal demodulator, and the fundamental computational methods of SC.
Section 3 describes the hardware circuits of the basic SC units and presents an architecture for the proposed hybrid computing scheme that integrates stochastic and binary computing.
Section 4 analyzes the computational accuracy and hardware resource consumption of the basic SC unit.
Section 5 assesses the computational accuracy, BER performance, and hardware resource of the MSD algorithm based on hybrid SC.
Section 6 summarizes the key points discussed in this paper.
3. Hybrid Computing Architecture of MSD Algorithm
The MSD algorithm is divided into two integral components: correlation calculation and symbol decision. The correlation calculation primarily comprises two fundamental computational units: complex multiplication and complex addition. Meanwhile, symbol decision involves three essential computational units: computation of the modulus, determination of the maximum value, and the subtraction of maximum values.
For the correlation calculation, this study employs SC to reduce the complexity of implementation. Stochastic multipliers and adders exhibit lower resource consumption than binary multipliers and adders; however, compared to stochastic adders, stochastic multipliers are susceptible to the correlation of sequences. Therefore, when applying stochastic multipliers, it is necessary to ensure that the input sequences are independent. Given that complex multiplication constitutes the initial step of the MSD algorithm, the random bit sequences are directly generated by the SNGs and do not exhibit excessively high correlation. Therefore, this limitation can be ignored in the SC-based correlation calculation.
However, when utilizing SC for correlation calculation, aside from the challenge of sequence correlation, there arises a problem of data scaling. As the data range in SC is inherently limited to 0 to 1, it is imperative to maintain this range within appropriate bounds throughout the entire computation process. Stochastic multipliers, due to their inherent computational characteristics, naturally avoid this issue. Conversely, when executing stochastic addition, the use of a suitable scaling factor becomes necessary to prevent potential overflow. Both excessively large and small scaling factors can result in indistinguishable differences among various computational outcomes. Therefore, this study proposes a flexible scaling adder circuit that can select an appropriate scaling factor based on different input conditions, which can ensure that the computational results do not overflow.
When performing symbol decision in the MSD algorithm, the first step is calculating the modulus of the correlation results. If SC is adopted for this process, it requires two uncorrelated random sequences with identical probability. If a new sequence is generated based on the outcomes of the correlation calculation, it can result in higher correlation between the two sequences and larger computational errors. Therefore, binary computing is adopted to complete the symbol decision process. Therefore, in the various computational stages of the MSD algorithm, this study adopts two distinct computational paradigms—SC and binary computing—based on their respective characteristics. A hybrid computing scheme is designed to implement the hardware circuit for the entire MSD algorithm.
3.1. Correlation Calculation Based on SC
In the realm of SC, it is imperative to maintain probability values strictly within the range of 0 to 1 throughout the entire computation process. During multiplication operations, if the input data fall within the range of 0–1, the output will inherently lie within the same range.
For adders employed in SC, it is necessary to scale the summation result to prevent overflow. Excessively small scaling factors may lead to the summation results exceeding the permissible range, while excessively large scaling factors result in small outcomes, thereby reducing the discriminability between different summation results. Therefore, it is necessary to design the adder with an appropriate scaling factor. Additionally, considering that the input number of the adder will vary with the observation window length in the MSD algorithm, the adder must also accommodate different numbers of input ports. Thus, high requirements are imposed on the scalability and flexibility of the SC-based adders.
In summary, the entire correlation calculation based on SC requires three fundamental computational units: an inverter, a multiplier, and an adder with flexible scaling factors. The circuit structures of each basic module based on SC will be introduced individually in the following.
3.1.1. Inverter
The input data are typically represented as signed numbers in the MSD algorithm. For the sake of simplifying the discussion, the input range can be limited between −1 and 1. When these input data are converted into a random bit stream by the SNG circuit, the conversion equation can be expressed as
where
X denotes the input data and
denotes the probability data. During the mapping process from the data domain to the probability domain, the input data are scaled and a bias is added. In this scenario, to obtain the opposite number of input data, the following equation can be used:
where
is the probability of the opposite number of
X—that is, when
is the logical 1, the
is the logical 0, and vice versa. Therefore, the computational circuit for the inverter in the SC context is a NOT gate.
3.1.2. Multiplier
When signals are mapped from the data domain to the probability domain, the data undergo scaling and bias operations. To ensure uniformity in calculation, the product of two input data points must also undergo the same mapping rule. Assuming the two input data points are
and
, the mapped result of their product can be expressed as
where
y is the product of
and
. Considering that the mapping results of input data in the probabilistic domain can be represented as
and
the product can be calculated as follows:
In probability calculations, the computation circuit structure for the multiplier is an XNOR gate.
Similarly, the complex multiplier can be expressed as
where
denotes the real part of a complex number and
denotes the imaginary part of a complex number.
In the complex multiplication computation, four multipliers and two adders are required. When the input data are random, adders with a scaling factor of two should be employed to prevent result overflow. The computation circuit of
is a two-input multiplexer in the SC. The whole circuit of the complex multiplier is shown in
Figure 5.
3.1.3. Adder with Flexible Scaling Factor
Assuming that the adder has N inputs and a scaling factor of k, and taking into account the scaling and bias in the domain conversion, the calculation equation for the adder can be expressed as
where
,
, and
denotes the floor operation.
In the stochastic adders, when the scaling factor k is excessively large, the output becomes too small, necessitating longer random sequences to ensure the computational accuracy. For adders with N inputs, a scaling factor of up to N can guarantee that the result will not overflow. Therefore, it suffices that the scaling factor k in the adder is less than N.
As observed in Equation (
28), during the stochastic addition computations, the probability value of the input data is first subjected to a scaling operation. Subsequently, as the scaled sequence is summed, a correction factor—which is dependent on both the scaling factor and the number of inputs—is subtracted from the cumulative sum.
The truth table for the adder is shown in
Table 2. To simplify the discussion, the scaling process has been omitted from the table. Within the table,
is the random sequence with a probability value of
,
is the current state,
is the output of the adder, and
is the updated state after the output. The state register and
are utilized to perform the required subtraction operation.
Figure 6 illustrates the hardware architecture of the proposed adder. In the comprehensive adder circuit, the N input signals are initially partitioned into groups of k, and a k-input multiplexer is employed to scale these signals. In the case where the last group of inputs comprises fewer than k elements, zeros are appended to ensure the completeness of the set. After scaling the input data, a subtraction unit is used to subtract the correction factor
. Subsequently, the output of the adder is determined based on the subtraction result and the current state value. Finally, the state value is updated in accordance with the adder’s output.
When
, the entire adder circuit simplifies to a multiplexer. When
, the adder circuit constitutes a full adder. If
, the correction factor is greater than 1. In such a case, the probabilistic value of the random sequence
within the adder is modified from
to
, while an extra constant decrement of 1 is applied during the subtraction computation process. As the scaling factor is generally not too small during the summation of multiple data, this particular circuit adjustment is not graphically depicted in
Figure 6.
3.1.4. The Correlation Circuit Architecture Based on SC
Building upon the inverters, multipliers, and adders discussed earlier, along with the integration of SNGs and SNCs, this study presents the architecture of the signal correlation circuit based on SC. Considering that the fundamental computational units within SC exhibit low resource consumption but require a long computing time to achieve acceptable computational precision, a parallel computing framework is adopted to improve the computational efficiency.
Figure 7 illustrates the circuit diagram for the correlation calculation between the received signal and a local signal, where N is the number of symbols utilized in the MSD algorithm and
is the number of sampling points per symbol. After executing complex multiplication between the received signal and local signal, a total of
results are obtained. This study presents a two-stage adder for this summation. The first stage sums the correlation results for each symbol, while the second stage sums the results of the N symbols again. Upon acquiring the outcomes through SC, the results should be converted into binary data using SNCs. Given that the conversion from binary data to a random sequence involves scaling and biasing operations, it is necessary to apply inverse operations on the SC results during the SNC process.
Furthermore, the local signal serves as prior information in the MSD algorithm and, thus, is directly stored as random bit sequences in the correlation circuit based on SC. On one hand, the SNGs will consume a certain amount of computational resources. On the other hand, a certain inaccuracy arises during the data domain conversion of SNGs. The pre-storage of a random sequence of the local signal serves to efficiently mitigate the computational resources while simultaneously safeguarding the accuracy of the computed results. Nevertheless, in comparison to storing the binary data of the local signal, this strategy entails a higher consumption of storage resources.
3.2. The Symbol Decision Pipeline Design
For the process of symbol decision, this study adopts a binary computing scheme. In the correlation calculation stage, the SC is utilized to reduce the resource consumption of hardware circuits, primarily due to the consideration that binary computing requires extensive hardware resources for complex multiplication. Nonetheless, a notable concern regarding SC is that the results of SC exhibit certain statistical errors. If the entire MSD algorithm relies solely on SC, the calculation errors will accumulate at each stage of computational unit, thereby necessitating a longer computation time to ensure the convergence of results.
Furthermore, a significant reason for employing binary computing in the symbol decision process lies in the fact that the first step involves computing the modulus of a complex signal. In binary computing, this solely necessitates the complex number itself. Conversely, the multiplier in the SC demands two uncorrelated random sequences. In the correlation calculation based on SC in the previous stage, only one random sequence can be generated. To sustain the use of SC for computing the modulus, an additional random sequence with the same probability value is necessary. When regenerating a new sequence based on an existing sequence, this may result in a certain correlation between the two sequences, thereby diminishing the computational accuracy.
Considering the challenges associated with symbol decision based on SC, reconsidering the adoption of binary computing is advisable. In comparison to SC, binary computing exhibits distinctive advantages. The errors in binary computing mainly arise from data truncation during bit quantization, and its computational accuracy is significantly higher than that of SC. Moreover, when calculating the magnitude of a signal, binary data only require self-multiplication, thereby avoiding the sequence regeneration issues encountered in SC.
Although the resource overhead of binary computing is higher than that of SC, some strategies can be employed to reduce its hardware resource utilization. It should be noted that the hardware resources in binary computing are intrinsically related to the real-time requirements of data processing. When the demand for the data processing rate is relatively low, binary computing approaches can be designed using a serial pipeline architecture, which does not incur high resource consumption. Recalling the characteristics of SC, it exhibits low hardware overhead but requires a long computation time, resulting in a relatively low update rate for correlation results. When binary computing is employed for symbol decision, its hardware resource consumption can be significantly reduced through reasonable pipeline design.
Figure 8 illustrates the pipeline architecture for symbol decision based on binary computing. The entire pipeline comprises a multiplier, an adder, a comparator, a register, and a subtractor. The multiplier is utilized to compute the moduli of complex numbers. Given the relatively low update rate of the correlation results, this study employs a single multiplier to sequentially calculate the squares of the real and imaginary parts of the complex numbers. These values are then summed using an adder to yield the squared modulus of the complex number. Subsequently, a comparator is used to serially compare each output of the adder. Upon identifying a larger result, it is stored in the register. Once the maximum value among all correlation results corresponding to symbol 0 has been recorded, the register is reset and reused to record the maximum value among all correlation results corresponding to symbol 1. Finally, a subtractor is employed to compute the difference between these two maximum values, thereby producing the definitive symbol decision outcome. Due to the fact that the computation time
of the multiplier is greater than the computation times
,
,
, and
of the other units, the entire pipeline scheme can be successfully implemented.
3.3. Multi-Stream Design for MSD Algorithm
The circuits for correlation calculation based on SC require substantial computation time to ensure the accuracy of the results, which consequently leads to a notably slower update rate for these outcomes. Meanwhile, the symbol decision based on binary computing incorporates a pipeline design to serially search for the maximum complex modulus among all correlation results, yielding a similarly low update rate for its outcomes. While the computation circuit for the entire MSD algorithm can significantly reduce hardware resource consumption, it suffers from severe limitations in terms of its real-time computation performance. Therefore, in the design of the overall MSD algorithm, a balanced consideration must be given to the trade-off between computation time and hardware resource overhead.
Figure 9 illustrates the multi-stream processing framework of the MSD algorithm. SC-based correlation calculations require hundreds of random bits to maintain acceptable error margins. Even in the 7-symbol MSD algorithm, its symbol decision only involves 128 complex numbers. The required computational time is fully accommodated within the processing time of the SC-based correlation calculation. Therefore, as shown in
Figure 9, the symbol decision can be seamlessly embedded within the SC timeline without incurring extra time overhead. The inherently low data update rate of SC enables binary-based symbol decision to be implemented using pipeline processing architectures.
As the symbol decision computation time can be overlapped with the correlation computation period, the MSD algorithm enables parallel processing of these two operations. Specifically, as the correlation computation for one received signal block concludes, processing of the subsequent block immediately initiates while the symbol decision module generates output data based on the previous correlation results. However, the inherent computational latency of SC imposes limitations—even with seamless pipelining of consecutive correlation operations, the system may struggle to meet the requirements of real-time processing.
Given the low resource overhead of SC-based correlation calculations and serialized symbol decision, multi-stream parallel data processing becomes feasible. By instantiating multiple data streams, the hybrid SC-based architecture can achieve high throughput, where concurrent processing of independent data streams ensures compliance with real-time requirements.
5. Performance Analysis of MSD Algorithm Based on Hybrid SC
5.1. Computational Accuracy of MSD Algorithm
In order to assess the performance of the MSD circuit constructed using the hybrid SC approach, this study initially carried out an experiment to evaluate the overall computational accuracy of the MSD algorithm. The sampling factor of the receiver was set to four. Leveraging the previously designed fundamental SC units, the computational accuracy was verified for the three-, five-, and seven-symbol variants of the MSD algorithm.
Additionally, the experiments also involved comparing the computational accuracy of the hybrid SC approach proposed in this study with that of a pure SC approach; specifically, the pure SC method employed a re-randomization technique in the symbol decision stage, as outlined in [
41].
Figure 14 illustrates the computational accuracy across diverse computation schemes and different symbol lengths. The results show that the hybrid SC achieved higher computational accuracy than pure SC. This can be attributed to two primary reasons: First, although re-randomization allows a new random sequence to be regenerated based on an existing random sequence, it does not fully eliminate the correlation between the two sequences. Second, re-randomization largely maintains the probability values of the sequences. When computational errors exist in the sequences undergoing re-randomization, these errors are propagated to the newly generated sequences, resulting in increased computational errors. In light of these observations, this study adopts a hybrid SC method to develop the MSD algorithm, aiming to enhance computational accuracy while ensuring computational efficiency.
Figure 14 also illustrates the computational accuracy under different bit quantization schemes. The error in binary computation arises from two primary sources: (1) quantization loss determined by the bit width; (2) the scaling operations that are necessary to prevent data overflow during summation processes. As the observation window of the MSD algorithm expands from three to seven symbols, the summation of input data increases proportionally. This necessitates a larger scaling factor to prevent overflow in adder operations, which introduces greater computational errors. Consequently, the seven-symbol MSD exhibited a greater computational error than the three-symbol MSD.
Figure 14 provides a comparative analysis of computational accuracy between hybrid SC and binary computing paradigms. However, it is critical to recognize that error manifestations differ fundamentally between these architectures: Binary computation introduces errors through floor operations, while the errors from hybrid SC derive from inherent variance in probabilistic bit-stream representations. Consequently, direct numerical comparison of error metrics does not provide an equitable reflection of algorithmic performance across different computational frameworks.
5.2. BER Performance of MSD Algorithm
In the MSD algorithm, the output result is determined based on the maximum correlation value derived from comparisons between the received signal and a set of local signals, distinguished by their symbols being either zero or one. The performance of the MSD cannot be adequately represented solely according to the computational error. Therefore, this study evaluates the BER performance under three-symbol, five-symbol, and seven-symbol MSD algorithms based on hybrid SC.
Figure 15 illustrates the BER performance of the three-symbol MSD algorithm under different computational schemes and sequence lengths. For a sequence length of 500 in hybrid SC, the BER performance slightly surpassed that with 8-bit binary computing. As the sequence length increased to 1000, the BER performance of hybrid SC was comparable to that of 9-bit binary computing. Additionally, when compared to hybrid SC, the MSD algorithm experienced a more pronounced performance decrease under pure SC.
Figure 16 illustrates the BER performance of the five-symbol MSD algorithm under different conditions. Similar to the three-symbol MSD, when the sequence length was set to 500, hybrid SC demonstrated a marginally superior BER performance compared to 8-bit binary computing. However, to attain BER performance equivalent to 9-bit binary computing, the sequence length for hybrid SC must be extended to 1500. This is primarily due to the fact that, as the number of symbols in the MSD algorithm increases, the performance loss of binary computing under the same bit quantization becomes smaller. The loss in binary computing mainly stems from data truncation. As the number of symbols in the MSD algorithm increases, the number of potential outcomes of the symbol decision process expands, thereby mitigating the adverse effects of quantization loss. Conversely, the primary error source in hybrid SC arises from the fluctuation of its computation results, rendering it relatively insensitive to the number of symbols in the MSD algorithm.
Figure 17 illustrates the BER performance of the seven-symbol MSD algorithm under different conditions. Compared to those of the three- or five-symbol MSD algorithms, the hybrid SC in the seven-symbol MSD necessitated a longer sequence length to sustain its BER performance at a comparable level to binary computing. When the sequence length reached 1000, the BER performance of hybrid SC marginally exceeded that of 8-bit binary computing. However, even when the sequence length extended to 1500, the performance of hybrid SC remained slightly inferior to that of 9-bit binary computing.
5.3. Hardware Resource Overhead of MSD Algorithm
To assess the hardware resource overhead of the hybrid SC approach designed in this study, the entire hardware circuit was meticulously implemented on the FPGA platform. Specifically, the Xilinx Virtex UltraScale xcvu190 platform was utilized, and the synthesis and implementation tools were Vivado Synthesis 2018 and Vivado Implementation 2018, respectively. As previously mentioned, a multi-stream approach was adopted to balance hardware resource consumption and computing speed. Due to the variation in BER performance under different sequence lengths in hybrid SC, this study conducted a precise evaluation of hardware resource consumption for various sequence lengths through controlling the number of parallel-processed streams. Considering actual data processing requirements, the symbol rate of the transmitted data was set to 10 Msps, and the up-sampling factor of the receiver was set to four. For comparison, resource requirements for binary computing under 8- and 9-bit quantization were also established.
Due to the up-sampling factor, a four-input adder is required to accumulate the multiplication results within each symbol. For the three-, five-, and seven-symbol MSD algorithms, this architecture necessitates three-, five-, and seven-input adders, respectively, to perform the summation of accumulated results across multiple symbols.
To determine the optimal scaling factors for each adder, we conducted numerical simulations of the MSD algorithm under the constraint that the summation results must not exceed the limiting range while minimizing the scaling factor. Through numerical analysis of the MSD computational process, we established the following scaling factor assignments: two for three-input adders, two for four-input adders, three for five-input adders, and four for seven-input adders. The selection criteria aimed to balance between preventing overflow in addition operations while preserving maximum signal resolution through minimal scaling factors.
Table 6 illustrates the hardware resource overhead of the three-symbol MSD algorithm under different computing paradigms. When the sequence length was 500, the hybrid SC proposed in this study exhibited a lower hardware resource consumption than that demanded by 8-bit binary computing. When the sequence length reached 1000, the hardware resource consumption of hybrid SC surpassed that of 8-bit binary computing, yet remained inferior to that of 9-bit binary computing. As the sequence length exceeded 1500, the hardware resource consumption of hybrid SC surpassed the threshold for 9-bit binary computing.
Table 7 illustrates the hardware resource overhead of the five-symbol MSD algorithm. In comparison to the three-symbol MSD, the hybrid SC demonstrated a hardware resource consumption that remained slightly lower than that of 8-bit binary computing, even when the sequence length reached 1000. When the sequence length increased to 1500, the hardware resource consumption of hybrid SC exceeded that of 8-bit binary computing but remained less than that of 9-bit binary computing. This is primarily due to the non-linear relationship between hardware resource consumption and computational load in SC. Even though an increase in the number of MSD symbols leads to a linear increase in computational load, the resource consumption does not increase significantly.
Table 8 illustrates the hardware resource overhead of the seven-symbol MSD algorithm. When the sequence length reached 1500, the hardware resource of hybrid SC slightly exceeded that of 8-bit binary computing. Conversely, when the sequence length was reduced to 500, the hardware resource of hybrid SC was merely one-third that with 8-bit binary computing. Nonetheless, the utilization of storage resources in hybrid SC exhibited a notably higher consumption compared to binary computing. This is primarily due to the fact that hybrid SC stores random bit sequences of local signals, which mitigates the computing resource overhead for numeric domain conversion at the cost of additional storage resources. Considering the ample storage capacity in modern hardware systems, this increased storage requirement can be considered an acceptable compromise.
5.4. Power Consumption and Computational Latency Analysis
In addition to comparing the hardware resource overheads, this paper also compares the power consumption and computational latency under the hybrid SC and the binary computation scheme.
Table 9 illustrates the power consumption of different computing paradigm. When the sequence length is 500, energy consumption of hybrid SC is lower than that of binary computing. However, when the sequence length exceeds 1000, energy consumption of hybrid SC is greater. Compared with its advantages in resource overhead, the energy efficiency of hybrid SC is slightly lower. Nevertheless, when the sequence length is short, it can still reduce energy consumption while reducing resource overhead.
Table 9 also illustrates the computational latency of different computing paradigm. Compared with binary computation, the computational latency of hybrid SC is significantly increased. As the sequence length increases, the computational latency also increases linearly. But even when the sequence length is 1500, its computational latency is less than 10 microseconds. This latency is of negligible significance in the communication systems.
6. Conclusions
This study proposed a novel hardware framework for the MSD algorithm based on a hybrid SC approach. For the correlation calculation in the MSD algorithm, the SC scheme is employed to reduce the hardware resource overhead. In the SC-based correlation calculation, a flexible and scalable stochastic adder was developed and integrated to achieve summation operations using different scaling factors. For the binary computing-based symbol decision, a pipeline structure was proposed to execute the entire process serially, thus reducing the resource overhead by leveraging the low update rate of the SC-based correlation results. The hardware architecture based on hybrid SC for three-, five-, and seven-symbol MSD algorithm variants was successfully constructed on an FPGA platform. Experimental evaluations revealed that the BER performance of the hybrid SC is comparable to that when using traditional binary computing methods, while simultaneously achieving a substantial reduction in hardware resource utilization.
However, the hardware architecture based on hybrid SC also has some drawbacks. Its energy cost increases slightly, and computational latency is relatively high. Despite the drawbacks, the introduction of SC still has significant advantages in reducing resource overhead. In future work, it is advisable to consider applying it to other algorithms with high computational complexity.