1. Introduction
Six decades ago, the pioneering work of Gaines [
1] introduced a novel computation technique known as Stochastic Computing (SC) [
2,
3]. This technique relies on a data domain transformation, i.e., from a weighted binary system (WBS) to an unary system, where the resulting number is a string of 0’s and 1’s, with the property that the percent of 1’s in the string represents the value of the number as a probability; hence, the string is known as probabilistic sequence (PS) or probabilistic sequence, and the input transforming circuit is known as Stochastic Number Generator (SNG). Each bit is assumed to be random, i.e., the bits at different positions are uncorrelated [
4]. Due to the fact that the interaction between classic probabilistic sets can be set up with logical operators, working with probabilistic sequences makes it possible to rely on single logical gates for implementing math operations, resulting in smaller digital circuit designs than those obtained with the corresponding counterpart based on a WBS. Once results are obtained, data are back-transformed to a WBS by means of a counter, i.e., by counting the 1’s in the PS. It is worth mentioning that the available computational resources at the time that Gaines proposed the SC were limited, making SC unfeasible.
Figure 1 shows a block diagram of a stochastic computing system where domain transformation elements are evident.
One of the most attractive features of stochastic computing is the great reduction in logic elements or gates for the design of specific applications, since basically the format in which the numbers work has the property of randomness and depends on the probabilistic mathematical operations that can be performed between them and that can be implemented with simple logic gates. On the other hand, fault tolerance is one of the advantages of stochastic computing [
5], since the impact due to the bit flip or soft error phenomenon (so called because it is a transient error and is not related to physical damage to the device; it only causes the change in an arbitrary bit [
6]) has less impact because the bits in an PS are not weighted. In terms of applications, SC is useful where small errors are tolerated, such as image processing [
7] and neural networks [
8], although there are also more specific and unconventional applications, such as the design of automatic controllers [
9].
Advancements in digital devices technology have rendered cheaper, simpler and more flexible implementations of algorithms based on SC as in [
10], where a standalone SC architecture that can perform accurate arithmetic operations such as addition, subtraction, multiplication and sorting was designed as in-memory SC architecture based on 2D memtransistors. But in general, there is a drawback presented by SC; i.e., an
n-bit binary number can represent up to
levels, but in order to deal with the same resolution in SC, a string of
bits is required, which considerably increases latency, making the SC match low-speed applications as artificial neural networks, as decoding low-density parity check codes and polar codes, for digital filters, among others [
4].
Nowadays, due to the recent available computational power in digital devices as FPGAs [
11] and due to other great features of SC techniques such as error resilience, low power consumption and low resource arithmetical unit implementations, researchers in different areas are thoroughly exploring this computation technique again [
5,
12]. Nevertheless, drawbacks are still present in SC, such as the loss of accuracy in relatively large circuits, restrict the SC to small algorithms. Another important drawback is resource consumption by the SNG. Each involved signal must be transformed by an SNG; hence, it is possible that all the circuits involved in SNGs are larger than the stochastic digital circuit itself, clearly losing the advantage of smaller circuit designs in the stochastic domain. The reason for this fact is that the SNG consists of two parts: a Linear Feedback Shift Register (LFSR) that generates a Pseudo-Random Binary Signal (PRBS) [
13] and a comparator.
Recently, some researchers have focused their efforts on novel applications of SC, e.g., control algorithms [
9], deep learning [
14], machine learning [
15] and estimation algorithms [
16], just to mention a few. Other researchers are concerned with the disadvantages of SC, hence aiming their research at alleviating SC drawbacks; e.g., in [
17], an area-efficient SNG by sharing the permuted output of one LFSR among several SNGs is proposed; despite novelties presented in this work using permuted readings to obtain more stochastic sequences, it just avoids the implementation of more LFSRs, which is attractive, but the SNG is not only composed of LFSRs, but of a comparator too. Precision degradation in subsequent phases after the application of the method to select permuted pairs is still a risk.
SNGs can be divided into two types: True SNG (TSNG) and Pseudo-Random Number Generator (PRNG). TSNG has a natural source of randomness to seed the comparator generating outputs. Through the years, one of the most outstanding developments of this type of SNG has been spintronic devices such as the one presented in [
18], where a scalable SNG based on the Spin–Hall effect is proposed, which is capable of generating multiple independent stochastic streams simultaneously. The design takes advantages of the efficient charge-to-spin conversion from the Spin–Hall material and the intrinsic stochasticity of nano-magnets; although the proposal shows interesting results, it is not of practical use as it requires custom non-commercial semiconductor devices also. Previous works [
5] have concluded that SNGs that truly use randomness as presented in [
18] can result in worse results than those obtained with PRNG ones. One of the reasons could be that the maximum length for the PS is not guaranteed with such semiconductor devices. SNG based on spintronic devices as an
n-bit random number generator still relies on the conventional SNG main idea: comparison of a number to obtain a PS. Therefore, the chance of losing accuracy sooner in a large system implementation using spintronic devices is higher than that obtained with conventional SNG based on LFSR. On the other hand, PRNGs are not fundamentally based on a natural randomness phenomena, but as was mentioned before, PRNGs are more accurate than TSNG. Most of the PRNGs, at least the most popular, are based on LFSR designs; others are based on quasi random number generators known as low-discrepancy (LD) sequences or Sobol sequence generators [
19], based on the Least Significative Zero detection from a counter. Generating this kind of LD sequences has been proposed to improve the accuracy and computation speed of stochastic computing circuits. Nevertheless, the large amount of hardware for Sobol sequence generators makes them expensive in terms of area and energy consumption. In [
20], two minimum probability-conversion circuits (MPCC) are proposed for reducing the hardware cost of SNGs, the MPCC of two-level logic (MPCC-2L) design and the MPCC of multilevel logic (MPCC-ML) design. The authors claim the MPCC-2L design slightly reduces SNG hardware cost, but the MPCC-ML design significantly reduces hardware cost as much as
for a 10-bit word. True results for the MPCC-ML are not presented since the authors state that additional flip-flops are required in between multiple levels of logic.
It is clear that there are opportunities for solving drawbacks in SC and in particular the issue with the SNG, i.e., its circuitry has the disadvantage of being relatively large, is attractive for investigating an alternative digital circuit proposal with less hardware. To the authors’ best knowledge, there are more PRNG works than TSNG in the literature due to some arguments given above.
Hence, this work contributes with the proposal of a novel SNG of the pseudo-random type that requires fewer logic gates. This is possible due to the structure of the comparator inside the SNG, in which some patterns were detected in the transformation process from WBS to PS, yielding in that way to a simpler digital circuit that represents such patterns with respect to the one in a classic SNG [
21] currently in use by engineers (see Remark 1).
Remark 1. Although the generator is somewhat old, it is still used by engineers, as can be seen in references [22,23,24]. Its popularity is due to the fact that it has the unique property of generating low-correlation sequences that perform accurate multiplications; hence, new proposals are still taking advantage of this property, as in [17,25]. It is worth mentioning that to the best of the authors’ knowledge, this generator has never been reduced, with the exception of the work [20]. Therefore, it is of interest to be able to make improvements to this generator to allow future works to have an economical and updated version of it. 3. EWBSNG Proposal
Analyzing the dynamics behavior inside the original WBSNG design, a key pattern is noticed. In
Table 1, each time the LFSR has a new state, the defined functions in (
5) filter out the binary input by letting pass the most significant bit with value one (MSBO) in the LFSR. Notice that
sequences shown in
Table 1 do not overlap with each other due to the fact that there is just one bit set to 1 each time, i.e., the MSBO of each LFSR state. Hence, the same MSBO is used like a reference to delimit subsequent runs of 0’s, as can be seen in (
5).
Therefore, for a given weight position inside the LFSR, the total number of MSBOs for such a position in a period is equal to the total number of runs of 0’s of length
r given before the MSBO. This particular number is
.
Table 2 shows the above-mentioned facts in an ordered table of states (for
), where it can be noted that the only run of three 0’s (
) matches with only one MSBO of weight one, satisfying
. After the MSBO, the remaining part of the LFSR state consists of do-not-care terms. Hence, the WBSNG shown in
Figure 10 searches for all the individual patterns presented in (
5), requiring several AND gates to implement them.
A new weight-generating circuit with less hardware can be designed for finding the MSBO in each LFSR state, with the remaining bits as do-not-care terms. For that, the state of the LFSR is complemented, where the MSBO will be zero and propagated through a concatenated structure of AND gates, yielding in that way to the do-not-care bits. Finally, for generating the MSB weight, it is directly connected to the MSB of the LFSR, i.e.,
, as in the WBSNG. For the remaining weights, XOR gates are used, issuing a logic 1, where a change from 1 to 0 is detected and the remaining weights will issue zeros from the propagated zeros of the concatenated structure of AND gates. The corresponding equations that describe the above-mentioned procedures are as follows:
Finally, the proposed weight-generating circuit for an EWBSNG is shown in
Figure 11 for the case of
. The rest of the EWBSNG circuit is similar to that shown in
Figure 10.
With respect to
Figure 11, the number of occurrences or clock cycles (denoted c. c.) of detected patterns (MSBO in every state of the LFSR) are shown in
Table 3,
Table 4,
Table 5 and
Table 6. It can be seen in
Table 3 that there are eight occurrences of pattern 1 ([1
]), where
represents do-not-care terms.
In
Table 4, the MSBO in each pattern was turned to logic 0. Logic 0 can easily be issued through the do-not-care terms by means of the concatenated AND gates.
The propagated zeros can now be found in
Table 5.
In
Table 6, it can be noted that a change from logic 1 to 0 is detected from left to right in the state represented by [
], yielding to the isolation of the MSBO. This bit represents the weight of its position. This is carried out with XOR gates.
Instead of designing a digital circuit for detecting each pattern as in the original WBSNG; here, a single digital circuit capable of self adjusting for detecting all patterns with less resources was designed. Finally, as can be seen in
Figure 10, the last step is the comparison of the weight state [
] with the binary input data
x, selecting in that way the weights that represent the binary input data in the PS.
Remark 4. It is clear that the LFSR cannot produce the zero state, but in Table 1 and Table 2, the zero state is added [5]. Remark 5. Since we are not modifying the pseudo-random source (LFSR) and only the comparator inside the SNG is modified without changing its functionality, the effects of the initial seed and the effects of different feedback configurations in the LFSR [28,29] and the correlation study of permuted LFSR outputs as well [17,30] remain the same with respect to the original SNG.