*4.2. Comparison*

Table 3 compares the proposed design with previous studies which could perform the same FFT operations (i.e., maximum FFT length of 128). An SDF 128-point FFT processor was developed based on the radix-2<sup>4</sup> algorithm [30] and its performance was extended to four streams for evaluation. That which is compared deals with devices based on different technologies; therefore, the area was normalized [Equation (8)] based on the scheme in [27,33]. Although the GI duration is relatively short (e.g., 1/4 symbol duration) in timeliness considerations, the proposed FFT kernel could lend support to a throughput gain of (6/4) for the number of streams, only mainly needing additional memory elements for the two-stream data storage. The comparison results demonstrated the per-stream area efficiency of the proposed design by using area-efficient Modules 2 and 3 (sorting buffer and CMU), resource sharing and hardware use GI duration. Table 3 shows the improved area efficiency of the proposed design because the proposed scheme achieved the highest throughput by using modest hardware resources (i.e., throughput per area).

$$\text{Norm.CorreArea} = \frac{\text{Core Area}}{\left(Tach./90 \, nm\right)^2} \tag{8}$$


**Table 3.** Performance evaluation and comparison based on 128-point FFT specification.

1 A test module was included. 2 An output sorting buffer was included. 3 All the values for area were normalized using Equation (8).
