Efficient Sigma–Delta Sensor Array Beamforming

Carbajal Ipenza, Sammy Johnatan; Masiero, Bruno Sanches

doi:10.3390/s23177577

Open AccessArticle

Efficient Sigma–Delta Sensor Array Beamforming

by

Sammy Johnatan Carbajal Ipenza

¹

and

Bruno Sanches Masiero

^2,*

¹

NXP Semiconductors N.V., 5656 Eindhoven, The Netherlands

²

School of Electrical and Computer Engineering, Universidade Estadual de Campinas, Campinas 13083-852, Brazil

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(17), 7577; https://doi.org/10.3390/s23177577

Submission received: 30 June 2023 / Revised: 23 August 2023 / Accepted: 25 August 2023 / Published: 31 August 2023

(This article belongs to the Special Issue Energy-Efficient Communication Networks and Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Nowadays, sensors with built-in sigma–delta modulators (ΣΔMs) are widely used in consumer, industrial, automotive, and medical applications, as they have become a cost-effective and convenient way to deliver data to digital processors. This is the case for micro-electro-mechanical system (MEMS), digital microphones that convert analog audio to a pulse-density modulated (PDM) bitstream. However, as the ΣΔMs output a PDM signal, sensors require either built-in or external high-order decimation filters to demodulate the PDM signal to a baseband multi-bit pulse-code modulated (PCM) signal. Because of this extra circuit requirement, the implementation of sensor array algorithms, such as beamforming in embedded systems (where the processing resources are critical) or in very large-scale integration (VLSI) circuits (where the power and area are crucial) becomes especially expensive as a large number of parallel decimation filters are required. This article proposes a novel architecture for beamforming algorithm implementation that fuses delay and decimation operations based on maximally flat (MAXFLAT) filters to make array processing more affordable. As proof of concept, we present an implementation example of a delay-and-sum (DAS) beamformer at given spatial and frequency requirements using this novel approach. Under these specifications, the proposed architecture requires 52% lower storage resources and 19% lower computational resources than the most efficient state-of-the-art architecture.

Keywords:

MAXFLAT; PDM; sigma–delta; microphone; sensor array; decimation; beamforming

1. Introduction

In the last decades, sensor array processing has emerged as an active area of research in estimating space-time parameters. Array-processing applications are applied to solve many real-world problems. In telecommunications, for example, antenna arrays are steered in one user direction to reduce user interference. Radar and sonar use arrays of antennas and hydrophones, respectively, to calculate parameters like direction of arrival (DoA), velocity, and range. In medicine, sensor arrays are used for medical imaging, and planar biomagnetic sensor arrays are used in electrocardiograms to localize brain activity. In industry, sensor arrays are used in automatic monitoring and fault detection [1].

More recently, microphone array processing has emerged to increase the audio quality in consumer devices like mobile phones, speakerphones, and smart speakers, which are broadly used in conference rooms, desktop devices, and intelligent virtual assistants (IVA), in both consumer and industrial devices. Most frequently, the signals from several microphones are combined via a beamforming algorithm to enhance the sound coming from a desired direction while attenuating ambient noise and interference [1].

However, microphone array implementations are still expensive due to the complex characteristics of speech signals (non-static source, intermittent, and broadband) and the usual environmental conditions (reverberation and non-stationary additive noise). Adding an extra microphone in the design requires new routing, new placement conditions, and more processing resources, increasing the system cost and power consumption, a critical factor for internet of things (IoT) and mobile applications.

Digital MEMS microphones (introduced in 2006 [2]) have emerged as an alternative to overcome the size and cost limitations. As these microphones have an analog-to-digital converter (ADC) incorporated as a pre-amplifier [3], they have a single line PDM output; because of that, they are also known as PDM microphones (PDM-mics). A decimation filter (also known as a PDM-to-PCM converter) demodulates this PDM bitstream output to a PCM signal. Unfortunately, implementing this decimation filter is still not cheap, as its cost (measured in die area and power) increases with the quality of the desired audio signal. Take, for example, the case of a microphone array using these PDM-mics. This architecture requires a decimation filter for each microphone input so that the implementation cost and power consumption will increase proportionally with the number of microphones, being even more expensive for practical applications.

This paper proposes a novel and economical method to implement beamforming algorithms with arrays of MEMS digital microphones. We apply the new architecture to a DAS beamformer as a proof of concept, but it can also be used with other beamforming strategies. This method merges a conventional beamformer’s filtering and delays operations into a single structure dubbed as delayed decimation filter. We propose a J-stage decimation filter whose penultimate stage (

J - 1

) is a Samadi filter, and its last stage (J) is an equiripple filter. The Samadi filter controls the overall filter delay by adjusting a single parameter, and the last equiripple stage compensates for the magnitude and phase distortion caused by the Samadi filter under a specific limit.

In the end, the proposed delayed decimation filter is an “all-in-one” filter that performs the same filtering and downsampling operations as any state-of-the-art decimation filter, has the capability of altering its group delay without any change in its structure or additional delay chain, and provides storage and computational resources savings in comparison to state-of-the-art architectures.

To explain the working principle of the proposal, we first recapitulate the implementation of a DAS beamformer in Section 2. We then present a novel beamformer based on delayed decimation filters in Section 3, where we introduce multirate and decimation filters, as well as how a Samadi filter can be used with these structures. To conclude, as a proof of concept, we present in Section 4 an implementation of this novel architecture, and in Section 5, we compare it to state-of-the-art DAS beamformer architectures.

2. DAS Beamformer

The DAS beamformer is the oldest and simplest array signal processing algorithm [1]. The underlying idea is to delay each microphone input by an appropriate time delay and then add all delayed microphone signals together. In this sense, the audio signal arriving from a particular direction at the array is reinforced in relation to signals coming from different directions and incoherent noise.

The traditional or discrete-time DAS beamformer (In the literature, the traditional DAS does not have the weights

w_{m}

in its temporal representation because these weights only show up if you use a “weighted DAS” or a frequency representation; however, in this work the “weighted DAS” is referred to as the traditional DAS, as

w_{m}

can implement the averaging process.) is the result of

z [k] = \sum_{m = 0}^{M - 1} w_{m} y_{m} [k - k_{m}],

(1)

where

y_{m}

is the mth microphone’s output in PCM representation and

k_{m}

is the integer delay associated with the mth microphone, such that

k_{m} = [Δ_{m} / T] = [Δ_{m} f_{o}],

(2)

where

Δ_{m}

is the required delay in the mth microphone, [x] means the nearest integer to x, and

f_{o}

and T are the sampling rate and period in

y_{m}

, respectively.

In case of PDM-mics, Equation (1) can be represented as shown in Figure 1, such that

y_{m}

is the decimation filter’s output and

x_{m}

is the PDM bitstream incoming from the respective mth PDM-mic.

Due to the integer nature of k, the DAS beamformer does not allow one to form sums that involve noninteger multiples of T. Consequently, beams cannot be steered in arbitrary directions, resulting in a directivity pattern with a stepped response due to the integer nature of the delay elements, which limits the beamformer resolution (as exemplified in Figure 2).

Also, if one assumes uncorrelated noise at the locations of the sensors and that the beamformer’s delays are appropriately matched to the wave’s DoA, it can be proven [4] that the beamformer gain (G) depends only on the weights

w_{m}

and the number of microphones:

G = \frac{{(\sum_{m = 0}^{M - 1} w_{m})}^{2}}{\sum_{m = 0}^{M - 1} w_{m}^{2}},

(3)

so that, for the beamformer in Figure 2, with

M = 40

and

w_{m} = 1

, the white noise gain will be

G = 40

or

32

dB. Furthermore, the dynamic range depends only on the number of elements in the array. The array used for the current example provides a dynamic range of

13

dB.

3. Beamformer Based on Delayed Decimation Filter

Figure 1 describes a typical architecture for implementing DAS beamformers with PDM-mics. For each PDM-mic, there is an associated decimation filter to convert the PDM bitstream into a PCM bitstream and a delay line to steer the beamformer. To devise a more economical implementation of this architecture, we propose to merge the decimation filtering and the delaying operations into a single structure. To explain how a Samadi filter can be used for this purpose, we first review the concept of multirate and decimation filters, present the Samadi filter structure, show how it can be used as a multirate filter, and finally propose a new beamforming architecture based on this multirate filter (delayed decimation filter).

3.1. Multirate and Decimation Filters

Multirate filters are digital filters whose different parts operate at different rates. The most obvious application of such a filter is when the input and output sample rates must differ (decimation or interpolation). A decimation filter is a class of multirate filters [5] that decreases a signal sampling rate by an integer or fractional factor. Figure 3 shows a generic decimation filter structure, where the input signal at

f_{i}

sampling rate passes through a low-pass filter (LPF) with impulse response

H (z)

, and then it is downsampled by a factor R to an output sampling rate

f_{o} = f_{i} / R

. In the case of a PDM-mic, usually,

x [n]

has a one-bit width only while

y [k]

is a multi-bit output.

For a given application, there are many design parameters to be taken into account for the LPF design, such as filter passband frequency

F_{p}

, stopband frequency

F_{s}

, passband ripple

δ_{p}

, and stopband ripple

δ_{s}

, as exemplified in Figure 4. Those LPF design parameters are related as follows:

\begin{matrix} U_{p} & = {f : f \in [0, F_{p}]} \end{matrix}

(4a)

\begin{matrix} U_{s} & = {f : f \in [F_{s}, f_{i}]} \end{matrix}

(4b)

\begin{matrix} δ_{p} & = max (| | H (e^{2 π i f / f_{i}}) | - 1 |) \forall f \in U_{p}, \end{matrix}

(4c)

\begin{matrix} δ_{s} & = max (| H (e^{2 π i f / f_{i}}) |) \forall f \in U_{s}, \end{matrix}

(4d)

where

U_{p}

and

U_{s}

are the passband and stopband frequency ranges, respectively. Also, the angular passband and stopband frequencies can be expressed as

\begin{matrix} w p & = \frac{2 π F_{p}}{f_{i}}, \end{matrix}

(5a)

\begin{matrix} w s & = \frac{2 π F_{s}}{f_{i}}, \end{matrix}

(5b)

and

U_{p}

and

U_{s}

intervals can be scaled to angular frequency domain as

\begin{matrix} V p & = \frac{2 π U_{p}}{f_{i}}, \end{matrix}

(6a)

\begin{matrix} V s & = \frac{2 π U_{s}}{f_{i}} . \end{matrix}

(6b)

In the case of audio sensors such as MEMS microphones, a decimation filter is required to convert the oversampled output from the internal ADC to a standard audio PCM output. Baseband signal quality parameters such as linearity, signal-to-noise ratio (SNR), total harmonic distortion (THD), and total harmonic distortion plus noise (THD+N) can be worsened at the filter output if the LPF is not properly designed [6]. Also, the LPF structure should be carefully chosen to obtain a proper phase response. A Finite Impulse Response (FIR) structure, for example, can be used if a linear phase is required; otherwise, Infinite Impulse Response (IIR) filters are preferred, as, usually, IIR filters are smaller than their equivalent FIR implementations. Moreover, some applications tolerate some degree of non-linearity in phase; in this case, quasi-linear filters, a mixture of FIR and IIR filters, can be used.

3.2. Universal Maximally Flat Samadi Filter

As derived in [7], the transfer function in Samadi filters is defined by

H_{N, K, d} (z) = \sum_{j = 0}^{N - K} c_{j} {(\frac{1 - z^{- 1}}{2})}^{j} {(\frac{1 + z^{- 1}}{2})}^{N - j},

(7)

where

c_{j} = \sum_{i = 0}^{j} {(- 1)}^{j - i} (\binom{\frac{N}{2} - d}{i}) (\binom{\frac{N}{2} + d}{j - i}),

(8)

K is the number of zeros at

z = - 1

, N is the filter order, and the delay parameter d is a real number defined as

d = α - \frac{N}{2} .

(9)

For a given group delay

α

, such that

0 \leq α \leq N

, from (9), one can verify that

- \frac{N}{2} \leq d \leq \frac{N}{2}

(10)

or

| d | \leq d_{max},

(11)

where

d_{max} = N / 2

is the maximum allowed delay parameter and the binomial coefficients in (8) are defined as

(\binom{r}{s}) = \{\begin{matrix} \prod_{q = 0}^{s - 1} \frac{r - q}{q + 1}, & s \geq 1 \\ 1, & s = 0 \\ 0 . & s < 0 \end{matrix}

(12)

This filter becomes a maximally flat (MAXFLAT) linear phase FIR when

d = 0

. As shown in [8,9], the angular passband frequency (w_p) of these linear phase filters is related with N as

L ≃ ⌈N w_{p} / π + 0.5⌉

(13)

where L is defined for convenience as

L = N - K .

(14)

The cutoff frequency of these linear phase filters increases almost linearly with L, as shown in Figure 5 for different values of N. Also, as demonstrated in [7] and shown in Figure 5a, for linear phase filters (

d = 0

), the coefficient of (7) is

c_{j} |_{d = 0} = 0, j odd .

(15)

Then, the magnitude frequency spectra of

L = 2 j

and

L = 2 j + 1

are the same for

j \in {0, \dots, ⌊ N / 2 - 1 ⌋}

. Figure 5 also shows that the filter has a linear phase and that the group delay for

d = 0

is

α = N / 2

, as expected by (9).

On the other hand, when

d \neq 0

, the Samadi filter becomes a MAXFLAT nonlinear phase filter. The most interesting characteristic of this filter class is the ability to modify its group delay with the filter delay parameter (d), as given by (9). Figure 6 shows how the flatness of the magnitude and phase of the filter’s frequency response is affected when d increases—we see that passband

δ_{p}

’s ripples worsen as d increases. However, it is also shown that the phase is still linear inside the passband region for

ω < 0.15 π

and that the decimation filter continues under the same specification for all values

| d | \leq 5

. This suggests that this filter can be used as an intermediary stage in a multirate filter chain to adjust the overall filter delay (Δ) and perform low-pass filtering at the same time, as discussed in the following sections.

Finally, we propose Algorithm 1 to calculate the minimum K and N Samadi filter values for a given d, matching a given filter specification with the following parameters: V_p, V_s,

δ_{p}

, and

δ_{s}

. In lines 2–4, the algorithm initializes w_p, L, and N values to the minimum possible ones. Then, in line 5, it starts to iterate to calculate the minimum K and N values. In line 6, K is updated. In lines 7–8,

δ_{p}

and

δ_{s}

are calculated from the filter frequency response for V_p and V_s ranges, respectively, and for the current K and N values. If

δ_{p}

and

δ_{s}

meet the specification, it returns the parameter values in line 10. Else, in lines 12–26, it increases the N or L value, depending on the d weight or if the filter parameters are inside ranges defined in (13)–(15).

Algorithm 1 Samadi Filter minimum N and K calculation algorithm
1: procedure SamadiMinN( $d, δ_{p}, δ_{s}, V_{p}, V_{s}$ )
2: $L \leftarrow 0$
3: $N \leftarrow 2 ⌈ d ⌉$
4: $w_{p} \leftarrow \max (V_{p})$
5: loop
6: $K = N - L$
7: $δ_{p}^{'} \leftarrow \max (\| H_{N, K, d} (e^{i ω}) - 1 \|) \forall ω \in V_{p}$
8: $δ_{s}^{'} \leftarrow \max (\| H_{N, K, d} (e^{i ω}) \|) \forall ω \in V_{s}$
9: if $δ_{p}^{'} \leq δ_{p}$ and $δ_{s}^{'} \leq δ_{s}$ then
10: return $N, K$
11: else
12: if d = 0 then	▹ Linear-phase filter
13: if $L > ⌈ N w_{p} / π + 0.5 ⌉$ then
14: $L \leftarrow 0$
15: $N \leftarrow N + 1$
16: else
17: $L \leftarrow L + 2$
18: end if
19: else	▹ Nonlinear-phase filter
20: if $δ_{s}^{'} \geq 1$ or $L \geq N$ then
21: $L \leftarrow 0$
22: $N \leftarrow N + 1$
23: else
24: $L \leftarrow L + 1$
25: end if
26: end if
27: end if
28: end loop
29: end procedure

Figure 7 shows minimum N and K values, calculated using Algorithm 1 for

d \in {0, \dots, 26}

and different values of w_p. It is shown that the minimum N, required for any d, decreases with w_p increments, and it is almost three times d when

w_{p} / π = 0.28

.

Also, it is essential to remark that, if the Samadi filter is designed for

d_{max}

, the decimation filter continues under the same specification for values

| d | \leq d_{max}

. This effect can be observed in Figure 6a, where

δ_{p}

decreases for lower values of d, and, in Figure 7, where, for

d \geq 3

, if N is kept constant and d is decreased, w_p tends to increase so that the flatness is improved.

3.3. Delayed Decimation Filter

Because of its configurable group delay property, a single Samadi filter could be used as the LPF of a multirate filter with adjustable overall filter delay, as shown in Figure 8a—this structure is dubbed in this paper as delayed decimation filter. However, as a Samadi filter does not have the flexibility to be designed for specific

F_{p}

and

F_{s}

values without changing other filtering parameters, its frequency response needs to be compensated to keep the overall decimation filter’s parameters under specification for different delay values (d). For this reason, we propose a J-stages decimation filter architecture whose penultimate stage (

J - 1

) is a Samadi filter and its last stage (J) is an equiripple filter, as shown in Figure 8b. The Samadi filter can then be decomposed iton its binomial components, as shown in Figure 8c.

The Samadi filter controls the overall filter delay (Δ) by setting its respective d parameter, and the last equiripple stage compensates for the magnitude and phase distortion caused by the Samadi filter under a specified limit. Also, as this is a multi-stage filter, other filtering stages (1 to

J - 2

) can be optionally added to help with decimation and filtering.

The overall filter delay Δ depends on the d,

R_{J - 1}

, and

R_{J}

parameters in such a way:

Δ = \frac{d}{R_{J} R_{J - 1} f_{o}} .

(16)

If we replace (16) in (11), it is observed that the maximum required delay (

Δ_{max}

) is limited by the

d_{max}

parameter as follows:

| Δ | \leq \frac{d_{max}}{R_{J} R_{J - 1} f_{o}} .

(17)

Therefore, since

d_{max} = Δ_{max} R_{J} R_{J - 1} f_{o}

, the minimum K and N parameters can be calculated using Algorithm 1 for

d = d_{max}

and the desired filter specification parameters:

δ_{p} = δ_{p}^{j}, δ_{s} = δ_{s}^{j}, V_{p} = V_{p}^{j}

, and

V_{s} = V_{s}^{j}

for

j = J - 1

.

3.4. Optimized Beamformer Structure

Since the Samadi filter is a binomial filter sequence (as first proposed by Haddad in [10]), (7) can be rearranged to allow the filter to be expressed as

H_{N, K, d} (z) = {(\frac{1 + z^{- 1}}{2})}^{N} \sum_{j = 0}^{N - K} c_{j} {(\frac{1 - z^{- 1}}{1 + z^{- 1}})}^{j} .

(18)

The binomial filter in Equation (18) can be realized as a cascade of two filters:

H_{N, K, d} (z) = A_{N} (z) B_{N, K, d} (z),

(19)

where

A_{N} (z) = {(\frac{1 + z^{- 1}}{2})}^{N}, B_{N, K, d} (z) = \sum_{j = 0}^{N - K} c_{j} {(\frac{1 - z^{- 1}}{1 + z^{- 1}})}^{j} .

(20)

The Samadi filter stage in a delayed decimation filter in Figure 8c can be expressed in its binomial representation in such a way that the latter part of the filter chain does not depend on Δ, as d is used only for the calculation of

c_{j}

. Therefore, if M delayed decimation filters are placed in parallel, the weightings by

w_{m}

are placed just before the

A_{N} (z)

filter and their outputs are added to form a beamformer. Note that the latter part after

B_{N, K, d} (z)

can be shared between all microphone channels, as shown in Figure 9.

4. Proof of Concept

We now evaluate the proposed architecture. We determine the delayed decimation filter parameters for a given specification and compare the proposed architecture to state-of-the-art DAS beamformer architectures.

4.1. Decimation Filter Specifications

Filter specifications and array geometries change depending on the beamformer application. Therefore, to compare the efficiency between the proposed method and the straightforward DAS beamformer implementation, we use the specification shown in Table 1 as the basis of all our decimation filter designs, as it is considered enough for most PDM-mic types and speech-processing applications.

4.2. Beamformer Specification

The delay from the array center to the mth microphone (

Δ_{m}

) in an array is constrained to

| Δ_{m} | \leq Δ_{max} for m = 0, 1, \dots, M - 1

(21)

such that

Δ_{max} = \frac{| {\bar{x}}_{max} - {\bar{x}}_{c} |}{c},

(22)

where

{\bar{x}}_{max}

is the furthest sensor location in relation to

{\bar{x}}_{c}

(which is the array’s center reference), M is the number of microphones, and c is the sound speed (typically 343.0

m

/

s

).

Assume that we require a microphone array for hands-free applications that, when placed 80

c m

from the voice source, would attain the same SNR as the SNR obtained by a single microphone placed 2

c

m

from the same source [11]. Then, by (3), the desired microphone array requires

M = 40

microphones.

Also, as the minimum distance between microphones should be

D_{\min} \leq c / 2 F_{p}

to avoid spatial aliasing, if the frequency range is limited to

F_{p} = 7.5 kHz

, then the desired microphone array will require

D_{\min} \leq 2 c m

. Finally, as

M = 40

, if a

5 \times 8

microphone array is assumed, then the

Δ_{max}

can be calculated using (22), with the resulting value shown in Table 2.

4.3. Filter Design

A delayed decimation filter was designed according to specifications listed in Table 1. The filter has a three-stage architecture ([lthband, maxflat, equir]) with respective decimation rates [48, 2, 2]. The lthband stage is an LPF whose cutoff frequency is

π / L

, and the impulse response is zero for every L-th sample [5]. The second stage is a maxflat Samadi filter, and the last is an equiripple filter [12]. As

R_{J} = R_{J - 1} = 2

, by (17),

d_{max} = 20.13

; the parameters N and K of the maxflat stage are calculated using Algorithm 1 so that the overall filter specification is kept for all

| d | \leq 20.13

.

Figure 10a shows the individual frequency spectrum of each internal stage for

d_{max} = 20.13

, and Figure 10b zooms in the passband frequency region. Note that even though the maxflat stage has a bumpy frequency spectrum above the passband frequency (

F_{p}

), this is compensated by the last stage equiripple filter (equir). Figure 11a also shows that the magnitude in the overall frequency spectrum of the delayed decimation filter is inside the required passband and stopband filter specifications, while Figure 11b,c show that the filter phase and magnitude response is almost linear in the passband range.

The advantage of using a Samadi filter is that it allows one to change its group delay by changing some coefficients, i.e., without changing the whole filter structure. Figure 12 shows the group delay of this multi-stage filter for many values of its d parameter. It is easy to see how the group delay is directly proportional to the d parameter.

Table 3 shows the resources required to implement a DAS beamformer based on this three-stage delayed decimation filter designed for array specifications listed in Table 2, and Table 4 shows the breakdown of resources required per filter stage.

5. Results

Results from Table 3 are compared to other state-of-the-art DAS beamformer architectures (more details in [13]) in Table 5.

The pcm_multi architecture is the same as shown in Figure 1 but uses a multi-stage decimator filter structure for each channel. It has more beamformer’s storage requirement and additions per second because of the parallel architecture for delaying and filtering.

The pcm_single_memsav architecture is also the same as shown in Figure 1 but uses a single-stage decimation filter with a memory-saving polyphase implementation [14] for each channel. This architecture has the lowest beamformer’s storage requirement because of the polyphase implementation. Still, conversely, it also has the most additions per second because more operations are performed at higher sampling rates before downsampling.

The pdm_multi architecture is the same as shown in Figure 13. Still, using a multi-stage decimator filter structure in the output is the most efficient state-of-the-art architecture because only a single decimation filter is required, and the delaying operations require only a few bits per channel.

The pdm_single_memsav architecture is also the same as shown in Figure 13 but using a single-stage decimation filter with a memory-saving polyphase implementation [14] in the output. It has lower beamformer’s storage requirement because of the polyphase implementation, but, conversely, it also requires more additions per second because more operations are performed at higher sampling rates before downsampling.

Table 5 shows that, for the given specification and because of the shared resources for delaying and filtering, the proposed architecture (delayed_bf) requires about

19 %

lower computational resources (additions per second) and

52 %

lower storage (beamformer’s storage requirement) than the most efficient state-of-the-art architecture (pdm_multi).

It is also observed that the proposed architecture’s storage efficiency is ranked just after the pcm_single_memsav architecture. However, as the pcm_single_memsav architecture also requires a prohibitive quantity of computational resources (about 697% more), it can be concluded that the proposed beamformer based on delayed decimation filters is the most resource-efficient beamformer architecture for the given specification.

Finally, we see that, because of the lowest computational resources requirement, in practical cases such as implementing the beamformer either in a single-core/single-adder CPU, in a Field-Programmable Gate Array (FPGA) running at 64

MHz

, or in an integrated circuit (VLSI) running at 10

MHz

, the proposed architecture will be, in all cases, about

19 %

more efficient.

6. Conclusions

In this study, we proposed combining the decimation filters found in PDM-mics with the delay line required in the traditional DAS beamformer. This was achieved by designing a decimation filter that includes a stage realized with the Samadi filter structure, which easily allows its group delay to be altered by the varying a single parameter.

We evaluated the proposed architecture by comparing it to other state-of-the-art DAS beamformer architectures. To facilitate the comparison, we established a set of filter specifications as a baseline for all decimation filter designs. These specifications were sufficient for various PDM-mics and speech-processing applications.

The designed filter demonstrated satisfactory performance, as exemplified in the frequency response and group delay plots. Furthermore, using a Samadi filter provided flexibility in adjusting the group delay without altering the overall filter structure.

Overall, the proposed architecture showed promising filter design and resource requirements results, providing the best trade-off between storage and computational resources. The presented specification requires

52 %

lower storage resources and

19 %

lower computational resources than the most efficient state-of-the-art architecture. The findings support the feasibility and effectiveness of the proposed approach for beamforming applications applied, but not limited, to DAS beamformers.

Author Contributions

Conceptualization, S.J.C.I. and B.S.M.; methodology, S.J.C.I. and B.S.M.; software, S.J.C.I.; validation, S.J.C.I.; formal analysis, S.J.C.I. and B.S.M.; investigation, S.J.C.I.; resources, B.S.M.; data curation, S.J.C.I. and B.M; writing—original draft preparation, S.J.C.I.; writing—review and editing, S.J.C.I. and B.S.M.; visualization, S.J.C.I.; supervision, B.S.M.; project administration, B.S.M.; funding acquisition, B.S.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by São Paulo Research Foundation (FAPESP), grant #2017/08120-6.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

Symbols

$B_{N, K, d} (z)$	Samadi filter binomial component.
c	sound speed.
$D_{\min}$	minimum distance between microphones.
$δ_{p}^{j}$	jth-stage passband ripple.
$Δ_{m}$	delay from the array center to the mth microphone.
$d_{\max}$	maximum allowed delay parameter.
$Δ_{\max}$	maximum required delay.
$Δ$	overall filter delay.
$δ_{p}$	passband ripple.
$δ_{s}$	stopband ripple.
d	Samadi filter delay parameter.
$δ_{s}^{j}$	jth-stage stopband ripple.
$f_{cpu}$	estimated minimum frequency in a single-core/single-adder processor.
$f_{i}$	input sampling rate.
$f_{o}$	output sampling rate.
$F_{p}$	passband frequency.
$F_{s}$	stopband frequency.
G	beamformer gain.
a	group delay.
$H (e^{2 π i f / f_{i}})$	overall low-pass filter impulse response.
$H_{N, K, d} (z)$	Samadi filter impulse response.
$H (z)$	low-pass filter impulse response.
$S_{bf}^{z}$	beamformer’s storage requirement.
K	number of zeros at $z = - 1$ in a Samadi filter.
$L_{frame}$	frame length (for frequency domain implementations).
$L_{in}$	filter input length.
$L_{out}$	filter output length.
M	number of microphones.
N	Samadi filter order.
R	decimation factor.
$T_{FPGA}^{+}$	estimated number of adders in an FPGA running at 64 MHz.
$T_{lp}^{+}$	estimated number of adders in a VLSI circuit running at 10 MHz.
$S_{bf}^{+}$	beamformer’s number of additions per second.
$S_{bf}^{*}$	beamformer’s number of multiplications per second.
$S_{bf}^{o}$	beamformer’s total number of additions per second.
$U_{p}$	passband frequency range.
$U_{s}$	stopband frequency range.
$V_{p}$	passband angular frequency range.
$V_{p}^{j}$	jth-stage passband angular frequency range.
$V_{s}$	stopband angular frequency range.
$V_{s}^{j}$	jth-stage stopband angular frequency range.
$w_{m}$	mth-filter channel gain.
$w_{p}$	angular passband frequency.
$w_{s}$	angular stopband frequency.

Abbreviations

$\sum Δ$ M	sigma–delta modulator.
ADC	analog-to-digital converter.
APS	additions per second.
DAS	delay-and-sum.
DoA	direction of arrival.
FIR	Finite Impulse Response.
FPGA	Field-Programmable Gate Array.
IIR	Infinite Impulse Response.
IoT	internet of things.
IVA	intelligent virtual assistants.
LPF	low-pass filter.
MAXFLAT	maximally flat.
MEMS	micro-electro-mechanical system.
PCM	pulse-code modulated.
PDM	pulse-density modulated.
PDM-mic	PDM microphone.
SNR	signal-to-noise ratio.
THD	total harmonic distortion.
THD+N	total harmonic distortion plus noise.
VLSI	very large-scale integration.

References

Krim, H.; Viberg, M. Two decades of array signal processing research: The parametric approach. IEEE Signal Process. Mag. 1996, 13, 67–94. [Google Scholar] [CrossRef]
Lawes, R. MEMS Cost Analysis: From Laboratory to Industry; Pan Stanford: Stanford, CA, USA, 2014. [Google Scholar]
Vardhini, P.H.; Makkena, M.L. Design and comparative analysis of on-chip sigma delta ADC for signal processing applications. Int. J. Speech Technol. 2021, 24, 401–407. [Google Scholar] [CrossRef]
Johnson, D.H.; Dudgeon, D.E. Array Signal Processing: Concepts and Techniques; Simon & Schuster, Inc.: New York, NY, USA, 1992. [Google Scholar]
Milic, L. Multirate Filtering for Digital Signal Processing: MATLAB Applications; Premier Reference Source, Information Science Reference; IGI Global: Hershey, PA, USA, 2009. [Google Scholar]
Metzler, B. Audio Measurement Handbook; Audio Precision: Raleigh, NC, USA, 1993. [Google Scholar]
Samadi, S.; Nishihara, A.; Iwakura, H. Universal maximally flat lowpass FIR systems. IEEE Trans. Signal Process. 2000, 48, 1956–1964. [Google Scholar] [CrossRef]
Herrmann, O. On the approximation problem in nonrecursive digital filter design. IEEE Trans. Circuit Theory 1971, 18, 411–413. [Google Scholar] [CrossRef]
Rajagpoal, L.; Roy, S.D. Design of maximally-flat FIR filters using the Bernstein polynomial. IEEE Trans. Circuits Syst. 1987, 34, 1587–1590. [Google Scholar] [CrossRef]
Haddad, R. A class of orthogonal nonrecursive binomial filters. IEEE Trans. Audio Electroacoust. 1971, 19, 296–304. [Google Scholar] [CrossRef]
Van Compernolle, D. Future Directions in Microphone Array Processing. In Microphone Arrays: Signal Processing Techniques and Applications; Springer: Berlin/Heidelberg, Germany, 2001; pp. 389–394. [Google Scholar] [CrossRef]
McClellan, J.; Parks, T. A unified approach to the design of optimum FIR linear-phase digital filters. IEEE Trans. Circuit Theory 1973, 20, 697–701. [Google Scholar] [CrossRef]
Carbajal Ipenza, S.J. Efficient Pulse-Density Modulated Microphone Array Processing. Master’s Thesis, Universidade Estadual de Campinas, Campinas, Brazil, 2020. [Google Scholar]
Fliege, N. Multirate Digital Signal Processing: Multirate Systems, Filter Banks, Wavelets; Wiley: New York, NY, USA, 1994. [Google Scholar]

Figure 1. PDM microphones’ DAS beamformers. Each PDM-mic requires a decimation filter with

H (z)

frequency response and R downsampling. Then, each filter output

y_{m} [k]

is delayed by a

Δ_{m}

factor. Finally, all delayed signals are weighted (factor

w_{m}

) and summed together.

Figure 1. PDM microphones’ DAS beamformers. Each PDM-mic requires a decimation filter with

H (z)

frequency response and R downsampling. Then, each filter output

y_{m} [k]

is delayed by a

Δ_{m}

factor. Finally, all delayed signals are weighted (factor

w_{m}

) and summed together.

Figure 2. Normalized power (polar) of a uniform linear array of an

M = 40

microphones DAS beamformer. Three audio sources of 1 kHz, 3 kHz, and 5 kHz are located at 20, 60, and 110 degrees, respectively, i.e., the three with equal strength. The beamformer is placed on the X-axis. Therefore, its directivity pattern is symmetric about this axis.

Figure 2. Normalized power (polar) of a uniform linear array of an

M = 40

microphones DAS beamformer. Three audio sources of 1 kHz, 3 kHz, and 5 kHz are located at 20, 60, and 110 degrees, respectively, i.e., the three with equal strength. The beamformer is placed on the X-axis. Therefore, its directivity pattern is symmetric about this axis.

Figure 3. Generic decimation filter structure. In order to avoid aliasing, the input data

x [n]

at

f_{i}

sampling rate is low-pass filtered and then downsampled by R. If correctly filtered, the output data

y [n]

at

f_{o}

sampling rate contain the same information as

x [n]

decimated by R.

Figure 3. Generic decimation filter structure. In order to avoid aliasing, the input data

x [n]

at

f_{i}

sampling rate is low-pass filtered and then downsampled by R. If correctly filtered, the output data

y [n]

at

f_{o}

sampling rate contain the same information as

x [n]

decimated by R.

Figure 4. Low-pass filter design parameters. The passband and stopband regions are defined by

F_{p}

and

F_{s}

, respectively, and their respectives ripples are defined by

δ_{p}

and

δ_{s}

. The whole filter frequency response is constrained to the input sampling rate (

f_{i}

).

Figure 4. Low-pass filter design parameters. The passband and stopband regions are defined by

F_{p}

and

F_{s}

, respectively, and their respectives ripples are defined by

δ_{p}

and

δ_{s}

. The whole filter frequency response is constrained to the input sampling rate (

f_{i}

).

Figure 5. Normalized frequency spectra of linear-phase Samadi filters (

d = 0

) with

N = 9

and

N = 12

: (a) magnitude, (b) phase, and (c) group delay. It is observed that, in

d = 0

case, w_p changes linearly with L, that the phase is linear for both N values and that the group delay is proportional to N.

Figure 5. Normalized frequency spectra of linear-phase Samadi filters (

d = 0

) with

N = 9

and

N = 12

: (a) magnitude, (b) phase, and (c) group delay. It is observed that, in

d = 0

case, w_p changes linearly with L, that the phase is linear for both N values and that the group delay is proportional to N.

Figure 6. Normalized frequency spectra of Samadi filters with

N = 10

and

d \in {- 5, \dots, 5}

: (a) magnitude, (b) phase, and (c) group delay. It is observed that, approximately until

ω / π < 0.15

, the magnitude is flat, the phase is linear, and the group delay is proportional to d. For

ω / π \geq 0.15

, the frequency response is nonlinear in magnitude, phase, and group delay.

Figure 6. Normalized frequency spectra of Samadi filters with

N = 10

and

d \in {- 5, \dots, 5}

: (a) magnitude, (b) phase, and (c) group delay. It is observed that, approximately until

ω / π < 0.15

, the magnitude is flat, the phase is linear, and the group delay is proportional to d. For

ω / π \geq 0.15

, the frequency response is nonlinear in magnitude, phase, and group delay.

Figure 7. Minimum (a) N and (b) K values calculated using Algorithm 1 for

δ_{s} = - 80

dB and different values of d and w_p. It is observed that w_p and d have a negative correlation for a given N value i.e., when w_p increases, d decreases.

Figure 7. Minimum (a) N and (b) K values calculated using Algorithm 1 for

δ_{s} = - 80

dB and different values of d and w_p. It is observed that w_p and d have a negative correlation for a given N value i.e., when w_p increases, d decreases.

Figure 8. (a) Delayed decimation filter, (b) its version as a multi-stage decimation filter with the

J - 1

stage being a Samadi filter, and (c) its version with Samadi filter decomposed into its binomial components. Samadi filter stage is meant to control the overall filter delay (Δ) and the equiripple filter to compensate the non-linear response of the Samadi filter in its non-flat band. The optional Stages 1 to

J - 2

are meant to compensate and downsample the overall frequency response.

Figure 8. (a) Delayed decimation filter, (b) its version as a multi-stage decimation filter with the

J - 1

stage being a Samadi filter, and (c) its version with Samadi filter decomposed into its binomial components. Samadi filter stage is meant to control the overall filter delay (Δ) and the equiripple filter to compensate the non-linear response of the Samadi filter in its non-flat band. The optional Stages 1 to

J - 2

are meant to compensate and downsample the overall frequency response.

Figure 9. PDM-mic array DAS beamformer using delayed decimation filters.

Figure 10. (a) Magnitude frequency spectrum of internal stages of the delayed decimation filter in the whole input range, and (b) the same frequency spectrum in the 0 kHz to 50 kHz range.

Figure 11. (a) Magnitude and (b) phase frequency spectrum of the delayed decimation filter. (c) Passband ripple frequency spectrum.

Figure 12. Delayed decimation filter group delay.

Figure 13. PDM microphones’ DAS beamformer at PDM domain. Each PDM-mic output

x_{m} [n]

is delayed by a

Δ_{m}

factor, then all delayed signals are weighted (factor

w_{m}

) and summed together. Finally, the resulting sum is filtered and downsampled.

Figure 13. PDM microphones’ DAS beamformer at PDM domain. Each PDM-mic output

x_{m} [n]

is delayed by a

Δ_{m}

factor, then all delayed signals are weighted (factor

w_{m}

) and summed together. Finally, the resulting sum is filtered and downsampled.

Table 1. Decimation filter specifications.

Parameter	Value
input sampling rate ( $f_{i}$ )	3072.0 $kHz$
output sampling rate ( $f_{o}$ )	16.0 $kHz$
passband frequency ( $F_{p}$ )	7.5 $kHz$
stopband frequency ( $F_{s}$ )	8.0 $kHz$
passband ripple ( $δ_{p}$ )	≤0.0116 (≤0.1 dB)
stopband ripple ( $δ_{s}$ )	≤0.0001 (≤−80.0 dB)
decimation factor (R)	192
filter input length ( $L_{in}$ )	1
filter output length ( $L_{out}$ )	24
phase response	linear or almost linear

Table 2. Microphone array specifications.

Parameter	Value
number of microphones (M)	40 ( $5 \times 8$ )
minimum distance between microphones (D_min)	22.0 $m m$
array dimensions	$110.0 m m \times 176.0 m m$
maximum required delay ( $Δ_{max}$ )	314.47 $μ s$
mth-filter channel gain ( $w_{m}$ )	1
frame length (for frequency domain implementations) (L_frame)	4.0 $m s$

Table 3. Required resources to implement a beamformer using 40 shared delayed decimation filters.

	Value	Unit
beamformer’s storage requirement ( $S_{bf}^{z}$ )	39,478	bit
beamformer’s number of multiplications per second ( $S_{bf}^{*}$ )	6.9624 × 10 $^{8}$	$MPS$
beamformer’s number of additions per second ( $S_{bf}^{+}$ )	8.81696 × 10 $^{8}$	$APS$
beamformer’s total number of additions per second ( $S_{bf}^{o}$ )	2.45858 × 10 $^{9}$	$APS$
estimated minimum frequency in a single-core/single-adder processor (f_cpu)	2458.58	$MHz$
estimated number of adders in an FPGA running at 64 MHz ( $T_{FPGA}^{+}$ )	39	-
estimated number of adders in a VLSI circuit running at 10 MHz ( $T_{lp}^{+}$ )	246	-

Table 4. Delayed decimation filter resource requirements breakdown. The first row corresponds to the Lth-band filter stage, the second and third ones are to the

B_{n, k, D} (z)

and

A_{n} (z)

parts of the Samadi filter, respectively, and the last one to the equiripple filter.

Table 4. Delayed decimation filter resource requirements breakdown. The first row corresponds to the Lth-band filter stage, the second and third ones are to the

B_{n, k, D} (z)

and

A_{n} (z)

parts of the Samadi filter, respectively, and the last one to the equiripple filter.

Stage	$S_{bf}^{z}$ (bit)	$S_{bf}^{*}$ (MPS)	$S_{bf}^{+}$ (APS)	$S_{bf}^{o}$ (APS)	$f_{cpu}$ (MHz)	$T_{FPGA}^{+}$	$T_{lp}^{+}$
lthband	138	15,680,000	15,616,000	15,616,000	15.62	1	2
maxflat — $B_{N, K, d} (z)$	552	1,536,000	6,144,000	39,936,000	39.94	1	4
maxflat — $A_{n} (z)$	2714	0	3,776,000	3,776,000	3.78	1	1
equir	9164	5,040,000	5,024,000	171,344,000	171.34	3	18

Table 5. Comparison of the proposed beamformer architecture based on delayed decimation filter (delayed_bf) and other state-of-the-art beamformer architectures implementing a DAS beamformer, as specified in Table 1 and Table 2. All percentages are related to the respective value for the pdm_multi architecture, the most efficient state-of-the-art architecture found for the given specification [13].

DAS Beamformer Architecture	Beamformer’s Storage Requirement ( $S_{bf}^{z}$ ) in Bit $\times 10^{3}$	Beamformer’s Total Number of Additions per Second ( $S_{bf}^{o}$ ) in APS $\times 10^{8}$	Estimated Minimum Frequency in a Single-Core/Single-Adder Processor ( $f_{cpu}$ ) in MHz	Estimated Number of Adders in an FPGA Running at 64 MHz ( $T_{FPGA}^{+}$ )	Estimated Number of Adders in a VLSI Circuit Running at 10 MHz ( $T_{lp}^{+}$ )
Using delayed decimation filter (delayed_bf)	39.5 (−52%)	24.6 (−19%)	2458.58 (−19%)	39 (−19%)	246 (−20%)
Using a multi-stage decimation filter (pcm_multi)	210.0 (+155%)	41.8 (+37%)	4184.94 (+37%)	66 (+37%)	419 (+37%)
Using a single-stage memory saving decimation filter (pcm_single_memsav)	27.4 (−67%)	243.0 (+697%)	24,303.98 (+697%)	380 (+692%)	2431 (+694%)
Using a multi-stage decimation filter at PDM domain (pdm_multi)	82.3 (0%)	30.5 (0%)	3050.29 (0%)	48 (0%)	306 (0%)
Using a single-stage memory saving decimation filter at PDM domain (pdm_single_memsav)	77.8 (−6%)	35.5 (+16%)	3553.26 (+16%)	56 (+17%)	356 (+16%)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Carbajal Ipenza, S.J.; Masiero, B.S. Efficient Sigma–Delta Sensor Array Beamforming. Sensors 2023, 23, 7577. https://doi.org/10.3390/s23177577

AMA Style

Carbajal Ipenza SJ, Masiero BS. Efficient Sigma–Delta Sensor Array Beamforming. Sensors. 2023; 23(17):7577. https://doi.org/10.3390/s23177577

Chicago/Turabian Style

Carbajal Ipenza, Sammy Johnatan, and Bruno Sanches Masiero. 2023. "Efficient Sigma–Delta Sensor Array Beamforming" Sensors 23, no. 17: 7577. https://doi.org/10.3390/s23177577

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Efficient Sigma–Delta Sensor Array Beamforming

Abstract

1. Introduction

2. DAS Beamformer

3. Beamformer Based on Delayed Decimation Filter

3.1. Multirate and Decimation Filters

3.2. Universal Maximally Flat Samadi Filter

3.3. Delayed Decimation Filter

3.4. Optimized Beamformer Structure

4. Proof of Concept

4.1. Decimation Filter Specifications

4.2. Beamformer Specification

4.3. Filter Design

5. Results

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI