Underdetermined Blind Source Separation Method Based on a Two-Stage Single-Source Point Screening

Zhu, Zhanyu; Chen, Xingjie; Lv, Zhaomin

doi:10.3390/electronics12102185

Open AccessArticle

Underdetermined Blind Source Separation Method Based on a Two-Stage Single-Source Point Screening

by

Zhanyu Zhu

,

Xingjie Chen

^* and

Zhaomin Lv

School of Urban Rail Transportation, Shanghai University of Engineering Science, Shanghai 201620, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(10), 2185; https://doi.org/10.3390/electronics12102185

Submission received: 8 March 2023 / Revised: 30 April 2023 / Accepted: 8 May 2023 / Published: 10 May 2023

Download

Browse Figures

Versions Notes

Abstract

:

Underdetermined blind source separation is a signal processing technique that is more suitable for practical applications and aims to separate the source signals from the mixed signals. The mixing matrix estimation is a major step in the underdetermined blind source separation. Since the current methods for estimating the mixing matrix have the disadvantages of insufficient accuracy or weak noise immunity, a two-stage single-source point screening that combines the cosine angle algorithm and the L1-norm optimization algorithm is proposed. During the first stage, the first-stage single-source points are extracted from the original mixed signals using the cosine angle algorithm. During the second stage, based on the L1-norm optimization algorithm, the reference single-source points are extracted from the original mixed signals. The reference single-source points are then clustered to obtain the clustering center, which is defined as the reference center. In combination with the reference center, the deviation and interference points in the first-stage single-source points are eliminated by the cosine distance. The remaining signal points are considered as the second-stage single-source points, which are clustered to obtain the mixing matrix estimation. Experiments on simulated and speech signals show that the proposed method can obtain more accurate and robust mixing matrix estimation, leading to better separation of the source signals.

Keywords:

underdetermined blind source separation; single-source point screening; cosine angle; L1-norm optimization

1. Introduction

Blind source separation (BSS) is a signal processing technique that separates the source signals from the mixed signals based on the statistical properties of the signal [1]. It does not require knowledge of the a priori information and transmission path of the source signals, providing a powerful tool for recovering deep source signal information from complex mixed signals. This technology is now widely used in different fields such as audio signal recognition [2], biomedical signal processing [3], and wireless communication systems [4,5,6]. Independence component analysis (ICA) and natural gradient (NG) are the classical algorithms that have been earlier applied to BSS [7,8,9,10]. These algorithms are designed based on the condition that the number of sensors is greater than or equal to the number of source signals. However, in practical applications, situations often occur where the number of sensors is less than the number of source signals. For example, the installation space is small and the number of acquisition sensors installed is insufficient; the sensors are damaged and the data cannot be acquired; the acquired data are subject to strong interference; and the data sampling cannot be used. Under such conditions, algorithms such as ICA and NG will no longer be applicable.

The condition when the number of sensors is less than the number of source signals is called underdetermined blind source separation (UBSS). The non-negative matrix factorization (NMF) [11,12], mode decomposition [13], and sparse component analysis (SCA) [14,15,16] are the three main methods of UBSS. Both the NMF and the modal decomposition method have limited application conditions, while the SCA method is widely recognized as a popular method for UBSS problems due to its remarkable flexibility. Based on SCA to process UBSS, the signals need to be sufficiently sparse, which means that there are only a few moments in the mixed signals where the source signal takes a non-zero value [17]. Most signals have insufficient sparsity in the time domain, but the signals have better sparsity in the frequency domain. Therefore, the signals are usually transformed from the time domain to the frequency domain to increase sparsity [18]. Typical time–frequency domain transformations include Wigner–Ville distribution (WVD) [19], wavelet packet transform (WPT) [20], and short-time Fourier transform (STFT) [21,22]. With the advantage of high efficiency and high speed, STFT is more widely used.

There is a widely used procedure for signals with sufficient sparsity, which is in two main steps. The first step is the estimation of the mixing matrix and the second step is the separation of the source signal [23]. The accuracy of the mixing matrix estimation directly impacts the quality of source signal separation. Therefore, it is crucial to achieve a highly accurate estimation of the mixing matrix [24,25,26,27]. The potential function analysis method and single-source point screening method are the two most common methods used to estimate the mixing matrix. Bofill and Zibulevsky [23] proposed potential function analysis method to estimate the mixing matrix based on geometric information. Linh et al. [28] improved the vector classification to estimate the mixing matrix on the above basis. The potential function analysis method is complex and has limited applicability. In contrast, the single-source point screening method has received attention for its wide applicability. The single-source point screening method is proposed by Aissa et al. [29] on the basis of the degenerate unmixing estimation technique (DUET) [30] and subspace theory. If there is one and only one source signal dominating at a moment in time while other source signals are sampled with values close to zero, the signal point obtained at this moment is considered as a single-source point (SSP) [31]. Abrard and Deville [32] extended the idea of estimating the mixing matrix based on SSPs by using the time–frequency ratio of mixtures (TIFROM) method, which screens the SSPs by the property that the signal time–frequency ratio is constant. Reju et al. [33] proposed to screen SSPs by using the cosine angle algorithm, which can quickly estimate the mixing matrix. The above methods are efficient and fast in screening, but they are sensitive to noise. In the face of noise interference, the effectiveness of these single-source point screening methods will severely degrade, which lead to a decrease in the accuracy of the mixing matrix estimation. To improve the resistance to noise interference of the single-source point screening method, Guo et al. [34] proposed a method based on signal-independent distribution to decrease the influence of signal cross terms, which reduced the interference of noise. Sun et al. [35], and Li et al. [36] used Hough transform and probability density distribution to correct the screened SSPs, which effectively removed some of the noise values. Zhen et al. [37] proposed a method based on L1-norm optimization algorithm for single-source point screening. Based on the one-dimensional subspace of vectors, the objective function was optimized to improve the noise-resistance performance of the method. Although the above methods have better resistance to noise interference, they also have the tendency to remove more signal points. These single-source point screening methods tend to result in the loss of necessary SSPs, which affects the accuracy of the mixing matrix estimation.

Accurate extraction of SSPs is crucial for improving the accuracy of the mixture matrix estimation. However, most of the current single-source point screening methods are one-time screening and each of them has certain disadvantages. Some methods have better resistance to noise interference, but it is easy to lose the necessary SSPs. Some methods have better accuracy of screening, but the ability to resist noise interference is weak.

To address the above problems, this paper proposes a two-stage single-source point screening method that combines the cosine angle algorithm and the L1-norm optimization algorithm. Compared with the one-time single-source point screening methods, the proposed method not only guarantees the accuracy of the extracted SSPs but also enhances the ability to resist noise interference. As a result, it enables to obtain more accurate and robust mixing matrix estimation. In the first stage, the original mixed signals are screened for SSPs based on the cosine angle algorithm. The extracted SSPs are considered the first-stage single-source points. In the second stage, based on the L1-norm optimization algorithm, the SSPs are extracted from the original mixed signal as reference single-source points. The reference single-source points are then clustered, and the resulting cluster center is obtained as the reference center. In combination with the reference center, the deviation and interference points in the first-stage single-source points are eliminated by the cosine distance. The remaining signal points are considered as the second-stage single-source points, which are clustered to obtain the mixing matrix estimation. Experiments on simulated and speech signals show that the proposed method can obtain more accurate and robust mixing matrix estimation, leading to better separation of the source signals.

The remainder of this paper is organized as follows. Section 2 introduces the fundamental theory of UBSS. In Section 3, the two-stage single-source point screening method for estimating the mixture matrix is presented. In Section 4, the effectiveness of the proposed method is verified based on the comparative analysis of simulated signals and the application experiments of real speech signals. In Section 5, the conclusions of this paper are provided.

2. Theory of Underdetermined Blind Source Separation

BBS is the process of separating the source signals from the observed mixed signals. For the linear instantaneous mixing case, the model can be assumed as Equation (1).

x (t) = A s (t) + n (t)

(1)

where,

x (t) = {[x_{1} (t), x_{2} (t), \dots, x_{m} (t)]}^{T}

represent

m - dimensional

mixed signals collected by sensors,

s (t) = {[s_{1} (t), s_{2} (t), \dots, s_{n} (t)]}^{T}

represent

n - dimensional

source signals,

n (t) = {[n_{1} (t), n_{2} (t), \dots, n_{m} (t)]}^{T}

represent the noises,

A = [a_{1}, a_{2}, \dots, a_{n}] \in R^{m \times n}

represent the mixing matrix,

a_{i}

is the

i

-th column vector of the mixing matrix. The UBSS is the case where fewer mixed signals than source signals (

m < n

).

SCA is the widely used method to deal with UBSS. The main six steps to solve the UBSS by using SCA are as follows, and Figure 1 represents the flow chart for solving UBSS. The two major steps are the estimation of the mixing matrix and the separation of the source signal.

Step 1: STFT transforms time-domain data of mixed signals

x (t)

to frequency-domain data

x (t, f)

.

Step 2: Data preprocessing of

x (t, f)

to simplify the data to obtain

\tilde{x} (t, f)

.

Step 3: Single-source point screening is performed on

\tilde{x} (t, f)

to extract the exact SSPs.

Step 4: Normalization and clustering of the extracted SSPs to obtain the estimated mixing matrix

\tilde{A}

.

Step 5: The frequency-domain data of the source signals

\tilde{s} (t, f)

are recovered with the signal separation reconstruction algorithm and mixing matrix estimation.

Step 6: Time-domain source signals

\tilde{s} (t)

are obtained by the inverse STFT.

2.1. Frequency-Domain Transformation

To better represent the sparsity of the mixed signals, the time-domain data of the mixed signals is transformed into frequency-domain data [21]. This paper uses STFT to perform this transformation, and the frequency-domain model of the mixed signals is shown in Equations (2) and (3). The noises are ignored in the formula to conveniently describe the theory.

x (t, f) = [\begin{matrix} x_{1} (t, f) \\ x_{2} (t, f) \\ ⋮ \\ x_{m} (t, f) \end{matrix}] = [\begin{matrix} a_{11} & a_{12} & \dots & a_{1 n} \\ a_{21} & a_{22} & \dots & a_{2 n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m 1} & a_{m 2} & \dots & a_{m n} \end{matrix}] [\begin{matrix} s_{1} (t, f) \\ s_{2} (t, f) \\ ⋮ \\ s_{n} (t, f) \end{matrix}] = A s (t, f)

(2)

x (t, f) = a_{1} s_{1} (t, f) + a_{2} s_{2} (t, f) + \dots + a_{n} s_{n} (t, f) = \sum_{i = 1}^{N} a_{i} s_{i} (t, f)

(3)

where,

x (t, f)

represent the complex vector of the mixed signals in the frequency domain,

s (t, f)

represent the complex vector of the source signals in the frequency domain.

2.2. Estimation of Mixing Matrix

2.2.1. Data Preprocessing

In the frequency domain, mixed signals may contain some noisy data with insignificant characteristics. These useless noise data can reduce computational efficiency. To eliminate these useless data, an energy value detection is required, as described in Equation (4). Firstly, the energy value of each data point is computed and the results are sorted in descending order. Then, based on a predetermined threshold, only data points with higher energy values are retained, while the remaining ones are discarded [38].

‖ x (t, f) ‖ > ε \max (‖ x (t_{1}, f_{1}) ‖, ‖ x (t_{2}, f_{2}) ‖, \dots, ‖ x (t_{T}, f_{K}) ‖)

(4)

where,

‖ \cdot ‖

represents the value of Euclidean norm,

x (t_{T}, f_{K})

represent the signal point of

T

time and

K

frequency,

ε

is the judgment threshold, which is selected as a very small value, generally chosen as 0.01.

By removing the useless data, the frequency-domain data of the mixed signals are improved, enabling a more focused analysis of the retained data points. This process not only enhances the clarity of the data, but also significantly reduces the computational complexity of the analysis.

2.2.2. Single-Source Point Screening

If a source signal dominates a moment of the mixed signals and the other source signals are zero or close to zero, the point of the mixed signals at that moment is considered to be SSP. Assuming that only one source signal

s_{i} (t, f)

is dominant at some time, Equation (3) can be expressed as Equation (5) according to the frequency-domain model of the mixed signals.

x (t, f) = a_{1} \times 0 + \dots + a_{i} s_{i} (t, f) + \dots + a_{n} \times 0 = a_{i} s_{i} (t, f)

(5)

From Equation (5), SSPs will aggregate in some linear directions in space, which are the directions corresponding to the column vectors

a_{i}

of the mixing matrix

A

. These SSPs can then be clustered to obtain the estimation of the mixing matrix. Therefore, extracting the valid SSPs is crucial to obtain the estimation of the mixing matrix. The single-source point screening algorithms are described as follows.

If some signal point

x (t_{i}, f_{i})

is SSP, the real and imaginary parts of the frequency-domain vector of the signal point must have directional consistency [33]. However, in practical applications, requiring complete consistency of direction is often too restrictive. To allow for greater flexibility, the constraint is relaxed. A signal point is considered to be SSP if the angle between the real and imaginary parts of its frequency-domain vector is very small. Single-source point screening can be accomplished by calculating the cosine angle between the real and imaginary components of the frequency-domain vector for each signal point. If the value of the cosine angle is less than a given threshold

\cos (θ)

, the signal point is considered as SSP. The expression for the calculation of the cosine angle is shown in Equation (6).

| \frac{Re {(x (t_{i}, f_{i}))}^{^{T}} Im {(x (t_{i}, f_{i}))}^{^{T}}}{‖ Re (x (t_{i}, f_{i})) ‖ ‖ Im (x (t_{i}, f_{i})) ‖} | < \cos (θ)

(6)

where,

Re ()

represents the real part of the frequency-domain vector,

Im ()

represents the imaginary part of the frequency-domain vector,

| \cdot |

represents absolute value,

θ

is a very small value that generally approaches 0.

In this paper, the method of single-source point screening with the cosine angle algorithm is referred to as the cosine-SSP method.

If the signal points

x (t_{1}, f_{1})

and

x (t_{2}, f_{2})

are both SSP of the same source signal

s_{i} (t, f)

, Equation (7) can be satisfied at both signal points.

{\begin{matrix} x (t_{1}, f_{1}) = a_{i} s_{i} (t_{1}, f_{1}) \\ x (t_{2}, f_{2}) = a_{i} s_{i} (t_{2}, f_{2}) \end{matrix}

(7)

Then the SSP

\tilde{x} (t_{1}, f_{1})

and

\tilde{x} (t_{2}, f_{2})

must be linearly representable to each other [37], as in Equation (8).

x (t_{1}, f_{1}) = α x (t_{2}, f_{2})

(8)

where,

α

represents a factor.

The single-source point screening problem is transformed into determining whether each signal point has a linear relationship with other signal points. Equation (6) can be extended to a general conclusion, which is expressed by Equation (9).

x (t_{I}, f_{J}) = b x (t, f)

(9)

where,

x (t_{I}, f_{J})

represent a signal point of

I

time and

J

frequency,

x (t, f)

represent the collection of all signal points,

b = [b_{1}, b_{2}, \dots, b_{m}]

represent the linear coefficient vector.

If there is an element in

b

that is non-zero and the other elements are zero, the signal point

x (t_{I}, f_{J})

is considered as SSP.

b

can be calculated by the L1-Homotopy [39,40] based on the L1-norm optimization algorithm.

In this paper, the method of single-source point screening with L1-norm optimization algorithm is referred to as the L1-SSP method.

2.3. Separation of Source Signal

In terms of mathematical model, the UBSS model shares similarities with the compressed sensing model. The problem of source signal separation can be solved by applying the theory of compressed sensing to separate and reconstruct the source signal. The theoretical model of compressed sensing is expressed in Equation (10).

\begin{matrix} x (t) = Φ Ψ h (t) \\ {\begin{matrix} x (t) = Φ s (t) \\ s (t) = Ψ h (t) \end{matrix} \end{matrix}

(10)

where,

x (t) = {[x_{1} (t), x_{2} (t), \dots, x_{m} (t)]}^{T}

represent the mixed signals,

s (t) = {[s_{1} (t), s_{2} (t), \dots, s_{n} (t)]}^{T}

represent the source signals,

h (t) = {[h_{1} (t), h_{2} (t), \dots, h_{n} (t)]}^{T}

represent the sparsity coefficients of source signals,

Φ

represents the observation matrix,

Ψ

represents the sparse basis matrix.

The estimation of the mixing matrix

\tilde{A}

is the observation matrix

Φ

and the sparse basis matrix

Ψ

is the fixed matrix. Combining the compressive sensing theoretical model, the sparse coefficients of the source signals

h (t)

can be obtained with the orthogonal matching pursuit algorithm. The separation of the source signals can be acquired with the sparse coefficients

h (t)

and the sparse basis matrix

Ψ

.

3. The Proposed Two-Stage Single-Source Point Screening Method

The estimation accuracy of the mixing matrix is crucial to the results of source signal separation. Extracting the exact SSPs is a prerequisite for improving the accuracy of the mixing matrix estimation. To improve the precision and robustness of the single-source point screening method, this paper proposes a two-stage single-source point screening method that combines the cosine angle algorithm and the L1-norm optimization algorithm.

3.1. The First Stage

The single-source point screening in the first stage is to extract the major SSPs. Based on the cosine angle algorithm, as in Equation (6). The major SSPs are extracted from the frequency-domain signal points of the mixed signals. The extracted single points are defined as the first-stage single-source points.

In order to improve the clustering characteristics of SSPs and minimize data distribution differences, it is necessary to normalize and map the data onto the hemispheric plane, as in Equation (11).

\bar{x} (t, f) = {\begin{matrix} \frac{\tilde{x} (t, f)}{‖ \tilde{x} (t, f) ‖}, {\tilde{x}}_{1} (t, f) \geq 0 \\ \frac{- \tilde{x} (t, f)}{‖ \tilde{x} (t, f) ‖}, {\tilde{x}}_{1} (t, f) < 0 \end{matrix}

(11)

where,

\tilde{x} (t, f) = {[{\tilde{x}}_{1} (t, f), {\tilde{x}}_{2} (t, f), \dots, {\tilde{x}}_{m} (t, f)]}^{T}

represent the data of SSP.

Due to the limitation of the method based on the cosine angle algorithm, the first-stage single-source points have lots of deviation points and interference points, especially when disturbed by strong noise. If these points are not removed, it significantly deteriorates the accuracy of the mixing matrix estimation. For this reason, this paper proposes the second-stage single-source point screening to solve the problem.

3.2. The Second Stage

The source point screening in the second stage is to remove deviation points and interference points. Initially, based on the L1-norm optimization algorithm, a few SSPs are extracted from the original mixed signals and defined as reference single-source points. Next, the reference single-source points are clustered and the clustering center is determined as the reference center. Finally, the first-stage single-source points are screened again in combination with the reference center to eliminate deviation points and interference points.

From Equation (8), the SSPs of the same source signal can be expressed linearly with each other. Generally, the frequency-domain vector can be divided into real and imaginary parts. Then both the real and imaginary parts of the frequency-domain vector must satisfy Equation (8) at the same time. Therefore, the complex

x (t_{1}, f_{1})

and

x (t_{2}, f_{2})

can be linearly represented to each other as in Equation (12)

{\begin{matrix} Re (x (t_{1}, f_{1})) = α Re (x (t_{2}, f_{2})) \\ Im (x (t_{2}, f_{2})) = α Im (x (t_{2}, f_{2})) \end{matrix}

(12)

Expanding Equation (12) to a general conclusion, as in Equation (13).

{\begin{matrix} Re (x (t_{I}, f_{J})) = b Re (x (t, f)) \\ Im (x (t_{I}, f_{J})) = b Im (x (t, f)) \end{matrix}

(13)

The results are not accurate if SSPs are extracted based on the real part data only, and the same for the imaginary part. Therefore, based on the L1-norm optimization algorithm, the real and imaginary parts of the frequency-domain vector data are screened for SSPs respectively. To minimize the impact of noise and calculation errors, the signal points whose real and imaginary data satisfy the judgment condition at the same time are extracted as reference single-source points

\hat{x} (t, f)

. As shown in Figure 2, the intersections of the subset of real part signal points and the subset of imaginary part signal points are extracted as the reference single-source points, and the non-intersection part signal points are removed.

Then, the extracted reference single-source points are normalized and clustered to determine the clustering center. The obtained clustering center is considered as the reference center

C 1

. The screening process for reference single-source points is highly rigorous, and the resulting number of extracted signal points is often limited. As a result, there is a higher risk of losing some SSPs, which can ultimately reduce the screening accuracy. The precision cannot be guaranteed by directly using the reference center

C 1

as the estimation of the mixing matrix.

After obtaining the reference center, the deviation points and interference points in the first-stage single-source points will be eliminated according to the cosine distance. The cosine distance between each first-stage single-source point and the reference center

C 1

is calculated using the cosine distance formula. As in Equation (14), the signal points that do not satisfy the condition are removed as deviation points and interference points. The remaining signal points are extracted as the second-stage single-source points.

Distance = 1 - \frac{d_{i} \cdot d_{j}}{‖ d_{i} ‖ ‖ d_{j} ‖} < σ

(14)

where,

d_{i}

represents the vector of reference center,

d_{j}

represents the vector of the first-stage single-source points. The threshold value

σ

is generally close to 0.001. Obviously, the smaller the threshold is, the more data points will be removed.

Then, the second-stage single-source points are clustered to obtain the estimated mixing matrix. After the mixing matrix is estimated, the separation of the source signals can be performed based on compression sensing.

Figure 3 visualizes the flow chart of the two-stage single-source point screening method for mixing matrix estimation.

First, the time-domain data of the mixed signals are transformed into frequency-domain data, and the useless data with small energy values are removed. Secondly, a two-stage single-source point screening is performed. During the first stage, the original mixed signals are screened for SSPs based on the Cosine angle algorithm. The extracted SSPs are considered as the first-stage single-source points and then normalized. During the second stage, based on the L1-norm optimization algorithm, the real and imaginary parts of the original mixed signals are screened for SSPs, respectively. The intersections of the two parts are taken as the reference single-source points. The reference single-source points are normalized and clustered to obtain the clustering center, which is considered as the reference center

C 1

. Combining the first-stage single-source points and the reference center, the deviation points and interference points in the first-stage single-source points are removed based on the cosine distance algorithm. After that, the accurate second-stage single-source points are extracted. Finally, the mixing matrix estimation is obtained by clustering the second-stage single-source points.

4. Results

4.1. Simulated Signals

In this paper, four different simulation source signals

S = [s_{1}, s_{2}, s_{3}, s_{4}]

are designed to verify the effectiveness of the method [41]. Specifically,

s_{1}

is a frequency

f_{1}

sine signal,

s_{2}

is a frequency modulated signal with

f_{2}

carrier frequencies and

f_{m 2}

modulation frequencies,

s_{3}

is an amplitude modulated signal with

f_{3}

carrier frequencies and

f_{m 3}

modulation frequencies, and

s_{4}

is a pulse signal based on

s s_{4}

. The sampling frequency is

F s = 5000 Hz

, the data length is

T = 1

s, and the simulation source signals are shown in Equation (15).

{\begin{cases} s_{1} = 2 \sin (2 π \times f_{1} \times t) \\ s_{2} = 2 \cos (2 π \times f_{2} \times t) + r a n d + 4 \cos (2 π \times f_{m 2} \times t)) \\ s_{3} = \sin (2 π \times f_{3} \times t) \times (1 + \sin (2 π \times f_{m 3} \times t)) \\ s_{4} : s s_{4} = W_{1} \times \frac{\exp (- ({(\log (t t) - \log (T L / 2))}^{2}) / 2 σ_{0}^{2})}{\sqrt{2 π} σ_{0}} \end{cases}

(15)

where

f_{1} = 30 HZ

,

f_{2} = 150 HZ

,

f_{m 2} = 25 HZ

,

f_{3} = 90 HZ

,

f_{m 3} = 7 HZ

,

f_{4} = 20 HZ

,

T L = 1 / f_{4}

,

t t = 0 : 1 / F s : T L

,

σ_{0} = 0.05

,

W 1 = 0.2 \times (1 + r a n d (1, f_{4} \times T L) / 4)

.

A random matrix is obtained by the random function, and then it is normalized to acquire a mixing matrix

A

, as in Equation (16).

A = [\begin{matrix} 0.8294 & 0.6330 & 0.7370 & 0.7944 \\ - 0.5586 & 0.7741 & - 0.6759 & 0.6074 \end{matrix}]

(16)

To simulate the effect of environmental noise interference, a signal-to-noise ratio (SNR) of 30 dB Gaussian white noise is added to mixed signals. The waveforms of the four sets of source signals and two sets of mixed signals are shown in Figure 4, with the x-axis as time and the y-axis as amplitude. It can be seen from Figure 4 that the waveforms of the source signals are completely drowned by other signals, making them impossible to observe clearly. The spectrums of the four sets of source signals and two sets of mixed signals are shown in Figure 5, with the x-axis as frequency and the y-axis as amplitude. It can be seen from Figure 5 that the main features in the spectrums have disappeared due to the added noise, making it impossible to extract the frequency-domain information of the source signals.

The STFT is used to transform time-domain data of mixed signals into frequency-domain data with a window length of 2048, a frameshift of 512, and the weighting function of the Hanning window. The scatter plot of the mixed signals in the frequency domain is shown in Figure 6, with the x-axis as frequency-domain data of mixed signal

{\tilde{x}}_{1} (t, f)

, and the y-axis as frequency-domain data of mixed signal

{\tilde{x}}_{2} (t, f)

. As shown in Figure 6, the signal points are mainly concentrated in a specific linear direction. However, some signal points are not clustered in a specific direction due to noise effects.

After single-source point screening in the first stage, most noise points were removed, and the first-stage single-source points were extracted. The scatter plot of the first-stage single-source points in the frequency domain is shown in Figure 7. It can be seen from Figure 7 that the first-stage single-source points are well clustered in a specific direction (blue lines in Figure 7 indicate the direction). Despite the removal of most noise points, some deviation points and interference points still remain.

To reduce the differences in the distribution of signal points and improve their clustering characteristics, a normalization process is necessary. The normalization process allows for a more intuitive understanding of signal points and contributes to easier clustering analysis. The scatter plot of the first-stage single-source points after normalization is shown in Figure 8. From Figure 8, it can be clearly seen that there are deviation points and interference points, which will greatly affect the clustering results. Therefore, it is necessary to remove these deviation points and interference points.

After single-source point screening in the second stage, deviation points and interference points were eliminated. The retained signal points are considered as the second-stage single-source points. The scatter plot of the second-stage single-source points after normalization is shown in Figure 9. It can be seen from Figure 9 that the scatter plot of the second-stage single-source points is more compact and clearer.

The second-stage single-source points are clustered by the clustering algorithm, and the obtained clustering center is the mixing matrix estimation, as in Equation (17).

\tilde{A} = [\begin{matrix} 0.8311 & 0.6335 & 0.7354 & 0.7959 \\ - 0.5561 & 0.7737 & - 0.6776 & 0.6054 \end{matrix}]

(17)

4.1.1. Accuracy Comparison of Mixing Matrix Estimation

To comparatively study the accuracy of mixing matrix estimation, one-time L1-SSP method [37], L1-SSP combined with particle swarm optimization (PSO) method [42], and one-time cosine-SSP method [33] are used to make a quantitative comparison. The mixing matrix used remains the same and SNR = 30dB. The estimated mixing matrix obtained by L1-SSP, L1-SSP+PSO, and cosine-SSP are shown in Equations (18)–(20).

{\tilde{A}}_{L 1} = [\begin{matrix} 0.8299 & 0.6325 & 0.7368 & 0.7877 \\ - 0.5579 & 0.7746 & - 0.6761 & 0.6161 \end{matrix}]

(18)

{\tilde{A}}_{L 1 + PSO} = [\begin{matrix} 0.8299 & 0.6327 & 0.7368 & 0.7891 \\ - 0.5579 & 0.7744 & - 0.6761 & 0.6143 \end{matrix}]

(19)

{\tilde{A}}_{Cos ine} = [\begin{matrix} 0.8312 & 0.6340 & 0.7352 & 0.7966 \\ - 0.5560 & 0.7733 & - 0.6779 & 0.6045 \end{matrix}]

(20)

Normalized mean squared error (NMSE) and deviation angle [35] are two important criteria for evaluating the accuracy of the results of the mixing matrix estimation. The formulas for the two evaluation criteria are shown in Equations (21) and (22), respectively.

NMSE = - 10 \log_{10} (\frac{\sum_{i = 1}^{m} \sum_{j = 1}^{n} a_{i j}^{2}}{\sum_{i = 1}^{m} \sum_{j = 1}^{n} {({\hat{a}}_{i}_{j} - a_{i j})}^{2}})

(21)

where

m

and

n

are the number of rows and columns of the mixing matrix,

{\hat{a}}_{i j}

is the element of the

i - th

row and

j - th

column of the estimated mixing matrix

\tilde{A}

,

a_{i j}

is the element of the

i - th

row and

j - th

column of the original mixing matrix

A

. The smaller the NMSE value is, the higher accurate the mixing matrix estimation is.

ang (a_{i}, {\hat{a}}_{i}) = \frac{180}{π} \cos^{- 1} (\frac{a_{i}^{T} {\tilde{a}}_{i}}{‖ a_{i} ‖ ‖ {\tilde{a}}_{i} ‖})

(22)

where

a_{i}

is the

i - th

column vector of the original mixing matrix

A

,

{\tilde{a}}_{i}

is the

i - th

column vector of the estimated mixing matrix

\tilde{A}

. The smaller the value of the deviation angle is, the more accurate the estimation of the mixing matrix column vectors is.

The mixing matrix is estimated by different methods 30 times, under the same parameters and the same conditions. The evaluation results of all methods are the average values, which are listed in Table 1.

According to the comparison in Table 1, the two-stage single-source point screening method has the smallest NMSE value relative to the one-time single-source point screening, which indicates that the proposed method has the highest accuracy of the mixing matrix estimation. It is worth noting that the fourth column deviation angle of the L1-SSP method is extremely large. The reason is that the L1-SSP method tends to lose some SSPs, resulting in a serious decrease in the accuracy of the mixing matrix estimation. Moreover, the deviation angle of each column of the proposed method is improved compared with the cosine-SSP method. This is because the deviation points and interference points in SSPs are removed.

4.1.2. Robust Comparison of Mixing Matrix Estimation

To quantitatively compare the robustness of different methods of mixing matrix estimation, Gaussian white noise with different SNR = [10, 15,20, 25, 30, 40] is added to the mixed signals. Under the same parameters and the same conditions, each of the different methods is performed 20 times at each noise level and the NMSE averages are recorded. Figure 10 illustrates the NMSE average for various methods at different SNRs, the x-axis represents the SNR levels from ten to forty, the y-axis represents the NMSE average results.

As can be seen from Figure 10, for different SNRs, the two-stage single-source point screening method has the smallest NMSE value relative to the one-time single-source point screening, which indicates that the proposed method performs more robustly and accurately. Under the influence of strong noise, the performances of the proposed method are degraded a bit, but still maintain better accuracy of the mixture matrix estimation. It is worth noting that the NMSE of the cosine-SSP method fluctuates greatly with the change in SNR. The reason is that the cosine-SSP method is very weak against interference from noise. Although the NMSE of the L1-SSP method changes less, the results are worse compared to the proposed method.

In addition, the average time cost is used to evaluate the computational complexity of different methods. The comparison of the average time cost for different methods is listed in Table 2.

According to the comparison in Table 2, the proposed method has a higher time cost compared to other methods. The reason is that the proposed method performs a two-stage screening and has high computational complexity. However, the proposed method eliminates the deviation points and interference points, which improves the accuracy and robustness of the mixing matrix estimation. Moreover, the clear SSPs can improve the efficiency of the clustering algorithm to decrease some time costs.

4.1.3. Comparison of Source Signal Separation

After obtaining the mixing matrix estimation, the source signals are separated by the compressed sensing model. All parameters are the same as in Section 4.1.1. The mixing matrix estimation is as in Equation (17). The waveforms of the source signals and separated signals are shown in Figure 11, with the x-axis as time and the y-axis as amplitude. It can be seen from Figure 11 that the main waveforms of the source signals and separated signals are very similar. The spectrum of the source signals and separated signals are shown in Figure 12, with the x-axis as frequency and the y-axis as amplitude. It can be seen from Figure 12 that the main frequencies of the source signals are revealed in the spectrum of the separated signals. The comparison results show that the mixing matrix estimation obtained by the proposed method is beneficial for the separation of source signals.

The correlation coefficient is used to compare the similarity between the source signals and the separated signals. The results are shown in Equation (23). The calculation results verify that the separated signal

\tilde{s} (t)

is close to the source signal

s (t)

.

C o r r (\tilde{s}, s) = [0.9790 0.9802 0.9424 0.8922]

(23)

To evaluate the separation performances of different methods, a comparative analysis is performed under the same conditions. The mixing matrix estimation for different methods is shown in Section 4.1.1. The correlation coefficients between the source signals and the separated signals for different methods are shown in Figure 13, with the x-axis as the four sets of simulated signals

[s_{1}, s_{2}, s_{3}, s_{4}]

and the y-axis as the correlation coefficient results. As can be seen from Figure 13, the four signals separated by the proposed method have stronger correlation coefficients than the other three methods, which indicates that the separated signals are more similar to the source signals.

In conclusion, the simulated signal results show that the proposed method improves the accuracy and robustness of the mixing matrix estimation, as well as effectively improves the separation of source signals.

4.2. Real Speech Signals

To further verify the feasibility of the proposed method for the separation of real signals, the speech signals from the NOIZEUS speech database [43] are used to test the performance in this section. The data are mainly multiple sets of speech signals recorded by three males and three females, which are interfered by different noises with certain SNRs.

Application experiments (1): The speech signals of four experimenters are randomly selected. The sampling frequency is

F s = 8000 Hz

and the data length is

T = 2 s

. A random matrix is obtained by the random function and then it is normalized to acquire a mixing matrix

A

, as in Equation (24). Meanwhile, the real environmental noise with SNR = 25 dB is added. The waveforms of four sets of source speech signals and two sets of mixed speech signals are shown in Figure 14, with the x-axis as time and the y-axis as amplitude. It can be seen from Figure 14 that the major waveforms of the source speech signals have been drowned out by other signals and noise.

A = [\begin{matrix} 0.3914 & 0.7368 & 0.4821 & 0.5679 \\ 0.9202 & - 0.6761 & - 0.8761 & 0.8231 \end{matrix}]

(24)

The STFT is used to transform time-domain data of mixed speech signals into frequency-domain data with a window length of 2048, a frameshift of 512, and the weighting function of the Hanning window. A two-stage single-source point screening method is used to extract SSPs and remove deviation points and interference points. The scatter plot of the first-stage single-source points after normalization is shown in Figure 15.

The deviation points and interference points are then removed. The scatter plot of the second-stage single-source points after normalization is shown in Figure 16.

The final mixing matrix estimation

\tilde{A}

is shown in Equation (25). In comparison with Equation (24), the estimated mixing matrix is approximately the same as the original mixing matrix. The evaluation results in NMSE = −49.1891.

\tilde{A} = [\begin{matrix} 0.3883 & 0.7395 & 0.4851 & 0.5653 \\ 0.9215 & - 0.6732 & - 0.8745 & 0.8248 \end{matrix}]

(25)

After obtaining the estimation of the mixing matrix, the separation of the source speech signals is performed. The waveforms of the source speech signals and the separated speech signals are shown in Figure 17, with the x-axis as time and the y-axis as amplitude. It can be seen from Figure 17, that the main waveforms between the source speech signals and the separated speech signals are very similar, indicating that the proposed method is available for the separation of speech source signals.

The similarity between the source and separated speech signals is compared according to the correlation coefficients, as shown in Equation (26). The results show that the correlation coefficients of all four groups of speech signals are greater than 0.81, which is a strong correlation. This proves that the effect of source speech signal separation is well.

C o r r (\tilde{s}, s) = [0.8352 0.8791 0.8170 0.8590]

(26)

The separation performance of the proposed method in real speech signals is comparatively studied with other three methods L1-SSP, L1-SSP+PSO, and cosine-SSP. The parameters and conditions for the mixing matrix estimation and source signal separation are the same.

The correlation coefficients between the source speech signals and the separated speech signals for different methods are shown in Figure 18, with the x-axis as the four sets of speech signals

[s_{1}, s_{2}, s_{3}, s_{4}]

and the y-axis as the correlation coefficient results. As can be seen from Figure 18, the separation of the four source speech signals obtained by the proposed method are closer to the source speech signals than the other three methods.

Application experiments (2): To verify the feasibility of the proposed method for the separation of real signals in strong noise, stronger non-Gaussian white noise is added to the mixed speech signals. Randomly select three source speech signals. The sampling frequency is

F s = 8000 Hz

, and the data length is

T = 2 s

. The random mixing matrix

A

is as in Equation (27) and the non-Gaussian white noise with SNR = 15 dB is added. The waveforms of three sets of source speech signals and two sets of mixed speech signals are shown in Figure 19, with the x-axis as time and the y-axis as amplitude.

A = [\begin{matrix} 0.6542 & 0.9228 & 0.3542 \\ - 0.7563 & 0.3852 & 0.9352 \end{matrix}]

(27)

The STFT is used to transform time-domain data of mixed speech signals into frequency-domain data with a window length of 2048, a frameshift of 512, and the weighting function of the Hanning window. Estimation of the mixing matrix is performed by a two-stage single-source point screening method. The scatter plot of the first-stage single-source points after normalization is shown in Figure 20. The scatter plot of the second-stage single-source points after normalization is shown in Figure 21. As can be seen from Figure 21, more deviation points and interference points are eliminated.

The final mixing matrix estimation

\tilde{A}

is shown in Equation (28). In comparison with Equation (27), the two mixing matrices are very similar to each other. The evaluation results in NMSE = −43.5657.

\tilde{A} = [\begin{matrix} 0.6468 & 0.9201 & 0.3532 \\ - 0.7626 & 0.3906 & 0.9355 \end{matrix}]

(28)

After obtaining the estimation of the mixing matrix, the separation of the source speech signals is performed. The waveforms of the source speech signals and the separated speech signals are shown in Figure 22, with the x-axis as time and the y-axis as amplitude. As can be seen from Figure 22, even under the influence of strong noise, the waveforms of the speech signals separated by the proposed method are still better.

The correlation coefficients between the source speech signals and the separated speech signals are shown in Equation (29). The results of all three groups of speech signals are greater than 0.86, which is a strong correlation.

C o r r (\tilde{s}, s) = [0.8866 0.8755 0.8655]

(29)

In the same way, the separation performance of the proposed method is compared with the other three methods under the same parameters and conditions. The correlation coefficients between the source speech signals and the separated speech signals for different methods are shown in Figure 23, with the x-axis as the three sets of speech signals

[s_{1}, s_{2}, s_{3}]

and the y-axis as the correlation coefficient results. As can be seen from Figure 23, the speech signals separated by the cosine-SSP method are not very good under the interference of strong noise, but the three separated speech signals by the proposed method maintain a good similarity with the source speech signals.

In summary, the application experiments of real speech signals verify the effectiveness of the proposed method. Under different noise disturbances, the proposed method can maintain a good performance in solving the UBSS problem of speech signals.

5. Conclusions

This paper focuses on estimation problems of mixing matrix in underdetermined blind source separation. A two-stage single-source point screening method that incorporates the cosine angle algorithm and the L1-norm optimization algorithm is proposed. In the first stage, the main single-source points are extracted, and in the second stage, the deviation points and interference points in the single-source points are removed. The two-stage single-source point screening method not only ensures the accuracy of mixing matrix estimation but also improves the robustness of mixing matrix estimation. The simulation results demonstrate that the proposed method obtains better mixing matrix estimation and enhances the accuracy of separated signals in comparison with the L1-SSP, L1-SSP + PSO, and cosine-SSP methods under various cases. The experimental results of real speech signals verify that the proposed method can effectively solve the underdetermined blind source separation problem.

Further research is warranted on the proposed method in several respects. Firstly, this paper primarily concentrates on the linear instantaneous mixing model of the signal, and the method needs refinement when facing the convolutional mixing model. Secondly, it is worthwhile to improve the algorithm in the paper to reduce the time-cost.

Author Contributions

Conceptualization, Z.Z. and Z.L.; methodology, Z.Z.; validation, Z.Z. and Z.L.; formal analysis, X.C.; resources, X.C. and Z.L.; software, Z.Z.; writing—original draft, Z.Z.; writing—review and editing, X.C. and Z.L.; visualization, Z.Z. and X.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

No applicable.

Informed Consent Statement

No applicable.

Data Availability Statement

No applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Cheng, W.; Lee, S.C.; Zhang, Z.S.; He, Z. Independent component analysis based source number estimation and its comparisons for mechanical systems. J. Sound Vib. 2012, 23, 5153–5167. [Google Scholar] [CrossRef]
Peng, T.; Chen, Y.; Liu, Z. A time-frequency domain blind source separation method for underdetermined instantaneous mixtures. Circuits Syst. Signal Process. 2015, 34, 3883–3895. [Google Scholar] [CrossRef]
Metsomaa, J.; Sarvas, J.; Ilmoniemi, R.J. Blind source separation of event-related EEG/MEG. IEEE Trans. Biomed. Eng. 2017, 64, 2054–2064. [Google Scholar] [CrossRef] [PubMed]
Cui, Z.T.; Zhang, Y.C.; Yi, N. Optimization of Kurtosis in the Extend-Infomax Blind Signal Separation Algorithm. Mob. Inf. Syst. 2021, 21, 1–8. [Google Scholar] [CrossRef]
Zou, L.; Chen, X.; Ji, X.; Wang, Z.J. Underdetermined joint blind source separation of multiple datasets. IEEE Access 2017, 5, 7474–7487. [Google Scholar] [CrossRef]
Li, Y.B.; Nie, W.; Ye, F. A complex mixing matrix estimation algorithm based on single source points. Circuits Syst. Signal Process. 2015, 34, 3709–3723. [Google Scholar] [CrossRef]
Hyvarinen, A.; Oja, E. Independent component analysis: Algorithms and applications. Neural Netw. 2000, 13, 411–430. [Google Scholar] [CrossRef]
Pham, D.T.; Cardoso, J.F. Blind separation of instantaneous mixtures of non-stationary sources. IEEE Trans. Signal Process 2001, 49, 1837–1848. [Google Scholar] [CrossRef]
Yao, X.J.; Yi, T.H.; Qu, C.; Li, H. Blind modal identification in frequency domain using independent component analysis for high damping structures with classical damping. Comput. Aided Civ. Infrastruct. Eng. 2018, 33, 35–50. [Google Scholar] [CrossRef]
Amari, S. Natural gradient learning for over- and under-complete bases in ICA. Neural Comput. 1999, 11, 1875–1883. [Google Scholar] [CrossRef]
Lee, D.D.; Seung, H.S. Learning the parts of objects by non-negative matrix factorization. Nature 1999, 401, 788–791. [Google Scholar] [CrossRef] [PubMed]
Xie, Y.; Xie, K.; Xie, S.L. Underdetermined convolutive blind separation of sources integrating tensor factorization and expectation maximization. Digit. Signal Process. 2019, 87, 145–154. [Google Scholar] [CrossRef]
Hong, H.; Liang, M. Separation of fault features from a single-channel mechanical signal mixture using wavelet decomposition. Mech. Syst. Signal Process. 2007, 21, 2025–2040. [Google Scholar] [CrossRef]
Karvanen, J.; Cichocki, A. Measuring sparseness of noisy signals. In Proceedings of the 4th International Symposium on Independent Component Analysis and Blind Signal Separation, Nara, Japan, 1–4 April 2003; pp. 125–130. [Google Scholar]
Gannot, S.; Vincent, E.; Markovich-Golan, S.; Ozerov, A. A consolidated perspective on multimicrophone speech enhancement and source separation. IEEE-ACM Trans. Audio Speech Lang. Process. 2017, 25, 692–730. [Google Scholar] [CrossRef]
Georgiev, P.; Theis, F.; Cichocki, A. Sparse component analysis and blind source separation of underdetermined mixtures. IEEE Trans. Neural Netw. 2005, 16, 992–996. [Google Scholar] [CrossRef] [PubMed]
Zhou, G.; Yang, Z.; Xie, S.; Yang, J.-M. Mixing matrix estimation from sparse mixtures with unknown number of sources. IEEE Trans. Neural Netw. 2011, 22, 211–221. [Google Scholar] [CrossRef] [PubMed]
Belouchrani; Amin, M.G. Blind source separation based on time frequency signal representations. IEEE Trans. Signal Process. 1998, 46, 2888–2897. [Google Scholar] [CrossRef]
Xie, S.; Yang, L.; Yang, J.M.; Zhou, G.; Xiang, Y. Time-frequency approach to underdetermined blind source separation. IEEE Trans. Neural Netw. Learning. Syst. 2012, 23, 306–316. [Google Scholar] [CrossRef]
Sadhu, A.; Narasimhan, S. A decentralized blind source separation algorithm for ambient modal identification in the presence of narrowband disturbances. Struct. Control. Health Monit. 2014, 21, 282–302. [Google Scholar] [CrossRef]
Amini, F.; Hedayati, Y. Underdetermined blind modal identification of structures by earthquake and ambient vibration measurements via sparse component analysis. J. Sound Vib. 2016, 366, 117–132. [Google Scholar] [CrossRef]
Cheng, W.; Jia, Z.; Chen, X.; Gao, L. Convolutive blind source separation in frequency domain with kurtosis maximization by modified conjugate gradient. Mech. Syst. Signal Process. 2019, 134, 106331. [Google Scholar] [CrossRef]
Bofill, P.; Zibulevsky, M. Underdetermined blind source separation using sparse representations. Signal Process. 2001, 81, 2353–2362. [Google Scholar] [CrossRef]
Yi, T.H.; Yao, X.J.; Qu, C.X.; Li, H.-N. Clustering number determination for sparse component analysis during output-only modal identification. J. Eng. Mech. 2019, 145, 04018122. [Google Scholar] [CrossRef]
Cheng, W.; Zhang, Z.S.; Cao, H.R.; He, Z.; Zhu, G. A comparative study of information-based source number estimation methods and experimental validations on mechanical systems. Sensors 2014, 14, 7625–7646. [Google Scholar] [CrossRef] [PubMed]
Eqlimi, E.; Makkiabadi, B.; Samadzadehaghdam, N.; Khajehpour, H.; Mohagheghian, F.; Sanei, S. A novel underdetermined source recovery algorithm based on k-Sparse component analysis. Circuits Syst. Signal Process. 2019, 38, 1264–1286. [Google Scholar] [CrossRef]
Ma, B.Z.; Zhang, T.Q. Underdetermined Blind Source Separation Based on Source Number Estimation and Improved Sparse Component Analysis. Circuits Syst. Signal Process. 2021, 40, 3417–3436. [Google Scholar] [CrossRef]
Linh-Trung, N.; Belouchrani, A.; Abed-Meraim, K.; Boashash, B. Separating more sources than sensors using time-frequency distributions. EURASIP J. Appl. Signal Process. 2005, 17, 2828–2847. [Google Scholar] [CrossRef]
Aissa-El-Bey, A.; Linh-Trung, N.; Abed-Meraim, K.; Belouchrani, A.; Grenier, Y. Underdetermined blind separation of nondisjoint sources in the time-frequency domain. IEEE Trans. Signal Process. 2007, 55, 897–907. [Google Scholar] [CrossRef]
Jourjine, A.; Rickard, S.; Yilmaz, O. Blind separation of disjoint orthogonal signals: Demixing N Sources from 2 mixtures. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Istanbul, Turkey, 5–9 June 2000; pp. 2985–2988. [Google Scholar] [CrossRef]
Magron, P.; Badeau, R.; David, B. Model-based STFT phase recovery for audio source separation. IEEE/ACM Trans. Audio Speech Lang. Process. 2018, 26, 1095–1105. [Google Scholar] [CrossRef]
Abrard, F.; Deville, Y. A time–frequency blind signal separation method applicable to underdetermined mixtures of dependent sources. Signal Process. 2005, 85, 1389–1403. [Google Scholar] [CrossRef]
Reju, V.G.; Koh, S.N.; Soon, I.Y. An algorithm for mixing matrix estimation in instantaneous blind source separation. Signal Process. 2009, 89, 1762–1773. [Google Scholar] [CrossRef]
Guo, J.; Zeng, X.; She, Z. Blind source separation based on high-resolution time-frequency distributions. Comput. Electr. Eng. 2012, 38, 175–184. [Google Scholar] [CrossRef]
Sun, J.D.; Li, Y.X.; Wen, J.T.; Yan, S. Novel mixing matrix estimation approach in underdetermined blind source separation. Neurocomputing 2015, 173, 623–632. [Google Scholar] [CrossRef]
Li, Y.B.; Nie, W.; Ye, F.; Lin, Y. A mixing matrix estimation algorithm for underdetermined blind source separation. Circuits Syst. Signal Process. 2016, 35, 3367–3379. [Google Scholar] [CrossRef]
Zhen, L.; Peng, D.; Yi, Z.; Xiang, Y.; Chen, P. Underdetermined blind source separation using sparse coding. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 3102–3108. [Google Scholar] [CrossRef]
Lu, J.; Cheng, W.; He, D.; Zi, Y. A novel underdetermined blind source separation method with noise and unknown source number. J. Sound Vib. 2019, 457, 67–91. [Google Scholar] [CrossRef]
Donoho, D.L. For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 2006, 59, 797–829. [Google Scholar] [CrossRef]
Asif, M.S.; Romberg, J. Sparse recovery of streaming signals using l1-homotopy. IEEE Trans. Signal Process 2014, 62, 4209–4223. [Google Scholar] [CrossRef]
Cheng, W.; Jia, Z.; Chen, X.; Han, L.; Zhou, G.; Gao, L. Underdetermined convolutive blind source separation in time-frequency domain based on single source points and experimental validation. Meas. Sci. Technol. 2020, 31, 095001. [Google Scholar] [CrossRef]
Wang, L.Y.; Hou, G.Y.; Xiang, J.H. Mixing Matrix Estimation of Underdetermined Blind Source Separation based on Improved Density Clustering Algorithm. In Proceedings of the 2019 8th Asia-Pacific Conference on Antennas and Propagation (APCAP_2019), Incheon, Republic of Korea, 4–7 August 2019. [Google Scholar]
Hu, Y.; Loizou, P.C. Subjective evaluation and comparison of speech enhancement algorithms. Speech Commun. 2007, 49, 588–601. [Google Scholar] [CrossRef]

Figure 1. The flow chart for solving UBSS.

Figure 2. Screening process of reference single-source points in the second stage.

Figure 3. The flow chart of the two-stage single-source point screening method.

Figure 4. Waveforms of signals. (a) Source signals. (b) Mixed signals.

Figure 5. Spectrums of signals. (a) Source signals. (b) Mixed signals.

Figure 6. Scatter plot of mixed signals.

Figure 7. Scatter plot of the first-stage single-source points.

Figure 8. Scatter plot of the first-stage single-source points after normalization.

Figure 9. Scatter plot of the second-stage single-source points after normalization.

Figure 10. The NMSE average for various methods at different SNRs.

Figure 11. Waveforms of signals. (a) Source signals. (b) Separated signals.

Figure 12. Spectrums of signals. (a) Source signals. (b) Separated signals.

Figure 13. Correlation coefficients between the source signals and the separated signals for different methods.

Figure 14. Waveforms of speech signals. (a) Source speech signals. (b) Mixed speech signals.

Figure 15. Scatter plot of the first-stage single-source points after normalization.

Figure 16. Scatter plot of the second-stage single-source points after normalization.

Figure 17. Waveforms of speech signals. (a) Source speech signals. (b) Separated speech signals.

Figure 18. Correlation coefficients between the source speech signals and the separated speech signals for different methods.

Figure 19. Waveforms of speech signals. (a) Source speech signals. (b) Mixed speech signals.

Figure 20. Scatter plot of the first-stage single-source points after normalization.

Figure 21. Scatter plot of the second-stage single-source points after normalization.

Figure 22. Waveforms of speech signals (a) Source speech signals (b) Separated speech signals.

Figure 23. Correlation coefficients between the source speech signals and the separated speech signals for different methods.

Table 1. Comparison of evaluation results for estimated mixing matrix under different methods.

Methods	Assessment Criteria
Methods	$ang (a_{1}, {\hat{a}}_{1})$	$ang (a_{2}, {\hat{a}}_{2})$	$ang (a_{3}, {\hat{a}}_{3})$	$ang (a_{4}, {\hat{a}}_{4})$	NMSE (dB)
L1-SSP	0.0493	0.0403	0.0162	0.6292	−45.1604
L1-SSP + PSO	0.0493	0.0242	0.0162	0.4985	−47.1725
Cosine-SSP	0.1812	0.0734	0.1542	0.2086	−50.9514
Proposed method	0.1732	0.0367	0.1338	0.1432	−52.7469

Table 2. Comparison of average time cost for different methods.

Methods	Cosine-SSP	L1-SSP	L1-SSP + PSO	Proposed Method
Time (s)	1.2343	3.3456	4.0921	4.2954

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhu, Z.; Chen, X.; Lv, Z. Underdetermined Blind Source Separation Method Based on a Two-Stage Single-Source Point Screening. Electronics 2023, 12, 2185. https://doi.org/10.3390/electronics12102185

AMA Style

Zhu Z, Chen X, Lv Z. Underdetermined Blind Source Separation Method Based on a Two-Stage Single-Source Point Screening. Electronics. 2023; 12(10):2185. https://doi.org/10.3390/electronics12102185

Chicago/Turabian Style

Zhu, Zhanyu, Xingjie Chen, and Zhaomin Lv. 2023. "Underdetermined Blind Source Separation Method Based on a Two-Stage Single-Source Point Screening" Electronics 12, no. 10: 2185. https://doi.org/10.3390/electronics12102185

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Underdetermined Blind Source Separation Method Based on a Two-Stage Single-Source Point Screening

Abstract

1. Introduction

2. Theory of Underdetermined Blind Source Separation

2.1. Frequency-Domain Transformation

2.2. Estimation of Mixing Matrix

2.2.1. Data Preprocessing

2.2.2. Single-Source Point Screening

2.3. Separation of Source Signal

3. The Proposed Two-Stage Single-Source Point Screening Method

3.1. The First Stage

3.2. The Second Stage

4. Results

4.1. Simulated Signals

4.1.1. Accuracy Comparison of Mixing Matrix Estimation

4.1.2. Robust Comparison of Mixing Matrix Estimation

4.1.3. Comparison of Source Signal Separation

4.2. Real Speech Signals

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI