**1. Introduction**

As one of the booming communication technologies in the information era, modulation classification (MC) technology [1] has a very important application value in the field of wireless communication. For example, it can play an important role in communication investigation, electronic countermeasures, signal authentication, interference identification, spectrum management, etc. At present, the wireless communication network has maintained a steady and rapid development trend, the network construction is increasingly integrated, and network applications are everywhere. At the same time, the inherent contradiction between the centralized static network and the dynamic change of the environment also causes serious problems like the low utilization of spectrum resources in the wireless communication network. Therefore, cognitive radio (CR) technology [2–5] is proposed and considered as a promising technology to solve these problems. MC plays an important role in CR based on spectrum sensing and feature analysis.

Cognitive radio has been widely accepted as a new technology in the field of wireless communication in the new era. In the cognitive radio network (CRN), in order to avoid interference with the transmission of the primary users, it is essential to accurately sense the presence for any contemporaneous transmission of the primary users in the observed spectrum [6]. The primary user signal error detection will cause the secondary user to waste the spectrum opportunity. Noise, shadow and multipath fading lead to a serious degradation of signal characteristics in conventional wireless communication scenarios. This makes signal detection very difficult in a low signal-to-noise ratio

(SNR) environment [7,8]. In addition, because the primary user (authorized user) and the cognitive user (unauthorized user) cannot communicate with each other, accurate MC can not only avoid mutual interference between them, but also provide the multi-dimensional spectrum information of the surrounding wireless environment, which helps to improve the inefficient use of spectrum resources in the CRN. With the different modulation parameters and methods used in the wide-band communication signal, MC has gradually been studied in depth and has become one of the main methods of signal recognition and classification.

MC has been playing an important role in the field of wireless communication for a long time, especially in dynamic spectrum management and interference recognition. A variety of methods and classifiers have been proposed in the literature, but most of them only identify a few modulation formats, such as low-order modulation format, or require some knowledge of parameters of the signal. MCs of a CR system are roughly divided into 4 categories: (a) Multiple quadrature amplitude modulation (MQAM) and multiple phase shift keying (MPSK) signals are classified based on signal envelope variance and wavelet transform, but the recognition rate is low at low SNR [9,10]; (b) artificial neural networks (ANN) based on machine-learning algorithms for automatic signal type recognition, which requires the most appropriate ANN and will lead to an increase in calculation time and risk of over-fitting [11,12]; (c) identification from higher-order cumulant (HOC) using fourth-order cumulant, which cannot identify some signals with the same fourth-order cumulant [13,14]; (d) feature parameters are extracted from the time domain, frequency domain and power spectrum of signals to classify and identify a modulation signal, but some feature parameter extraction processes are complex and easily interfered with by noise [15–17]. The proposed method mainly focuses on the recognition of single-feature parameters, and most classifiers adopted an increase the complexity of the system.

In this paper, we propose a new modulation classification method that combines high-order cumulants and cyclic spectrum feature extraction methods with a decision tree–support vector machine (DT–SVM) classifier. In the feature extraction phase, the compressed sensing (CS) method is used to obtain the compressed sample size of the feature parameters, and the influence of key factors on the classification accuracy in the modulation classification process is analyzed. CS is a signal processing technique called "sampling compression combo". The CS method can map signals from high-dimensional space to low-dimensional space through a small number of observations (non-adaptive linear projection) of sparse signals, and maintain the original structure of the signal [18]. The sparse signal reconstruction is actually reconstructing the original signal from the signal observations with high probability by solving the non-linear optimization problem, which breaks through the limitations of the traditional Shannon–Nyquist sampling theorem and solves the performance requirements of a sampling system when processing cognitive radio signal. It also relieves the pressure of storage, transmission and processing for large amounts of the traditional sampled data. The combination of HOC and the cyclic spectrum can distinguish the same cumulant of different signals, and achieve the MC through the DT–SVM classifier. Combining the advantages of HOC and cyclic spectrum features, the algorithm directly obtains the compressed values of feature parameters through the CS theory, and analyzes the influence of symbol length and compression ratio on the recognition accuracy. The simulation results show that the algorithm has a better classification performance in low SNR and the validity of the method is verified.

The rest of this paper is structured as follows. Section 2 introduces the feature extraction method and its characteristics in detail. Section 3 introduces the compression sampling values of feature parameters obtained by combining the compression sensing theory. Section 4 describes the structure of the decision tree–support vector machine classifier. In Section 5, some simulation results are presented. Finally, Section 6 sums up the conclusions.

### **2. Feature Extraction**

### *2.1. Feature Extraction Based on Higher-Order Cumulant (HOC)*

For wireless channel model, we studied the property of HOC and the insensitivity of its second-order terms to Gaussian noise, the kth-order cumulant **C***k*,*n*(*m*1, *m*2, ···, *mk*−1) of a complex-valued stationary random process *x*(*t*), can be defined as:

$$\mathbf{C}\_{k,n}(m\_1, m\_2, \dots, m\_{k-1}) = \text{cum}(\mathbf{x}(t), \mathbf{x}(t+m\_1), \dots, \mathbf{x}(t+m\_{k-1})) \tag{1}$$

where *x*(*t* + *mk*) denotes a function of different time delays and regardless of *t*, *cum*(•) means taking the cumulant. Therefore, its fourth-order cumulant is:

$$\begin{array}{c} \mathbf{C}\_{4,\mathbb{n}}(m\_1, m\_2, \dots, m\_3) = & \mathbb{E}[\mathbf{x}(n)\mathbf{x}(n+m\_1)\mathbf{x}(n+m\_2)\mathbf{x}(n+m\_3)] - \mathbf{C}\_{2,\mathbb{n}}(m\_1)\mathbf{C}\_{2,\mathbb{n}}(m\_2 - m\_3) \\ & - \mathbf{C}\_{2,\mathbb{n}}(m\_2)\mathbf{C}\_{2,\mathbb{n}}(m\_3 - m\_1) - \mathbf{C}\_{2,\mathbb{n}}(m\_3)\mathbf{C}\_{2,\mathbb{n}}(m\_1 - m\_2) \end{array} \tag{2}$$

Based on the above theory, the fourth-order, sixth-order and eight-order cumulants of the zero-mean *x*(*t*), are shown as:

$$\begin{aligned} \mathbf{C}\_{4,0} &= \text{cum}(\mathbf{x}, \mathbf{x}, \mathbf{x}, \mathbf{x}) = \mathbf{M}\_{4,0} - 3 \mathbf{M}\_{2,0} \\ \mathbf{C}\_{4,1} &= \text{cum}(\mathbf{x}, \mathbf{x}, \mathbf{x}, \mathbf{x}^\*) = \mathbf{M}\_{4,1} - 3 \mathbf{M}\_{2,1} \mathbf{M}\_{2,0} \\ \mathbf{C}\_{4,2} &= \text{cum}(\mathbf{x}, \mathbf{x}, \mathbf{x}^\*, \mathbf{x}^\*) = \mathbf{M}\_{4,2} - \left| \mathbf{M}\_{2,0} \right|^2 - 2 \mathbf{M}\_{2,1}^2 \\ \mathbf{C}\_{6,0} &= \text{cum}(\mathbf{x}, \mathbf{x}, \mathbf{x}, \mathbf{x}, \mathbf{x}, \mathbf{x}) = \mathbf{M}\_{6,0} - 15 \mathbf{M}\_{4,0} \mathbf{M}\_{2,0} + 30 \mathbf{M}\_{2,0}^2 \\ \mathbf{C}\_{6,3} &= \text{cum}(\mathbf{x}, \mathbf{x}, \mathbf{x}, \mathbf{x}^\*, \mathbf{x}^\*, \mathbf{x}^\*) = \mathbf{M}\_{6,3} - 9 \mathbf{C}\_{4,2} \mathbf{C}\_{2,1} - 6 \mathbf{C}\_{2,1}^3 \\ \mathbf{C}\_{8,0} &= \text{cum}(\mathbf{x}, \mathbf{x}, \mathbf{x}, \mathbf{x}, \mathbf{x}, \mathbf{x}, \mathbf{x}, \mathbf{x}) = \mathbf{M}\_{8,0} - 28 \mathbf{M}\_{6,0} \mathbf{C}\_{2,0} - 35 \mathbf{M}\_{4,0} \mathbf{^2 + 420 \mathbf{M}\_{4,0} \mathbf{M}\_{2,0}^2 - 630 \mathbf{M}\_{2,0} \mathbf{^4} \end{aligned} \tag{3}$$

where **M***pq* = *E*[*x*(*t*) *p*−*q x*∗ (*t*) *q* ] denotes the pth-order mixing moment [19].

In the practical application of MC, we need to estimate the HOC value of the signal from the received symbol sequence in the shortest possible time. Sample estimations of the correlations are given by:

$$\begin{aligned} \mathbf{C\_{4,0}} &= \frac{1}{N} \sum\_{n=1}^{N} \left( \mathbf{x}(t) \right)^{4} - 3 \mathbf{C\_{2,0}^{2}} \\ &\quad \dots \dots \\ \mathbf{C\_{8,0}} &= \frac{1}{N} \sum\_{n=1}^{N} \left( \mathbf{x}(t) \right)^{8} - 28 \mathbf{C\_{2,0}} \frac{1}{N} \sum\_{n=1}^{N} \left( \mathbf{x}(t) \right)^{6} - 35 \mathbf{M}\_{4,0}^{2} + 420 \mathbf{M}\_{4,0} \mathbf{M}\_{2,0}^{2} - 630 \mathbf{M}\_{2,0}^{2} \end{aligned} \tag{4}$$

Substituting the estimated values into Equation (4), we can obtain all of the features for the considered six wireless signal types. Table 1 shows some of these features for a number of these signals. These values are computed under the constraint of unit variance in noise free conditions. It can be seen that by computing of these values, we can classify the wireless signal types.

**Table 1.** Theoretical values of higher-order cumulant (HOC) for six wireless signal modulations.


Table 1 shows that OOK (on-off keying), DPSK (differential phase shift keying), QPSK (quadrature phase shift keying), OQPSK (offset quadrature phase shift keying) have the same theoretical values of HOC. In addition, 16QAM (16 quadrature amplitude modulation) and 64QAM (64 quadrature amplitude modulation) have similar HOC values. Therefore, we can define a feature parameter **<sup>T</sup>**<sup>1</sup> = **C**8,0 / **C**4,0 that is calculated in Table <sup>2</sup> and divides signals into three categories including (OOK, DPSK), (QPSK, OQPSK) and (16QAM, 64QAM). It is worth noting that the absolute value and ratio form are used to eliminate the effect of phase jitter and amplitude [20].

**Table 2.** Theoretical values of T1 for six wireless signal modulations.


Owing to the difference between the phase jump rules of QPSK and OQPSK, the sampling sequence of both can be performed with a differential operation, i.e.,

$$
\Delta \mathbf{x}(t) = \mathbf{x}(t+1) - \mathbf{x}(t) = (a\_{t+1} - a\_t) \exp[j(2\pi f\_c + \Delta \theta\_c)] \tag{5}
$$

where *x*(*t*) denotes the signals of QPSK and OQPSK, *ak* is the transmitted symbol sequences, *fc* denotes the carrier frequency and θ*<sup>c</sup>* denotes the phase jitter. For the sake of discussion, we assume that *fc* and θ*<sup>c</sup>* have been completed timing synchronization. The values of HOC under difference operation are calculated in Table 3. Then we define another feature parameter **<sup>T</sup>**<sup>2</sup> = **C***d*8,0 / **C***d*4,0 2 is calculated in Table 4 to classify QPSK and OQPSK, where **C***d*8,0 and **C***d*4,0 represent the cumulants after differential operation.

**Table 3.** Theoretical values of HOC after difference between QPSK and OQPSK.


**Table 4.** Theoretical values of T2 for QPSK and OQPSK.


### *2.2. Feature Extraction Based on Cyclic Spectrum*

Since the T1 of (OOK, DPSK) and (16QAM, 64QAM) are the same or similar, a cyclic spectral density function for noise suppression is proposed for identification. Assuming *x*(*t*) is the cyclostationary signal, and its mean value and autocorrelation function are periodic with *T*<sup>0</sup> shown as:

$$m\_{\mathbf{x}}(t+T\_0) = m\_{\mathbf{x}}(t) \tag{6}$$

$$\mathbf{R}\_{\mathbf{x}}(t+T\_0+\frac{\pi}{2}, t+T\_0-\frac{\pi}{2}) = \mathbf{R}\_{\mathbf{x}}(t+\frac{\pi}{2}, t-\frac{\pi}{2})\tag{7}$$

where τ is the delay variable. Because the autocorrelation function has periodicity, its Fourier series can be written as:

$$\mathbf{R}\_{\rm xa}(t + \frac{\tau}{2}, t - \frac{\tau}{2}) = \sum\_{a} \mathbf{R}\_{\rm xa}(\tau) e^{j2\pi \mathbf{x}} \tag{8}$$

where α stands for the frequency corresponding to the instantaneous autocorrelation and is often called the cyclic frequency. In addition, **R***x*<sup>α</sup> is the coefficient of the Fourier series which is given by:

$$\mathbf{R}\_{\rm xa}(\tau) = \frac{1}{T\_0} \int\_{-\frac{T\_0}{2}}^{\frac{T\_0}{2}} \mathbf{R}\_{\rm xa}(t + \frac{\tau}{2}, t - \frac{\tau}{2}) e^{-j2\pi at} dt \tag{9}$$

The Fourier transform of the cyclic autocorrelation function can be written as:

$$\mathbf{S}\_{\rm xa}(f) \triangleq \int\_{-\infty}^{\infty} \mathbf{R}\_{\rm xa}(\tau) e^{-j2\pi f t} d\tau \tag{10}$$

where **S***x*α(*f*) is called power spectral density and *f* is the spectral frequency.

The **R***x*α(τ) can be seen as the cross-correlation of two complex frequency shift components *u*(*t*) and *v*(*t*) of *x*(*t*), i.e.,

$$\mathbf{R}\_{\text{xx}}(\tau) = \mathbf{R}\_{\text{trvar}}(\tau) = \frac{1}{T\_0} \int\_{-\frac{T\_0}{2}}^{\frac{T\_0}{2}} u(t + \frac{\tau}{2}) v^\star(t - \frac{\tau}{2}) dt. \tag{11}$$

where *u*(*t*) = *x*(*t*)*ej*2πα*<sup>t</sup>* , *v*(*t*) = *x*(*t*)*e*−*j*2πα*<sup>t</sup>* .

From Equation (10) we can obtain **S***x*α(*f*) = **S***uv*α(*f*). Through the cross-spectrum analysis, we can obtain:

$$\mathbf{S}\_{\text{xu}}(f) \triangleq \lim\_{T\_0 \to \text{so} \Delta t \to \infty} \mathbf{S}\_{\text{uv}T\_0}(f)\_{\Delta t} = \lim\_{T\_0 \to \text{so} \Delta t \to \infty} \frac{1}{\Delta t} \int\_{-\frac{\Delta t}{2}}^{\frac{\Delta t}{2}} \mathbf{S}\_{X\_{T\_0}a}(t, f)dt \tag{12}$$

$$\mathbf{S}\_{X\_{T\_0}a}(t,f) = \frac{1}{T\_0}\mathbf{X}\_{T\_o}(t,f+\frac{a}{2})\mathbf{X}\_T^\*(t,f+\frac{a}{2})\tag{13}$$

$$\mathbf{X}\_{T\_0}(t, f + \frac{\alpha}{2}) = \int\_{t - \frac{T\_0}{2}}^{t + \frac{T\_0}{2}} \mathbf{x}(u) e^{-j2\pi fu} du \tag{14}$$

where Equation (12) is used to estimate the cyclic spectral density, Equation (13) is the cyclic periodic diagram, Equation (14) is the short-time Fourier transform (STFT) formula, Δ*t* is the length of received data, *T*<sup>0</sup> is the window length for the STFT, and (•) <sup>∗</sup> is the complex conjugate.

According to the insensitivity of the cyclic spectrum to noise and the above theory, the characteristic parameter **T**3 = max(**S***x*α) is defined to distinguish the signal set of (OOK, DPSK) and (16QAM, 64QAM).

### **3. Compressed Values of Feature Parameters Based on Compressed Sensing**
