*Article* **Time-Dependent Prediction of Microblog Propagation Trends Based on Group Features**

**Qin Zhao 1,2,3, Zheyu Zhou 1, Jingjing Li 1, Shilin Jia <sup>1</sup> and Jianguo Pan 1,\***


**Abstract:** The conventional machine learning-based method for the prediction of microblogs' reposting number mainly focuses on the extraction and representation of static features of the source microblogs such as user attributes and content attributes, without taking into account the problem that the microblog propagation network is dynamic. Moreover, it neglects dynamic features such as the change of the spatial and temporal background in the process of microblog propagation, leading to the inaccurate description of microblog features, which reduces the performance of prediction. In this paper, we contribute to the study on microblog propagation trends, and propose a new microblog feature presentation and time-dependent prediction method based on group features, using a reposting number which reflects the scale of microblog reposting to quantitatively describe the spreading effect and trends of the microblog. We extract some dynamic features created in the process of microblog propagation and development, and incorporate them with some traditional static features as group features to make a more accurate presentation of microblog features than a traditional machine learning-based research. Subsequently, based on the group features, we construct a time-dependent model with the LSTM network for further learning its hidden features and temporal features, and eventually carry out the prediction of microblog propagation trends. Experimental results show that our approach has better performance than the state-of-the-art methods.

**Keywords:** propagation trends; social networks; group features; dilated CNN; machine learning

### **1. Introduction**

With the rapid development of the Internet in China, a Sina microblog has now become an indispensable way for people to obtain and issue information. On the microblog platform, users can express their own opinions with freedom which will be spread and propagated through other users' browsing and reposting. As the microblogs being continuously reposted by other users, some microblogs will finally lead to their explosive spread and become a hot topic, the priority among people's discussion, while some others will never. Therefore, in order to make a microblog better serve the public in many fields such as public opinion supervision, advertising, information push, and corporate marketing [1], the prediction of potential hot microblogs becomes a key research object among people, namely the prediction of microblog propagation trends.

The conventional machine learning-based methods for the prediction of a microblog reposting number mainly conduct extraction and representation of the static features of user attributes and content attributes of the source microblogs to construct its machine learning prediction model, but neglect the dynamic features generated in the process of microblog propagation.

In this paper, we contribute to the study on microblog propagation trends, and inspired by our previous works [2–4], we propose a new microblog feature description and

**Citation:** Zhao, Q.; Zhou, Z.; Li, J.; Jia, S.; Pan, J. Time-Dependent Prediction of Microblog Propagation Trends Based on Group Features. *Electronics* **2022**, *11*, 2585. https:// doi.org/10.3390/electronics11162585

Academic Editor: Ahmad Taher Azar

Received: 28 July 2022 Accepted: 16 August 2022 Published: 18 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

1

time-dependent prediction method based on group features. When predicting the scale of reposting, In an innovative way, we extract and take some dynamic features generated in the process of microblog propagation and development into account, together with some traditional static features such as user features and microblog features as group features to solve the problem of inaccurate microblog feature descriptions of a traditional machinelearning based research and also to improve the accuracy of reposting number prediction. We first make a description of microblog group features, which specifically includes features from three aspects, namely, commonly used individual features, reposting comment features, and group influence features , and they are all extracted through their corresponding feature extraction methods. Commonly used individual features are extracted through manually constructed feature engineering. We use feature extraction models based on cluster and Dilated CNN to extract reposting comment features, and PageRank algorithm is used to extract group influence features. Finally, with the extracted microblog group features, we construct a time-dependent prediction model for reposting number and conduct the prediction of microblog propagation trends.

The remainder of the paper is organized as follows: Section 2 briefly introduces the domestic and foreign research progress of microblog propagation trend prediction, and the related technologies mainly involved in this paper. On the basis of traditional static feature representation, Section 3 proposes a microblog feature representation and a timedependent prediction method for a microblog propagation trend based on group features. In Section 4, we conduct relevant experiments and evaluations on the proposed method with real datasets of Sina microblogs. Finally, in Section 5, we outline our contributions, conclude the paper, and forecast our future work.

### **2. Related Work**

### *2.1. The Domestic and Foreign Research Progress*

The traditional prediction methods for microblog reposting number mainly include three approaches, which are the prediction methods based on topological structure [5], the methods based on machine learning [6], and the methods based on points [7], respectively.

The traditional topological structure-based models originate from the Information Diffusion Theory, which is widely applied, mainly in the fields such as recommendation, and monitoring. For the study on the prediction problem of Sina microblog propagation trends, the most typical models include infectious disease models and information cascade models. These prediction methods fully consider the role of the forwarders in the social network formed by microblog users; however, due to the large number of nodes in the network, the complexity of the methods is higher.

The traditional machine learning-based models mainly use machine learning models to learn the hidden features that affect the microblog reposting number so as to carry out predictions [8]. These methods analyze the relevant factors that affect the microblog reposting number, with which extract the relevant features of the source microblog, and then convert the prediction problem into a classification or regression problem. Through the learning of historical data and extracted features, the machine learning-based models are constructed and trained, and finally make predictions to obtain the corresponding target value.

The point-based prediction methods mainly introduce the idea of time decay. Relevant researchers believe that whether the microblog will be popular is related to time, and the propagation trends of microblog usually change from slow to fast, then from fast to slow, and eventually cease to perish. The basic idea is to regard the event of microblog propagation trends variation as a life cycle event of a microblog, and then predict the probability of the occurrence of a microblog's stopping reposting and death, and finally obtain the propagation scale through the maximum likelihood method.

### *2.2. Related Techniques*

### 2.2.1. Key Information Extraction Technology Based on TF-IDF

TF-IDF [9] is a simple but effective key information extraction technology used to evaluate the importance of a term to one of the documents in a document set. TF-IDF actually calculates the product of the value of TF and the value of IDF. TF is term frequency, which means the frequency of the appearance of terms in a document, while IDF is inverse document frequency, which measures how many documents in the document set contain the term, and is a measure of the general importance of the term in the document set. The fundamental principle of TF-IDF is that, if a term appears frequently in a document but not frequently in other documents in the document set, then we can consider that, for this document, compared to other terms in it, this term is more important and has better distinguishing ability. The calculation of TF-IDF seen in Equations (1) and (2) is as follows:

$$TF = \frac{t}{s} \tag{1}$$

$$IDF = \log\left(\frac{D}{d} + 0.1\right) \tag{2}$$

where *t* represents the number of times the term appears in the document, and *s* represents the sum of the number of times all terms appear in the document, while *h* and *d* respectively represent the total number of documents in the document set and the number of documents which contain the term in the document set.

### 2.2.2. Dilated Convolutional Neural Network (DCNN)

DCNN is a special convolutional network. Compared to the conventional CNN, DCNN contains a parameter called dilation rate [10], which is mainly used to indicate the size of the dilation. DCNN have the same convolution kernel size as ordinary CNN, so, during DCNN's convolution calculation process, no additional convolution kernel parameters are needed. However, due to the existence of dilation rate, DCNN shares a larger range of parameters and a larger receptive field without reducing the resolution or coverage [11].

As is shown in Figure 1, for DCNN with a convolution kernel size of 3 × 3 and dilation rate of 2, the size of its receptive field is the same as the one of ordinary CNN with a convolution kernel size of 5 × 5. However, during DCNN's calculation process, only nine parameters are used, 36% of the parameter number of ordinary CNN with a convolution kernel size of 5 × 5, which is equivalent to providing a wider receptive field with the same convolution calculation cost.

**Figure 1.** The diagram of the dilated convolution.

DCNN has applications in many fields such as image segmentation, speech synthesis, structural condition inspection, and target detection [12–15]. Due to the fact that high-level convolutional feature maps of the convolution network have larger receptive field and more abstract features, while low-level ones have a smaller receptive field and more detailed features, the usage of the combination of multi-scale feature maps in tasks can contain more information. Therefore, DCNN used in the paper contains multiple sizes of dilated convolutions [16].

### 2.2.3. Neural Network RNN and LSTM

In the application of machine learning, some tasks require a better ability to process information sequence when a Recurrent Neural Network (RNN) is needed. RNN has a strong ability to process time series data [17], while the process of microblog propagation we studied on in this paper is just a time series process, hence RNN is very suitable as the prediction model in this paper. Although RNN can learn sequential dependency of data, but due to the existence of its gradient vanishing problem, RNN has the defect to store and learn long-term dependence. For this reason, we choose an improved kind of RNN, called Long Short Term Memory Network (LSTM), to construct the prediction model in this paper. LSTM has made up for the shortcomings of the original RNN and is currently one of the most successful and popular RNN architectures, which has been applied to various time series tasks such as natural language processing and sound data processing.

In order to improve the defect that it is difficult for RNN to store and learn longterm dependence, LSTM adds a cell memory controller c to learn long-term features, as is shown in Figure 2. At time t, LSTM has three inputs, which are the current input value *xt*, the previous output value *ht*−1, and the previous cell state *ct*−1, as well as two outputs, respectively, which are the current output value *ht* and the current cell state *ct*. Through three gate structures, namely input gate, forget gate, and output gate, LSTM maintains and updates the cell state [18]. In LSTM, temporal information is added or deleted from the cell state by gate structures, which selectively allows information to pass through. Neurons can feed data to the upper layer or the same layer [19].

**Figure 2.** The structure of RNN and LSTM. (**a**) the structure of RNN unit; (**b**) the structure of LSTM unit.

With the current input and the previous cell state, LSTM gradually updates its cell state. Then, the output of the merged layer is trained through the "Relu" layer. Finally, the output layer produces predicted values [20].

### **3. Methods**

In this paper, we use reposting number, which reflects the scale of microblog reposting to quantitatively describe the propagation effect of microblogs on the issue of study on microblog propagation trends. In order to make up for the problem that feature description of traditional machine learning-based prediction methods for reposting number is

inaccurate, we mainly use microblog group features, including user and content features, group influence features, and reposting comment features to model [21]. We further learn its hidden features and time-dependent features relying on an LSTM network, and finally predict microblog reposting number.

### *3.1. Microblog Group Feature Representation*

3.1.1. Bloggers and Microblog Content Features

Commonly used microblog features for prediction in current research are mainly divided into two categories [22–24]. The first one is features of users themselves, namely the blogger features, and the second one is features of blogs themselves, namely microblog content features. For Sina microblog, blogger features include the number of blogger's fans, blogger influence, blogger's recent microblog heat [22,25,26], and for microblog content features, current research usually focuses on points including whether the original microblogs contain links, and its hashtags.

The blogger features and microblog content features specifically used in this paper are shown in Table 1, including feature tags, specific meanings, and value ranges.

**Table 1.** Common microblog features.


For some important features, their brief descriptions are as follows:

(1) The blogger's recent microblog heat: There is a certain logical relationship between the heat of one microblog and the heat of its blogger's other microblogs recently issued. Therefore, we use the heat of 10 other microblogs recently issued by the blogger as one basis of the calculation of the blogger's recent microblog heat. The calculation is shown in Equation (3):

$$h = \frac{1}{10} \sum\_{m=1}^{10} \left( r\_m + c\_m + l\_m \right) \tag{3}$$

where *h* represents the required feature of the blogger's recent microblog heat. For the *m*-th other microblog recently issued by the blogger, *rm* represents the reposting number of the microblog, *cm* represents the number of microblog comments, and *lm* represents the praise score of the microblog.

(2) Blogger influence: The microblog propagation trends will be directly affected by the strength of the influence of its blogger, namely, with larger blogger influence, it is easier for the microblog to be spread. Since the following relationship between microblog users is similar to the links between web pages in the Internet, the idea of PageRank algorithm can be used to evaluate the influence of users. The basic idea is that the user's influence is larger followed by more influential users, the user's influence is larger with more fans, and the user's influence is larger with more fans and follow fewer users. According to the research on the topological structure and information propagation of Sina microblog [23], it

is found that it has an obvious small-world experiment, and its degree distribution obeys a power-law distribution. According to the idea that messages can be sent to other people on the network with fewer hops, the calculation of user influence is shown in Equation (4):

$$I(u\_i) = (1 - d) + d \sum\_{j=F(u\_i)}^{N-1} \frac{I(u\_j)}{out(u\_i)} \tag{4}$$

where *I*(*ui*) represents the required influence of user *i*. *d* is the damping factor, which represents the probability of transferring from one given user to another random user, with its value range between 0 and 1, and the value of *d* is usually 0.85. *F*(*ui*) represents all user nodes that have an outbound link to the blogger node, namely the user's fan group. *N* represents the number of all user nodes that have an outbound link to the blogger node, namely the number of user's fans. *Out*(*ui*) represents the out degree of user node *ui*.

(3) Microblog length: Microblogs issued by most users are short and fragmented daily life and emotional catharsis, which is hard to result in widespread resonance and reposting. In contrast, those microblogs with more complete expression are more likely to gain the understanding and resonance of other users, and easier to spread. Therefore, we consider the microblog length as one of the microblog content features, and set the classification criteria as whether the length of microblog is more than 15 words.

(4) Whether to include usernames or hashtags: Regarding microblog content features, we consider the problem of whether usernames or hashtags is included. In a microblog, usernames are used to directly quote other users, or to address or talk about a certain user, and hashtags are used to mark specific topics.

(5) Special marks: We consider whether there is an exclamation mark "!" or a question mark "?" at the end of a microblog as part of microblog content features. The exclamation mark is used to mark emotional statements in the text, and the question mark represents a problem in the text. The existence of both marks is more likely to result in the blogger's passing his own emotions to other users or arousing other users responding, which contributes to the spread of the blog.

### 3.1.2. Key Comment Features Based on Cluster and DCNN

A microblog often expresses different meanings in different temporal and spatial contexts, and sometimes may even contain irony, metaphors, and other information. In this regard, forwarder comments are often needed as supplement to the information of the source microblog to provide temporal and spatial background information which the original microblog lacks. At the same time, users are usually susceptible to comments from other users. Therefore, in this paper, we consider extracting comment features of forwarder group to improve the accuracy of machine learning-based prediction methods.

Since there are too many forwarder comments on a microblog, it is necessary to extract important information of the comments first, and then encode and vectorize them into group comment features of the blog. The process of feature extraction is shown in Figure 3, where there are generally three steps: (1) First, we use the cluster-based key information extraction model to extract key information from the group comments of microblog forwarders; (2) Encoding and vectorizing group comment information into sentence embeddings; and (3) Inputting the sentence embeddings to the DCNN convolution layer for feature extraction and compression. Finally, feature embeddings of the forwarder group comments are extracted, which contains temporal and spatial background information and is a supplement to the source microblog.

**Figure 3.** The learning of group comment features.

The specific work of each step is as follows:

Step 1: Key information extraction based on clusters

Forwarder comments are composed of sentences. In the process of microblog propagation, forwarder group comments contain a large number of sentences, so key information of the comments needs to be extracted first. In this paper, we use the cluster-based key information extraction technology to extract the corresponding feature sentences [27]. Our concept of "cluster" in this paper refers to the aggregation of keywords, namely, sentence fragments which contain multiple keywords.

It can be seen in Figure 4 that the framed part in the figure represents a cluster, where the keywords are obtained by calculating the TF-IDF score of terms of the comment sentences. If the distance between two keywords is less than the threshold, then these two keywords are classified into the same cluster. We set the threshold to 4 in this paper. In other words, if there are more than four other terms between two keywords, then these two keywords will be divided into two clusters. Then, we calculate the importance score of the clusters, the calculation of Equation (5) is as follows:

$$C\_-IMP = \frac{(NKyes)^2}{len} \tag{5}$$

where *C*\_*Imp* represents the required importance score of clusters. *NKeys* represents the number of keywords in the cluster. *Len* represents the number of terms in the cluster. Taking Figure 4 as an example, in the figure, the cluster in the frame has a total of four terms, two of which are keywords. Therefore, the importance score of this cluster is (22)/4 = 1. After that, we extract the 10 sentences with the highest cluster scores, and combine them together as the finally extracted comments containing key information which can be further processed later.

**Figure 4.** Key sentence extraction based on cluster.

Step 2: Feature encoding and vectorization

Since the computer cannot directly understand the meanings of text information, it is necessary to encode and vectorize the extracted comment features containing key information into a multi-dimensional embedding to facilitate subsequent further processing. The basic idea of word embeddings originates from NNLM [24] (Neural Network Language Model) proposed by Bengio. In this paper, we use open source tool Word2vec of Google in 2013 to solve the word embedding representation problem of microblog comments. Word2vec can quickly and effectively replace text sentences with multi-dimensional embeddings based on a given corpus.

There are two models for Word2vec, which, respectively, are the Continuous Bagof-Words (CBOW) model and the Skip-Gram (SG) model, whose structures are shown in Figure 5. For a sentence containing L words, where ..., *wi*−1, *wi* ... *wL*, respectively, represent the word embedding of each word in the sentence. In the CBOW model, a total of n words before and after the current word *wi* (here n = 2) are used to predict the current word *wi*. In contrast, the Skip-Gram model uses the word *wi* to predict the n words before and after it. Both CBOW and Skip-Gram models include input layer [24], hidden layer, and output layer.

**Figure 5.** Two models for vectorization.

After preprocessing reposting comments containing key information, Word2vec is used to encode them into multi-dimensional embeddings. We train the Word2vec model, update the weights through the backpropagation algorithm, and use the stochastic gradient

descent method to reduce the loss value, and finally obtain the byproduct, word embeddings of the model. Based on the word embeddings trained by the tool word2vec, we convert the words into microblog forwarder comments into word embeddings, and finally convert the key sentences of the comments into sentence embeddings.

Step 3: Feature extraction and compression of DCNN convolutional layer

Finally, we conduct feature extraction and compression on the forwarder comment embeddings. Due to the complexity of microblog language, the effect of usage of ordinary convolutional networks for feature extraction and compression is limited, and there are too many model parameters. Therefore, we choose to use Dilated Convolutional Neural Network (DCNN) and input the sentence embedding representation of reposting comments into the DCNN convolutional layer for feature extraction and feature compression. The three dilated convolutional layers we use are shown in Figure 6.

**Figure 6.** The dilated convolutional layer. (**a**–**c**) are, respectively, the convolution process with dilation rate k = 1, k = 2, and k = 4.

In the figure, for the three dilated convolutional layers C1, C2, and C3, their convolution kernels Map1, Map2, and Map3 are of the same size, which are all 3 × 3 matrices, but the dilation rates of the three convolution kernels are different, with values of 1, 2, and 4. In subgraph (a), a convolution kernel with dilation rate of 1 is used to convolve the input embeddings, and we input the result feature map as the output of C1 to the convolutional layer C2. In subgraph (b), a convolution kernel with dilation rate of 2 is used to convolve the feature map output by the C1 layer, and we input the result feature map as the output of C2 to the convolutional layer C3. In subgraph (c), a convolution kernel with dilation rate of 4 is used to convolve the feature map output by the C2 layer. At this time, the receptive field of the elements in the output y of the convolutional layer C3 has reached 15 × 15, while, with the ordinary convolution operation, the receptive field will only be 7 × 7.

The calculation process of the DCNN convolutional layer is shown in Equation (6).

$$\mathcal{L}(t) = f(\boldsymbol{W}^T [\boldsymbol{X}\_t^T + \boldsymbol{X}\_{t+1}^T + \dots + \boldsymbol{X}\_{t+h-1}^T]^T + b) \tag{6}$$

For the forwarder comment embeddings, the convolution kernel *W* of dilated convolutional layer is applied to a window of terms of length *h*, and local features are generated after dilated convolution. In Equation (6), *c*(*t*) is the feature value calculated at position *t*. *b* is the deviation of the current filter, and *f*(∗) is the nonlinear activation function (ReLU). We use zero padding to ensure that the size of the matrix after convolution meets requirements of the calculation. Then, the pooling operation is performed on each feature map through

the maximum pooling layer to perform feature compression on the feature embeddings, and output embedding *p*(*j*) with a fixed length. The calculation is shown in Equation (7):

$$p(j) = \max\_{t} \{ c\_j(t) \} \tag{7}$$

As is shown in Figure 6, our model uses multiple filters (with different window sizes) to obtain multiple features, and then outputs a multi-dimensional embedding at the maximum pooling layer network stage. The calculation is as shown in Equation (8):

$$CV = f(\mathcal{W}^T [p(j)\_1^T, p(j)\_2^T \dots p(j)\_{10}^T] + b) \tag{8}$$

where *f*(∗) represents convolution and pooling operations. As a result, the feature embedding representation *CV* of forwarder key comments is finally obtained, which contains spatial and temporal background information and is a supplement to the source microblog information.

### 3.1.3. Group Influence Features

User influence refers to the ability of a user's opinions, comments, or behaviors to change the behaviors or opinions of other users. In microblog social networks, user influence has a direct impact on microblog propagation trends. Traditional machine learningbased prediction methods usually consider the personal influence of bloggers, without considering the influence of reposting users group in the process of microblog propagation. For example, if a celebrity user with huge influence reposts a microblog, then the propagation scale of this microblog is likely to be greatly improved [24]. In this paper, we use the PageRank algorithm [28] to calculate group influence to make up for the defect of blogger personal influence in traditional prediction methods.

Some scholars regard the microblog social network as a specific directed graph based on graph theory, each node of which corresponds to each user, and the directed edges in the graph represent the relationship "follow" and "followed" in the microblog network. Since the following relationship between users represented with directed edges is similar to the links between web pages on the Internet, we use the idea of PageRank algorithm to evaluate and calculate user influence. The main idea is that the user's influence is larger followed by more influential users, the user's influence is larger with more fans, and the user's influence is larger with more fans and follow fewer users. The algorithm comprehensively considers the structure of the microblog social network, and the final calculated user influence value can also reflect the user's influence objectively. The calculation Equation (9) of the PageRank value of user influence is as follows:

$$I(u\_i) = (1 - d) + d \sum\_{j=\mathcal{F}(u\_i)}^{N-1} \frac{I(u\_j)}{out(u\_i)} \tag{9}$$

where *I*(*ui*) represents the required influence of user *i*. *d* is the damping factor, which represents the probability of transferring from one given user to another random user, with its value range between 0 and 1, and the value of *d* is usually 0.85. *F*(*ui*) represents all user nodes that have an outbound link to the blogger node, namely the user's fan group. *S* represents the number of all user nodes that have an outbound link to the blogger node, namely the number of user's fans. *Out*(*ui*) represents the out degree of user node *ui*.

After calculating the personal influence of reposting users through the PageRank algorithm in the microblog propagation process, we accumulate the individual PageRank values of the users in the reposting group, calculate the group influence of reposting users, and serve the combination of group influence features and blogger personal influence features as the final influence features. The calculation of full influence is shown in Equation (10), which calculates the accumulation of the influence of all reposting users before time *tm*:

$$FI(u\_i) = \sum\_{t=t\_1}^{t\_m} \sum\_{j=F(u\_i)}^{N-1} I(u\_j) \tag{10}$$

### *3.2. The Construction of the Prediction Model for Reposting Number*

Taking into account the time dependence of the change of microblog propagation trends, in this paper, we choose the extracted microblog group features combined with Long Short Term Memory Network (LSTM) to construct the prediction model. The overall framework of our LSTM prediction model in this paper is shown in Figure 7, which contains four functional models, including input layer, LSTM hidden layer, output layer, and network training, where the input layer is responsible for preprocessing the microblog feature data set to meet requirements of the network input. The LSTM hidden layer is composed of a multi-layer recurrent neural network constructed by LSTM units. The output layer provides the final prediction results of reposting number, and we train the prediction network through the Adam optimization algorithm to update model weights iteratively.

Adam optimization is an effective gradient-based stochastic optimization method, which combines the advantages of AdaGrad and RMSProp optimization algorithms and has excellent performance in network training. Compared to other stochastic optimization methods, Adam is better in terms of speed and calculated amount, occupies fewer computing and storage resources, and the overall performance in practical applications is relatively better.

**Figure 7.** The framework diagram of the LSTM prediction model.

In Figure 7, the upper right corner is the detailed structure of the LSTM unit. LSTM maintains and updates the cell state of the cell memory controller c through three gate structures, including input gate, forget gate, and output gate, and learns long-term features. The internal formulas of LSTM we used are shown in Equations (11)–(15):

$$c\_t = f\_t c\_{t-1} + i\_t \\tanh(\mathcal{W}\_{\text{xc}} \mathbf{x}\_t + \mathcal{W}\_{\text{hc}} h\_{t-1} + b\_c) \tag{11}$$

$$O\_t = \sigma(\mathcal{W}\_{xo}\mathfrak{x}\_t + \mathcal{W}\_{ho}h\_{t-1} + b\_o) \tag{12}$$

$$f\_t = \sigma(\mathcal{W}\_{xf}\mathbf{x}\_t + \mathcal{W}\_{hi}\mathbf{h}\_{t-1} + \mathbf{b}\_f) \tag{13}$$

$$h\_t = o\_t 
tanh(c\_t) \tag{14}$$

$$c\_t = f\_t c\_{t-1} + i\_t \\tanh(\mathcal{W}\_{\text{xc}} \chi\_t + \mathcal{W}\_{\text{hc}} h\_{t-1} + b\_{\text{c}}) \tag{15}$$

where *σ* is the activation function, and *it*, *ot*, *ft*, *ct*, and *ht* represent, respectively, the input gate, output gate, forget gate, cell state, and the final output of LSTM.

First, we preprocess the multi-dimensional group features of microblog in the input layer. The original microblog feature sequence is defined as *Fo* = *f*1, *f*2, ..., *fn* in the order of timestamps. The multi-dimensional microblog features are preprocessed, time slices are divided, and data set is divided into training set and test set. Supposing that the input length is *L*, the processed microblog data set is denoted as the sample feature *X*, the actual reposting number *Y*, and the corresponding predicted reposting number *Yp* output by the output layer, the representations of which respectively correspond to the following Equations (16)–(18):

$$X = \{\mathbf{x}\_1, \mathbf{x}\_2, \dots, \mathbf{x}\_L\} \tag{16}$$

$$Y = \{y\_1, y\_2, \dots, y\_k\} \tag{17}$$

$$Y\_p = \{y\_1, y\_2, \dots, y\_k\} \tag{18}$$

where the value of *k* is 3. The calculation of root mean square error, namely the loss function, is shown in Equation (19):

$$loss = \sqrt{\frac{1}{m} \sum\_{i=1}^{m} (Y - Y\_p)^2} \tag{19}$$

The calculation and training of the prediction network are mainly done through the back propagation algorithm through time (BPTT) [25], as is shown in Figure 8.

**Figure 8.** The diagram of BPTT algorithm.

The training flow chart of the model is shown in Figure 9. The training process is generally divided into four steps, the specific process of which is as follows:

(1) First, we calculate the output value of the LSTM unit structure according to the forward propagation.

(2) Secondly, we calculate the error terms of all LSTM unit structures through backpropagation, where the error terms include two propagation directions in terms of time and network structure, respectively.

(3) Then, the network automatically calculates the gradient of corresponding weight according to the calculated error value.

(4) Finally, after setting parameters such as the learning rate, we train the network, and through the gradient-based Adam optimization algorithm, we update the network weights iteratively until the network converges.

**Figure 9.** The training flow chart.

To sum up, the overall structure of our prediction method for microblog reposting number based on group features in this paper is shown in Figure 10.

**Figure 10.** Time-dependent prediction process based on group information.

### **4. Experiments and Results**

*4.1. Experiment Preparation and Data Preprocessing*

4.1.1. Experimental Environment

The environment and configuration of the hardware and software used in the experiment are as follows:

(1) Hardware Configuration:

1. CPU: inter(R) Core(TM) i5-8265u cpu @160GHz 180GHz RAM: 8GB Memory: 256 Solid+1TB Portable Hard Disk System: windows 10

2. GPU: NVIDA-GeFore GTX1080-Cuda Memory: 700GB Hard Disk System: Ubuntu 15.6 (2) Software Configuration:

Compiler: Python 3.7 Developing tool: Anaconda, Jupyter Notebook, Pycharm Community.

### 4.1.2. Dataset and Data Preprocessing

In this paper, we use real data from the Sina microblog collected and issued by the team of Tang Jie of Tsinghua University (http://arnetminer.org/Influencelocality, accessed on 15 August 2022). The overview of the original data set is shown in Table 2.

**Table 2.** Original data set.


We select some data from the original data set for our experiments, which contain relevant information such as microblog content and creation time. For the microblog data set, we establish a reposting chain ranked by time according to its reposting time and content. When sampling the data set, in order to ensure the integrity of the reposting chain, the reposting process of each microblog event should already be ended. In Table 3, some relevant attributes obtained through statistical analysis of the data set are shown.

**Table 3.** Some attributes of the data set.


In Figure 11, the distribution of reposting number of the microblog data set is shown. As can be seen from the figure, the distribution shows a clear power-law distribution trend that is plotted on the scale of logarithm.

To make the original data set meet the requirements of the prediction network input, the data need to be preprocessed accordingly first. The work of preprocessing mainly includes storing microblog data, processing missing values, removing stop words, and word segmentation, where non-numerical data need to be processed as numerical ones first, such as male and female gender replaced with 0 and 1, respectively, and normalization as well as other processing operations are required for numerical data.

**Figure 11.** The distribution of a reposting number of the microblog data set.

Since the proposed model is a time-dependent prediction model, and the requirement of the LSTM network input is a 3D format, namely, [samples, timesteps, features], so, after preprocessing of the input data, the format of data needs to be reshaped into a 3D one, and the data be divided into time slices. The specific time slice division process of the data are shown in Figure 12. We select 10, 20, 30, ... 120 min as a time slice, respectively. Finally, we divide the data set into training set, test set, and validation set.

**Figure 12.** The time slice division of dataset.

### *4.2. Prediction Model for Reposting Number*

4.2.1. The Analysis of Some of Group Features

In Section 3, we have provided a detailed explanation of the group features, where group influence feature is the sum of users' influence values calculated by PageRank in the microblog forwarder group. The users with top 10 personal influence are shown in Table 4, and Table 5 shows the influence data of some users, where the \* is used to protect the user privacy.


**Table 4.** The ranking of user influence.

**Table 5.** Some data of microblog influence.


In Figure 13, the relationship between the amount of reposting and user influence is counted, where the horizontal axis and vertical one, respectively, indicate the magnitude of influence and the average amount of microblog reposting. It can be seen from the figure that, as the user influence decreases, the reposting amount of microblog also decreases, indicating a positive correlation between user influence and the reposting amount of microblogs.

**Figure 13.** The relationship between the amount of reposting and user influence.

In addition, some other extracted features of bloggers and blogs are also very important. Taking the publishing time period as an example, as is shown in Figure 14, the different publishing times each day also have an impact on the reposting amount of microblog.

**Figure 14.** The influence of the publishing time of microblogs.

### 4.2.2. Training Process and Parameter Selection

In the experiment, we input our extracted features into the model for training and prediction. The hyperparameters of our model, including epoch and learning rate, need to be selected and adjusted through experients, otherwise the performance of our model will decrease. Here, we take hyperparameter epoch and learning rate as an example.

(1) Generally, the generalization ability of the model will increase as the epoch increases. However, an excessively large epoch may lead to the problem of over-fitting, which may decrease the generalization ability of the model on the contrary. Figure 15 shows the performance curve of our model under different epochs. As can be seen from the figure, when the epoch reaches 150, the loss of the model no longer decreases.

(2) The learning rate is another hyperparameter of our model. If the learning rate is too small, the training time of the model will be too long, while, with a learning rate that is too large, it is easy to exceed the threshold, making the model unstable and reducing its performance. Figure 16 shows the relationship curve between learning rate and RMSE. From the figure, it can be found that it is the most appropriate for the learning rate to be 0.1.

**Figure 15.** The relationship between Epochs and Loss.

**Figure 16.** The relationship between Learning rate and RMSE.

Finally, Table 6 shows the final value of all the hyperparameters we use determined through comparative experiments, including epoch and learning rate. Among them, the final learning rate is 0.1. The size of the forwarder comment embedding is 300. The size of the model input is 120. The convolution layer has two layers, the convolution kernel size of which is 3 × 3 and 5 × 5, and the number is 128 and 64, respectively. In addition, the number of LSTM prediction units is 30.

**Table 6.** The parameter setting of the model.


### 4.2.3. Results

(1) Evaluation Metrics and Benchmark Methods

We use three evaluation metrics, *MAE*, *MAPE*, and *RMSE*, to measure the performance of our model, where *MAE* is used to measure the mean absolute error between the predicted value and the actual value on the data set. For a test set containing n microblog messages, the definition of *MAE* is in Equation (20):

$$MAE = \frac{1}{n} \sum\_{t=1}^{n} |actual(t) - forecast(t)| \tag{20}$$

*MAPE* is used to measure the mean absolute percentage error between the predicted value and the actual value on the data set. The definition of *MAPE* is in Equation (21):

$$MAPE = \frac{1}{n} \sum\_{t=1}^{n} |\frac{actual(t) - forecast(t)}{actual(t)}| \times 100\% \tag{21}$$

*RMSE* is used to measure the root mean square error between the predicted value and the actual value on the data set. The definition of *RMSE* is in Equation (22):

$$RMSE = \sqrt{\frac{\sum\_{t=1}^{n} |actual(t) - forecast(t)|^2}{n}} \tag{22}$$

We compare our proposed method with several benchmark models. The benchmark models are briefly introduced as follows:

RPP is a model based on an enhanced Poisson process, which integrates three aspects of factors, respectively, which are the strength of the message, the time relaxation equation for the message which decays over time, and the enhancement equation for the preferential link phenomenon in message propagation.

The model LR is a simple but efficient classification model in machine learning, which is widely used in practice.

The model S-H is an epidemic prediction model based on logarithm linear regression of a variable proposed by Szabo et al.

The fundamental principle of BP network is to modify the weight and threshold along the direction of rapidly reducing the objective function.

The traditional LSTM network uses static features to predict the reposting number, without considering the dynamic features generated in the process of microblog propagation.

The model MP5 combines the characteristics of decision trees and multiple linear regression, each leaf node of which is a linear regression model. Therefore, the model MP5 can be used for regression problems of continuous value.

The model T-P divides the prediction problem of reposting number into two procedures. In the first procedure, T-P classifies microblog based on the potential reposting number, and, in the second procedure, T-P conduct regression in each subcategory separately.

The model BCI considers the characteristics of two factors, namely historical behavior and content relevance, to predict the problem of reposting number.

(2) Experiment on Real Data Set

The experiment is carried out on the real microblog data set in two parts. In the first part, 80% of the data set is divided into the training set and 20% is divided into the test set. In the second part, 70% of the data set is divided into the training set, and 30% is divided into the test set. Table 7 shows the experiment on the proposed model with benchmark models such as LR, S-H, and RPP, as well as the results of corresponding evaluation metrics RMSE, MAPE, and MAE. It can be seen from the table that, when 70% of the data set is divided into the training set, the RMSE of our proposed model is 7.335, the MAPE is 23.21, and the MAE is 18.77, whose performance is better than the one of any other benchmark models. When 80% of the data set is divided into the training set, the RMSE of our proposed model is 7.233, the MAPE is 22.89, and the MAE is 17.99. Not only does it outperform other benchmark models on every evaluation metric, but compared to results of the case that 70% of the data set is divided into the training set, the results of the situation in which 80% of the data set divided into the training set are also obviously better.

In this paper, we also select some benchmark models at random for additional tests with our proposed method on the data set, and plot the prediction curve of the reposting number, as is shown in Figure 17, where (a), (b), and (c), respectively, are different reposting scales. It can be seen from the figure that, under three different reposting scales, compared to other benchmark models, our proposed method performs better, which explicitly verifies that not only do our extracted microblog group feature representations contain more comprehensive and accurate information, but our proposed time-dependent prediction method based on group features is also more excellent.


**Table 7.** The results of experiment on propagation trends.

**Figure 17.** The comparison of model performance. (**a**–**c**), respectively, are different reposting scales.

### **5. Conclusions**

In this paper, we study the propagation trends of microblog events, and, aiming at the problem of inaccurate feature descriptions of traditional machine learning-based predicting methods, in Section 3, a new microblog feature description and time-dependent prediction method of propagation trends based on group features are proposed. The proposed method is evaluated by an experiment on the real dataset of Sina microblog, the results of which prove that not only does the microblog group feature representation extracted in this paper contain more comprehensive and accurate information, but the proposed time-dependent prediction method based on a group feature also has better performance, higher accuracy, faster speed, and better robustness than traditional methods.

The method proposed in this paper also has much room for improvement. In our future work, it is necessary to conduct a further correlation analysis on the main factors and characteristics that affect the trends of microblog propagation, in order to use fewer features as group features in subsequent studies to construct our prediction model, with better performance in experiments at the same time. In addition, we construct our prediction model of microblog propagation trends on the basis of the basic LSTM prediction model, with not enough further improvement on the model itself, which will be a main perspective of our follow-up work. Furthermore, when evaluating our final prediction effects of microblog propagation trends, we use traditional evaluation metrics which lack our consideration on evaluation metrics that characterize other aspects of microblog propagation effects, such as the depth and breadth of propagation. Therefore, we will conduct further research on the establishment of a more comprehensive evaluation metrics system of microblog propagation trends.

**Author Contributions:** Conceptualization, Q.Z. and J.P.; methodology, Q.Z.; software, Z.Z.; validation, Z.Z. and S.J.; formal analysis, Z.Z.; investigation, S.J.; resources, S.J.; data curation, J.L.; writing—original draft preparation, Z.Z.; writing—review and editing, Q.Z.; visualization, Z.Z.; supervision, Q.Z.; project administration, J.P.; funding acquisition, Q.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported in part by the National Natural Science Foundation of China under Grant No.61702333, in part by the Opening Topic of the Key Laboratory of Embedded Systems and Service Computing of Ministry of Education under Grant ESSCKF 2019-03, and in part by the Natural Science Foundation of Shanghai under Grant No. 20ZR1455600.

**Institutional Review Board Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** This work was supported by the Key Innovation Group of Digital Humanities Resource and Research of Shanghai Normal University, and by the Research Base of Online Education for Shanghai Middle and Primary Schools, Shanghai Normal University, both funded by Shanghai Municipal Education Commission, and also by Shanghai Engineering Research Center of Intelligent Education and Bigdata, Shanghai Normal University, funded by Shanghai Municipal Science and Technology Commission.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


### *Article* **A Study on Secret Key Rate in Wideband Rice Channel**

**Simone Del Prete \*, Franco Fuschini and Marina Barbiroli**

Department of Electrical, Electronic and Information Engineering "G. Marconi", University of Bologna, 40126 Bologna, Italy

**\*** Correspondence: simone.delprete4@unibo.it

**Abstract:** Standard cryptography is expected to poorly fit IoT applications and services, as IoT devices can hardly cope with the computational complexity often required to run encryption algorithms. In this framework, physical layer security is often claimed as an effective solution to enforce secrecy in IoT systems. It relies on wireless channel characteristics to provide a mechanism for secure communications, with or even without cryptography. Among the different possibilities, an interesting solution aims at exploiting the random-like nature of the wireless channel to let the legitimate users agree on a secret key, simultaneously limiting the eavesdropping threat thanks to the spatial decorrelation properties of the wireless channel. The actual reliability of the channel-based key generation process depends on several parameters, as the actual correlation between the channel samples gathered by the users and the noise always affecting the wireless communications. The sensitivity of the key generation process can be expressed by the secrecy key rate, which represents the maximum number of secret bits that can be achieved from each channel observation. In this work, the secrecy key rate value is computed by means of simulations carried out under different working conditions in order to investigate the impact of major channel parameters on the SKR values. In contrast to previous works, the secrecy key rate is computed under a line-of-sight wireless channel and considering different correlation levels between the legitimate users and the eavesdropper.

**Keywords:** physical layer security; Rice channels; wireless communications; 6G security

### **1. Introduction**

Modern cryptography is usually based on mathematical algorithms and can be divided into symmetric and asymmetric encryption systems. Symmetric encryption employs the same key to both encrypt and decrypt messages, while asymmetric cryptography relies on two keys: a public key to turn a plaintext into a ciphertext and a private key to retrieve the plain message. In this framework, the advent of quantum computers might be a threat for modern cryptography systems, that are usually termed to be only computational secure [1]. As an example, RSA, the most popular system for asymmetric cryptography, can easily be broken by Shor's algorithm if run by a quantum computer [2]. Instead, the actual symmetric encryption standard AES, in its version AES-256, is proven to be quantum resistant [3,4]. Moreover, starting from 5G, it is possible to observe a pervasive spread of low-power devices such as IoT devices, which are usually battery powered and have limited computational capacity: the modern RSA system is too lavish to be used on such devices. Therefore, not only is there the need of a quantum resistant set of security techniques, but also methods that can be supported by IoT devices. In this framework, in August 2018, the NIST published a call for an algorithm for lightweight cryptography (https://csrc.nist.gov/projects/lightweight-cryptography, accessed on 12 August 2022), showing the interest from the standardization bodies into the research for new lightweight cryptographic methods.

Physical layer security (PLS) is an umbrella of techniques which is hopefully able to achieve perfect secrecy by exploiting the unpredictable fading characteristics of the wireless channel [1,5]. In addition, PLS has recently been proposed as a key enabler for

**Citation:** Del Prete, S.; Fuschini, F.; Barbiroli, M. A Study on Secret Key Rate in Wideband Rice Channel. *Electronics* **2022**, *11*, 2772. https://doi.org/10.3390/ electronics11172772

Academic Editors: Tao Huang, Shihao Yan, Guanglin Zhang, Li Sun, Tsz Hon Yuen, YoHan Park and Changhoon Lee

Received: 26 July 2022 Accepted: 1 September 2022 Published: 2 September 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the security of future 6G communications systems [6,7]. Among the different techniques fostered under the aegis of PLS, physical layer key generation (PLKG) seems to be a mature and promising solution to protect the confidentiality of communications [8,9], in particular when low-power devices are employed in the system (e.g., IoT devices [10]). PLKG allows two users (here referred to as Alice and Bob) to generate a symmetric encryption key, simply by a mutual observation of the wireless channel, which is desirable to be symmetric and random. In the end, there is a nice interest to employ PLS techniques in the future security paradigm.

Several different metrics have been considered to assess the performance of the PLKG protocol, including key randomness, key disagreement and key generation rate. In particular, the secrecy key rate (SKR) may be of special interest, as it represents the maximum number of bits that can be achieved from each channel observation without the possibility of an eavesdropper (referred to as Eve) catching them [11]. Previous studies on the PLKG have mainly focused on the feasibility of the key generation process under the following conditions:


These assumptions might not be true in a real scenario: higher frequency (e.g., mm-Wave, Tera-hertz) will be used in the future, already starting from the 5G standards [12]. Therefore, the wireless propagation will likely occur mostly in LOS condition, in order to cope with the high attenuation of the high frequency-bands. Moreover, due to the inner broadcast nature of the wireless channel, it is always possible to eavesdrop the communication, and in this specific case, try to steal some bits of the key.

In the literature, the SKR is usually reduced to the mutual information between Alice and Bob [13], i.e., neglecting the presence of Eve in the channel, who nonetheless decreases the number of bits that can be securely extracted. However, in [14], the authors considered the presence of the eavesdropper, but assumed Gaussian channel samples which might not be true in reality. In addition, it is often assumed that the generation occurs in a non-LoS scenario, i.e., under Rayleigh-like fading conditions [15], which is the ideal case for the PLKG thanks to the high entropy of the channel. Few works in the literature have evaluated the PLKG under LoS conditions, e.g., Ref. [16] computed an upper bound on the key generation capability of the two users communicating under LoS conditions. However, they considered the case in which the eavesdropper is capable of estimating the LoS component, and they assumed perfect channel reciprocity.

The goal of this work is to assess the performance of the PLKG through the computation of the SKR. Monte Carlo simulations have been performed under real-case general conditions with the aim of estimating the SKR in a LoS wireless link. In addition, an eavesdropper (Eve) is assumed to be present and sees the Alice–Bob channel with a low, but not zero, correlation: the correlation matrix of the Alice–Bob, Alice–Eve and Bob–Eve channels is an input parameter of the simulation. Moreover, instead of mutual information, the entire SKR, with its upper and lower bound, is computed. Additionally, the channels are generated according to a realistic 3GPP channel model (as it will be described in Section 3.2). Furthermore, the reciprocity is not assumed to be perfect and the impact of non-ideal reciprocity is taken into account by generating highly correlated channels between the legitimate users, but not equals. The simulations are repeated for different channel conditions: different Rice factor, signal-to-noise ratio, and delay spread (DS).

The rest of the paper is organized as follows: the PLKG protocol is shortly introduced in Section 2. Section 3 explains the assessment simulation procedure, whereas Section 4 reports a validation of the simulation procedure under a reference Gaussian case. The results of the assessment are reported in Section 5 and finally some conclusions are drawn in Section 6.

### **2. Physical-Layer Key Generation Protocol**

The aim of the PLKG protocol is to let Alice and Bob autonomously generate a symmetric encryption key, without the possibility for Eve to steal the key. It fundamentally relies on the following general properties of the propagation channel [1]:


PLKG usually consists of four well-known stages [1]:


An important metric for the PLKG is the SKR, which was introduced by Maurer in [11]. Suppose that Alice, Bob and Eve, respectively, acquire the channel observations *X<sup>A</sup>* = [*xA*(1), *<sup>x</sup>A*(2), ··· , *<sup>x</sup>A*(*n*)], *<sup>X</sup><sup>B</sup>* = [*xB*(1), *<sup>x</sup>B*(2), ··· , *<sup>x</sup>B*(*n*)], *<sup>X</sup><sup>E</sup>* = [*xE*(1), *<sup>x</sup>E*(2), ··· , *<sup>x</sup>E*(*n*)], then the SKR has an upper and lower bound expressed by [11]:

$$\mathbb{R}(X^A, X^B \parallel X^E) \ge \max[\mathbb{I}(X^A; X^B) - \mathbb{I}(X^A; X^E), \mathbb{I}(X^A; X^B) - \mathbb{I}(X^B; X^E)],\tag{1}$$

$$\mathbb{E}(X^A, X^B \parallel X^E) \le \min[\mathbb{I}(X^A; X^B), \mathbb{I}(X^A; X^B \mid X^E)],\tag{2}$$

which is an indication of the maximum number of bit per channel observation that can be extracted without the possibility of Eve guessing the bit [1]. The presence of Eve is taken into consideration in this work, as the information leakage to a possible eavesdropper can actually further limit the SKR. Furthermore, the channel is not assumed to be perfectly symmetric: the channel observations of Alice and Bob are still highly correlated, but not exactly the same. In order to compute the SKR under complete working conditions, a Monte Carlo simulation was therefore carried out, as explained in Section 3.

### **3. Materials and Methods**

The main goal of the work is to assess the value of the SKR under different channel conditions in a system where the encryption keys are generated according to the previously explained PLKG protocol. Figure 1 outlines the presence of the users in the channel, with a particular emphasis on the mutual correlation, whereas a summary of the simulation parameters is reported in Table 1. The target observation is the frequency response of the channel, processed through the filterbank method [20]. Therefore, the vectors of channel observations *XA*, *XB*, *X<sup>E</sup>* consist of the output of the *Nf* filters applied to the power spectral density (PSD). Moreover, the filters are supposed to be ideal pass band filters and the PSD is obtained through the square FFT of the channel impulse response (CIR), which is generated according to a wideband tapped delay model [21], where it is possible to tune the delay spread (DS) and the Rice factor K. Furthermore, the PSD observed by Alice Bob, and Eve is generated according to some mutual correlation target. This is accomplished through the Cholesky decomposition, even though it is only theoretically supported in the case of Gaussian samples. The channels are generated in order to achieve a bandwidth of 160 MHz.

**Figure 1.** General scheme of Alice, Bob and Eve on the channel: each pair of users sees the channel realizations with a different non-zero correlation.

**Table 1.** Main simulation parameters.


The SKR is computed through a Monte Carlo simulation: 5 <sup>×</sup> 105 channel realizations are generated for the same input values (DS and Rice factor K) and the SKR is computed case by case according to (1) and (2). The different simulation steps are described in the following sections, and a scheme of the procedure is sketched in Figure 2.

**Figure 2.** Diagram of the simulation.

### *3.1. Parameters*

The first block in the simulation flow chart outlined in Figure 2 refers to a parameter file listing the parameters required by each simulation snapshot. The main parameters are reported in Table 1.

### *3.2. Tapped Delay Line Model*

The wireless channel is generated according to the Tapped Delay Line "TDL-D" model described in [21]. It is a statistical channel model and consists of a set of paths with a normalized delay and power, which can be tuned to account for different propagation conditions. In particular, the channel model accounts for multipath Rice fading, i.e., the Rice factor and the DS are the tuning parameters of the model.

To generate the channel realizations, the following procedure was applied, as also described in [21]:


### *3.3. Resample*

The TDL model is then resampled in order to obtain a CIR with a continuous time axis. To this aim, a sample time is selected as the inverse of the channel bandwidth written in Table 1. Each delay of the TDL is transformed into the corresponding time sample, and the complex amplitudes of the taps falling within the same sample are coherently summed up.

### *3.4. FFT*

To obtain the channel frequency response, a simple FFT is performed on the CIR, which is also zero padded to reach "Nfft" samples (see Table 1). For the purpose of this work, the square amplitude of the channel transfer function (CTF), often referred to as Power Spectral Density (PSD) is considered. Therefore, the filtering applies to the PSD.

### *3.5. Cholesky Decomposition*

Cholesky decomposition is a matrix decomposition procedure often employed to generate correlated Gaussian samples. Let *X* = (*x*1, *x*2, ... *xn*) be a *n*-dimensional standard Gaussian random vector (*xi* ∼ N (0, 1)) made of uncorrelated samples: its covariance matrix will be the identity matrix. A set of correlated Gaussian random variables can be obtained through the Cholesky decomposition, which decomposes an Hermitian matrix (*C*) into the product of a triangular lower (*L*) and a triangular upper matrix (*L T* ).

$$
\overline{\mathbf{C}} = \overline{\mathbf{L}} \times \overline{\mathbf{L}}^T.\tag{3}
$$

The vector *Y* = *L* × *X* will then be a Gaussian random vector with a covariance matrix equal to *C*. The proof of this is simple and follows from the computation of the covariance matrix of *Y*:

$$\begin{split} E[\overline{Y} \times \overline{Y}^T] &= E[\overline{L} \times \overline{X} \times (\overline{\overline{L}} \times \overline{X})^T] = E[\overline{L} \times \overline{X} \times \overline{X}^T \times \overline{\overline{L}}^T] = \\ &= \overline{\overline{L}} \times E[\overline{X}\overline{X}^T] \times \overline{\overline{L}}^T = \overline{L} \times \overline{I}\_n \times \overline{\overline{L}}^T = \overline{L} \times \overline{\overline{L}}^T = \overline{\overline{C}}. \end{split} \tag{4}$$

This method is known to be theoretically grounded for Gaussian variables and according to [22], it is still reliable in case the variables are Gamma distributed. The Rice distribution is approximated by the Nakagami-m distribution and Gamma variables can be obtained as the square of Nakagami-m variables. By means of the Fitter (https: //pypi.org/project/fitter/, accessed on 12 August 2022) class, the PSD samples were fitted in order to empirically determine the distribution of the samples. By looking at Figure 3 and Table 2, where the Sumsquare error and the parameters (following thescipy.stats (https://docs.scipy.org/doc/scipy/tutorial/stats.html, accessed on 12 August 2022) notation) is reported for different distributions, the PSD samples distribution seems to fairly comply with a gamma distribution. Therefore, it is reasonable to suppose that the PSD samples are gamma-distributed and the method of the Cholesky decomposition is still reliable in this case. For instance, by setting the target correlation between Alice and Bob to 0.99 and the correlation between Alice/Bob and Eve to 0.1, the actual correlation levels were

then computed from on the channel samples achieved after the Cholesky decomposition, and turned out equal to 0.99 and 0.09.


**Table 2.** Sumsquare error and parameters of different distributions.

**Figure 3.** Fitting of different probability density functions to the histogram of the PSD samples.

If the matrix *C* = *L* × *L T* is the desired correlation matrix, *A i* , *B i* , *E <sup>i</sup>* are respectively Alice's, Bob's and Eve's independent *i*-th realization of the PSD, the correlated channels (*Ai*, *Bi*, *Ei*) are obtained through s simple matrix multiplication:

$$
\begin{bmatrix} a\_{i;0} & \dots & a\_{i;M} \\ b\_{i;0} & \dots & b\_{i;M} \\ \varepsilon\_{i;0} & \dots & \varepsilon\_{i;M} \end{bmatrix} = \overline{\mathbf{L}} \times \begin{bmatrix} a'\_{i;0} & \dots & a'\_{i;M} \\ b'\_{i;0} & \dots & b'\_{i;M} \\ \varepsilon'\_{i;0} & \dots & \varepsilon'\_{i;M} \end{bmatrix} \tag{5}
$$

As an example, Figure 4 depicts an example of channel realization, showing that, for high Alice–Bob correlation, the channels in frequency are quite similar, and instead Eve observes an uncorrelated channel.

**Figure 4.** A channel realization obtained with K = 10 dB, delay spread = 30 ns, Alice–Bob correlation of 0.99, Alice–Eve and Bob–Eve correlation of 0.2.

### *3.6. AWGN*

After the correlation of the channel, white noise is added to the PSD according to the signal-to-noise ratio reported in Table 1.

### *3.7. Filtering*

The SKR is computed on the PSD after the filterbank [20] method is applied. For the purpose of this project, the filters are assumed to be ideal pass-band filters and there are either 1 or 4 filters. Each filter acts as a mean operator on the sub band of the PSD (or the entire PSD in case 1 filter is employed), hence the output of a filter is a single number. In practice, if *P*(*f*) is the PSD, *fi* is the central frequency of the *i*-th filter and Δ*f* its pass band, then the output of the filter is computed as follows:

$$X\_i = \frac{1}{\Delta f} \int\_{f\_i - \Delta f/2}^{f\_i + \Delta f/2} P(f) \, df, \quad i = 1, 2, \dots, N\_f. \tag{6}$$

The filtering is also useful to reduce the dimensionality of the CTF, which is a benefit for the mutual information estimators, as will be clear in the next paragraph. In case 1 filter is employed, the entire 160 MHz is used; instead, when 4 filters are used, each filter has a non overlapping bandwidth of 40 MHz.

### *3.8. Estimators*

Mutual information estimators have been employed to obtain the mutual information required for the computation of the SKR. In particular, the Non-Parametric Entropy Estimator Toolbox (https://github.com/gregversteeg/NPEET, accessed on 15 May 2022) and a python open source estimator of the mutual information based on the channel samples vectors were exploited. Moreover, this allows to estimate the mutual information for a multidimensional sample. However, these kind of estimators requires an exponential number of samples as the dimensionality increases due to the problem known as the curse of dimensionality [23]: therefore, the number of dimensions (number of filters of the filterbank) must be kept low. For the purpose of this work, it was seen that, by using 500,000 channel realizations, the estimators already converge.

### **4. Gaussian Case and Validation**

A preliminary assessment was carried out in the Gaussian case, as the mutual information between Gaussian vectors can be expressed through analytical, closed-form formulas. The goal of this section is to evaluate the effectiveness of the simulator in a case where the mutual information can be expressed by an analytical closed formula. In particular, we derived the expression of the mutual information between correlated Gaussian variables and verified the correctness of the method implemented, particularly of the estimators.

Consider two Gaussian signals affected by AWGN:

$$A = \mathbf{s}\_{\mathbf{d}} + \mathbf{n}\_{\mathbf{d}} \tag{7}$$

$$B = s\_b + n\_{b\prime} \tag{8}$$

where *sa*,*sb* ∼ N (0, 1), *na* ∼ N (0, *σa*) and *nb* ∼ N (0, *σb*) and corr(*sa*,*sb*) = *η*. Since *A* and *B* are the sum of a zero mean Gaussian random variable, they will both be Gaussian with a variance, respectively, *σ<sup>A</sup>* and *σB*. The mutual information between *A* and *B* can be therefore expressed as:

$$\mathbb{I}(A;B) = h(A) + h(B) + h(A,B) = \frac{1}{2} \log\_2 \left( \frac{\sigma\_A^2 \sigma\_B^2}{\sigma\_A^2 \sigma\_B^2 - \eta^2} \right),\tag{9}$$

See Appendix A for the demonstration.

### *Estimation Procedure*

In order to test the estimators, the following procedure is employed. First, independent Gaussian signals are generated, then a correlation is applied according to what has been explained in Section 3.5. After the generation, AWGN is added to the signals:

$$X\_1^- \sim \mathcal{N}(0, 1), \tag{10}$$

$$
\overline{X\_2} \sim \mathcal{N}(0, 1),
\tag{11}
$$

$$
\overline{m\_a} \sim \mathcal{N}(0, \sigma\_a),
\tag{12}
$$

$$
\overline{m\_{\mathbb{b}}} \sim \mathcal{N}(0, \sigma\_{\mathbb{b}}),
\tag{13}
$$

$$
\overline{s\_{\mathfrak{a}}} = X\_{1\prime} \tag{14}
$$

$$
\overline{s\_b} = \eta \overline{X\_1} + \sqrt{1 - \eta^2} \overline{X\_2} \tag{15}
$$

$$A = \overline{s\_{\mathfrak{a}}} + \overline{n\_{\mathfrak{a}}} \tag{16}$$

$$B = \overline{s\_b} + \overline{n\_b}.\tag{17}$$

Equation (15) comes from (3) and (5) when two random vectors are considered. The evaluation is repeated for different values of the correlation *η*: after the generation, the random vectors are given to the estimators to obtain mutual information. Furthermore, *X*<sup>1</sup> and *X*<sup>2</sup> contain 500,000 samples.

Figure 5 shows the results of the comparison. In particular, the mutual information significantly drops when the correlation is different from 1. Moreover, the estimated curves correspond to the theoretical case, confirming the correct behavior of the estimators. Since the SKR is a combination of mutual information, the same agreement between the theory and the simulation is expected regarding the SKR. This also proves the correctness of the simulation procedure employed.

**Figure 5.** Comparison between the theoretical and estimated mutual information in the correlated Gaussian case, with different SNR conditions.

### **5. Results and Discussion**

Simulations aimed at evaluating the SKR under different channel conditions, i.e., for different values of the Rice factor, of the DS and of the SNR. For the sake of simplicity, the legitimate and the eavesdropped channels are assumed to share the same Rice factor and DS, and Eve is supposed to have the same correlation towards Alice and Bob indifferently.

### *5.1. SKR and the K Factor*

Simulations were run for different values of the Rice factor and correlation between the wireless channels, but always with the same SNR of 10 dB and with a DS of 30 ns. In addition, the estimation was performed both for the one-filter (narrow-band case, Figure 6) and for four-filters (wide-band case, Figure 7) cases. When Alice and Bob share highly correlated channel observations (0.99 in Figures 5 and 6), the SKR lower and upper bound basically coincide: this is not surprising as the lower and upper bound set on the SKR by (1) and (2) come to coincide as soon as Alice and Bob share highly correlated channel observations. Further details can be found in Appendix B. Instead, when the correlation is reduced, the two curves become distinguishable. Moreover, it is possible to highlight a decreasing trend of the SKR with the Rice factor: for a larger *K*, the channels are more stable and the multipath effects are reduced, thus the channel fluctuations are weaker, the overall randomness inside the channel is lower and hence the SKR is reduced. The reasons for this decreasing evolution of the SKR can be found by looking at Figure 8, which reports some PSD for the different values of the Rice Factor. As K increases, the channels become flatter, resulting in a weaker entropy and hence, in a lower SKR.

Reducing the Alice–Bob correlation also impairs the SKR, as it means that the disagreements in the bit sequences harvested from the channel become more probable because of the lower reciprocity level. A further reduction in the SKR is triggered when Eve improves her correlation with respect to Alice/Bob, as she can then better infer some information about the key, thus reducing its overall secrecy. Since the SKR represents the total number of bits that can be extracted after the filterbank method, it is normal to observe higher values when four filters are employed (Figure 7).

**Figure 6.** Secrecy key rate as a function of the Rice factor K, for different values of the correlation and with 1 filter. In the legend, "ab" and "be" stand for Alice–Bob correlation and Bob–Eve correlation.

**Figure 7.** Secrecy key rate as a function of the Rice factor K, for different values of the correlation and with 4 filters. In the legend, "ab" and "be" stand for the Alice–Bob correlation and Bob–Eve correlation.

**Figure 8.** Power spectral densities for a different value of the Rice factor K.

### *5.2. SKR and SNR*

The simulations were then performed with respect to the SNR experienced by Alice and Bob, whereas the SNR of Eve is always kept to 10 dB, the DS is 30 ns and the Rice factor was set to 10 dB. Once again, the simulations were repeated for different values of the correlation.

Figure 9 depicts the SKR as a function of the SNR with one filter, while Figure 10 shows the situation with four filters. In line with the Gaussian case described in Section 4, the SKR increases with the SNR, as a louder noise between Alice and Bob evidently affects the channel reciprocity, thus increasing the probability of disagreement between the key they finally receive from the channel observations. The sensitivity to the channels' correlation highlighted in Figures 9 and 10 is of course the same as that already discussed with reference to Figures 6 and 7.

**Figure 9.** Secrecy key rate as a function of the SNR of Alice and Bob, for different values of the correlation and with 1 filter. In the legend, "ab" and "be" stand for Alice–Bob correlation and Bob–Eve correlation.

**Figure 10.** Secrecy key rate as a function of the SNR of Alice and Bob, for different values of the correlation and with 4 filters. In the legend, "ab" and "be" stand for Alice–Bob correlation and Bob–Eve correlation.

### *5.3. SKR and Delay Spread*

As a last case, the simulations were performed to fix both the SNR and the Rice factor at 10 dB, but varying the DS of the channel. As in the previous cases, the simulations are repeated for different values of the correlation values.

As for the case with one filter, depicted in Figure 11, it is possible to notice that the DS does not seem to have a big impact on the SKR. Conversely, the SKR tends to decrease with the increasing DS, when multiple filters are employed (Figure 12). This trend is also in line with what has been reported in [13].

**Figure 11.** Secrecy key rate as a function of the delay spread of the channel of Alice and Bob, for different values of the correlation and with one filter. In the legend, "ab" and "be" stand for the Alice–Bob correlation and Bob–Eve correlation.

**Figure 12.** Secrecy key rate as a function of the delay spread of the channel of Alice and Bob, for different values of the correlation and with 1 filter. In the legend, "ab" and "be" stand for Alice–Bob correlation and Bob–Eve correlation.

The reason for this behavior can be understood by looking at Figure 13 and bearing in mind that the number of paths in the TDL is fixed: when the DS is low, there is a higher probability that the different paths cannot be resolved singularly; therefore, they might severely interfere and create a deep null in the PSD. In contrast, when the DS is larger, the different paths are spread over a wider delay range, and therefore they less frequently add up coherently inside the PSD, thus corresponding to a more oscillating PSD, but without deep fades.

**Figure 13.** A realization of power spectral density with a different delay spread.

In terms of the entropy of the channel, and hence mutual information between Alice and Bob, having deep fades increases the randomness of the channel, translating into a higher SKR. Moreover, the effect of the deep fades is somehow mitigated in the case of one single filter, since it blunts the effects due to the presence of deep fades by averaging the PSD over the whole signal bandwidth. Instead, when four filters are employed, the deep fades in the case of low DS create more variability on the filter outputs, introducing more entropy.

### **6. Conclusions**

In this work, a simulation framework for PLKG in the Rice channel was presented, in order to compute the SKR under different channel conditions. Moreover, the simulator is able to generate correlated wide-band channel in order to take into account the presence of an eavesdropper and the possible imperfections that lead to non-ideal channel reciprocity. The SKR was computed, showing a decreasing trend with respect to the Rice factor of the channel. Moreover, it was shown that a high correlation between the Alice and Bob channel samples is required in order to achieve a reasonable SKR. Finally, given the considered channel model, the DS has a detrimental effect on the SKR, since the higher DS situations lead to a lower SKR.

**Author Contributions:** Conceptualization, S.D.P.; methodology, S.D.P.; software, S.D.P.; validation, S.D.P., F.F. and M.B.; investigation, S.D.P.; writing—original draft preparation, S.D.P. and F.F.; writing review and editing, M.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data and software available under request.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **Appendix A**

Suppose we have two signals with AWGN:

$$A = s\_a + n\_a \tag{A1}$$

*B* = *sb* + *nb* (A2)

where *sa*,*sb* ∼ N (0, 1), *na* ∼ N (0, *σa*) and *nb* ∼ N (0, *σb*).

If *SNR* is the signal-to-noise ratio of the two users (assumed to be the same for simplicity):

$$
\sigma\_{n\_{\theta}}^{2} = \frac{1}{SNR'} \tag{A3}
$$

$$
\sigma\_{n\_b}^2 = \frac{1}{SNR'} \tag{A4}
$$

$$
\sigma\_a^2 = 1 + \frac{1}{SNR'} \tag{A5}
$$

$$
\sigma\_b^2 = 1 + \frac{1}{SNR}.\tag{A6}
$$

Now, suppose that *sa* and *sb* have a covariance cov(*sa*,*sb*) = *η*: since the variance of both *sa* and *sb* is equal to 1, then the covariance and the correlation are the same. The correlation between *A* and *B* can be computed as:

$$\text{corr}(A, B) \equiv \rho = \frac{E[(A - \mu\_A)(B - \mu\_B)]}{\sigma\_A \sigma\_B},\tag{A7}$$

but *μ<sup>A</sup>* = *μ<sup>B</sup>* = 0 since they are the sum of the zero mean Gaussian random variables, therefore

$$\begin{split} \text{corr}(A, B) \equiv \rho &= \frac{E[(A - \mu\_A)(B - \mu\_B)]}{\sigma\_A \sigma\_B} = \frac{E[AB]}{\sigma\_A \sigma\_B} = \frac{E[(s\_a + n\_a)(s\_b + n\_b)]}{\sigma\_A \sigma\_B} = \\ &= \frac{E[s\_a s\_b] + E[s\_a n\_b] + E[s\_b n\_a] + E[n\_a n\_b]}{\sigma\_A \sigma\_B} = \frac{E[s\_a s\_b]}{\sigma\_A \sigma\_B} = \\ &= \frac{\text{cov}(s\_a s\_b)}{\sigma\_A \sigma\_B} = \frac{\eta}{\sigma\_A \sigma\_B} .\end{split} \tag{A8}$$

The mutual information of the two random variables can be rewritten as

$$\mathbb{I}(A;B) = h(A) + h(B) + h(A,B),\tag{A9}$$

where *h*(*A*), *h*(*B*) are the differential entropies of the two signals and *h*(*A*, *B*) is the joint entropy, which in the Gaussian case, can be expressed as:

$$h(A) = \frac{1}{2} \log\_2(2\pi e \sigma\_A^2),\tag{A10}$$

$$h(B) = \frac{1}{2} \log\_2(2\pi e \sigma\_B^2),\tag{A11}$$

$$\begin{split} h(A,B) &= \frac{1}{2} \log\_2 \left( (2\pi e)^2 (\sigma\_A^2 \sigma\_B^2 - \text{cov}^2(A,B)) \right) = \\ &= \frac{1}{2} \log\_2 \left( (2\pi e)^2 (\sigma\_A^2 \sigma\_B^2 - \sigma\_A^2 \sigma\_B^2 \rho^2) \right). \end{split} \tag{A12}$$

Hence, the mutual information can be written as:

$$\begin{split} \mathbb{I}(A;B) &= h(A) + h(B) + h(A,B) = \\ &= \frac{1}{2} \log\_2(2\pi e \sigma\_A^2) + \frac{1}{2} \log\_2(2\pi e \sigma\_B^2) - \frac{1}{2} \log\_2\left((2\pi e)^2 (\sigma\_A^2 \sigma\_B^2 - \sigma\_A^2 \sigma\_B^2 \rho^2)\right) = \\ &= \frac{1}{2} \log\_2\left(\frac{(2\pi e)^2 \sigma\_A^2 \sigma\_B^2}{(2\pi e)^2 (\sigma\_A^2 \sigma\_B^2 - \sigma\_A^2 \sigma\_B^2 \rho^2)}\right) = \frac{1}{2} \log\_2\left(\frac{\sigma\_A^2 \sigma\_B^2}{\sigma\_A^2 \sigma\_B^2 - \sigma\_A^2 \sigma\_B^2 \rho^2}\right) = \\ &= \frac{1}{2} \log\_2\left(\frac{1}{1 - \rho^2}\right) = \frac{1}{2} \log\_2\left(\frac{\sigma\_A^2 \sigma\_B^2}{\sigma\_A^2 \sigma\_B^2 - \eta^2}\right). \end{split} \tag{A13}$$

### **Appendix B**

Let's start from the Lower Bound, assuming that Alice and Bob share highly correlated channel samples and that the correlation between Alice and Eve channel samples is the same as that between the Bob and Eve samples. Moreover, all the links share the same channel condition in terms of SNR and Rice factor K. The lower bound of the SKR can be reduced to:

$$\begin{split} \mathbb{E}(X^{A}, X^{B} \parallel X^{E}) &\geq \max[\mathbb{I}(X^{A}; X^{B}) - \mathbb{I}(X^{A}; X^{E}), \mathbb{I}(X^{A}; X^{B}) - \mathbb{I}(X^{B}; X^{E})] \\ &= \mathbb{I}(X^{A}; X^{B}) - \mathbb{I}(X^{A}; X^{E}) \\ &= h(X^{A}) - h(X^{A}|X^{B}) - h(X^{A}) + h(X^{A}|X^{E}) \simeq h(X^{A}|X^{E}). \end{split} \tag{A14}$$

The conditioned entropy *<sup>h</sup>*(*XA*|*XB*) is almost zero since the Alice and Bob channel observations are highly correlated, and therefore, the residual uncertainty on *X<sup>A</sup>* by knowing *X<sup>B</sup>* is almost null.

In the same way as before, the upper bound can be reduced to:

$$\begin{split} \mathbb{R}(X^{A}, X^{B} \parallel X^{E}) &\leq \min[\mathbb{I}(X^{A}; X^{B}), \mathbb{I}(X^{A}; X^{B} \mid X^{E})] = \\ &= \min[h(X^{A}) - h(X^{A} | X^{B}), h(X^{A} | X^{E}) - h(X^{A} | X^{B}, X^{E})] \\ &\simeq \min[h(X^{A}), h(X^{A} | X^{E})] = h(X^{A} | X^{E}). \end{split} \tag{A15}$$

The term *<sup>h</sup>*(*XA*|*XB*) is zero for the reasons explained before, and then the term *<sup>h</sup>*(*XA*|*XB*, *<sup>X</sup>E*) is almost zero since the conditioning happens on both *<sup>X</sup><sup>B</sup>* and *<sup>X</sup>E*, but *X<sup>B</sup>* is highly correlated with *XA*, and hence, the residual uncertainty is almost zero.

Since the upper bound and the lower bound are equal, when Alice and Bob share highly correlated samples that the SKR reduces to *<sup>h</sup>*(*XA*|*XE*); therefore, it is expected that for a high correlation, similar values for the upper and lower bounds should be achieved.

### **References**


### *Article* **A Cube Attack on a Reduced-Round Sycon**

**Minjeong Cho , Hyejin Eom , Erzhena Tcydenova and Changhoon Lee \***

Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul 01811, Korea

**\*** Correspondence: chlee@seoultech.ac.kr

**Abstract:** The cube attack was proposed at the 2009 Eurocrypt. The attack derives linear polynomials for specific output bits of a BlackBox cipher. Cube attacks target recovery keys or secret states. In this paper, we present a cube attack on a 5-round Sycon permutation and a 6-round Sycon permutation with a 320-bit state, whose rate occupies 96 bits, and whose capacity is 224 bits. We found cube variables related to a superpoly with a secret state. Within the cube variables, we recovered 32 bits of the secret state. The target algorithm was Sycon with 5-round and 6-round versions of permutation. For the 5-round Sycon, we found a cube variable and recovered a state with a total of 2<sup>192</sup> Sycon computations and 2<sup>37</sup> bits of memory. For the 6-round Sycon, we found cube variables and recovered a state with a total of 2<sup>192</sup> Sycon computations and 2<sup>70</sup> bits of memory. When using brute force in a 5-round attack, 2224 operations were required, but the cube attack proposed in this paper had 248 offline operations, and 2<sup>32</sup> operations were required. When using brute force in a 6-round attack, 2224 operations were required, but the cube attack proposed in this paper required 295 offline operations, and 263 operations were required. For both attacks, offline could be used continuously after performing only once. To the best of our knowledge, this is the first cube attack on Sycon.

**Keywords:** sycon; cube attack; state recovery

### **1. Introduction**

Currently, wireless communication technology supports the high-speed communication of various devices, such as cellular phones, lightweight devices, and industrial sensors. In addition, wireless communication makes smart factories, smart cities, and self-driving cars possible and provides many conveniences for human beings. However, the importance of information security must be emphasized, because there are risks of being exposed to cyber security threat, manipulation or leakage of data and invasion of privacy during the transmission of data by wireless communication [1–3]. In wireless communication, information security may be provided through cryptographic algorithms [4,5]. However, since the security strength of cryptographic algorithms does not provide immutability, it must be continuously re-evaluated and reviewed for proper use.

A cube attack is the first type of attack to utilize existing linear, logarithmic, and correlation at the same time [6]. A cube attack can apply to block ciphers, stream ciphers, and MACs. With a cube attack, key recovery is possible. A cube attack creates a polynomial in GF(2) using a set of variables defined as a cube. A cube is the set of all cases with the given variables. The polynomial is associated with a cryptographic algorithm's output treated as a black box, expressed with a quotient and a remainder. The goal is to find the coefficient of the quotient. Then, the secret data are recovered using the obtained coefficient of the quotient.

At Eurocrypt 2009, a cube attack against Trivium, a block cipher-based stream cipher suitable for wireless communication environments, was announced [6]. Trivium is a stream cipher using an 80-bit key and 1152 initializations [7]. Since then, it has been proven through several papers that a cube attack is possible for several cryptographic algorithms such as SIMON-64/96, Ascon, ACORN, MROUS, GILMI, and Keyak [8–13].

**Citation:** Cho, M.; Eom, H.; Tcydenova, E.; Lee, C. A Cube Attack on a Reduced-Round Sycon. *Electronics* **2022**, *11*, 3605. https:// doi.org/10.3390/electronics11213605

Academic Editor: Rameez Asif

Received: 24 August 2022 Accepted: 1 November 2022 Published: 4 November 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

This paper proposes a cube attack on Sycon. Sycon is an AEAD cipher sponge construction with a 128-bit key. Sycon was submitted to the NIST Lightweight Cryptography Competition [14] and was selected among the ciphers for the first round of this project.

In this paper, Sycon using a rate of 96 was reduced to five rounds and six rounds. Based on the low algebraic degree of Sycon, we were able to construct a cube attack with a complexity of 2<sup>192</sup> for the 5-round Sycon permutation, described in Section 4. A total 248 operations and 237 bits of memory were required to recover the 32-bit state in the 5-round Sycon. In Section 6, we describe the use of similar algebraic properties to construct a cube attack to obtain a state recovery attack for the 6-round Sycon permutation. We recovered the same 32-bit state on the 6-round Sycon with 295 operations and 270 bits of memory. To the best of our knowledge, this paper describes the first cube attack on Sycon. The main contributions of this paper are summarized as follows:


This paper proceeds as follows. Section 3 introduces cube attacks and the Sycon cipher. Section 4 describes a cube attack on Sycon. The Results and Discussion section describes the complexity of the attacks. Conclusions are provided in Section 6 with a summary of the proposed attack and the results.

### **2. Related Work**

A cube attack was proposed on Trivium with 672 initialization rounds with 219 bits operations, 735 initialization rounds with 230 bits operations, and 767 initialization rounds with 2<sup>45</sup> bits operations. Since then, an improved attack with an MILP model on Trivium with 675/735/840/841/842 initialization rounds was proposed in 2021, and an attack on Trivium with 843 initialization rounds was proposed in 2022 [15,16].

Ascon was selected as a finalist in the NIST Lightweight AEAD Cryptography Contest in 2017, and the strongest attack at that time was a cube attack, which had the time complexity of 297 under the nonce misuse condition in the 7-round Ascon initialization step [17]. However, for Ascon in 2022, an improved cube attack was also conducted under the condition of nonce misuse in the initialization phase [18]. The complexity of the key recovery for a full-round Ascon was 2130.

In [19], Dinur et al. presented a cube-like attack against Keccak hash function-based message authentication codes, authenticated encryption, and stream cipher. The key recovery attack was performed for up to seven rounds. Key recovery and forgery attacks were proposed for AE based on the Keccak hash function. In the case of the key recovery, the attacks were performed up to six rounds under the nonce respected condition, and the attacks were performed up to seven rounds under the nonce reused condition. A key recovery attack and keystream prediction attack were proposed for the Keccak hash function-based stream cipher, and it was shown that six rounds of key recovery and key stream prediction could perform attacks for up to nine rounds.

In [20], Salam et al. proposed a cube attack against the authenticated encryption stream cipher ACORN. This attack recovered a 128-bit key with a complexity of 2<sup>35</sup> in 477 initialization rounds in ACORN, a proposed candidate for NIST CAESAR (Competition for Authenticated Encryption: Security, Applicability, and Robustness). In addition, fullround ACORN showed that a state recovery attack could be performed with a complexity of 272.8 using a linear equation associated with the initial state. In [12], Yang et al. proposed a method of measuring the algebraic order and numerical mapping in NFSR-based ciphers and a method of finding a cube based on a greedy algorithm. It was shown that a key could

be recovered with a complexity of 2127.46 with cube variables using 123 variables for the 772 reduced-round ACORN.

In [21], Huang et al. introduced an efficient key recovery attack for the Keccak hash function-based MAC or Keccak hash function-based AE algorithm Keyak using a conditional cube attack. In [19], a MAC-based 7-round Keccak hash function was proposed . With 28 times more data, the time complexity could be decreased to 272. An attack against Keyak was feasible with the time complexity of 274 and a data complexity of 2<sup>74</sup> for eight rounds.

### **3. Preliminaries**

In this section, we briefly introduce the necessary background for this paper. Firstly, we provide the notations used in this paper. Then, we provide a brief description of Sycon and the concept of a cube attack.

### *3.1. Abbreviations and Notations*

The abbreviations used in this paper are listed in Table 1.

**Table 1.** Abbreviations.


The symbol notations used in this paper are listed in Table 2.



### *3.2. Sycon Authenticated Encryption with the Associated Data Algorithm Specification*

Sycon is an authenticated encryption with an associated data (AEAD) cipher [22]. AEAD is an encryption algorithm with a built-in integrity process using a secret key [23]. AEAD usually performs better than using two separate cryptographic processes with two different secret keys. Sycon provides two authenticated encryption algorithms with associated data and one hash algorithm in a sponge structure. In this section, we specify the Sycon whose rate is 96.

Sycon consists of initialization, related data processing, encryption/decryption, and finalization. The initialization phase loads 128-bit keys, a 128-bit nonce, and a 64-bit initialization vector into the 320-bit state variable. Then, it conducts two permutation calls, truncating the key by 64 bits and XORing it. The relevant data processing is applied after the

initialization phase, if the related data are not empty. Relevant data processing performs the permutation with the associated data (AD) and the current state as input. The relevant data processing will not perform if the relevant data are empty. In encryption/decryption, the encryption algorithm generates the ciphertext with the same length as the input plaintext. In this case, the size of the plaintext is a multiple of 96, and padding is performed if it is less than 96 bits. Then, we conduct the permutation to update the state. This process is repeated until all 96 bits of plaintext are processed. The finalization absorbs the key back into the state via a ratio of two permutation calls, and a 128-bit tag is output. A tag is a value that concatenates the contents of S2 and S3 among the state variables.

The state is XORed with a key or plaintext after permutation as shown in Figure 1. The LSB 224 bits of the state are XORed with a domain separator. The domain separator of Sycon is as follows: 0224 for initialization, 100 <sup>0</sup><sup>221</sup> for AD processing, 010 0221 for massage, and 001 0221 for tag generations. If the additional data are empty, 001 0221 is replaced by 010 <sup>0</sup>221.

**Figure 1.** Sycon -AEAD-96 (when the length of the associated data is 0).

Sycon permutation is an iterative computation in a round function. In the round function, Sycon uses a 320-bit state. In the state, the first 64/96 bits are user message bits along the rate. The round function (R) of the Sycon permutation consists of a sequence of three distinct transformations: SBox (SB), SubBlockDiffusion (SD), and AddRoundConstant (RC), i.e., *<sup>R</sup>* <sup>=</sup> *RC* ◦ *SD* ◦ *SB*. The *<sup>ρ</sup>*-round permutation, denoted by <sup>Π</sup>*ρ*, is constructed as <sup>Π</sup>*<sup>ρ</sup>* <sup>=</sup> *<sup>R</sup>* ◦ ... ◦ *<sup>R</sup>*.

The first layer is a nonlinear computation. Sycon's round function is SPN. Thus, for nonlinear computation, Sycon uses 64 S-boxes. The process of the S-boxes in the equation is as follows:


The second layer is a diffusion layer that performs linear transformation on five 64-bit sub-blocks. The diffusion layer uses the following linear transformation:

$$\begin{aligned} S0 &\leftarrow (S0 \oplus (S0 \lll 59) \oplus (S0 \lll 54)) \lll 40\\ S1 &\leftarrow (S1 \oplus (S1 \lll 55) \oplus (S1 \lll 46)) \lll 32\\ S2 &\leftarrow (S2 \oplus (S2 \lll 33) \oplus (S2 \lll 02)) \lll 16\\ S3 &\leftarrow (S3 \oplus (S3 \lll 21) \oplus (S3 \lll 42)) \lll 56\\ S4 &\leftarrow (S4 \oplus (S4 \lll 13) \oplus (S4 \lll 26)) \end{aligned} \tag{2}$$

The third layer is the add round constant layer. Round constants use a four-bit LFSR defined by the polynomial *x*<sup>4</sup> + *x* + 1 *over* F2. The LFSR status is expressed as *rc* = (*rci*+3,*rci*+2,*rci*+1,*rci*), where *rci*+<sup>4</sup> = *rci* ⊕ *rci*+1. Starting from the initial state *rc* = (0, 1, 0, 1), we generate a *ρ* = 12 round constant, where each state of the LFSR is given as a unique constant. The four-bit LFSR with status (*rc*3,*rc*2,*rc*1,*rc*0) is converted to a byte equal to (0, 0, 0, 0,*rci*+3,*rci*+2,*rci*+1,*rci*). The round constants are given in Table 3.


### *3.3. Cube Attack*

Let the cryptography algorithms be expressed with a polynomial *f* . The input of the cryptographic algorithm (e.g., plaintext, initial vector, nonce, associated authentication data) will be *f* 's input parameter, and the output of the cryptography algorithm (e.g., ciphertext, tag) will be the value of *f* 's computed result. If the block cipher has input as plaintext *P*, initial vector *IV*, and key *K*, and the output is ciphertext *C*, we can express the block cipher *f*(*P*, *IV*, *K*) = *c* with *P* = *p*<sup>0</sup> *p*<sup>1</sup> *p*<sup>2</sup> *p*<sup>3</sup> ... *pn*, *IV* = *iv*0*iv*1*iv*2*iv*<sup>3</sup> ... *ivn*, *K* = *k*0*k*1*k*2*k*<sup>3</sup> ... *km*, and *C* = *c*0*c*1*c*2*c*<sup>3</sup> ... *cn*, where *pi*, *ki*, and *ci* is a bit representation, respectively.


In a cube attack, finding cube variables is important, because when the degree of the polynomial becomes higher, an attacker needs more polynomials to use Gaussian elimination. When the cube variables are larger, the attacker breaks more rounds. Moreover, *f* can be divided by *m* − 1 variables, and the quotient will be a degree 1 polynomial. A cube attack should first formulate the polynomial *f* . If *f* is not a dense polynomial, superpoly *Q*'s degree will be changed along the chosen cube variables. For example, when we define cube variables as *t* = *p*<sup>0</sup> *p*<sup>1</sup> ... *pn*−1, and if *A* is bits that multiplied with *l*, *l* <= *n* bits, then *f* could be *f*(*P*, *IV*, *K*) = *tQ*(*pn*, *A*) + *R*(*P*, *IV*, *K*). That is, the polynomial *Q* degree is *l* − *m*. Thus, in order to obtain Q, the attacker needs to use fewer cube variables *l* − *m* − 1.

To make *m* independent polynomials, the attacker needs to analyze where there are no multiplication values between the target bits and the input bits that the attacker can control. The attacker will select the control bits that do not have multiplication with the target bits. The attacker can know whether multiplication will be computed by analyzing the cryptographic algorithms' process. A typical example that has a multiplication step in a cryptography algorithm is an S-box. After the attacker finds the proper cube variables from the polynomials, a cube attack can decide the round that an attacker can use. Then, the cube attack is presented as follows:


*l* − *m* − 1 cube sum result. The attacker computes the Gaussian elimination to obtain the recovered target bits.

3. **Brute Force Phase** If the cube variables are not enough to obtain all rounds or all target bits, the attacker performs a brute force attack on the remaining bits. For example, if the recovered bits are *l* bits and the targeted bits are *m* bits, the attacker performs a 2*m*−*<sup>l</sup>* exhaustive search.

### **4. State Recovery Attack on a Reduced-Round Sycon**

In this section, we focus on state recovery attacks. First, we analyze the round-reduced Sycon. Then, in the later part of the section, the cube attacks on five-round Sycon and six-round Sycon are described.

### *4.1. Idea and Scenario*

We propose an attack idea and scenario to recover the secret state of Sycon. Sycon has a 320-bit state variable, as described in Section 3. In this paper, the state variable S is expressed as a truncated form to 32-bit units as follows:

$$\mathbb{S} = \mathbb{S}0\_0 \| \| S0\_1 \| \| S1\_0 \| \| S2\_0 \| \| S2\_1 \| \| S3\_0 \| \| S3\_1 \| \| S4\_0 \| \| S4\_1 \tag{3}$$

The polynomial of the output from the S-box has degree 2. After five rounds of the permutation are performed, the result S has a polynomial of degree 24. Therefore, we require 2<sup>4</sup> <sup>−</sup> 1 variables in the cube for an attack against the 5-round Sycon. In the same way, after six rounds, the degree of <sup>S</sup> is 25. Thus, we need 25 <sup>−</sup> 1 variables in the cube. We choose bits from the state variable that we have control over. We select *S*00, *S*01, and *S*10 as the cube variables. The cube attack is feasible if the variables are uniquely multiplied by *S*00, *S*01, and *S*10. The variable with this characteristic in the S-box is *S*30. *S*10 is only multiplied by *S*00 and *S*30. Choosing *S*10 as the cube variable allows us to recover *S*30. With the cube variables, we obtain the linear equations as follows:

$$L\_i(\mathcal{S}) = a\_i^0 \mathcal{S} \mathbf{1}\_0^0 + a\_i^1 \mathcal{S} \mathbf{1}\_0^1 + a\_i^2 \mathcal{S} \mathbf{1}\_0^2 + \dots + a\_i^{31} \mathcal{S} \mathbf{1}\_0^{31} + c\_i, i = 0, 1, 2, \dots, 31\tag{4}$$

To construct *Li*, we obtain the result of the permutation by *M* ⊕ *C*, and we choose the message bits that construct the cube variables. Figure 2 shows the attack scenario.

**Figure 2.** State recovery attack scenario on the reduced-round Sycon.

### *4.2. Attack on the Five-Round Sycon*

### 4.2.1. Offline Phase

Sycon encrypts plaintext by truncating it in units of 96 bits. We chose 96 bits of plaintext. The state variable *S* was concatenated as *S*00, *S*01, *S*10, *S*11, *S*20, *S*21, *S*30, and *S*31, where each size was 32 bits.

$$\mathbb{S} = S \mathbb{O}\_0 \| \| S \mathbf{0}\_1 \| \| S \mathbf{1}\_0 \| \| S \mathbf{2}\_0 \| \| S \mathbf{2}\_1 \| \| S \mathbf{3}\_0 \| \| S \mathbf{3}\_1 \| \| S \mathbf{4}\_0 \| \| S \mathbf{4}\_1 \tag{5}$$

*P*0, *P*1, and *P*<sup>2</sup> were the plaintext values chosen. Ciphertext *C*0, *C*1, and *C*<sup>2</sup> were computed by performing XOR of the current state variables MSB 96 bits *S*00, *S*01, and *S*10 and the plaintext *P*0, *P*1, and *P*2.

$$\begin{aligned} \mathbf{C}\_{0} &= \mathbf{S} \mathbf{0}\_{0} \oplus P\_{0} \\ \mathbf{C}\_{1} &= \mathbf{S} \mathbf{0}\_{1} \oplus P\_{1} \\ \mathbf{C}\_{2} &= \mathbf{S} \mathbf{1}\_{0} \oplus P\_{2} \end{aligned} \tag{6}$$

Since the ciphertext and the plaintext were known, we computed the XORing of the ciphertext and the plaintext to obtain *S*00, *S*01, *S*10.

$$\begin{aligned} S0\_0 &= \mathbb{C}\_0 \oplus P\_0\\ S0\_1 &= \mathbb{C}\_1 \oplus P\_1\\ S1\_0 &= \mathbb{C}\_2 \oplus P\_2 \end{aligned} \tag{7}$$

We set the cube variables in *S*01. Since the degree of the S-box was 2, the result after five rounds had a polynomial of degree 16. The attacker chose a 15-bit cube in the 5-round Sycon state variable to obtain a polynomial of degree 1 and the rest of the values as constants. We recovered 16 bits of MSB and 16 bits of LSB of *S*30, respectively. Therefore, two cube variables were required, and each cube variable set had 215 elements. The cube variable could be *CS*<sup>10</sup> := {0*x*0, 0*x*1, ..., 0*x*ffff} and *CS*<sup>10</sup> := {0*x*00000, 0*x*10000, ..., 0*x*ffff0000}. We assigned all the other variables to 0 except for the cube variable *S*01 and used *S*00 in the attack. *S*10 was used in the attack as it was multiplied only by *S*00 and *S*30, as described in Section 2.

Since *S*30 depends only on *S*00, after five rounds, the polynomial could be expressed as a polynomial for the quotient of *S*00 and *S*30.

$$\begin{aligned} \mathbb{S} &= \mathbb{S} \mathbb{S}\_0 Q(\mathbb{S} 0\_0, \mathbb{S} \mathbb{S}\_1) + \mathbb{R}(\mathbb{S} 0, \mathbb{S}1, \mathbb{S} 2, \mathbb{S} 3, \mathbb{S}4) \\ Q(S) &= \sum\_{\mathbb{S} \mathbb{I}\_0 \in \mathbb{C}\_{\mathbb{S} \mathbb{I}\_0}} f(\mathbb{S}) \\ &= a^0 \mathbb{S} 1^0 + a^1 \mathbb{S} 1^1 \dots + a^{31} \mathbb{S} 1^{31} + c \text{ where } a^i \in \{0, 1\} \end{aligned} \tag{8}$$

We used the process shown in Table 4 to find the variable coefficients of a polynomial for *Q*(*S*). As a result, we obtained the following values. The results obtained were stored in the memory and used in the online phase.

$$\sum\_{(\mathbf{v}\_1,\ldots,\mathbf{v}\_{15})\in\mathbb{C}\_{S1\_0}}f(S0\_0, S0\_1, S1\_0, S1\_1, S2\_0, S2\_1, S3\_0, S3\_1, S4\_0, S4\_1) = Q(S0\_0, S3\_0),\tag{9}$$

$$\text{with } S0\_0 = 0\text{x00000000}, \text{x000000001}, \text{x000000002}, \dots, \text{x0}\text{x00000ffff},\tag{9}$$

$$S3\_0 = 0\text{x00000000}, \text{x000000001}, \text{x0000000002}, \dots, \text{x0}\text{x00000ffff}$$

$$\sum\_{(\mathbf{v}\_{16},\ldots,\mathbf{v}\_{31})\in\mathbb{C}\_{S1\_0}}f(S0\_0, S0\_1, S1\_0, S2\_1, S2\_0, S2\_1, S3\_0, S3\_1, S4\_0, S4\_1) = Q(S0\_0, S3\_0),$$

$$\text{with } Q0 = 0\text{x000000000000000000000000000000000000000}, \text{x00000000000000000000000000000000}, \text{x0000000000000000000000000}\tag{10}$$

*with S*00 = 0*x*00000000, 0*x*00010000, 0*x*00020000, . . . , 0*x*ffff0000, *S*30 = 0*x*00000000, 0*x*00010000, 0*x*00020000, . . . , 0*x*ffff0000

**Table 4.** Attack on the five-round Sycon: Offline phase process.

```
Offline Pseudo Code
Input : State S = S00S01S10S11S20S21S30S31S40S41, Cube Set CS10
Output : CubeSum CS[216] = {0510, 0510,...,0510}
Algorithm :
    S := 0320
    For i in 0 to 0x0000ffff:
         CS[i] := 0510
         S30 := i
         For j in 0 to 0x0000fffe:
              S00 := j
              For cube value in CS10 :
                  S = Sycon(S)
                  CStmp := CS[i] ⊕ S10
                  S := 0320
              CS[i] := CS[i]

                            CStmp
    For i in 0 to 0xffff0000:
         CS[i] := 0510
         S30 := i
         For j in 0 to 0xfffe0000:
              S00 := j
              For cube value in CS10 :
                  S = Sycon(S)
                  CStmp := CS[i] ⊕ S10
                  S := 0320
              CS[i] := CS[i]

                            CStmp
    return CS
```
### 4.2.2. Online Phase

In the online phase, we implemented an oracle. The oracle allowed the choice of arbitrary plaintext *S*01 and *S*10, and the cube variables *CS*<sup>00</sup> were defined internally as a fixed set. The oracle computed *Q*(*S*00, *S*30) for the plaintext chosen. We computed two times for the MSB 32 bits and LSB 32 bits of *S*30. The oracle calculated the superpolys as follows:

$$\begin{aligned} \sum\_{(\mathbf{v}\_1, \dots, \mathbf{v}\_{15}) \in \mathbb{C}\_{S1\_0}} f(S0\_0, S0\_1, S1\_0, S1\_1, S2\_0, S2\_1, S3\_0, S3\_1, S4\_0, S4\_1) &= Q(S0\_0, S3\_0), \\ \text{with } S0\_0 &= 0 \ge 00000000, 0 \ge 00000001, 0 \ge 000000002, \dots, 0 \ge 0000000 \text{ffff} \end{aligned} \tag{11}$$

$$\sum\_{\substack{(v\_1\mathfrak{G}\_{\sf r}, v\_{\sf S}) \in \mathfrak{C}\_{\sf S\sf S}}} f(\operatorname{S0}\_{0\prime}, \operatorname{S0}\_{1\prime}, \operatorname{S1}\_{0\prime}, \operatorname{S1}\_{1\prime}, \operatorname{S2}\_{0\prime}, \operatorname{S2}\_{1\prime}, \operatorname{S3}\_{0\prime}, \operatorname{S3}\_{1\prime}, \operatorname{S4}\_{0\prime}, \operatorname{S4}\_{1}) = Q(\operatorname{S0}\_{0\prime}, \operatorname{S3}\_{0})\_{\prime} \tag{12}$$
 
$$\text{\textquotedblleft  $\alpha$  } \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad \alpha \quad$$

*with S*00 = 0*x*00000000, 0*x*00010000, 0*x*00020000, . . . , 0*x*ffff0000.

We explored the superpoly *Q*(*S*00, *S*30) obtained by querying the oracle in the memory space stored during the offline phase. If the same value as *Q*(*S*00, *S*30) was in the memory, we determined *S*30, because the memory stored *Q*(*S*00, *S*30) for every *S*30. The *Q*(*S*00, *S*30) that we explored in the memory was as follows:

$$Q(0x0000000, \text{S3}\_0), Q(0x0000001, \text{S3}\_0), Q(0x00000002, \text{S3}\_0), \dots, Q(0x000000001, \text{S3}\_0) \tag{13}$$

*Q*(0*x*00000000, *S*30), *Q*(0*x*00010000, *S*30), *Q*(0*x*00020000, *S*30),..., *Q*(0*x*ffff0000,*s*30) (14)

4.2.3. Brute Force Phase

We recovered the remaining 192 bits of state by brute force. There was no memory required, but we needed 2192 computations.

*4.3. Attack on the Six-Round Sycon*

### 4.3.1. Offline Phase

We chose the 32-bit *S*01. With the chosen plaintext, we assigned the LSB 31 bits of *<sup>S</sup>*10 as the cube variable *CS*<sup>10</sup> :<sup>=</sup> {032, ... ,0*x*ffffffff}. The cube variable 2<sup>32</sup> of *<sup>S</sup>*01, *<sup>S</sup>*20, *<sup>S</sup>*21, *S*31, *S*40, and *S*41 was fixed as 032, because *S*10 was only multiplied with *S*00 and *S*30. We used the process shown in Table 5 to find the variable coefficients of a polynomial for Q(S). *S*30 could be rephrased as *Sycon*(S) = *S*30*Q*(*S*00, *S*31) + *R*(*S*0, *S*1, *S*2, *S*3, *S*4). *S*30*Q*(*S*) was <sup>∑</sup>*S*10∈*CS*<sup>10</sup> *Sycon*(S). For each *<sup>j</sup>* ∈ {0, 1}64, we computed the following:

$$\sum\_{(v\_1,\ldots,v\_{S1})\in\mathbb{C}\_{S1\_0}} f(S0\_0, S0\_1, S1\_0, S1\_1, S2\_0, S2\_1, S3\_0, S3\_1, S4\_0, S4\_1) = Q(S0\_0, S3\_0),$$

$$S0\_0 = 0\text{x}00000000, 0\text{x}00000001, 0\text{x}000000002, \dots, 0\text{x}0\text{ff}\text{ff}\text{ff},$$

$$S3\_0 = 0\text{x}00000000, 0\text{x}00000001, 0\text{x}00000002, \dots, 0\text{x}\text{ff}\text{ff}\text{ff}\text{ff}.$$

**Table 5.** Attack on the six-round Sycon: Offline phase process.

#### **Offline Pseudo Code**

```
Input : State S = S00S01S10S11S20s21S30S31S40S41, Cube Set CS10
Output : CubeSum CS[232] = {0510, 0510,...,0510}
Algorithm :
    S := 0320
    For i in 0 to 0xffffffff:
         CS[i] := 01020
         S30 := i
         For j in 0 to 0xfffffffe:
              s00 := j
              For cube value in CS10 :
                  S = Sycon(S)
                  CStmp := CS[i] ⊕ S10
                  S := 0320
              CS[i] := CS[i]

                            CStmp
    return CS
```
### 4.3.2. Online Phase

We implemented an oracle, where the cube variables *CS*00 were defined internally as a fixed set. The oracle computed *Q*(*S*00, *S*30) for the plaintext chosen. The results calculated by the oracle were as follows for the online cube-sum:

$$\sum\_{\mathbf{x}\_{1}, (\mathbf{x}\_{1}, \dots, \mathbf{x}\_{S1}) \in \mathbb{C}\_{S1\_{0}}} f(\mathbf{S}0\_{0}, \mathbf{S}0\_{1}, \mathbf{S}1\_{0}, \mathbf{S}1\_{1}, \mathbf{S}2\_{0}, \mathbf{S}2\_{1}, \mathbf{S}3\_{0}, \mathbf{S}3\_{1}, \mathbf{S}4\_{1}) = Q(\mathbf{S}0\_{0}, \mathbf{S}3\_{0}), \tag{16}$$
 
$$\mathbf{S}0\_{0} = \mathbf{0} \mathbf{x}00000000, \mathbf{0} \mathbf{x}00000001, \mathbf{0} \mathbf{x}00000002, \dots, 0 \mathbf{x} \mathbf{f} \mathbf{f} \mathbf{f} \mathbf{f} \mathbf{f} \mathbf{f} \mathbf{f} \mathbf{f}.$$

We explored the value *Q*(*S*00, *S*30) obtained through the oracle in the memory space stored during the offline phase. If the same value as *Q*(*S*00, *S*30) was in the memory, we determined *S*30, because the memory stored *Q*(*S*00, *S*30) for every *S*30. The *Q*(*S*00, *S*30) that we explored in the memory was as follows:

*Q*(0*x*00000000, *S*30), *Q*(0*x*00000001, *S*30), *Q*(0*x*00000002, *S*30),..., *Q*(0*x*ffffffff, *S*30) (17)

4.3.3. Brute Force Phase

We recovered the remaining 192-bit state bits by brute force.

### **5. Results and Discussion**

First, we analyze the complexity in the offline phase. Computing the equations for just one case in the memory required 2<sup>15</sup> 5-round Sycon and 16 bits of memory. We performed the same process 2<sup>32</sup> times repeatedly for each *S*30 and *S*00. Therefore, in the offline phase, we required 248 Sycon computations and 2<sup>37</sup> bits of memory. In the online phase, we required 232 computations to recover the target bits. The remaining 192 bits of state were recovered by brute force, so we required 2192. In total, the attack needed a computational complexity of 248 <sup>+</sup> 232 <sup>+</sup> <sup>2</sup><sup>192</sup> <sup>≈</sup> <sup>2</sup><sup>192</sup> .

To recover the six-round Sycon permutation state, we needed an offline phase, an online phase, and a brute-force phase. In the offline phase, we precalculated the cube sum, and it needed a computation complexity of 2<sup>31</sup> and 64 bits of memory. We performed this process 2<sup>64</sup> times repeatedly for all cases in *S*30 and *S*00. In total, in the offline phase, 295 6-round Sycon computations and 270 bits of memory were required. In the online phase, we only needed to perform 2<sup>64</sup> of six-round Sycon permutations. The remaining 192 bits of state were recovered by brute force. In the brute-force phase, there was no memory required, but we needed 2<sup>192</sup> computations. Thus, in total, the attack required a computational complexity of 295 <sup>+</sup> 270 <sup>+</sup> <sup>2</sup><sup>192</sup> <sup>≈</sup> <sup>2</sup><sup>192</sup> (Table 6).


**Table 6.** Complexity of the 224-bit secret state recovery attack.

\* This was the first state recovery attack against Sycon. So the attack complexity based on brute force is the best result up to now.

The AEAD cipher encrypts the data to be transmitted and creates a tag for data integrity [24]. In order to give functions, Sycon has the following four phases: Initialization, associated data processing, encryption/decryption, and finalization. Each step should be analyzed in different ways [25]. In the future, we can extend the cube attack to the initialization or finalization phase of Sycon to recover the secret key.

### **6. Conclusions**

In this paper, we proposed a state recovery attack against five-round and six-round Sycon. Sycon keeps 224 bits in secret and 96 bits as plaintext/ciphertext. With no information, we needed 2224 computations. From the attack we proposed, 32 bits of the state *S*31 could be recovered against the 5-round Sycon, with a time complexity of 2192. We also proposed an attack against the six-round Sycon. The attack recovered the same 32 bits with the time complexity of 2<sup>95</sup> and 2<sup>70</sup> of memory in the offline phase. The time complexity was 2192. This was faster than brute force over the 2224 possible states by a factor of about 232.

**Author Contributions:** Methodology, M.C.; Writing—original draft, M.C.; Formal analysis, H.E. and E.T.; Writing—review and editing, H.E. and E.T.; Supervision, C.L.; All authors read and agreed to the published version of the manuscript.

**Funding:** This work was supported as part of the Military Crypto Research Center (UD210027XD) funded by the Defense Acquisition Program Administration (DAPA) and the Agency for Defense Development (ADD).

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**

