Energy-Efficient Secure Cell-Free Massive MIMO for Internet of Things: A Hybrid CNN–LSTM-Based Deep-Learning Approach

Vaziri, Ali; Moghaddam, Pardis Sadatian; Shoeibi, Mehrdad; Kaveh, Masoud

doi:10.3390/fi17040169

Open AccessArticle

Energy-Efficient Secure Cell-Free Massive MIMO for Internet of Things: A Hybrid CNN–LSTM-Based Deep-Learning Approach

¹

Department of Systems and Enterprises, Stevens Institute of Technology, Hoboken, NJ 07030, USA

²

Department of Computer Science, Georgia State University, Atlanta, GA 30302, USA

³

The WPI Business School, Worcester Polytechnic Institute, Worcester, MA 01609, USA

⁴

Department of Information and Communication Engineering, Aalto University, 02150 Espoo, Finland

^*

Author to whom correspondence should be addressed.

Future Internet 2025, 17(4), 169; https://doi.org/10.3390/fi17040169

Submission received: 2 March 2025 / Revised: 7 April 2025 / Accepted: 9 April 2025 / Published: 11 April 2025

(This article belongs to the Special Issue Moving Towards 6G Wireless Technologies—2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

The Internet of Things (IoT) has revolutionized modern communication systems by enabling seamless connectivity among low-power devices. However, the increasing demand for high-performance wireless networks necessitates advanced frameworks that optimize both energy efficiency (EE) and security. Cell-free massive multiple-input multiple-output (CF m-MIMO) has emerged as a promising solution for IoT networks, offering enhanced spectral efficiency, low-latency communication, and robust connectivity. Nevertheless, balancing EE and security in such systems remains a significant challenge due to the stringent power and computational constraints of IoT devices. This study employs secrecy energy efficiency (SEE) as a key performance metric to evaluate the trade-off between power consumption and secure communication efficiency. By jointly considering energy consumption and secrecy rate, our analysis provides a comprehensive assessment of security-aware energy efficiency in CF m-MIMO-based IoT networks. To enhance SEE, we introduce a hybrid deep-learning (DL) framework that integrates convolutional neural networks (CNN) and long short-term memory (LSTM) networks for joint EE and security optimization. The CNN extracts spatial features, while the LSTM captures temporal dependencies, enabling a more robust and adaptive modeling of dynamic IoT communication patterns. Additionally, a multi-objective improved biogeography-based optimization (MOIBBO) algorithm is utilized to optimize hyperparameters, ensuring an improved balance between convergence speed and model performance. Extensive simulation results demonstrate that the proposed MOIBBO-CNN–LSTM framework achieves superior SEE performance compared to benchmark schemes. Specifically, MOIBBO-CNN–LSTM attains an SEE gain of up to 38% compared to LSTM and 22% over CNN while converging significantly faster at early training epochs. Furthermore, our results reveal that SEE improves with increasing AP transmit power up to a saturation point (approximately 9.5 Mb/J at

P_{AP}^{max} = 500

mW), beyond which excessive power consumption limits efficiency gains. Additionally, SEE decreases as the number of APs increases, underscoring the need for adaptive AP selection strategies to mitigate static power consumption in backhaul links. These findings confirm that MOIBBO-CNN–LSTM offers an effective solution for optimizing SEE in CF m-MIMO-based IoT networks, paving the way for more energy-efficient and secure IoT communications.

Keywords:

internet of things; energy efficiency; cell-free massive MIMO; secrecy rate; deep learning

1. Introduction

The Internet of Things (IoT) has transformed modern communication systems by enabling seamless interconnectivity among billions of devices in diverse applications, including smart cities, healthcare, industrial automation, and environmental monitoring [1,2,3]. IoT enhances automation and decision-making by leveraging real-time data exchange, improving operational efficiency and user convenience [4,5]. However, despite its growing adoption, several challenges hinder its large-scale deployment and practicality. Energy consumption remains a critical issue, as most IoT devices are battery-powered and require efficient power management to extend operational lifespan. Battery replacement is often impractical in remote or massive deployments, increasing maintenance costs [6,7,8,9]. Moreover, IoT devices are typically low-resource systems with limited computational and memory capabilities, making it difficult to process and transmit large amounts of data efficiently [10,11,12]. Connectivity is another challenge, as IoT networks must support seamless communication across varying network conditions and device densities [13]. Additionally, coverage limitations in conventional wireless architectures restrict reliable data transmission, particularly in highly distributed IoT environments [14,15,16,17]. Last but not least, security vulnerabilities pose significant threats, as IoT systems are prone to cyberattacks, unauthorized access, and data breaches due to their decentralized nature [18,19,20,21].

To address the aforementioned challenges, cell-free massive multiple-input multiple-output (CF m-MIMO) has been proposed as a promising solution for IoT communication networks [22,23,24]. CF m-MIMO leverages a distributed network of access points (APs) connected to a centralized processing unit, collaboratively serving IoT devices without traditional cell boundaries. By eliminating inter-cell interference and distributing network resources efficiently, CF m-MIMO enhances the quality of service (QoS) by improving spectral efficiency, coverage, and user fairness [25,26,27]. The decentralized nature of CF m-MIMO significantly enhances connectivity, ensuring robust communication even in ultra-dense IoT environments. Unlike conventional cellular architectures, CF m-MIMO dynamically adapts network resources based on device locations and communication demands, providing uniform signal quality across a wide geographical area. By leveraging coordinated beamforming and joint signal processing, CF m-MIMO optimizes energy efficiency and mitigates interference, making it an ideal candidate for scalable, energy-efficient, and secure IoT communications [28,29,30].

1.1. Related Works

The rapid growth of IoT has led to significant research in CF m-MIMO to enhance network scalability, energy efficiency (EE), and security. One of the primary challenges in CF m-MIMO IoT systems is ensuring efficient power usage while maintaining high QoS. The authors in [28] analyzed a wirelessly powered IoT system using CF m-MIMO, where IoT sensors harvest energy from distributed APs during the downlink phase and use the power harvested for uplink transmission. Their optimization strategy minimized total transmit energy while meeting signal-to-interference-plus-noise ratio (SINR) constraints, demonstrating that CF IoT significantly outperforms collocated m-MIMO and small-cell IoT in energy efficiency. Similarly, Lee et al. [29] investigated CF m-MIMO in low-power IoT networks and proposed an energy-efficient power control scheme that significantly reduces the transmission power of IoT devices while maintaining connectivity. Their work demonstrated power savings of 90% but highlighted challenges in balancing spectral efficiency (SE) and power consumption. Yan et al. [30] extended this work by developing a scalable CF m-MIMO IoT system that utilizes optimal power control strategies and neural network-based solutions to improve EE. Their results indicated multifold EE improvements, but security considerations were not explicitly addressed.

Despite advances in EE, security remains a critical challenge in CF m-MIMO IoT due to the broadcast nature of wireless transmissions. Unauthorized eavesdropping poses a severe threat to confidential communications. Zhang et al. [31] explored a non-orthogonal multiple access (NOMA)-based CF m-MIMO IoT system and derived closed-form SE and EE expressions under pilot contamination and interference conditions. However, while optimizing power control for NOMA users, their work did not incorporate physical layer security (PLS) techniques for the enhancement of secrecy rate. Similarly, Rao et al. [32] examined pilot contamination in CF m-MIMO IoT, an issue exacerbated by the inability to assign orthogonal pilots to a massive number of IoT devices. They proposed an optimal linear minimum mean-square-error (LMMSE) channel estimation method to mitigate interference and improve uplink throughput, indirectly improving security but without explicitly considering PLS.

Beyond EE and security, recent studies have focused on improving network reliability and localization in CF m-MIMO IoT. Wei et al. [33] proposed a fingerprint-based channel estimation and localization framework where location awareness was leveraged to enhance channel estimation accuracy. Their framework introduced a two-phase localization approach and a pilot reassignment scheme to improve positioning accuracy and channel quality. Similarly, Lan et al. [34] introduced a RIS-assisted CF m-MIMO framework for IoT networks, optimizing power control, precoding, and RIS phase shifts to improve the sum rate (SR) and EE. Their work demonstrated that RIS could significantly enhance CF m-MIMO performance, although security issues were not addressed. Li et al. [35] extended this by focusing on user-centric CF m-MIMO in highly dynamic IoT environments. They studied the effects of imperfect channel state information (CSI), channel aging, and non-line-of-sight (NLoS) conditions, proposing a soft handover scheme to enhance mobility support. Their findings emphasized the importance of preconfigured pilot overhead and CSI estimation in mitigating performance degradation.

Mahmoud et al. [36] investigated CF m-MIMO for indoor factory environments in industrial IoT, analyzing the effects of centralized and distributed AP cooperation on spectral efficiency. Their proposed AP selection and pilot assignment schemes reduced pilot contamination while maintaining connectivity in highly dense deployments. Ke et al. [37] studied massive access in CF m-MIMO IoT, addressing challenges in active user detection and channel estimation by proposing a structured sparsity-based generalized approximate message passing (SS-GAMP) algorithm. Their work compared cloud and edge computing paradigms, demonstrating that edge computing could reduce processing latency while maintaining similar performance levels. Yan et al. [38] investigated the optimization of CF m-MIMO for IoT by developing power control algorithms that enhance both SE and EE. Their work focused on the uplink transmission scenario, where IoT devices require optimal power allocation to maintain reliable connectivity under stringent energy constraints. The authors proposed max-min power control algorithms leveraging random matrix (RM) theory, which provided accurate SINR approximations based on large-scale fading coefficients. They also introduced a neural network (NN)-based power control algorithm for the downlink, which significantly reduced computational complexity while achieving near-optimal power allocation. Their results demonstrated that machine learning-based power control in CF m-MIMO networks enhances scalability by reducing computational overhead at APs without degrading system performance.

Li et al. [39] explored the integration of RIS and backscatter devices (BD) in CF m-MIMO symmetric radio (CF-m-MIMO-SR) to improve spectral efficiency and energy efficiency in IoT systems. Their study analyzed the performance trade-offs between RIS-aided and BD-aided CF-m-MIMO-SR systems, focusing on how RIS can mitigate the double fading effect inherent in BD communications. The authors derived closed-form SE expressions for different levels of cooperation among AP and investigated various signal cancelation schemes based on available CSI. Their simulation results demonstrated that RIS significantly improves the SE of the backscatter link due to its ability to control reflection elements, whereas BD-aided systems require additional signal processing for enhanced direct-link performance. Their findings highlight the potential of RIS in CF-m-MIMO-SR networks but emphasize the need for optimized signal processing techniques to maximize performance across both backscatter and direct links.

1.2. Research Gaps and Motivations

While CF m-MIMO has demonstrated significant improvements in QoS and connectivity in IoT networks, existing schemes often lack sufficient EE approaches. Most CF m-MIMO frameworks enhance performance by increasing the number of antennas in IoT APs. However, this comes at the cost of higher power consumption, which is particularly concerning for battery-operated IoT devices. As the number of antennas grows, the overall energy expenditure of the network also increases, creating a trade-off between EE and communication performance. Without proper EE optimization, the large-scale adoption of CF m-MIMO for IoT remains impractical due to its growing power demands.

In addition to energy concerns, security vulnerabilities in CF m-MIMO-based IoT networks present another major challenge. The broadcast nature of wireless communications makes legitimate transmissions susceptible to eavesdropping attacks, where malicious entities attempt to intercept confidential data. Security has, therefore, become an unavoidable issue in wireless communication systems, requiring advanced protection mechanisms. In the past decade, PLS has gained significant attention as a complement to traditional cryptographic encryption techniques. PLS leverages signal processing and transmission design to secure data at the physical level, reducing the dependency on computationally expensive encryption methods. Interestingly, the increase in the number of antennas in CF m-MIMO networks brings more spatial diversity gain, which has the potential to enhance PLS performance by improving secure beamforming and jamming techniques. However, despite these advantages, the existing literature has not yet fully explored secure CF m-MIMO IoT architectures, leaving security concerns largely unaddressed. To comprehensively evaluate both PLS and EE in CF m-MIMO IoT networks, this paper adopts secrecy energy efficiency (SEE) [40,41] as the key performance metric. SEE is defined as the ratio of the sum secrecy rate (SR) to total power consumption, providing a unified measure of both secure communication and energy efficiency [42,43,44,45]. Taking advantage of SEE, our objective is to develop an optimization framework that jointly improves security and EE, ensuring the feasibility of CF m-MIMO IoT deployments in various practical settings. To clearly highlight the positioning of our work within the existing literature, Table 1 presents a comparative analysis of recent studies based on different metrics for designing secure and energy-efficient CF m-MIMO IoT systems, where EE represents optimization for energy efficiency, SE represents an enhancement of spectral efficiency through CF m-MIMO strategies or optimization, AB represents explicit consideration of antenna scaling and its impact on energy consumption in CF m-MIMO, PLS represents the incorporation of physical layer security techniques for secure transmission in CF m-MIMO IoT, SEE represents use of secrecy energy efficiency as a unified performance metric balancing power and security, and TO represents use of adaptive, multi-objective optimization methods to balance competing goals (e.g., security vs. efficiency).

1.3. Paper Contributions and Organization

Based on the aforementioned research gaps and the importance of EE and secure communication in CF m-MIMO-enabled IoT networks, the main contributions of this paper are summarized as follows:

This work employs SEE as a key performance metric to jointly optimize both EE and SR in CF m-MIMO-based IoT networks. Unlike traditional approaches that focus on either EE, QoS, or security, our framework ensures a balanced trade-off by quantifying the impact of power consumption on secure communication.
We propose a novel hybrid deep learning (DL) model based on a convolutional neural network (CNN) and long-short-term memory (LSTM) to improve SEE optimization in CF m-MIMO. CNN is responsible for extracting spatial features from the IoT network environment, while the LSTM network captures temporal dependencies, allowing the system to adapt dynamically to changing network conditions and security threats.
To further improve the efficiency and effectiveness of the deep-learning framework, we incorporate a multi-objective improved biogeography-based optimization (MOIBBO) algorithm for hyperparameter tuning. The MOIBBO improves training performance by optimizing key model parameters, accelerating convergence, and improving both classification accuracy and model robustness to secure IoT communications.
We conduct an extensive simulation and performance evaluation comparing our proposed model with existing CF m-MIMO security and energy efficiency frameworks. The simulation results demonstrate that our approach achieves higher SEE, lower power consumption, and better security performance compared to conventional methods.

The remainder of this paper is organized as follows: Section 2 describes the system model and problem formulation for SEE maximization in CF m-MIMO-enabled IoT networks. Section 3 presents the materials and methodology of the proposed hybrid DL framework. Section 4 provides the simulation results, assessing the performance of the proposed solution. Section 5 presents the discussion and comparison of the results with benchmark approaches, and finally, Section 6 concludes the paper.

2. System Model and Problem Formulation

We consider a CF m-MIMO-based IoT network consisting of K distributed APs, denoted as

A_{k}

, each equipped with N antennas, serving I single-antenna IoT devices, denoted as

D_{i}

, in the presence of J active eavesdroppers, denoted as

E_{j}

, as shown in Figure 1. All APs are linked to a centralized processing unit (CPU) and operate over the same time-frequency resources to simultaneously serve

D_{i}

while mitigating interference from

E_{j}

. The system follows a time-division duplex (TDD) protocol, where each coherence interval is divided into two phases: uplink training and downlink data transmission [46,47].

During the uplink training phase, each device

D_{i}

and eavesdropper

E_{j}

transmit a pilot sequence to the APs to facilitate channel estimation. The received pilot signals at the APs are used to estimate the CSI, which is then utilized for precoding during downlink transmission. The channel fading coefficients from AP

A_{k}

to device

D_{i}

and eavesdropper

E_{j}

are expressed as:

H_{k D_{i}} = \sqrt{Γ_{k D_{i}}} X_{k D_{i}},

(1)

H_{k E_{j}} = \sqrt{Γ_{k E_{j}}} X_{k E_{j}},

(2)

where

Γ_{k D_{i}}

and

Γ_{k E_{j}}

represent the large-scale fading coefficients, incorporating both path loss and shadowing between AP

A_{k}

and either device

D_{i}

or eavesdropper

E_{j}

. The terms

X_{k D_{i}} \in C^{N \times 1}

and

X_{k E_{j}} \in C^{N \times 1}

denote small-scale Rayleigh fading, whose elements are independent and identically distributed (i.i.d.) random variables following

CN (0, 1)

. The small-scale fading remains constant over a single coherence interval but varies independently across different coherence blocks. In contrast, large-scale fading evolves at a much slower rate and remains unchanged for multiple coherence intervals [48].

For the uplink training, we assume that all IoT devices

D_{i}

simultaneously transmit

τ_{p}

mutually orthogonal pilot sequences to K APs. As per [49], we require

τ_{p} \geq K

. Let

Ψ_{i} \in C^{τ_{p} \times 1}

be the pilot sequence for device

D_{i}

, where

∥ Ψ_{i} ∥^{2} = 1

, and the sequences satisfy the orthogonality condition

Ψ_{i}^{H} Ψ_{i^{'}} = 0

for

i \neq i^{'}

. Since pilot sequences are publicly known, eavesdroppers may exploit them to launch pilot contamination attacks by transmitting sequences identical to those of legitimate devices. As a result, the received pilot signal at the kth AP is modeled as:

Z_{p, k} = \sqrt{τ_{p} ρ_{D}} \sum_{i = 1}^{I} H_{k D_{i}} Ψ_{i}^{H} + \sqrt{τ_{p} ρ_{E}} \sum_{j = 1}^{J} H_{k E_{j}} Ψ_{i}^{H} + V_{p, k},

(3)

where

ρ_{D}

and

ρ_{E}

denote the respective normalized signal-to-noise ratios (SNRs) for legitimate devices and eavesdroppers. The term

V_{p, k} \in C^{N \times τ_{p}}

represents additive white Gaussian noise (AWGN) at

A_{k}

, modeled as

CN (0, 1)

. For channel estimation at AP

A_{k}

, we project the received pilot signal onto

Ψ_{i}

:

z_{p, k D_{i}} = Z_{p, k} Ψ_{i} .

(4)

Thus, the projected received signal is given by:

z_{p, k D_{i}} = \{\begin{matrix} \sqrt{τ_{p} ρ_{D}} H_{k D_{i}} + V_{p, k} Ψ_{i}, & i \neq i^{'} \\ \sqrt{τ_{p} ρ_{D}} H_{k D_{i}} + \sqrt{τ_{p} ρ_{E}} \sum_{j = 1}^{J} H_{k E_{j}} + V_{p, k} Ψ_{i}, & i = i^{'} \end{matrix}

(5)

Applying the linear minimum mean-square-error (LMMSE) estimator [50], the estimated channel coefficient from AP

A_{k}

to device

D_{i}

is obtained as:

{\hat{H}}_{k D_{i}} = \{\begin{matrix} \frac{\sqrt{τ_{p} ρ_{D}} Γ_{k D_{i}}}{τ_{p} ρ_{D} Γ_{k D_{i}} + 1} z_{p, k D_{i}}, & i \neq i^{'} \\ \frac{\sqrt{τ_{p} ρ_{D}} Γ_{k D_{i}}}{τ_{p} ρ_{D} Γ_{k D_{i}} + τ_{p} ρ_{E} \sum_{j = 1}^{J} Γ_{k E_{j}} + 1} z_{p, k D_{i}}, & i = i^{'} \end{matrix}

(6)

For notation simplicity, let

λ_{k D_{i}}

represent the mean-square value of the estimated channel coefficient:

λ_{k D_{i}} = \{\begin{matrix} \frac{τ_{p} ρ_{D} Γ_{k D_{i}}^{2}}{τ_{p} ρ_{D} Γ_{k D_{i}} + 1}, & i \neq i^{'} \\ \frac{τ_{p} ρ_{D} Γ_{k D_{i}}^{2}}{τ_{p} ρ_{D} Γ_{k D_{i}} + τ_{p} ρ_{E} \sum_{j = 1}^{J} Γ_{k E_{j}} + 1}, & i = i^{'} \end{matrix}

(7)

By the properties of the LMMSE estimator, the estimation error is given by:

{\tilde{H}}_{k D_{i}} = H_{k D_{i}} - {\hat{H}}_{k D_{i}},

(8)

where

{\tilde{H}}_{k D_{i}}

follows the independent Gaussian distribution

CN (0, Γ_{k D_{i}} - λ_{k D_{i}})

. Similarly, the estimated channel from AP

A_{k}

to eavesdropper

E_{j}

is expressed as:

{\hat{H}}_{k E_{j}} = \frac{\sqrt{ρ_{E}} Γ_{k E_{j}}}{\sqrt{ρ_{D}} Γ_{k D_{i}}} {\hat{H}}_{k D_{i}} .

(9)

The corresponding mean-square value for the estimated eavesdropper channel is given by:

λ_{k E_{j}} = E \{| {\hat{H}}_{k E_{j}} |^{2}\} = \frac{ρ_{E} Γ_{k E_{j}}^{2}}{ρ_{p} Γ_{k D_{i}}^{2}} λ_{k D_{i}} .

(10)

The estimation error from AP

A_{k}

to eavesdropper

E_{j}

is defined as:

{\tilde{H}}_{k E_{j}} = H_{k E_{j}} - {\hat{H}}_{k E_{j}} .

(11)

Following the properties of LMMSE estimation,

{\tilde{H}}_{k E_{j}}

follows an independent Gaussian distribution

CN (0, Γ_{k E_{j}} - λ_{k E_{j}})

. For downlink data transmission, the transmitted signal from the kth AP is given by:

Q_{k} = \sqrt{ζ} \sum_{i = 1}^{I} \sqrt{θ_{k i}} V_{k i} r_{i},

(12)

where

ζ = \frac{P_{\max}}{ξ_{0}}

represents the maximum normalized transmit signal-to-noise ratio (SNR) at each AP. Here,

P_{\max}

is the maximum allowable transmit power per AP,

ξ_{0}

is the noise power, and

r_{i}

represents the transmitted symbol for device

D_{i}

, with

E {| r_{i} |^{2}} = 1

. The power control factor between AP

A_{k}

and device

D_{i}

is denoted as

θ_{k i}

, and

V_{k i} \in C^{N \times 1}

is the precoding matrix designed for the signals sent from AP

A_{k}

to device

D_{i}

. The normalized beamforming vector is computed as follows:

V_{k i} = \frac{{\tilde{H}}_{k D_{i}}^{*}}{∥ {\tilde{H}}_{k D_{i}}^{*} ∥} = \frac{{\tilde{H}}_{k D_{i}}^{*}}{\sqrt{N σ_{k D_{i}}}} .

(13)

The total power used by AP

A_{k}

can be expressed as:

E \{∥ Q_{k} ∥^{2}\} = ζ \sum_{i = 1}^{I} θ_{k i} .

(14)

Each AP transmits confidential messages to I IoT devices, while J eavesdroppers attempt to intercept the legitimate downlink signals. The received signal at device

D_{i}

is given by:

u_{D_{i}} = \sum_{k = 1}^{K} T_{k D_{i}}^{T} Q_{k} + e_{D_{i}},

(15)

u_{D_{i}} = \sqrt{ζ} \sum_{k = 1}^{K} \sum_{i^{'} = 1}^{I} \sqrt{\frac{θ_{k i^{'}}}{N σ_{k D_{i}^{'}}}} T_{k D_{i}}^{T} {\tilde{H}}_{k D_{i}^{'}}^{*} r_{i^{'}} + e_{D_{i}} .

(16)

where

e_{D_{i}} \sim CN (0, 1)

represents the AWGN noise at device

D_{i}

. Similarly, the signal received by eavesdropper

E_{j}

while attempting to decode the transmission to device

D_{i}

is given by:

u_{E_{j}, i} = \sum_{k = 1}^{K} T_{k E_{j}}^{T} Q_{k} + e_{E_{j}, i},

(17)

u_{E_{j}, i} = \sqrt{ζ} \sum_{k = 1}^{K} \sum_{i^{'} = 1}^{I} \sqrt{\frac{θ_{k i^{'}}}{N σ_{k D_{i}^{'}}}} T_{k E_{j}}^{T} {\tilde{H}}_{k D_{i}^{'}}^{*} r_{i^{'}} + e_{E_{j}, i} .

(18)

where

e_{E_{j}, i} \sim CN (0, 1)

represents the AWGN at eavesdropper

E_{j}

. Next, we formulate the SEE maximization problem under constraints on available transmission power and QoS requirements. SEE here serves as a unified performance metric that captures the trade-off between secure data transmission and energy consumption, which is especially critical in power-constrained IoT systems. In the context of CF m-MIMO IoT networks, SEE is defined as the ratio of the sum secrecy rate to the total power consumption across all access points and devices. A higher SEE value indicates that more confidential information is successfully transmitted per unit of energy consumed. Unlike traditional metrics that consider only energy efficiency or spectral efficiency in isolation, SEE integrates security into the resource allocation framework by incorporating the difference between the achievable rate at the legitimate receiver and the maximum rate achievable by potential eavesdroppers. This allows the system to not only optimize power usage but also actively protect against information leakage. Therefore, optimizing SEE ensures that the network uses energy in a way that maximizes confidentiality while maintaining efficiency, a key requirement for the secure deployment of large-scale IoT environments [42,43,44,45].

Thus, to efficiently solve this optimization problem, we develop a hybrid DL framework that integrates CNN and LSTM networks for joint EE and security optimization. CNN is utilized for spatial feature extraction, while LSTM captures temporal dependencies, enabling more effective modeling of dynamic IoT communication patterns. Additionally, to improve the training efficiency and performance of the CNN’s fully connected layers, we employ the MOIBBO algorithm for hyperparameter optimization, given its effectiveness in handling multi-objective problems with improved convergence speed. Therefore, the primary objective of this work is to enhance the SEE of the CF m-MIMO-based IoT system through optimized power allocation while satisfying the following constraints:

The maximum normalized transmission power of each AP $A_{k}$ is constrained by $ζ$ . Thus, the power control factors must satisfy:

$\sum_{i = 1}^{I} θ_{k i} \leq 1, k \in K = {1, \dots, K} .$

(19)
To ensure the QoS requirements of each IoT device $D_{i}$ , the achievable secrecy rate must exceed a predefined threshold:

${log}_{2} (1 + {\tilde{φ}}_{D_{i}}) \geq r_{D_{i}}^{th}, \forall i \in I .$

(20)
To limit the wiretapping capability of any eavesdropper $E_{j}$ , its achievable rate for wiretapping device $D_{i}$ is restricted as follows:

${log}_{2} (1 + {\tilde{φ}}_{E_{j}, i}) \leq r_{E_{j}, i}^{th}, \forall i \in I, \forall j \in J .$

(21)

With these constraints, the SEE maximization problem is expressed as follows:

max_{θ_{k i}} \frac{\sum_{i = 1}^{I} \frac{1}{I} [S_{\sec, i} + \sum_{i^{'} \neq i}^{I} {log}_{2} (1 + {\tilde{φ}}_{D_{i}})]}{P_{total}},

(22)

\begin{matrix} s . t . & C 1 : 0 \leq θ_{k i} \leq 1, \forall k \in K, \forall i \in I, \end{matrix}

(23)

\begin{matrix} C 2 : {log}_{2} (1 + {\tilde{φ}}_{D_{i}}) \geq r_{D_{i}}^{th}, \forall i \in I, \end{matrix}

(24)

\begin{matrix} C 3 : {log}_{2} (1 + {\tilde{φ}}_{E_{j}, i}) \leq r_{E_{j}, i}^{th}, \forall i \in I, \forall j \in J, \end{matrix}

(25)

\begin{matrix} C 4 : \sum_{i = 1}^{I} θ_{k i} \leq 1, \forall k \in K . \end{matrix}

(26)

In this work, QoS-aware communication signifies the system’s capability to sustain a satisfactory SR. Consequently, our primary goal in optimizing SEE is to enhance SR while adhering to a predefined energy constraint.

3. Materials and Methods

In this section, we present the methodology and framework used to develop and evaluate the proposed model. The section is structured into four key subsections, each addressing a critical component of the study. Section 3.1 provides an overview of CNNs, explaining their feature extraction capabilities and role in processing spatial dependencies. Section 3.2 describes the LSTM architecture and its ability to model sequential dependencies, making it essential for handling temporal variations in IoT data. Section 3.3 introduces the novel MOIBBO algorithm, explaining its migration, mutation, and Pareto-based selection mechanisms for optimizing DL architectures. Finally, Section 3.4 presents the integration of CNN, LSTM, and MOIBBO into a unified framework, detailing the architecture, optimization strategy, and advantages of the hybrid model to solve the SEE problem in IoT networks based on CF m-MIMO.

3.1. Standard CNN

A CNN is a specialized class of DL models designed primarily for processing structured grid data, such as images. First introduced by Yann LeCun in 1989, CNNs were inspired by the organization of the animal visual cortex, where neurons in the brain respond to overlapping regions of the visual field. The most well-known early CNN architecture, LeNet-5, was developed in 1998 and demonstrated its effectiveness in handwritten digit recognition. Since then, CNNs have become the foundation of modern computer vision, driving applications such as image classification, object detection, medical imaging, and facial recognition. CNNs leverage spatial hierarchies of features, extracting relevant patterns from input images through the application of convolutional filters. Instead of traditional fully connected layers, CNNs utilize local connectivity and weight sharing to drastically reduce the number of parameters, making them more efficient in handling large-scale image datasets [51,52,53,54].

Each convolutional filter scans over an input matrix, capturing small localized patterns such as edges or textures, which are then combined across layers to build more complex representations. As shown in Figure 2, a standard CNN consists of multiple essential layers that progressively transform the input data into high-level feature representations. The model consists of convolutional layers, pooling layers, a flattening stage, and fully connected layers, which ultimately produce the final output. The first stage is the convolutional layer, where each filter (kernel) slides over the input image and performs element-wise multiplications followed by summation. Mathematically, this operation can be expressed as follows:

Y (i, j) = \sum_{m} \sum_{n} X (i - m, j - n) \times W (m, n)

(27)

where X represents the input image, W denotes the kernel weights, and

Y (i, j)

is the resulting feature map. The extracted feature maps are then passed through an activation function, commonly the Rectified Linear Unit (ReLU), which introduces nonlinearity into the model, allowing it to learn complex patterns.

Following the convolutional operation, CNNs employ pooling layers to reduce the spatial dimensions of feature maps while preserving key information. Max pooling is the most commonly used method, selecting the maximum value from each local region, thereby achieving translational invariance and improving computational efficiency. The first pooling operation in Figure 2 demonstrates this process, in which the number of feature maps is reduced while retaining important details. After multiple convolutional and pooling layers, the flattening layer converts the extracted high-dimensional features into a one-dimensional vector. This flattened representation is then fed into fully connected (dense) layers, which perform the final decision-making. These layers resemble traditional neural networks, where each neuron is connected to every neuron in the previous layer. The final output layer typically applies a SoftMax function for classification tasks or a sigmoid function for binary classification. CNNs have revolutionized deep learning due to their ability to automatically learn feature hierarchies from raw data, reducing the need for manual feature engineering. Their success extends beyond computer vision to areas such as natural language processing (NLP) and speech recognition, where CNNs extract spatial representations from text and audio signals. Using convolutional operations, pooling, and deep feature extraction, CNNs have become indispensable tools for modern AI applications.

3.2. Standard LSTM

LSTM networks, first introduced by Hochreiter and Schmidhuber in 1997, are a specialized type of recurrent neural network (RNN) designed to address the vanishing gradient problem. Unlike traditional RNNs, which struggle to capture long-range dependencies in sequential data, LSTMs incorporate a memory cell mechanism that selectively retains and discards information over extended time steps. This key advantage makes LSTMs particularly effective for applications such as time-series forecasting, natural language processing (NLP), speech recognition, and anomaly detection in IoT systems. The LSTM architecture, as illustrated in Figure 3, consists of multiple gates and memory cells that regulate the flow of information.

In Figure 3, the architecture visually represents how the information propagates through the LSTM unit. The blue circles indicate elements-wise operations such as multiplication and addition, while the yellow sigmoid and tanh activations control information processing. The previous hidden state and input are processed through weight matrices, and updated values propagate through the memory cell, ultimately generating

h_{t}

and

c_{t}

[55,56,57,58].

Unlike conventional RNNs, which rely solely on hidden states, LSTMs maintain a cell state (

c_{t}

) that enables long-term storage of relevant information. This memory cell is updated through three primary gates: the forget gate, the input gate, and the output gate. Each of these components is governed by specific activation functions and weight matrices. The forget gate (

f_{t}

), defined by Equation (28), determines how much of the past cell state (

c_{t - 1}

) should be retained or discarded. It takes as input the previous hidden state (

h_{t - 1}

) and the current input (

x_{t}

), applies a sigmoid activation function, and produces an output between 0 and 1. If the value is close to 0, the past memory is mostly forgotten; if it is close to 1, the memory is preserved.

f_{t} = σ (W_{h f} h_{t - 1} + W_{i f} x_{t} + B_{h f} + B_{i f})

(28)

where

f_{t}

denotes the forget gate;

W_{h f}

,

W_{i f}

are weight matrices;

B_{h f}

,

B_{i f}

are bias terms; and

σ

represents the sigmoid activation function. The input gate (

i_{t}

) and the candidate memory update (

g_{t}

), given in Equations (29) and (30), work together to decide how much new information should be stored in the memory cell. The input gate applies a sigmoid activation function, while the candidate update uses a hyperbolic tangent function to generate new candidate values. These values are then multiplied element-wise to regulate how much information should be added to the cell state.

g_{t} = tanh (W_{h g} h_{t - 1} + W_{i g} x_{t} + B_{h g} + B_{i g})

(29)

i_{t} = σ (W_{h i} h_{t - 1} + W_{i i} x_{t} + B_{h i} + B_{i i})

(30)

where

g_{t}

denotes the candidate generation,

i_{t}

represents the input gate;

W_{h g}

,

W_{h i}

,

W_{i g}

,

W_{i i}

are weight matrices; and

B_{h g}

,

B_{h i}

,

B_{i g}

,

B_{i i}

are bias terms. The cell state update, expressed in Equations (31) and (32), combines the previous cell state (

c_{t - 1}

) and the newly computed candidate state (

g_{t}^{'}

), which is derived by multiplying the candidate state with the input gate. The forget gate determines how much of the past information is retained, while the input gate modulates the newly added information.

g_{t}^{'} = g_{t} ⊙ i_{t},

(31)

c_{t} = (f_{t} ⊙ c_{t - 1}) + g_{t}^{'}

(32)

where

g_{t}^{'}

is the modulated candidate state;

c_{t}

is the updated cell state; and ⊙ represents the element-wise multiplication operator. Finally, the output gate (

o_{t}

) and the final hidden state (

h_{t}

) determine the output at the current time step. The output gate, computed using Equations (33) and (34), applies a sigmoid activation function, controlling how much of the updated cell state is passed through the tanh activation function to produce the new hidden state.

o_{t} = σ (W_{h o} h_{t - 1} + W_{i o} x_{t} + B_{h o} + B_{i o}),

(33)

h_{t} = o_{t} ⊙ tanh (c_{t})

(34)

where

o_{t}

represents the output gate;

h_{t}

is the updated hidden state;

W_{h o}

,

W_{i o}

are weight matrices;

B_{h o}

,

B_{i o}

are bias terms; and tanh is the hyperbolic tangent activation function.

3.3. MOIBBO

The biogeography-based optimization (BBO) algorithm was first introduced by Dan Simon in 2008 as an evolutionary optimization technique inspired by the natural distribution of species across different habitats [59]. The fundamental idea of BBO is derived from biogeography, which studies the migration, mutation, and selection processes that govern species distribution in various ecosystems. In an optimization context, each habitat represents a potential solution to a given problem, and the quality of a solution is measured by its habitat suitability index (HSI). The key principle of BBO is that solutions with higher HSI tend to share features with lower-HSI solutions through a migration mechanism, allowing knowledge transfer and convergence toward optimal solutions. In BBO, migration plays a crucial role in the exchange of information among candidate solutions. Each habitat has an immigration rate (

λ

) and an emigration rate (

μ

), which determine how solutions share information.

Migration rates are typically modeled as linear functions of HSI, meaning that high-HSI habitats (better solutions) have a higher emigration rate, while low-HSI habitats (worse solutions) have a higher immigration rate. This is mathematically expressed as:

μ_{i} = \frac{i}{N},

(35)

λ_{i} = 1 - \frac{i}{N}

(36)

where i represents the rank of the habitat in terms of suitability, and N is the total number of habitats. The migration process involves selecting high-HSI solutions as donors and low-HSI solutions as recipients, enabling the exploration of new promising regions in the search space. In addition to migration, mutation is another critical operator in BBO, ensuring diversity and preventing premature convergence. Mutation is inspired by sudden environmental changes or random genetic variations in species. The mutation probability is often inversely proportional to the HSI of a habitat, meaning that worse solutions have a higher probability of undergoing mutation. The mutation rate (

m_{i}

) for a habitat is given by:

m_{i} = m_{\max} (1 - \frac{p_{j}}{p_{\max}})

(37)

where

m_{\max}

is the maximum mutation rate;

p_{j}

represents the probability of species count, and

p_{\max}

is the highest value of

p_{j}

. This mechanism introduces randomness to the optimization process, enhancing global search capability and allowing exploration beyond the solutions obtained through migration. The interaction between host and guest habitats in the BBO is based on the principle that high-HSI habitats act as knowledge sources for lower-HSI habitats. When migration occurs, characteristics of a host (high-HSI habitat) replace some characteristics of a guest (low-HSI habitat), allowing for gradual improvement of weaker solutions. This biogeographical interaction mechanism allows the algorithm to balance exploration (through mutation) and exploitation (through migration), leading to efficient optimization in complex search spaces.

The standard linear migration model in BBO suffers from several limitations when dealing with complex optimization landscapes. Linear migration assumes a fixed and simple relationship between immigration and emigration rates, which does not adequately capture the dynamic and nonlinear nature of real-world species migration. This rigid structure limits the adaptability of the algorithm, leading to premature convergence in multimodal search spaces and reducing its ability to explore and exploit solutions effectively. Moreover, in highly non-convex problems, a static migration model fails to maintain diversity across different generations, which is crucial for avoiding stagnation in local optima. To overcome these challenges, we propose a six-stage adaptive migration model, which dynamically adjusts migration rates based on different optimization phases. Instead of relying on a single static rule, the six-rule migration strategy divides the population into distinct categories based on their fitness levels and assigns separate nonlinear migration functions to each subset. This ensures that solutions at different evolutionary stages experience customized migration behavior, improving both exploration and exploitation. The new migration model is defined as Equations (38) and (39):

\{\begin{matrix} μ_{i} = {(\frac{i}{N})}^{4} \\ λ_{i} = 1 - {(\frac{i}{N})}^{5} \end{matrix} i < \frac{N}{3}

(38)

\{\begin{matrix} μ_{i} = ln (\frac{i}{N} + 1) \\ λ_{i} = exp (- \frac{i}{N}) \end{matrix} \frac{N}{3} \leq i \leq \frac{2 N}{3}

(39)

\{\begin{matrix} μ_{i} = \frac{1}{3} (tanh (\frac{i π}{N} - \frac{π}{8}) + 1) \\ λ_{i} = \frac{1}{3} (- tanh (\frac{i π}{N} - \frac{π}{8}) + 1) \end{matrix} \frac{2 N}{3} < i

(40)

In this nonlinear migration model, the first rule applies a polynomial-based migration rate for the best solutions (high HSI), ensuring that they contribute strongly to the population. The second rule introduces logarithmic and exponential functions, balancing the trade-off between exploration and exploitation in middle-range solutions. Finally, the third rule employs a hyperbolic tangent function, which dynamically adjusts the migration behavior of weaker solutions, ensuring gradual and stable convergence. By implementing this multi-stage migration model, the algorithm benefits from adaptive migration rates across different generations and iterations, leading to smarter and context-aware migration. This approach allows the BBO algorithm to self-regulate its migration strategy based on the evolutionary state of the population, resulting in improved diversity maintenance, convergence speed, and robustness in solving complex optimization problems [60,61,62].

In this paper, we have implemented a MOIBBO algorithm, which is an extension of the standard BBO designed to handle multi-objective optimization problems. Unlike the single-objective version, MOIBBO aims to optimize multiple conflicting objectives simultaneously using the nondominated sorting approach. This strategy organizes the population into different nondominated frontiers, allowing the algorithm to maintain a diverse set of solutions that represent various trade-offs between the objectives. Nondominated sorting ensures that solutions in the population are ranked based on Pareto dominance, where no solution in a given frontier is worse than another in all objectives. The algorithm then evolves solutions across these fronts, ensuring an efficient balance between exploration and exploitation in the search space.

The MOIBBO algorithm works by incorporating the concepts of migration and mutation, similar to standard BBO, but with modifications to account for the multi-objective nature. Each solution is represented by a vector of objective values, and the fitness is evaluated using the Pareto dominance concept, where solutions are compared based on their ability to dominate others in all objectives. The algorithm uses nondominated sorting to categorize solutions into different Pareto fronts. After sorting, the migration process happens between habitats within the same Pareto front, and solutions are updated by a nondominated migration mechanism, ensuring that the population continuously moves towards a set of diverse solutions that best approximate the Pareto front. The mutation process is also adapted to maintain diversity within and across the Pareto fronts, preventing premature convergence and ensuring that all regions of the search space are explored.

3.4. Proposed MOIBBO-CNN–LSTM

The proposed MOIBBO-CNN–LSTM model, as illustrated in Figure 4, represents a hybrid DL framework optimized using a MOIBBO algorithm. The figure is divided into two main sections: the upper part depicts the optimization process carried out by MOIBBO, while the lower part shows the CNN–LSTM architecture used for processing input data. The optimization pipeline begins with parameter initialization, followed by the computation of a fitness function, after which nondominated sorting is applied to categorize solutions into different Pareto fronts. If the stopping condition is not met, migration and mutation operators refine the population, generating new candidate solutions in each iteration. Once convergence is reached, the optimal solutions are finalized.

The CNN–LSTM component in the lower section of the figure consists of convolutional layers for feature extraction, LSTM layers for sequence modeling, and fully connected layers for final prediction. The MOIBBO optimizer enhances the overall framework by simultaneously optimizing three objective functions (Equations (42)–(44)): the root mean squared error (RMSE), which ensures that predictions are accurate; the total number of CNN layers, which controls the network complexity; and the number of neurons in hidden layers, which balances computational cost and expressiveness. The nondominated sorting approach in MOIBBO ensures that the optimization process explores multiple trade-offs between these objectives, enabling an adaptive and efficient search for the best model configuration.

F_{1} (x) = min {(\frac{1}{N} \sum_{i = 1}^{N} {[P_{i} - O_{i}]}^{2})}^{\frac{1}{2}}

(41)

F_{2} (x) = min L

(42)

F_{3} (x) = min \sum_{l = 1}^{L} Q_{l}

(43)

where N is the number of observations;

P_{i}

is the calculated parameter;

O_{i}

is the observed parameter; L represents the total number of hidden layers in the CNN; and

Q_{l}

is a binary indicator for the presence of a neuron in layer l.

In the CNN–LSTM framework, input data undergo a hierarchical transformation, where CNN first extracts spatial dependencies. The convolutional layers apply filters to detect localized patterns, while the pooling layers reduce dimensionality and retain the most important features. After multiple layers of convolution and pooling, the extracted feature maps are flattened and passed into the LSTM component, where long-term dependencies in the data are learned. Unlike CNN, which only captures spatial features, LSTM processes sequential information by maintaining memory over past observations. This allows the model to handle time-dependent variations, making it particularly effective for applications where historical dependencies play a critical role, such as predictive modeling in IoT networks. The integration of MOIBBO with CNN–LSTM offers multiple advantages. Instead of relying on conventional backpropagation-based training, which often leads to overfitting or suboptimal convergence, MOIBBO optimizes the hyperparameters in a global search manner, ensuring that the CNN–LSTM model is not only accurate but also computationally efficient. The migration and mutation mechanisms in MOIBBO actively refine the network architecture, allowing it to self-adjust according to the complexity of the problem. This optimization process ensures that the resulting model does not suffer from unnecessary complexity while still maintaining high predictive accuracy.

For our SEE optimization in CF-mMIMO-based IoT networks, the proposed model is particularly well-suited. IoT networks require a balance between EE and security, which is often challenging due to power constraints and dynamic communication environments. The CNN–LSTM framework provides a data-driven approach to optimize SEE, where CNN identifies spatial characteristics of network signals, and LSTM captures temporal variations, making the model highly effective in understanding and predicting network behavior. MOIBBO further enhances this by optimizing the structure of the DL model, ensuring that the trade-off between computational efficiency, accuracy, and model complexity is properly managed. By leveraging multi-objective Pareto optimization, the proposed model does not settle for a single solution but instead provides a set of optimal solutions, allowing decision-makers to select the best configuration based on real-world constraints. This adaptability is crucial in wireless networks, where environmental factors constantly change, and a fixed model may not be optimal for different scenarios. The ability of MOIBBO to dynamically tune the model structure ensures that it remains robust and efficient across different IoT deployment conditions. Figure 4 shows the flowchart of the proposed MOIBBO-CNN–LSTM model.

The flow of data through the model further highlights its effectiveness. Raw sensor data from IoT devices enters CNN, where initial transformations occur, removing noise and irrelevant features while preserving essential spatial structures. These refined features then pass through the LSTM layers, where short-term and long-term temporal correlations are modeled. The final layers of the network map the learned features to specific outputs, predicting optimal network configurations that maximize SEE. By consistently optimizing this process using MOIBBO, the model can adapt to different network conditions, ensuring that both energy efficiency and security requirements are met. In general, the proposed MOIBBO-CNN–LSTM model presents a highly adaptive and efficient solution to optimize SEE in CF-mMIMO-based IoT networks. It combines CNN’s capability to extract spatial patterns, LSTM’s ability to learn sequential delays, and MOIBBO’s strength in global optimization to provide a powerful framework that balances precision, efficiency, and robustness. The integration of multi-objective optimization ensures that no single performance metric is prioritized at the expense of others, making the model flexible for real-world deployment where trade-offs between energy consumption, security, and computational constraints must be carefully managed.

4. Results

This section presents the simulation results of the proposed SEEM framework, detailing the system setup, implementation framework, evaluation metrics, comparative analysis with other state-of-the-art models, and evaluating its SEE performance in comparison to different optimization strategies within a multi-device, multi-eavesdropper CF m-MIMO IoT network. The simulation environment assumes a distributed deployment of all nodes within a

100 \times 100

m area, where each node is randomly positioned within this region. To visually represent this deployment, we illustrate the topology of the simulated system, where

K = 20

APs,

I = 7

IoT devices, and

J = 5

eavesdroppers are randomly placed, as depicted in Figure 5. For consistency, our simulations adopt the same large-scale fading coefficients and specific noise power values as those outlined in [48]. The transmission power for pilot signals is set at

P_{p} = 0.3

W for IoT devices and

P_{E} = 0.1

W for eavesdroppers, with the respective normalized SNRs

ρ_{p}

and

ρ_{E}

being computed as the ratio of pilot power to noise power. Furthermore, the internal energy consumption of each antenna is assigned a fixed value of

P_{tc, k} = 0.2

W, while the backhaul system incurs a fixed power consumption of

P_{0, k} = 0.825

W per AP. Furthermore, traffic-dependent backhaul energy use is defined as

P_{bt, k} = 0.25

W/(Gbits/s), which accounts for dynamic data transmission rates [63]. The system also operates with a bandwidth of

B = 20

MHz and utilizes

τ_{p} = 30

pilot sequences for CSI estimation. The power amplifier efficiency per AP is set to

α_{k} = 2.5

. Finally, for iterative optimization-based evaluations, the maximum number of iterations is configured as

t_{max} = 10^{2}

to ensure algorithmic convergence.

The implementation was carried out in Python 3.8, leveraging TensorFlow and Keras for DL components, ensuring efficient training and inference of the CNN–LSTM architecture. The multi-objective optimization process was conducted using the distributed evolutionary algorithms in the Python (DEAP) library, which provides robust evolutionary computation techniques tailored for multi-objective problems. To ensure computational consistency and scalability, all experiments were executed on a high-performance computing cluster equipped with an Intel Xeon processor and 128 GB RAM, providing an optimized environment for deep learning and optimization-based simulations. To assess the performance of the proposed MOIBBO-CNN–LSTM model, a comparative analysis was conducted against five widely recognized benchmark models in the field of DL and optimization-driven frameworks. The first comparative model is NSGA-II-CNN–LSTM, which integrates the nondominated sorting genetic algorithm-II (NSGA-II) with a CNN–LSTM architecture. This model was chosen because NSGA-II represents one of the most well-established multi-objective evolutionary algorithms, enabling an alternative approach to optimizing deep neural network (DNN) architectures. Another model used in the comparison is vision transformer (ViT), a DL architecture based on the self-attention mechanism, which has demonstrated remarkable success in capturing complex spatial and sequential constraints. Given its ability to extract spatial and temporal features, ViT serves as a strong competitor to hybrid CNN–LSTM architectures.

Additionally, the deep reinforcement learning (DRL) model was included as a reference since RL techniques are commonly used for energy efficiency and security optimization in IoT networks. DRL-based models dynamically learn optimal policies over time, making them valuable in real-time IoT environments. Furthermore, to evaluate the contribution of hybrid CNN–LSTM integration, two individual deep-learning models, CNN and LSTM, were used in the comparison. The standalone CNN model processes spatial features extracted from raw data without considering temporal dependencies, while the LSTM model captures sequential correlations in data without leveraging convolutional feature extraction. By including both CNN and LSTM individually, the comparative analysis demonstrates the advantages of their combined usage, particularly in scenarios where both spatial and temporal dependencies influence decision-making. This selection of benchmark models provides a comprehensive evaluation, ensuring that the proposed approach is assessed not only against evolutionary optimization methods but also against standalone and alternative deep-learning frameworks.

To evaluate the effectiveness of the proposed MOIBBO-CNN–LSTM framework, a synthetic dataset was constructed by simulating a wide range of CF-mMIMO-based IoT network scenarios. Each scenario was generated by randomly initializing the key system parameters, including the number of APs, the number of IoT devices and eavesdroppers, the channel fading coefficients, the SNRs, and the transmission power levels. Based on these randomly generated parameters, the theoretical values of SEE were calculated using Equation (22), which encapsulates both the physical layer security and the energy efficiency of the system. The resulting dataset consisted of input–output pairs where the input captures the network configuration and the output corresponds to the calculated SEE. The generated dataset was used to train and validate the hybrid CNN–LSTM model, allowing it to learn the nonlinear mapping between network parameters and the corresponding SEE values.

To quantify the effectiveness of the models, five key evaluation metrics were employed, ensuring a rigorous and well-rounded assessment. The first metric used is the root mean squared error (RMSE), which measures the squared difference between predicted and observed values. RMSE is particularly relevant for assessing prediction accuracy, as it penalizes larger errors more heavily, making it ideal for evaluating the ability of the CNN–LSTM model to accurately predict SEE in CF-mMIMO-based IoT networks. A lower RMSE value indicates that the model provides more precise estimations, reducing deviations from actual values. The second metric used is the mean absolute percentage error (MAPE), which quantifies the relative prediction error in percentage form, as expressed in Equation (44). This metric is particularly useful for assessing generalization across different data scales, as it provides an interpretable measure of accuracy in real-world scenarios. Given the variability in IoT network parameters, MAPE ensures that the model remains effective across diverse conditions.

In addition to RMSE and MAPE, the coefficient of determination (

R^{2}

) was used as a statistical measure to evaluate the strength of the correlation between predicted and actual values. This metric, computed using Equation (45), determines how well the predictions align with real data trends. A higher

R^{2}

score, ideally close to 1, suggests that the model captures the variance in the dataset effectively, indicating strong predictive capabilities. This is particularly important in IoT applications, where understanding the underlying relationships between security, energy efficiency, and network parameters is critical for optimizing performance.

M A P E = \frac{1}{N} \sum_{i = 1}^{N} |\frac{P_{i} - O_{i}}{O_{i}}| \times 100

(44)

R^{2} = \frac{{[\frac{1}{N} \sum_{i = 1}^{N} (P_{i} - \bar{P}) (O_{i} - \bar{O})]}^{2}}{σ_{P} σ_{O}}

(45)

where N is the number of observations;

P_{i}

is the calculated parameter;

O_{i}

is the observed parameter;

\bar{P}

is the average calculated parameter;

\bar{O}

is the average observed parameter;

σ_{P}

is the standard deviation of predictions; and

σ_{O}

is the standard deviation of observations.

Beyond accuracy-based metrics, the evaluation also considers the convergence trend of the models. The rate at which the model reaches its optimal performance is a crucial factor in determining its efficiency. A model that converges quickly requires fewer computational resources and can adapt to dynamic environments in real time, which is essential for IoT applications where rapid decision-making is required. The convergence behavior of each model is analyzed across multiple generations of optimization, ensuring that the proposed method is not only accurate but also computationally feasible for real-world deployment. Another critical factor in the evaluation process is execution time, which reflects the total computational cost of training and inference. Since IoT networks operate under strict latency and power constraints, it is necessary to ensure that the proposed model remains computationally efficient while delivering high performance. A model that requires excessive training time may not be practical for real-time network management, particularly in large-scale CF-mMIMO systems where quick adaptation is essential. Therefore, execution time is carefully monitored to assess the trade-off between performance and computational feasibility. These evaluation metrics were carefully selected to align with the core objective of the study, which is to enhance SEE in CF-mMIMO-based IoT networks.

Hyperparameter calibration is a critical aspect of DL and optimization-based models, as it directly influences the performance, convergence speed, and generalization ability of an algorithm. In complex architectures such as CNN–LSTM and transformer-based models, selecting the appropriate hyperparameters ensures that the model can efficiently extract spatial and temporal patterns while maintaining computational efficiency. Poorly tuned hyperparameters may lead to underfitting or overfitting, where the model either fails to learn meaningful representations or memorizes training data without generalizing well to new inputs. In optimization-driven frameworks such as MOIBBO and NSGA-II, hyperparameter calibration affects the balance between exploration and exploitation, determining how efficiently the search space is navigated for optimal solutions. Several approaches exist for hyperparameter tuning, with Grid Search, Random Search, and Bayesian optimization being the most commonly used techniques. Grid Search is a systematic method that exhaustively tests all possible combinations of hyperparameters within a predefined range, ensuring that the optimal set of parameters is identified. This method is highly effective for structured search spaces but becomes computationally expensive as the number of hyperparameters increases.

Table 2 provides the optimized hyperparameter configurations for the proposed MOIBBO-CNN–LSTM model and the baseline comparison methods, including ViT, DRL, CNN, LSTM, and NSGA-II, fine-tuned using the grid search approach to ensure optimal performance. For MOIBBO-CNN–LSTM, key hyperparameters such as learning rate (0.003), dropout rate (0.2), batch size (64), number of convolutional layers (10), and kernel size (5 × 5) were fine-tuned to maximize performance. Additionally, the mutation rate (0.06), population size (100), and iteration limit (300) for the MOIBBO optimizer were calibrated to balance search efficiency and convergence speed. The ViT model was configured with six attention heads, ten transformer layers, and a GELU activation function, which are optimal settings for handling complex spatial and temporal dependencies. For DRL, hyperparameters such as

ε

-greedy exploration (ranging from 0.19 to 0.89), discount factor (0.91), and batch size (64) were tuned to ensure an optimal balance between learning and exploration. The NSGA-II-based model was configured with combination probability (0.94) and mutation probability (0.08), with a population size of 100 and 300 iterations, ensuring robust performance in multi-objective optimization. The results of this calibration process demonstrate the effectiveness of grid search in identifying optimal configurations, as each model was fine-tuned to achieve maximum predictive accuracy, computational efficiency, and stability. The hyperparameter tuning process significantly improved the convergence behavior and final performance of the models, ensuring that they operate at peak efficiency when applied to real-world IoT network optimization.

Figure 6 illustrates the convergence behavior of all evaluated models by plotting the RMSE values against training epochs, providing insight into their learning efficiency and optimization stability. The curves represent how each algorithm improves over time, with a steeper decline indicating faster convergence and better learning performance. The MOIBBO-CNN–LSTM model (blue line) demonstrates the most rapid and stable convergence, reaching near-optimal RMSE values within the first 100 epochs and continuing to refine its accuracy until it achieves the lowest RMSE by epoch 300. In contrast, the other models exhibit significantly slower convergence rates and higher final error values, emphasizing the advantage of the proposed multi-objective optimization strategy. The NSGA-II-CNN–LSTM model (green line) shows a moderate convergence rate, steadily decreasing its RMSE over training iterations but ultimately stabilizing at a higher error level compared to MOIBBO-CNN–LSTM. This suggests that while evolutionary optimization aids the CNN–LSTM structure, it lacks the adaptability and efficiency of MOIBBO’s dynamic search mechanisms. The ViT and DRL models demonstrate a slower learning process, requiring significantly more epochs to reduce their error rates. Their final RMSE values remain considerably higher, indicating that while these architectures capture useful patterns, they struggle to generalize SEE trade-offs as effectively as the hybrid CNN–LSTM framework. The standalone CNN and LSTM models exhibit the weakest convergence performance, with high initial RMSE values and slow reduction over epochs. Even after 300 epochs, their RMSE values remain significantly higher than other approaches, confirming that neither CNN nor LSTM alone is sufficient for handling the complexities of SEE optimization in CF-mMIMO-based IoT networks. These results reinforce that the MOIBBO-CNN–LSTM model achieves superior learning efficiency and accuracy through its integrated optimization approach, ensuring rapid convergence and minimal prediction error compared to alternative state-of-the-art methods.

Figure 7 illustrates the evolution of SEE over training epochs for different algorithms, showcasing their optimization efficiency. Performance trends provide insights into the trade-off between convergence speed and final SEE. MOIBBO-CNN–LSTM achieves the highest SEE, starting at

5.5

Mb/J in the initial epoch and reaching

12.5

Mb/J at epoch 200. In particular, it converges quickly, reaching near-optimal SEE (12 Mb/J) by epoch 80. This rapid improvement highlights its efficient learning capacity and its superior optimization strategy. NSGA-II-CNN–LSTM exhibits a similar trend, although slightly lower in performance, stabilizing around 12 Mb/J at epoch 90. This suggests that NSGA-II-CNN–LSTM provides competitive energy efficiency, making it a viable alternative when considering trade-offs between accuracy and computational complexity. ViT and DRL demonstrate moderate SEE improvements, stabilizing at approximately 10 Mb/J and 9 Mb/J, respectively. These methods exhibit a more gradual learning curve that requires more training epochs to achieve satisfactory performance. CNN and LSTM, on the other hand, show the slowest progression. LSTM starts with the lowest SEE (

2.5

Mb/J) and reaches

7.5

Mb/J at epoch 200, stabilizing later than all other models at around epoch 140. CNN follows a similar trajectory but with slightly better SEE convergence. The delayed improvement in these models highlights their limited optimization efficiency in energy management, which is sensitive to security.

Figure 8 illustrates the impact of the maximum transmit power of APs,

P_{A P}^{max}

, on the SEE performance across different algorithms. The results demonstrate a general increasing trend in SEE as

P_{A P}^{max}

rises, indicating that higher transmission power enhances overall energy efficiency. However, the rate of improvement varies across the algorithms. For lower values of

P_{A P}^{max}

(i.e., 50–200 mW), the SEE of all models exhibits a logarithmic-like growth, showing a rapid increase initially before stabilizing. In this range, the proposed MOIBBO-CNN–LSTM and NSGA-II-CNN–LSTM achieve the highest SEE, reaching approximately 8.0 Mb/J and 7.5 Mb/J at 200 mW, respectively. The ViT and DRL models also show improvements, though with relatively lower SEE values. CNN and LSTM demonstrate the worst performance, with LSTM only reaching 5.5 Mb/J at 200 mW. As

P_{A P}^{max}

increases beyond 200 mW (up to 500 mW), the growth rate of SEE slows down across all models, following a diminishing return effect. This indicates that after a certain power threshold, increasing

P_{A P}^{max}

does not significantly improve energy efficiency. The proposed MOIBBO-CNN–LSTM consistently maintains the highest SEE, reaching 9.5 Mb/J at

P_{A P}^{max} = 500

mW, demonstrating its superior energy efficiency. The NSGA-II-CNN–LSTM model remains competitive with an SEE of approximately 9.0 Mb/J, while ViT and DRL follow closely behind. The LSTM model, however, struggles to exceed 7.0 Mb/J even at higher

P_{A P}^{max}

values.

The results depicted in Figure 9 illustrate a downward trend in SEE across all evaluated methods as the number of APs increases. This behavior suggests that while adding more APs enhances SR, the associated power consumption grows at a relatively higher rate, ultimately leading to a net reduction in SEE. Essentially, the benefits of improving SR through additional APs come at the cost of increased energy expenditure, particularly due to the fixed power consumption of the backhaul infrastructure. A crucial implication of these findings is the need for intelligent AP management strategies to mitigate power inefficiencies. One potential solution is the dynamic selection of active APs, where redundant APs are deactivated during periods of lower traffic demand. By strategically putting underutilized APs into sleep mode during off-peak hours, the fixed circuit power consumption can be significantly reduced, leading to improved energy efficiency without compromising system performance. This adaptive AP selection will be explored in our future research. Among the tested models, the proposed MOIBBO-CNN–LSTM consistently outperforms the baseline methods, achieving the highest SEE across all AP configurations. This further highlights the effectiveness of our approach in balancing SR improvements with energy efficiency, making it a compelling solution for optimizing large-scale CF-mMIMO networks.

5. Discussion

Table 3 presents a comparative analysis of different models in optimizing SEE in CF-mMIMO-based IoT networks. The table evaluates each algorithm based on RMSE, R², MAPE, and the average execution time (in seconds). The results indicate that the proposed MOIBBO-CNN–LSTM model consistently outperforms the baseline approaches, achieving the lowest RMSE (0.08), the highest R² score (0.97), and the lowest MAPE (1.03%), demonstrating its superior predictive performance in accurately modeling SEE trade-offs in CF-mMIMO networks. The comparative analysis highlights the limitations of standalone DL models such as CNN and LSTM, which exhibit significantly higher RMSE values (11.27 and 13.91, respectively) and lower R² scores (0.82 and 0.80), confirming their inability to fully capture the complex relationships between energy efficiency and security constraints. The NSGA-II-CNN–LSTM model, while performing better than conventional DL models, still struggles with an RMSE of 3.27 and a MAPE of 6.96%, indicating that the multi-objective optimization strategy of MOIBBO plays a critical role in improving the accuracy of the model. The ViT and DRL models, which use transformer-based and reinforcement learning techniques, also fail to achieve the same level of accuracy, with higher error rates and suboptimal SEE predictions compared to the proposed model.

From a computational efficiency perspective, the results demonstrate that MOIBBO-CNN–LSTM not only achieves superior accuracy but also maintains a reasonable execution time of 962 s, outperforming NSGA-II-CNN–LSTM (1241 s) and DRL (1317 s) in terms of computational cost. While CNN and LSTM models execute faster (652 s and 743 s, respectively), their inferior accuracy makes them impractical for real-world deployment. The ViT model, despite maintaining balanced performance, still suffers from higher computational demands (1012 s) while underperforming in accuracy (RMSE = 6.12, MAPE = 8.32%). The superior performance of MOIBBO-CNN–LSTM can be attributed to its integration of CNN for spatial feature extraction, LSTM for sequential modeling, and MOIBBO for multi-objective optimization, which collectively ensures optimal trade-offs between accuracy, efficiency, and model complexity. Unlike traditional optimization techniques, MOIBBO dynamically adjusts the network’s architecture, hyperparameters, and feature selection, leading to faster convergence, improved generalization, and lower prediction errors. These results confirm that MOIBBO-CNN–LSTM is the most effective framework for optimizing SEE in CF-mMIMO-based IoT networks, providing both high predictive accuracy and efficient computational performance, making it well-suited for real-time IoT network optimization and secure energy management applications.

Table 4 presents a comparative analysis of algorithm execution times based on RMSE termination conditions, demonstrating how efficiently each model converges to different levels of prediction accuracy. The table records the run time (in seconds) required for each model to reach the predefined RMSE thresholds, namely RMSE

< 20

, RMSE

< 15

, RMSE

< 10

, and RMSE

< 5

. These values indicate the rate of convergence for each approach, with lower execution times reflecting higher efficiency in achieving a given accuracy level. In particular, MOIBBO-CNN–LSTM is the only method that successfully reaches RMSE

< 5

, requiring just 386 s, while all other models fail to converge to this level within a reasonable time frame. The results indicate that MOIBBO-CNN–LSTM consistently outperforms all baseline models in both convergence speed and final accuracy. For an RMSE threshold of less than 20, the proposed model achieves convergence in just 75 s, significantly faster than NSGA-II-CNN–LSTM (201 s), ViT (312 s), and DRL (365 s). As the RMSE threshold becomes more restrictive, MOIBBO-CNN–LSTM maintains its advantage, requiring only 124 s for RMSE

< 15

and 263 s for RMSE

< 10

, while other models experience a substantial increase in execution time. The NSGA-II-CNN–LSTM model, for example, takes 381 s for RMSE

< 15

and 694 s for RMSE

< 10

, more than double the time required by the proposed model, highlighting the superior optimization strategy of MOIBBO. The inability of ViT, DRL, CNN, and LSTM to reach RMSE

< 5

further underscores their limitations in predictive accuracy and computational efficiency. CNN and LSTM, despite their relatively fast initial convergence, fail to reach any threshold below RMSE

< 10

, indicating their restricted capacity to model complex SEE relationships in CF-mMIMO-based IoT networks. DRL and ViT, while capable of improving accuracy over time, exhibit significantly longer execution times, with DRL requiring more than 1000 s to reach RMSE

< 10

, demonstrating inefficiency in real-time applications. In contrast, the MOIBBO-CNN–LSTM model efficiently balances accuracy and computational cost, making it the most practical and effective solution to optimize security-aware energy efficiency in CF-mMIMO-based IoT networks. The RMSE values from Table 4 are also presented visually in Figure 10 to facilitate a clearer comparison between the models. As shown in the bar chart, the proposed MOIBBO-CNN–LSTM consistently achieves the lowest error across all training stages, while baseline models exhibit notably higher RMSE values.

Table 5 presents the RMSE values recorded in different training epochs (50, 100, 200, and 300) for all evaluated models, illustrating their convergence behavior and overall learning efficiency. The results highlight how each model’s prediction accuracy improves as training progresses, with lower RMSE values indicating better performance. The MOIBBO-CNN–LSTM model achieves the lowest RMSE across all epochs, demonstrating its ability to efficiently learn complex relationships in SEE optimization. By epoch 300, the proposed model attains an RMSE of just 0.08, significantly outperforming all other methods. The NSGA-II-CNN–LSTM model, although benefiting from evolutionary optimization, converges more slowly and stabilizes at a higher RMSE of 3.27 after 300 epochs. Although this approach improves over standalone DL methods, its performance remains inferior to MOIBBO-CNN–LSTM, reinforcing the advantage of MOIBBO’s multi-objective optimization strategy. The ViT and DRL models, despite incorporating advanced architectures, struggle to reach competitive error rates, with RMSE values of 6.12 and 9.39 at epoch 300, respectively. These results suggest that while transformer-based and reinforcement learning approaches capture certain patterns, they are less effective for SEE prediction in CF-mMIMO-based IoT networks compared to the proposed hybrid model. The CNN and LSTM models show the weakest performance, with RMSE values of 11.27 and 13.91 at epoch 300, confirming their limitations in handling SEE trade-offs when used independently. Their slow convergence and high final error rates indicate that neither spatial feature extraction (CNN) nor sequential modeling (LSTM) alone is sufficient for this optimization task. The results reinforce that the MOIBBO-CNN–LSTM model achieves superior predictive accuracy through its integration of CNN for spatial features, LSTM for sequential dependencies, and MOIBBO for hyperparameter optimization, leading to a faster and more effective learning process compared to other state-of-the-art methods.

6. Conclusions

This paper introduced a novel hybrid deep-learning framework, MOIBBO-CNN–LSTM, aimed at optimizing secrecy energy efficiency in CF-mMIMO-based IoT networks. By integrating CNN for spatial feature extraction and LSTM for temporal modeling, the proposed method effectively learns dynamic network behaviors and security-aware energy management strategies. To enhance optimization efficiency, the MOIBBO algorithm was employed for hyperparameter tuning, ensuring a balance between model complexity, convergence speed, and predictive accuracy. Extensive simulation results validated the superiority of the proposed framework compared to benchmark schemes, including NSGA-II-CNN–LSTM, ViT, DRL, CNN, and LSTM. Specifically, MOIBBO-CNN–LSTM exhibited the highest SEE, demonstrating faster convergence at earlier training epochs and outperforming NSGA-II-CNN–LSTM by achieving optimal SEE levels approximately 10% earlier. These results highlight the effectiveness of the MOIBBO optimizer in accelerating convergence while maintaining high predictive accuracy. Further analysis revealed that SEE improves with increasing AP transmit power up to a saturation point, beyond which excessive power consumption diminishes SEE gains. This underscores the importance of power control mechanisms to optimize energy efficiency while ensuring secure communication. Additionally, the study demonstrated that SEE decreases as the number of APs increases, implying that the secrecy rate gain from additional APs is counterbalanced by increased static power consumption in AP backhaul links. This observation suggests the necessity of adaptive AP selection strategies to deactivate idle APs during non-peak periods, an aspect that warrants further investigation. The findings of this study indicate that MOIBBO-CNN–LSTM provides a robust solution for energy-efficient and secure CF-mMIMO IoT communications. Future research will focus on adaptive real-time AP selection techniques and reinforcement learning-based optimization to further enhance network performance under varying traffic loads and security constraints.

Author Contributions

Conceptualization, A.V. and M.K.; methodology, A.V. and P.S.M.; software, M.S. and P.S.M.; validation, M.S., A.V. and P.S.M.; formal analysis, M.S.; investigation, A.V. and P.S.M.; writing—original draft preparation, A.V.; writing—review and editing, M.K.; visualization, M.K. and M.S.; supervision, M.K.; project administration, M.K.; funding acquisition, M.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Li, C.; Wang, J.; Wang, S.; Zhang, Y. A Review of IoT Applications in Healthcare. Neurocomputing 2024, 565, 127017. [Google Scholar] [CrossRef]
Bellini, P.; Nesi, P.; Pantaleo, G. IoT-Enabled Smart Cities: A Review of Concepts, Frameworks and Key Technologies. Appl. Sci. 2022, 12, 1607. [Google Scholar] [CrossRef]
Najafi, F.; Kaveh, M.; Mosavi, M.R.; Brighente, A.; Conti, M. EPUF: An Entropy-Derived Latency-Based DRAM Physical Unclonable Function for Lightweight Authentication in Internet of Things. IEEE Trans. Mob. Comput. 2025, 24, 2422–2436. [Google Scholar] [CrossRef]
Khan, T.; Emon, M.M.H.; Rahman, M.A. A Systematic Review on Exploring the Influence of Industry 4.0 Technologies to Enhance Supply Chain Visibility and Operational Efficiency. Rev. Bus. Econ. Stud. 2024, 12, 6–27. [Google Scholar] [CrossRef]
Jeon, H.; Lee, C. Internet of Things Technology: Balancing Privacy Concerns with Convenience. Telemat. Inform. 2022, 70, 101816. [Google Scholar] [CrossRef]
Erhueh, O.V.; Elete, T.; Akano, O.A.; Nwakile, C.; Hanson, E. Application of Internet of Things (IoT) in Energy Infrastructure: Lessons for the Future of Operations and Maintenance. Compr. Res. Rev. Sci. Technol. 2024, 2, 28–54. [Google Scholar] [CrossRef]
Jiang, T.; Zhang, Y.; Ma, W.; Peng, M.; Peng, Y.; Feng, M.; Liu, G. Backscatter Communication Meets Practical Battery-Free Internet of Things: A Survey and Outlook. IEEE Commun. Surv. Tutor. 2023, 25, 2021–2051. [Google Scholar] [CrossRef]
Qi, L.; Wu, B.; Chen, X.; Zhou, L.; Ni, W.; Jamalipour, A. Joint Optimization of Internet-of-Things and Smart Grid for Energy Generation, Battery (Dis)charging, and Information Delivery. IEEE Internet Things J. 2024, 11, 21647–21658. [Google Scholar] [CrossRef]
Gonzalez, I.; Calderón, A.J.; Folgado, F.J. IoT Real-Time System for Monitoring Lithium-Ion Battery Long-Term Operation in Microgrids. J. Energy Storage 2022, 51, 104596. [Google Scholar] [CrossRef]
Zikria, Y.B.; Ali, R.; Afzal, M.K.; Kim, S.W. Next-Generation Internet of Things (IoT): Opportunities, Challenges, and Solutions. Sensors 2021, 21, 1174. [Google Scholar] [CrossRef]
Aini, N.; Wibisono, G. Method Comparison for Increasing Data Rate on 5G-IoT Technology. In Proceedings of the 2021 International Conference on Artificial Intelligence and Computer Science Technology (ICAICST), Jakarta, Indonesia, 29–30 June 2021; pp. 129–134. [Google Scholar]
Kaveh, M.; Mosavi, M.R.; Martín, D.; Aghapour, S. An Efficient Authentication Protocol for Smart Grid Communication Based on On-Chip-Error-Correcting Physical Unclonable Function. Sustain. Energy Grids Netw. 2023, 36, 101228. [Google Scholar] [CrossRef]
Rahim, M.A.; Rahman, M.A.; Rahman, M.M.; Asyhari, A.T.; Bhuiyan, M.Z.A.; Ramasamy, D. Evolution of IoT-Enabled Connectivity and Applications in Automotive Industry: A Review. Veh. Commun. 2021, 27, 100285. [Google Scholar] [CrossRef]
Yau, C.W.; Jewsakul, S.; Luk, M.H.; Lee, A.P.; Chan, Y.H.; Ngai, E.C.; Pong, P.W.; Lui, K.S.; Liu, J. NB-IoT Coverage and Sensor Node Connectivity in Dense Urban Environments: An Empirical Study. ACM Trans. Sens. Netw. (TOSN) 2022, 18, 1–36. [Google Scholar] [CrossRef]
Sedighian Kashi, S. Area Coverage of Heterogeneous Wireless Sensor Networks in Support of Internet of Things Demands. Computing 2019, 101, 363–385. [Google Scholar] [CrossRef]
Gava, M.A.; Rocha, H.R.O.; Faber, M.J.; Segatto, M.E.V.; Wörtche, H.; Silva, J.A.L. Optimizing Resources and Increasing the Coverage of Internet-of-Things (IoT) Networks: An Approach Based on LoRaWAN. Sensors 2023, 23, 1239. [Google Scholar] [CrossRef]
Malik, P.K.; Bilandi, N.; Gupta, A. Narrow Band-IoT and Long-Range Technology of IoT Smart Communication: Designs and Challenges. Comput. Ind. Eng. 2022, 172, 108572. [Google Scholar]
Mishra, R.; Mishra, A. Current Research on Internet of Things (IoT) Security Protocols: A Survey. Comput. Secur. 2025, 104310. [Google Scholar] [CrossRef]
Kaveh, M.; Yan, Z.; Jäntti, R. Secrecy Performance Analysis of RIS-Aided Smart Grid Communications. IEEE Trans. Ind. Inform. 2024, 20, 5415–5427. [Google Scholar] [CrossRef]
Naik, A.C.; Awasthi, L.K.; Rathee, P.; Sharma, T.P.; Verma, A. Enhancing IoT Security: A Comprehensive Exploration of Privacy, Security Measures, and Advanced Routing Solutions. Comput. Netw. 2025, 258, 111045. [Google Scholar] [CrossRef]
Ghadi, F.R.; Kaveh, M.; Wong, K.K.; Martín, D. Physical Layer Security Performance of Cooperative Dual-RIS-Aided V2V NOMA Communications. IEEE Syst. J. 2024, 18, 2074–2084. [Google Scholar] [CrossRef]
Ngo, H.Q.; Interdonato, G.; Larsson, E.G.; Caire, G.; Andrews, J.G. Ultradense Cell-Free Massive MIMO for 6G: Technical Overview and Open Questions. Proc. IEEE 2024, 112, 805–831. [Google Scholar] [CrossRef]
Apiyo, A.; Izydorczyk, J. A Survey of NOMA-Aided Cell-Free Massive MIMO Systems. Electronics 2024, 13, 231. [Google Scholar] [CrossRef]
Zheng, J.; Zhang, J.; Du, H.; Niyato, D.; Ai, B.; Debbah, M.; Letaief, K.B. Mobile Cell-Free Massive MIMO: Challenges, Solutions, and Future Directions. IEEE Wirel. Commun. 2024, 31, 140–147. [Google Scholar] [CrossRef]
Mohammadi, M.; Mobini, Z.; Ngo, H.Q.; Matthaiou, M. Next-Generation Multiple Access With Cell-Free Massive MIMO. Proc. IEEE 2024, 112, 1372–1420. [Google Scholar] [CrossRef]
Mao, W.; Lu, Y.; Chi, C.Y.; Ai, B.; Zhong, Z.; Ding, Z. Communication-Sensing Region for Cell-Free Massive MIMO ISAC Systems. IEEE Trans. Wirel. Commun. 2024, 23, 12396–12411. [Google Scholar] [CrossRef]
Shi, X.; Shao, X.; Zheng, B.; Zhang, R. 6DMA-Aided Cell-Free Massive MIMO Communication. IEEE Wirel. Commun. Lett. 2025. Early Access. [Google Scholar] [CrossRef]
Wang, X.; Ashikhmin, A.; Wang, X. Wirelessly powered cell-free IoT: Analysis and optimization. IEEE Internet Things J. 2020, 7, 8384–8396. [Google Scholar] [CrossRef]
Lee, B.M. Cell-free massive MIMO for massive low-power Internet of Things networks. IEEE Internet Things J. 2021, 9, 6520–6535. [Google Scholar] [CrossRef]
Yan, H.; Ashikhmin, A.; Yang, H. A scalable and energy-efficient IoT system supported by cell-free massive MIMO. IEEE Internet Things J. 2021, 8, 14705–14718. [Google Scholar] [CrossRef]
Zhang, J.; Fan, J.; Zhang, J.; Ng, D.W.K.; Sun, Q.; Ai, B. Performance analysis and optimization of NOMA-based cell-free massive MIMO for IoT. IEEE Internet Things J. 2021, 9, 9625–9639. [Google Scholar] [CrossRef]
Rao, S.; Ashikhmin, A.; Yang, H. Internet of Things based on cell-free massive MIMO. In Proceedings of the 53rd Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 3–6 November 2019; pp. 1946–1950. [Google Scholar]
Wei, C.; Xu, K.; Shen, Z.; Xia, X.; Li, C.; Xie, W.; Zhang, D.; Liang, H. Fingerprint-based localization and channel estimation integration for cell-free massive MIMO IoT systems. IEEE Internet Things J. 2022, 9, 25237–25252. [Google Scholar] [CrossRef]
Lan, M.; Hei, Y.; Huo, M.; Li, H.; Li, W. A new framework of RIS-aided user-centric cell-free massive MIMO system for IoT networks. IEEE Internet Things J. 2023, 11, 1110–1121. [Google Scholar] [CrossRef]
Li, H.; Wang, Y.; Sun, C.; Wang, Z. User-centric cell-free massive MIMO for IoT in highly dynamic environments. IEEE Internet Things J. 2023, 11, 8658–8675. [Google Scholar] [CrossRef]
Mahmoud, A.M.; Mehana, A.H.; Fahmy, Y.A. Cell-Free Massive MIMO System for Indoor Industrial IoT Networks. IEEE Access 2024, 12, 143288–143306. [Google Scholar] [CrossRef]
Ke, M.; Gao, Z.; Wu, Y.; Gao, X.; Wong, K.K. Massive access in cell-free massive MIMO-based Internet of Things: Cloud computing and edge computing paradigms. IEEE J. Sel. Areas Commun. 2020, 39, 756–772. [Google Scholar] [CrossRef]
Yan, H.; Ashikhmin, A.; Yang, H. Optimally supporting IoT with cell-free massive MIMO. In Proceedings of the IEEE Global Communications Conference (GLOBECOM), Taipei, Taiwan, 7–11 December 2020; pp. 1–6. [Google Scholar]
Li, F.; Sun, Q.; Chen, X.; Peng, B.; Zhang, J.; Wong, K.K. Cell-Free Massive MIMO Symbiotic Radio for IoT: RIS or BD? IEEE Trans. Wirel. Commun. 2024, 24, 2311–2324. [Google Scholar] [CrossRef]
Ghadi, F.R.; Wong, K.K.; López-Martínez, F.J.; New, W.K.; Xu, H.; Chae, C.B. Physical Layer Security Over Fluid Antenna Systems: Secrecy Performance Analysis. IEEE Trans. Wirel. Commun. 2024, 23, 18201–18213. [Google Scholar] [CrossRef]
Kaveh, M.; Rostami Ghadi, F.; Jäntti, R.; Yan, Z. Secrecy Performance Analysis of Backscatter Communications with Side Information. Sensors 2023, 23, 8358. [Google Scholar] [CrossRef] [PubMed]
Lu, Y.; Xiong, K.; Fan, P.; Ding, Z.; Zhong, Z.; Letaief, K.B. Secrecy energy efficiency in multi-antenna SWIPT networks with dual-layer PS receivers. IEEE Trans. Wirel. Commun. 2020, 19, 4290–4306. [Google Scholar] [CrossRef]
Taghizadeh, O.; Neuhaus, P.; Mathar, R.; Fettweis, G. Secrecy energy efficiency of MIMOME wiretap channels with full-duplex jamming. IEEE Trans. Commun. 2019, 67, 5588–5603. [Google Scholar] [CrossRef]
Ouyang, J.; Wang, X.; Xu, B.; Zhu, J.; Zhu, W.P. Secrecy energy efficiency in full-duplex AF relay systems with untrusted energy harvesters. IEEE Commun. Lett. 2021, 25, 3493–3497. [Google Scholar] [CrossRef]
Jiang, Y.; Zou, Y. Secrecy energy efficiency maximization for multi-user multi-eavesdropper cell-free massive MIMO networks. IEEE Trans. Veh. Technol. 2023, 72, 6009–6022. [Google Scholar] [CrossRef]
Ozdogan, O.; Bjornson, E.; Larsson, E.G. Massive MIMO with Spatially Correlated Rician Fading Channels. IEEE Trans. Commun. 2019, 67, 3234–3250. [Google Scholar] [CrossRef]
Bjornson, E.; Sanguinetti, L. Making Cell-Free Massive MIMO Competitive with MMSE Processing and Centralized Implementation. IEEE Trans. Wirel. Commun. 2020, 19, 77–90. [Google Scholar] [CrossRef]
Ngo, H.Q.; Ashikhmin, A.; Yang, H.; Larsson, E.G.; Marzetta, T.L. Cell-Free Massive MIMO versus Small Cells. IEEE Trans. Wirel. Commun. 2017, 16, 1834–1850. [Google Scholar] [CrossRef]
Hoang, T.M.; Ngo, H.Q.; Duong, T.Q.; Tuan, H.D.; Marshall, A. Cell-Free Massive MIMO Networks: Optimal Power Control Against Active Eavesdropping. IEEE Trans. Commun. 2018, 66, 4724–4737. [Google Scholar] [CrossRef]
Alageli, M.; Ikhlef, A.; Alsifiany, F.; Abdullah, M.A.M.; Chen, G.; Chambers, J. Optimal Downlink Transmission for Cell-Free SWIPT Massive MIMO Systems with Active Eavesdropping. IEEE Trans. Inf. Forensics Secur. 2020, 15, 1983–1998. [Google Scholar] [CrossRef]
Zhao, X.; Wang, L.; Zhang, Y.; Han, X.; Deveci, M.; Parmar, M. A Review of Convolutional Neural Networks in Computer Vision. Artif. Intell. Rev. 2024, 57, 99. [Google Scholar] [CrossRef]
Taye, M.M. Theoretical Understanding of Convolutional Neural Network: Concepts, Architectures, Applications, Future Directions. Computation 2023, 11, 52. [Google Scholar] [CrossRef]
Chen, C.; Mat Isa, N.A.; Liu, X. A Review of Convolutional Neural Network Based Methods for Medical Image Classification. Comput. Biol. Med. 2025, 185, 109507. [Google Scholar] [CrossRef]
Krichen, M. Convolutional Neural Networks: A Survey. Computers 2023, 12, 151. [Google Scholar] [CrossRef]
Wen, X.; Li, W. Time Series Prediction Based on LSTM-Attention-LSTM Model. IEEE Access 2023, 11, 48322–48331. [Google Scholar] [CrossRef]
Al-Selwi, S.M.; Hassan, M.F.; Abdulkadir, S.J.; Muneer, A.; Sumiea, E.H.; Alqushaibi, A.; Ragab, M.G. RNN-LSTM: From Applications to Modeling Techniques and Beyond—Systematic Review. J. King Saud-Univ.-Comput. Inf. Sci. 2024, 36, 102068. [Google Scholar] [CrossRef]
Fard, S.S.; Kaveh, M.; Mosavi, M.R.; Ko, S.B. An Efficient Modeling Attack for Breaking the Security of XOR-Arbiter PUFs by Using the Fully Connected and Long-Short Term Memory. Microprocess. Microsyst. 2024, 94, 104667. [Google Scholar] [CrossRef]
Yadav, H.; Thakkar, A. NOA-LSTM: An Efficient LSTM Cell Architecture for Time Series Forecasting. Expert Syst. Appl. 2024, 238, 122333. [Google Scholar] [CrossRef]
Simon, D. Biogeography-Based Optimization. IEEE Trans. Evol. Comput. 2008, 12, 702–713. [Google Scholar] [CrossRef]
Ma, H.; Simon, D.; Siarry, P.; Yang, Z.; Fei, M. Biogeography-Based Optimization: A 10-Year Review. IEEE Trans. Emerg. Top. Comput. Intell. 2017, 1, 391–407. [Google Scholar] [CrossRef]
Kaveh, M.; Mesgari, M.S.; Martín, D.; Kaveh, M. TDMBBO: A Novel Three-Dimensional Migration Model of Biogeography-Based Optimization (Case Study: Facility Planning and Benchmark Problems). J. Supercomput. 2023, 79, 9715–9770. [Google Scholar] [CrossRef]
Nevisi, M.M.S.; Bashir, E.; Martín, D.; Rezvanjou, S.; Shoushtari, F.; Ghafourian, E. Secrecy Outage Probability Minimization in Wireless-Powered Communications Using an Improved Biogeography-Based Optimization-Inspired Recurrent Neural Network. Comput. Mater. Contin. 2024, 78, 3. [Google Scholar] [CrossRef]
Ngo, H.Q.; Tran, L.-N.; Duong, T.Q.; Matthaiou, M.; Larsson, E.G. On the Total Energy Efficiency of Cell-Free Massive MIMO. IEEE Trans. Green Commun. Netw. 2018, 2, 25–39. [Google Scholar] [CrossRef]

Figure 1. The proposed system model for multi-device multi-eavesdropper CF m-MIMO-enabled IoT Network.

Figure 2. Standard CNN architecture illustrating the hierarchical feature extraction process.

Figure 3. Standard LSTM architecture demonstrating the flow of information through forget, input, and output gates.

Figure 4. The flowchart of the proposed MOIBBO-CNN–LSTM model.

Figure 5. System setup of the CF m-MIMO IoT network with 20 APs, 7 IoT devices, and 5 eavesdroppers. The deployment area is scaled within a square of 100 m length for our simulations unless otherwise stated.

Figure 6. Convergence trends of the proposed methods in terms of RMSE over 300 training epochs.

Figure 7. SEE progression across training epochs for different algorithms. MOIBBO-CNN–LSTM achieves the highest SEE with rapid convergence, while LSTM shows the slowest improvement.

Figure 8. SEE performance as a function of the maximum AP transmit power for different optimization techniques.

Figure 9. SEE performance as a function of the number of APs for different algorithms.

Figure 10. Bar chart representation of RMSE values at different training epochs for each model.

Table 1. Comparison of Related Works Based on Key Evaluation Metrics.

Work	EE	SE	AB	PLS	SEE	TO
Wang et al. [28]	✔	✔	×	×	×	×
Lee et al. [29]	✔	✔	✔	×	×	×
Yan et al. [30]	✔	✔	✔	×	×	×
Zhang et al. [31]	✔	✔	×	×	×	×
Rao et al. [32]	✔	✔	×	×	×	×
Wei et al. [33]	✔	✔	×	×	×	×
Lan et al. [34]	✔	✔	×	×	×	×
Li et al. [35]	✔	✔	×	×	×	×
Mahmoud et al. [36]	✔	✔	×	×	×	×
Ke et al. [37]	✔	×	×	×	×	×
Yan et al. [38]	✔	✔	×	×	×	×
Li et al. [39]	✔	✔	×	×	×	×
This Work	✔	✔	✔	✔	✔	✔

Table 2. Optimized hyperparameter settings for the proposed methods using grid search.

Proposed Methods	Parameters	Value
MOIBBO-CNN–LSTM	Learning rate	0.003
	Recurrent dropout rate	0.2
	Sequence length	8
	Batch size	64
	Number of convolution layers	10
	Kernel size	5 × 5
	Pooling type	Max pooling (3 × 3)
	Optimizer	MOIBBO
	Mutation rate	0.06
	Population size	100
	Iteration	300
ViT	Learning rate	0.005
	Patch size	16 × 16
	Number of attention heads	6
	Number of Transformer layers	10
	Activation function	GELU
	Dropout rate	0.1
	Optimizer	Adam
DRL	Learning rate	0.004
	Discount factor	0.91
	$ε$ -greedy	0.19–0.89
	Batch size	64
	Number of hidden layers	6
	Number of neurons per layer	32
	Activation function	ReLU
	Optimizer	SGD
NSGA-II	Combination probability	0.94
	Mutation probability	0.08
	Population size	100
	Iteration	300

Table 3. Comparison of proposed models for SEE optimization in CF-mMIMO-based IoT networks.

Model	RMSE	R²	MAPE	Average Time (s)
MOIBBO-CNN–LSTM	0.08	0.97	1.03%	962
NSGA-II-CNN–LSTM	3.27	0.92	6.96%	1241
ViT	6.12	0.86	8.32%	1012
DRL	9.39	0.84	11.43%	1317
CNN	11.27	0.82	15.96%	652
LSTM	13.91	0.80	19.36%	743

Table 4. Evaluation of algorithm execution time under different RMSE threshold conditions.

Model	Run Time (s)
Model	RMSE < 20	RMSE < 15	RMSE < 10	RMSE < 5
MOIBBO-CNN–LSTM	75	124	263	386
NSGA-II-CNN–LSTM	201	381	694	-
ViT	312	486	863	-
DRL	365	512	1025	-
CNN	452	602	-	-
LSTM	496	632	-	-

Table 5. RMSE progression across different training epochs for the proposed methods.

Model	RMSE
Model	Epoch = 50	Epoch = 100	Epoch = 200	Epoch = 300
MOIBBO-CNN–LSTM	2.32	1.06	0.36	0.08
NSGA-II-CNN–LSTM	9.24	7.32	4.21	3.27
ViT	25.96	18.24	10.29	6.12
DRL	29.14	20.79	12.17	9.39
CNN	32.19	25.17	14.32	11.27
LSTM	35.27	24.80	15.20	13.91

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Vaziri, A.; Moghaddam, P.S.; Shoeibi, M.; Kaveh, M. Energy-Efficient Secure Cell-Free Massive MIMO for Internet of Things: A Hybrid CNN–LSTM-Based Deep-Learning Approach. Future Internet 2025, 17, 169. https://doi.org/10.3390/fi17040169

AMA Style

Vaziri A, Moghaddam PS, Shoeibi M, Kaveh M. Energy-Efficient Secure Cell-Free Massive MIMO for Internet of Things: A Hybrid CNN–LSTM-Based Deep-Learning Approach. Future Internet. 2025; 17(4):169. https://doi.org/10.3390/fi17040169

Chicago/Turabian Style

Vaziri, Ali, Pardis Sadatian Moghaddam, Mehrdad Shoeibi, and Masoud Kaveh. 2025. "Energy-Efficient Secure Cell-Free Massive MIMO for Internet of Things: A Hybrid CNN–LSTM-Based Deep-Learning Approach" Future Internet 17, no. 4: 169. https://doi.org/10.3390/fi17040169

APA Style

Vaziri, A., Moghaddam, P. S., Shoeibi, M., & Kaveh, M. (2025). Energy-Efficient Secure Cell-Free Massive MIMO for Internet of Things: A Hybrid CNN–LSTM-Based Deep-Learning Approach. Future Internet, 17(4), 169. https://doi.org/10.3390/fi17040169

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Energy-Efficient Secure Cell-Free Massive MIMO for Internet of Things: A Hybrid CNN–LSTM-Based Deep-Learning Approach

Abstract

1. Introduction

1.1. Related Works

1.2. Research Gaps and Motivations

1.3. Paper Contributions and Organization

2. System Model and Problem Formulation

3. Materials and Methods

3.1. Standard CNN

3.2. Standard LSTM

3.3. MOIBBO

3.4. Proposed MOIBBO-CNN–LSTM

4. Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI