Adaptive BBU Migration Based on Deep Q-Learning for Cloud Radio Access Network

Ismail, Sura F.; Kadhim, Dheyaa Jasim

doi:10.3390/app15073494

Open AccessArticle

Adaptive BBU Migration Based on Deep Q-Learning for Cloud Radio Access Network

by

Sura F. Ismail

^1,2,* and

Dheyaa Jasim Kadhim

^1,*

¹

Department of Electrical Engineering, College of Engineering, University of Baghdad, Baghdad 10011, Iraq

²

Department of Informatics Management System, College of Informatics Business, University of Information Technology and Communications, Baghdad 00964, Iraq

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(7), 3494; https://doi.org/10.3390/app15073494

Submission received: 14 February 2025 / Revised: 17 March 2025 / Accepted: 20 March 2025 / Published: 22 March 2025

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

The efficiency of the current cellular network is limited due to the imbalance between resource availability and traffic demand. To overcome these limitations, baseband units (BBUs) are deployed on virtual machines (VMs) to form a virtual pool of BBUs. This setup enables the pooling of hardware resources, reducing the costs associated with building base stations (BSs) and simplifying both management and control. However, extreme levels of server resource use within the pool can increase physical maintenance costs and impact virtual BBU performance. This study introduces an adaptive, threshold-based dynamic migration strategy for virtual BBUs within the iCanCloud framework by setting upper and lower limits on the servers’ resource usage in the pool. The proposed method determines whether to initiate a migration by evaluating resource usage on each compute node and identifies the target node for migration if required. This aims to balance server load and cut energy consumption, and also to avoid unnecessary migration because of too high or too low server load, and effectively determine the time to trigger migration and not depend only on a certain instantaneous peak of server resource utilization. This paper used a deep Q-network learning method to predict resource utilization and make an accurate migration decision based on a history dataset. Experimental results show that as compared with Kalman filter prediction and other traditional methods, this model can effectively lower the cost of VM migration by decreasing the migration time and occurrence of it to enhance overall performance while reducing energy consumption.

Keywords:

CRAN; BBU cloud migration; load balancing; deep Q-network

1. Introduction

As mobile technologies have advanced from 1G to 6G, the objectives of users and network operators have changed. However, communities are becoming more data-centric, reliant on data, and automating their activities more and more in the modern world. The widespread application of automation in industrial production processes will greatly boost productivity. Millions of sensors will be deployed across homes, factories, and cities, with AI-powered systems running in local cloud and fog environments to support innovative applications [1,2,3,4,5]. To accommodate a wide range of devices and address future data demands, 6G new radio (NR) networks are anticipated to meet these needs through expertly managed spectrum resources [6,7,8]. The goal of recent research has been to determine how to improve the power efficiency of cellular networks. Numerous strategies have been proposed to reduce the energy consumption of BSs, which make up 60–80% of the total energy usage [9,10]. One of these strategies is switching from the conventional distributed base station (D-BS) design [11] to the cloud radio access network (C-RAN), a centralized, cooperative, cloud-based, and environmentally friendly radio base station architecture that China Mobile proposed in 2010 [12,13]. The architecture of the cloud radio access network consists of two parts: the remote radio head (RRH) and the baseband unit (BBU). Digital processing of baseband signals is performed by the BBU in conjunction with all higher-layer operations. Using antennas, RRH transmits and receives wireless signals [14,15].

The use of the distributed base station architecture as a next-generation base station platform is constrained by a number of problems. The following highlights a few of these D-BS architecture drawbacks and how C-RAN addresses them [16].

Problem: In a standard RAN or distributed BS architecture, the number of BS increases and the matching cell size reduces as the number of cellular users rises. This results in higher CAPEX and OPEX as well as increased power consumption [17,18].
Solution: The baseband processing is performed at a centralized site, whereas the RF unit is kept at a distributed position under the architecture that China Mobile introduced. Therefore, centralization lowers power consumption and associated costs even as the number of cells grows.

2.: Problem: The conventional RAN relies on a costly radio frequency (RF) component and a specialized signal processing unit, which results in a large unit that requires additional room to operate [19].
Solution: Because the RF unit is small and light, it may be placed in any commonplace, such as a light pool, tree trunk, or rooftop. Once more, C-BBU’s usage of load balancing and virtualization technology results in a smaller BS architecture by reducing the size of the BBU pool.

3.: Problem: The main switching center (MSC) is the only component responsible for network connectivity in the conventional RAN architecture, which lengthens network latency.
Solution: C-RAN integrates certain MSC control plane functions into the C-BBUs. The overall end-to-end time delay and backhaul congestion are further minimized, as control and processing tasks are handled at the C-BBU, which is positioned closer to the user equipment (UE).

Additionally, the flexibility and scalability of C-RAN are increased by the use of virtualization and cloud computing technology at the radio access network (RAN) platform [20] which decreased the number of base stations. A single platform can provide services to various cellular operators according to their service level agreement (SLA) by using the aforementioned concepts [21]. The load distribution among different BBUs in the BBU pool fluctuates significantly due to the continual initiation or termination of communication services. Low-load or idle BBUs must remain active in the same operational state as high-load BBUs to ensure uninterrupted service for users within the coverage area. On the one hand, overutilization of real servers hosting virtualized BBUs can result in a loss in service quality. On the other hand, underutilization of assets keeps power consumption under control but wastes physical assets. With virtualization and cloud computing, the Cloud-RAN promotes base station agility and scalability [22]. The use of resources dynamically relates to the current demand for the use of resources, with shared use of physical resources. With such an installation, the use of processing in terms of virtualization enables a single installation of BBUs in a server cluster, with a pool of BBUs supporting the virtual use of resources in a cluster is optimized, overall dependability of operations in a pool of BBUs is maximized, and use of power is reduced.

In this work, an adaptive dynamic migration scheme with a dual-threshold model is proposed, specifying both upper and lower bounds for utilizing resources in physical nodes in a BBU pool. In the following, key contributions of this work are discussed:

Create a BBU pool and virtualize it using the iCanCloud platform, a tool designed for modeling cloud environments.
Implement a two-threshold migration mechanism dynamically. Start with an adaptive migration trigger scheme with a deep reinforcement learning (DQN) algorithm that can detect when migration will best occur, skipping unnecessary migration in case of transient spikes in server resource consumption, high or low. Next, make a smart decision about selecting a virtual machine for migration through consideration of both migration count and duration.
The iCanCloud platform runs a simulation in an attempt to assess the proposed migration scheme, and it can enable efficiency in terms of energy and considerable migration count and migration time savings. Unlike a conventional manual scheme, such a scheme utilizes artificial intelligence for accuracy, evasion of invalid migration, and achievement of load balancing and adaptive dynamic migration.

One of the traditional RL algorithms is Q-learning, which is a model-free learning technique [23]. In Q-learning, the environmental context is covered at the start. The action taken for each stage in the section should be determined by the policy and the present circumstances. The following state, s_t+1, and the associated reward, r_t, can then be acquired. The Bellman equation should then be used to update the action value Q(t) = Q(s_t, a_t):

Q (t) = Q (t) + β (r_{t} + γ_{m a x, t + 1} Q (t + 1) - Q (t))

(1)

In our instance, the environment is very complicated, thus a Q-table of this size would be unsuitable. Consequently, a neural network (NN)-based Q-learning technique is employed in its place. The training network and the target network are the two neural networks used in DQL’s learning process. Despite having differing weights, these networks share the same architecture. When the parameters of the training DQN are backpropagated using the Adam [24] optimizer algorithm, the parameters of the target DQN are updated in accordance with the evaluation of the trained DQN at a specific rate. It is crucial to stress that the target network’s parameters are periodically synchronized with the main Q-network’s parameters rather than being trained. Training the main Q-network with the target network’s Q-values improves training stability.

Q-values representing multiple actions will be produced as output for a state value supplied as input to the NN. The action that maximizes the Q-value is chosen from these outputs, and the target action is defined as follows:

γ_{i} = r_{i} + γ_{m a x, t + 1} Q (t + 1)

(2)

L = \frac{1}{N} \sum_{i = 0}^{N - 1} {(Q_{i} - y_{i})}^{2}

(3)

The proposed DQN-based migration strategy outperforms the state-of-the-art models: Kalman filter, IQR, MAD, LR, SES, and ARIMA in resource utilization, energy efficiency, migration time, and adaptability. It dynamically learns from past migrations, avoids unnecessary movement, and optimizes VM placement in real-time, making it the optimal strategy for Cloud-RAN dynamic migration strategies.

The remainder of the paper is organized as follows: Section 2 presents literature reviews of works on migration. Section 3 provides a detailed explanation of the system model with parameter definitions. Section 4 illustrates the proposed migration model. The evaluation results using iCanCloud are shown in Section 5, followed by a conclusion in Section 6.

2. Literature Review

With virtual BBUs in virtual machines (VMs), cloud environments can use cloud VM migration for efficient use of resources and balancing loads. In cloud environments, balancing loads helps minimize unnecessary use of resources through workloads redistributed between overused and underused nodes. VM selection and migration allow the best migration options between overused and underused hosts. A load balancing with resource migration strategy across several BBUs in a super base station architecture was proposed by the authors in [25]. Their suggested architecture uses a radio network controller (RNC) to facilitate resource sharing amongst BBUs. A C-BBU’s resources are distributed across several radio units located in various geographic regions.

The authors in [26] proposed an online migration of a virtual BS, focusing on: reducing data loss during virtual machine migration and minimizing the duration of service interruptions. A load migration method with threshold-based for virtual base stations, utilizing virtualization, and cloud technologies, was introduced in [27]. The authors estimated both the total service interruption duration and migration time. In [28], an energy-efficient workload was proposed to distribute the processing load across the BBU pool. To decrease migration time and improve energy consumption in C-RAN, the authors used workload scheduling with queuing-based techniques. Research in [29] proposed the live migration of containerized baseband units (BBUs) in two wireless network settings: long-term evolution (LTE) networks and long-range wide-area networks (LoRaWAN). AVM live migration method for the data center environment was put forth by the authors in [30]. Based on downtime and migration duration, the authors have examined parallel and serial virtual machine migration approaches in their work. According to their comparison, parallel migration reduces downtime more effectively. The authors of [31] suggested a VM placement and migration method for the data center based on crow search. The authors of the paper suggested a meta-heuristic load-balancing strategy to minimize resource waste and lower power consumption. A QoS-aware VM migration method was presented in [32]. Their work concentrated on lowering a server’s power consumption and optimizing the use of physical resources. They employed random virtual machine migration to lower a server’s power consumption and the interquartile range (IQR) approach to boost resource usage. The minimum migration time (MMT) algorithm is used by the authors of references [33,34] to choose a virtual machine (VM). Reference [35] uses the minimum power high available capacity approach to discover new locations for virtual machines (VMs) and chooses the VM selection strategy with maximum usage for migration and VM placement method. To select the destination host with the smallest total utilization (TU), reference [36] computes TU in conjunction with CPU and RAM. In reference [37], Razali et al. create a prediction model, use historical data as a training set, and forecast CPU usage for the upcoming instant. Future resource use is predicted using a fuzzy logic-based prediction in reference [38]. The threshold and time sequence prediction technique which is employed by the authors in reference [39] does not carry out migration activities right away when it is discovered that server resource consumption is beyond the threshold. It observes multiple cycles to decide whether to start the migration. A dual threshold-based baseband resource migration technique for a C-RAN architecture was also suggested by the authors in [40]. The authors highlighted an adaptive, threshold-based process for selecting and migrating VMs within the C-RAN architecture. Deep reinforcement learning (D Q-learning network) is adopted in this work as an adaptive migration trigger mechanism, taking its ability to learn and update its controlling parameters.

3. Proposed System Model

In this paper, the proposed system model uses the architecture of the cloud radio access network that enables the functional separation of a traditional base station (BS) into two parts: the remote radio head (RRH) and the baseband unit (BBU) as shown in Figure 1.

3.1. System Scenario

As previously illustrated in Figure 1, this paper uses the centralized architecture of BBU. To oversee the performance of all virtual resources, BBUs are grouped together as a BBU pool in a single location and managed by a cloud manager. Cooperative radio resource allocation and large-scale collaborative processing are made possible by the virtualized BBU pool. The iCanCloud framework [41], a simulation platform for modeling and simulating cloud environments, is used to simulate the system. The OMNeT++ platform has been used to develop iCanCloud. The simulation framework simulator INET is used by the network systems. The iCanCloud simulator kernel includes a range of modules designed to replicate the behavior of specific components, organized according to their functions. The iCanCloud platform’s modular architecture consists of three main sections: the user model, cloud manager, and cloud infrastructure as shown in Figure 2.

3.2. Power Consumption Model

The study referenced in [42] illustrates that power consumption in physical machines can be represented by a linear relationship with CPU usage. It also highlights that a fully idle physical machine typically consumes around 70% of its total power capacity. Using Equation (4), define power consumption as a function of CPU utilization.

P (v) = n \times P_{m a x} + (1 - n) . P_{m a x} \times v

(4)

where v is the CPU usage, n is the proportion of power used by an idle physical server, and P_max is the maximum power of a server while it is operating. Due to workload fluctuation, CPU usage varies over time, and v(t) is a function of time. Equation (5) can be used to define the overall energy usage. This model states that CPU use determines energy consumption.

E = \int_{t} P (v (t)) d t

(5)

3.3. Resource Utilization

Dynamic migration allows VMs to be transferred across physical nodes without downtime. However, this process can degrade the performance of applications running within a VM during migration. Studies referenced in [43] show that application behavior impacts both downtime and performance, with total memory usage of the VM and available network bandwidth influencing migration duration. Resource utilization means CPU utilization which is the average CPU usage of the host by the virtual machines (VMs) in the given set of hosts and memory utilization which is the average memory used of hosts by VMs. According to [44], a single VM migration can lead to performance loss, equating to an additional 10% CPU consumption, which may cause SLA violations.

4. Proposed Dynamic Migration System

Since each physical compute node’s operating state must be monitored and migration requirements for the resource state must be defined to decide whether to activate the dynamic migration. The suggested system established a dual threshold technique for the migration process. The higher threshold is specified in this work to mitigate the issue of the physical server increasing energy consumption due to the high load. To save energy, set a lower threshold to move the virtual machine (VM) on the compute nodes when the server utilization falls below it and shut down or sleep the physical server. Because physical servers’ resource use is unstable, the CPU or memory utilization at a given moment may result in needless migration because it is either too high or too low, wasting system energy. The parameter definitions suggested in the paper are displayed in Table 1.

Because deep reinforcement learning (DQN) can learn and optimize control techniques in non-linear situations, it was selected for resource consumption prediction in this research. DRL’s salient characteristics that render it appropriate for this system are as follows:

Non-linear decision making: DQN can handle complex, dynamic resource utilization patterns.
Policy optimization: Through training, DQN learns optimal migration strategies based on past experience and rewards.
Continuous improvement: Unlike static methods like the Kalman filter, DQN improves as it receives feedback.

The chosen architecture was based on deep Q-networks (DQNs). This model uses a neural network to approximate the value of taking certain actions (i.e., migrating resources) in different states of resource utilization. The system continuously adapts by learning how actions influence future states and rewards. Since server resource utilization is immediate, predictive technology can effectively avert unnecessary virtual resource migrations. A predictive migration trigger strategy, driven by a DQN algorithm, is used to evaluate if server resource usage exceeds or falls below a predefined threshold.

The iCanCloud simulation tool was employed to generate data on CPU usage, memory consumption, energy usage, migration time, and migration count in a cloud setting. This dataset was subsequently used to train a deep reinforcement learning model designed to enhance the resource migration strategy. These data points served as input features for training the DQN model, derived from the iCanCloud simulation, and included the following:

State variables: CPU and memory utilization, current load on BBUs, and system energy consumption.

s_t = [CPU Utilization_i, Memory Utilization_i, Energy Consumption_i, VM_i] ∀i ∈ hosts

(6)

Action space: Decisions on whether to migrate VMs or not, and the selection of target physical machines for migration.

a_t ∈ {Migrate VM→h_j, Keep VM on h_i}

(7)

Reward function: The reward was based on optimizing resource utilization and min. imizing energy consumption, and the number of migrations. The reward is positive for resource behaviors that reduce energy consumption.

r_{t} = λ_{1} \times (1 - \frac{C P U L o a d}{T h r e s h o l d}) - λ_{2} \times M i g r a t i o n C o s t - λ_{3} \times E n e r g y C o n s u m p t i o n

(8)

where λ1, λ2, λ3 (lambda_1, lambda_2, lambda_3) are weighting factors determining whom to give importance. In our case, we have carried out a simulation, mostly optimizing the energy consumption.

Example of Reward Calculation

Case 1: Balanced load without migration

○: CPU usage: 50% (below threshold)
○: Migration cost: 0
○: Energy consumption: 0

R_{t} = λ_{1} \times (1 - \frac{70}{50}) = 0.29 λ_{1}

Case 2: High load and migration needed

○: CPU usage: 90% (above threshold)
○: Migration cost: 0.1
○: Energy consumption: 0.05

R_{t} = λ_{1} \times (1 - \frac{90}{70}) - 0.1 λ_{2} - 0.05 λ_{3} R_{t} = - 0.29 λ_{1} - 0.1 λ_{2} - 0.05 λ_{3}

A negative reward pushes the agent to take corrective actions (e.g., migrate a VM).

The deep Q-network (DQN) model, a widely used DRL architecture was chosen. The model’s layers were structured as follows:

Input layer: takes the current state of CPU and memory utilization, energy consumption, and migration metrics.
Hidden layers: fully connected layers with ReLU activation functions to handle the non-linearity in the data.
Output layer: produces the Q-values corresponding to the possible migration actions, where higher Q-values indicate better decisions.

This research proposes a migration technique that combines prediction and resource utilization to evaluate the relationship. Figure 3 depicts a thorough procedure and Algorithm 1 as a trigger flow. The proposed trigger method begins by polling each compute node’s load utilization for monitoring purposes, using CPU and memory usage as criteria for initiating migration. It then evaluates the relationship between utilization levels and the threshold. The prediction model is used to continuously determine the relationship between load utilization and the threshold if any parameters exceed the threshold range. A compute node is identified as overloaded if the predicted value surpasses the upper threshold, in which case it calculates D_CPU/D_Mem, representing the difference between the overload resource utilization and the upper threshold. Otherwise, the method employs a prediction model to initiate the underload decision process, recognizing the node as underloaded if the expected value is below the lower threshold.

To avoid performance degradation, the VM selected for migration is chosen from an overloaded or underloaded compute node following the VM selection strategy. According to the proposed method, overload may occur in three scenarios: In case 0, both memory and CPU utilization exceed the upper threshold. In cases 1 and 2, only utilization of one of the resources, memory or CPU, respectively, surpasses the upper threshold. For energy efficiency, dynamic migration is triggered, and the underloaded node is powered down if any utilization parameters fall below the lower threshold. If all parameters remain within the threshold range, the system is considered to be in a normal state, and the next phase of the decision-making process begins.

The strategy then selects the VM with the shortest migration time from the identified subset of VMs, prioritizing VMs with the fewest previous migrations. Algorithm 2 illustrates the proposed VM selection process.

M_k-DCPU and VM_K-DMem show the difference between D’s (D_CPU or D_Mem) and VM’s CPU and memory use. Only one virtual machine (VM) needs to be moved in order to obtain the server’s resource usage rate below the upper threshold if the difference is larger than or equal to 0. This VM is retained within the VM_list subset. Within this subset, VMs that meet the minimum migration count are ordered by memory usage, and the VM with the smallest memory usage is selected to be migrated. If VM_list is empty, the strategy calculates f from Algorithm 1, whether VM_list is empty for each VM on the overloaded host, and then selects the VM that will most effectively reduce server load.

Algorithm 1: DQN migration method with dual threshold pseudo-code

Initialize DQN_Model with trained weights
Set threshold (upper and lower) for resource utilization (e.g., CPU_threshold, Memory_threshold)
While the simulation is running:
# Step 1: Collect environment state
For each host in the iCanCloud environment:
current_CPU_utilization = host.getCPUUtilization()
  current_Memory_utilization = host.getMemoryUtilization()
  current_Energy_consumption = host.getEnergyConsumption()
# Step 2: Define the state (combine CPU, Memory, and Energy into a state vector)
state = [current_CPU_utilization, current_Memory_utilization, current_Energy_consumption]
# Step 3: Feed state to DQN model to obtain action = DQN_Model.predict(state)
# Step 4: Action decision
If action == ‘MIGRATE_VM’:
# Perform migration
target_host = selectBestHostForMigration() vm_to_migrate = selectVMToMigrate(host) migrate(vm_to_migrate, target_host)
log("Migration performed from", host, "to", target_host)
Else if action == ‘DO_NOTHING’:
  # No migration needed, continue monitoring log("No migration required for", host)
# Step 5: Monitor environment and obtain feedback for reward post_migration_CPU_utilization = host.getCPUUtilization() post_migration_Memory_utilization = host.getMemoryUtilization() post_migration_Energy_consumption = host.getEnergyConsumption()
reward = computeReward(post_migration_CPU_utilization, post_migration_Memory_utilization, post_migration_Energy_consumption)
# Step 6: Train DQN with the reward (optional for continued training) DQN_Model.update(state, action, reward)
# Continue to the next host or iteration

Algorithm 2: VM Migration selection strategy pseudo-code

Input: Over_load_Host, DCPU and DMem
Output: VM_to_migrate
Step1 # Calculate the utilization load for eachVM in Over_load_Host do
case0: VMk-DCPU = VMk_CPU-DCPU,VMk-DMem = VMk_Mem-DMem
case1: VMk-DCPU = VMk_CPU-DCPU
case2: VMk-DMem = VMk_Mem-DMem
Step2# Check the VM_k-DCPU and VM_k-DMem for each VMs to be migrated
if (case 0:VM k-DCPU >= 0 and VM k-DMem >= 0;
case 1:VM k-DCPU >= 0; case 2:VM k-DMem >= 0)
Step3# If the difference is greater than or equal to zero sort the VMlist_count according to minimum memory usage and then select the smallest one to be migrated
VM_list ←VMk
sort VM_list by memory
VM_to_migrate = VM_list[0]
Step4# Or else, calculate the utilization weight of the three cases as follows:
case0

: e 1 = \frac{D_{C P U}}{D_{C P U} + D_{M e n}}

e

2 = \frac{D_{M e n}}{D_{C P U} + D_{M e n}}

case1: e1 = 1, e2 = 0
case2: e1 = 0, e2 = 1
fk = w1⋅VMk-DCPU + w2⋅VMk-DMem
VMlist_f ←VMk, fk
Step5# Sort the list of VMs according to f
sort VMlist_f by f k
VM_to_migrate = VMlist_f [0]
return VM_to_migrate

The VM selection is not required when the compute node initiates migration due to an excessively low load because all of the virtual machines within the node are immediately relocated, and the node will be shut down to conserve energy.

5. Simulation Results

This section presents the simulation outcome and performance analysis of the suggested deep Q-network (DQN)-based migration scheme for virtual BBUs in a cloud radio access network (C-RAN). The section aims to demonstrate the excellence of the suggested solution in terms of resource utilization, energy efficiency, and migration performance compared to traditional methods.

5.1. The Proposed Simulation Environment

The iCanCloud platform, which can be used to implement various resource allocation techniques and assess strategy performance, simulates the dynamic migration strategy suggested in this paper. A laptop running Ubuntu 18 (64-bit) with an Intel Core i5-6200U CPU and 4 GB of RAM is the gear used in the iCanCloud simulation experiment. The OMNET++ platform is the foundation of iCanCloud. Thus, the iCanCloud 1.0 version is used to imitate OMNET++ 4.6.1 and INET 2.5.0 in this paper.

The iCanCloud simulation tool was used to generate dummy data about CPU utilization, memory usage, energy consumption, migration time, and the number of migrations in a cloud environment. These data were then used to train a deep reinforcement learning (DRL) model aimed at improving the resource migration strategy. iCanCloud provides data on CPU utilization, memory consumption, and energy usage at various times during the simulation.

A number of virtual machines (VMs) are loaded on compute nodes, which are physical servers. A cloud manager can monitor and control the virtual BBU pool loaded on these nodes. In Figure 4, the proposed system scenario is illustrated that consists of a BBU pool (centralized baseband processing unit) where multiple virtualized BBUs (vBBUs) are hosted. Each BBU in the pool is designed as a virtual machine (VM) on compute nodes (physical servers). These VMs handle baseband processing tasks for multiple remote radio heads (RRHs). The system architecture is designed to enable dynamic migration of VMs to distribute the load and minimize power consumption. The compute nodes (physical servers) load multiple BBUs on VMs, offering efficient resource utilization. The DQN-based migration policy is employed to determine whether migration is required for load balancing. Cloud manager monitors and controls the entire virtualized BBU pool. It is in charge of monitoring server resource utilization (CPU, memory, power), deciding when to trigger migrations using a dual-threshold model (upper and lower thresholds), and finally selecting the optimal target compute node for VM migration.

Based on the built simulation platform shown in Figure 4, two compute nodes which contain 150 BBU nodes in each of them and every BBU node contains three Virtual Machines. The allocation of processing power per virtual machine is based on 1000 MIPS, 2000 MIPS, and 3000 MIPS, whereas the processing power per BBU node is based on 2000 MIPS and 4000 MIPS. The number of users has submitted tasks is 1500, which are sequentially assigned to the virtual machine. Virtual machines are first assigned to odd ordinal hosts, then assigned to even ordinal hosts. Table 2 displays the specific experimental parameters.

To assess the effectiveness of the proposed strategy, a comparison and analysis with existing migration and VM selection strategies are required. In this experiment, four VM selection strategies are compared against the DRL selection strategy: the Kalman filter prediction (KAL), random selection (RS), minimum utilization (MU), and maximum correlation (MC) [40].

5.2. The Proposed Evaluation Results and Analysis

Three indicators are taken into consideration in order to compare the performance of the suggested migration method; the energy consumption of the BBU pool, the migration time, and the total number of migrations. The four VM selection strategies—KAL, RS, MU, and MC—are contrasted with DRL-VM migration selection. The statistics techniques used to analyze the data are:

MAD (median absolute deviation). It could be used to identify nodes with unusually high or low resource usage compared to the median usage across the pool.
IQR (interquartile range). IQR can be used to understand the range within which the central 50% of resource usage data lies. It helps in understanding the spread of the middle portion of the data and can be useful for setting thresholds.
LR (load ratio). LR can help determine how heavily a particular resource (such as CPU, memory, or network bandwidth) is being utilized relative to its total capacity. It is a key metric for deciding when to trigger migrations to balance the load.
Thr (threshold). A threshold is a predefined value used to trigger certain actions. In the context of resource management, thresholds are typically set for metrics like CPU usage, memory usage, etc.

Figure 5, Figure 6 and Figure 7 display the simulation results. When compared to the current approach, the suggested migration selection method successfully lowers energy usage, migration duration, and migration number. Figure 5 describes the energy consumption of the proposed DQN strategy which results in much less energy consumption compared to conventional methods. DQN minimizes unnecessary migrations and optimally decides when and where to migrate VMs, reducing power consumption in the BBU pool. Figure 6 describes the number of migration occurrences of the proposed DQN strategy. The DQN model achieves a lower number of VM migrations compared to the Kalman filter and other baseline methods. The model learns from past migration events and predicts future resource utilization, preventing unnecessary migrations caused by short-term spikes in server load. Figure 7 describes the migration time for the proposed DQN strategy. The migration time is significantly lower in the DQN-based method compared to conventional methods, as DQN optimally selects the best VM and target node, leading to faster migration execution.

Table 3 shows the performance of the DQN migration model compared against the Kalman filter-based approach using the following metrics; CPU utilization, memory utilization, energy consumption, and number of migrations.

From the above table, it can be concluded that

Resource utilization: The DQN model improved by 6.2% and memory utilization improved by 6.4% compared to the Kalman filter. This was because DQN was better at identifying and reacting to resource usage patterns over time.
Energy consumption: DQN also resulted in a 16% reduction in energy consumption, as it optimized the system’s load balancing by triggering migrations only when necessary and consolidating resources more effectively.
Migration time and number of migrations: The DQN model reduced migration time by 31% and the number of migrations by 28%, compared to the Kalman filter. By learning from previous migration events, DQN minimized unnecessary migrations and improved the selection of target machines, resulting in fewer, but more efficient, migration events.

The DQN-based migration scheme outperformed the traditional Kalman filter scheme in important performance aspects, including consumption of resources, efficiency in terms of energy, and migration times. With deep reinforcement learning, the system could dynamically adapt according to cloud usage behavior in terms of resources, and it could maintain a superior performance-energy consumption balancing act.

5.3. Comparative Study of the Proposed System with Some Statistical Methods

Cloud-based BBU pools require smart approaches for the effective use of resources. With dynamically changing workloads and changing system states, the selection of an ideal node for VM migration and consolidation is a challenge. Techniques such as IQR, LR, and MAD use past information for deciding thresholds for utilizing resources, but DQN techniques use real-time feedback and learning for effective decision making. The following are the statistical methods that are compared with a proposed DQN.

Median Absolute Deviation (MAD)

The median absolute deviation (MAD) is a robust statistical variance measure. It is computed as a median of the absolute values of the deviation of the data from its median [45]:

MAD = median(∣Xi − median(X)∣)

(9)

where X is a collection of observations (e.g., the number of VMs attached to candidate nodes). MAD is less sensitive to outliers than the standard deviation. In cloud BBU, MAD is adopted for choosing nodes whose number of attached VMs varies least in terms of the central tendency. The following is the procedure:

Data collection: Gather the number of linked VMs from eligible nodes.
Median calculation: Sort the data and compute the median number of linked VMs.
Deviation computation: Calculate the absolute deviations for each node from the median.
Candidate selection: Choose the node with the smallest deviation, indicating a balanced or typical resource usage. If no node has linked VMs, the algorithm defaults to a first-come-first-serve (FCFS) approach.

B.: Interquartile Range (IQR)

The interquartile range (IQR) is defined as the difference between the third quartile (Q3) and the first quartile (Q1) of the data [46]:

IQR = Q3 − Q1

(10)

IQR is widely used to measure the spread of the central 50% of a dataset and is effective in detecting outliers when combined with multiplier thresholds (e.g., 1.5 times the IQR).

For node selection, IQR is applied to the number of linked VMs:

Data collection: Obtain the count of linked VMs across candidate nodes.
Quartile calculation: Sort the data, compute Q1 and Q3, and then derive the IQR.
Threshold determination: Establish a lower threshold Q1 − 1.5 × IQR and an upper threshold Q3 + 1.5 × IQR.
Candidate filtering: Nodes with VM counts below the lower threshold are classified as underutilized, while those within the threshold are deemed acceptable. Nodes exceeding the upper threshold are considered overutilized and are excluded from selection.
Selection logic: Preferentially select the underutilized node with the fewest VMs; if none exist, choose the node with an acceptable load.

C.: Load Ratio (LR)

The load ratio is defined as a normalized metric that compares a node’s forecasted load to its capacity or a baseline load expectation. Mathematically, it can be expressed as follows [47]:

Load Ratio = \frac{F o r e c a s t e d L o a d}{C a p a c i t y}

(11)

where the forecasted load is a prediction of predicted workload (e.g., number of VMs connected in) calculated through forecasting and modeled with such techniques as ARIMA, and capacity (or Baseline Load) is a predefined, allowable maximum baseline value representing a reflection of a node’s capacity. The load ratio is a normalized mechanism for comparing and estimating a node’s comparative utilization, disregarding any variation in its actual capacities. Where a low load ratio reflects a node is underused in relation to its capacity, a high load ratio can mean a node is at, or in danger of exceeding, its ideal level of loading.

The implementation of this model for the node selection is applied as follows:

Data collection:
Gather the forecasted load values (e.g., predicted number of linked VMs) and the capacity or baseline load for each candidate node.
Ratio calculation:
Compute the Load Ratio for each node by dividing its forecasted load by its capacity:
Threshold determination:
Define thresholds or criteria to classify nodes based on their Load Ratios. For example, nodes may be categorized as:
○
Underutilized: Nodes with a low load ratio, indicating significant available capacity.
○
Balanced: Nodes with a load ratio within an acceptable range.
○
Overutilized: Nodes with a high load ratio, suggesting that they are approaching or exceeding their resource limits.
Candidate filtering:
Filter out nodes based on their classification:
○
Underutilized nodes: These are preferred since they have the greatest capacity headroom.
○
Acceptable nodes: Considered if no underutilized nodes are available.
○
Overutilized nodes: Excluded from selection to prevent potential overloads.
Selection logic:
Preferentially select the node with the lowest load ratio to ensure efficient and balanced resource utilization. If multiple nodes exhibit similar ratios, additional criteria (such as current load or historical performance) can be used as tie-breakers.

This load ratio approach allows the system to make dynamic, capacity-aware decisions during node selection, promoting proactive resource management and preventing any single node from becoming a bottleneck.

The proposed DQN with these statistical methods simulated for about two hours for the scenario listed above to obtain a more robust result. Figure 8, Figure 9 and Figure 10 show the energy consumption, number of migrations, and migration time of the proposed DQN for VM migration to effectively utilize the resources with MAD, LR, and IQR. Figure 8 depicts comparative energy consumption where the DQN-based migration policy has the minimum energy consumption compared to statistical methods (LR, IQR, MAD). While Figure 9 shows comparative migrations. The DQN policy has fewer migrations compared to MAD and IQR but LR is also comparable to DQN in terms of migrations, as load ratio (LR) considers forecasted utilization patterns, which helps in selecting nodes more efficiently compared to IQR and MAD. Both LR and DQN use threshold-based triggering, which prevents unnecessary migrations due to minor oscillations. LR, however, lacks learning and with time, DQN continues to improve its decisions and becomes more adaptive. Figure 10 shows the comparative migration time, the DQN model incurs the minimum migration time while IQR and MAD incur higher migration times.

More enhancement results will be obtained from the proposed DQN due to the following:

Dynamic adaptation:
Unlike static statistical thresholds derived from historical data, the DQN continually learns and adapts to current system dynamics, making it more resilient to changes in workload patterns.
Multivariate optimization:
DQN considers multiple performance indicators simultaneously (e.g., CPU, memory, energy, and VM count) rather than relying on a single metric. This holistic approach can optimize trade-offs between competing objectives.
Proactive decision making:
The DQN’s ability to predict future states allows for proactive adjustments in node selection, potentially preventing overload before it occurs.
Scalability:
As the system grows in complexity, the DQN model can scale to incorporate additional features and parameters, whereas the statistical methods might require significant re-tuning.

Table 4 illustrates a comparative discussion between MAD, LR, LQR, and DQN.

The statistical techniques (IQR, LR, and MAD) offer straightforward, reliable baseline methodologies that are simple to apply and understand. They function well in settings with consistent workloads or with constrained computational power. They are not, however, flexible enough to change with the times or take into account a wider variety of performance indicators. On the other hand, the DQN approach performs better in a dynamic, complex setting like BBU cloud systems. It is a better option for resource migration and node selection because of its capacity to learn from current system behavior, optimize across several dimensions, and predict future system states.

5.4. A Comparative Analysis with Some Time Series Methods

In this paper, comparative results between the proposed DQN and two time series predicting methods which are simple exponential smoothing (SES) and auto-regressive integrated moving average (ARIMA) are reported.

Simple exponential smoothing (SES) To predict the near-term load (calculated as the number of linked virtual machines) for potential nodes in a BBU cloud environment, use simple exponential smoothing (SES). The objective is to proactively choose a node that, given its anticipated resource demand, is anticipated to be underutilized [48].
The key steps for the implementation of the method:
- Candidate collection:
  The algorithm iterates through a set of eligible nodes and collects each node along with its current number of linked VMs. Each node’s current value serves as an initial forecast since historical data or a previous forecast may not be available.
- Forecast update:
  For each candidate node, the forecast is updated using the SES formula:
  
  forecast = α × current Value + (1 − α) × previous Forecast
  
  (12)
  
  In this simplified implementation, the forecast is initialized with the current observation.
- Thresholding:
  A threshold is defined (e.g., 10 linked VMs) to distinguish underutilized nodes from acceptable ones. Nodes with a forecast below this threshold are considered underutilized.
- Node selection:
  ○
  If there are underutilized nodes, the algorithm selects the node with the smallest forecasted value.
  ○
  If no underutilized node exists, it selects from the acceptable nodes based on the minimum forecasted load.
  ○
  If no candidate nodes exist or if no node has any linked VMs, a fallback selection strategy (e.g., FCFS) is used.

This approach allows for dynamic and proactive resource management by forecasting the load on each node and selecting the node that is most likely to be underutilized in the near future.

B.

Auto-regressive integrated moving average: Converts the IQR-based node selection method into one that leverages a time series forecasting model (ARIMA) to predict future resource usage (number of linked VMs) for each candidate node. The idea is to use historical data for each node to forecast its next time-step value, then apply a similar thresholding approach (using forecasted values) to identify underutilized or acceptable nodes [49].

The key Steps for the implementation of this method:

○: Data collection and forecasting: For each candidate node, the algorithm retrieves its historical time series of linked VMs. An ARIMA model is applied to predict the next value for the number of linked VMs. If sufficient historical data are unavailable, it falls back to using the current observed value.
○: Threshold computation: The forecasted values for all candidate nodes are sorted, and the first quartile (Q1) and third quartile (Q3) are computed. The interquartile range (IQR) is used to derive a lower threshold and an upper threshold (using the conventional 1.5 X QR rule).
○: Candidate filtering: Nodes with forecasted values below the lower threshold are classified as underutilized. Nodes with forecasted values within the threshold are acceptable. Nodes with forecasted values above the upper threshold are considered overutilized and are not selected.
○: Selection: Preference is given to underutilized nodes (lowest forecasted load), and if none are available, then acceptable nodes are considered. If no node meets the criteria, a fallback strategy (such as FCFS) is used.

This ARIMA-based approach introduces a dynamic, time series forecasting element to the node selection process, allowing decisions to be made not just on the current snapshot but on an informed prediction of future load, which can lead to more proactive resource management in BBU cloud environments.

The proposed DQN with these time series methods is also simulated for about two hours for the scenario listed above. Figure 11, Figure 12 and Figure 13 show the energy consumption, number of migrations, and migration time.

As observed from the evaluated results, DQN performs better than SES and ARIMA in terms of energy consumption, number of migrations, and migration time while providing more computational and complex matters. Table 5 gives a comparative analysis between the proposed DQN vs. ARIMA and SES.

6. Conclusions

To enhance the use of resources for efficient use of resources and consumption of energy, C-RAN must manage processing workloads between BBUs in a pool of BBUs. In this work, a deep reinforcement learning (DRL) scheme for migration of a load between VBBU nodes, evaluation of server uses in a pool of BBUs, and migration at predefined thresholds was analyzed and modeled. To prevent excessive consumption of energy through excessive VM migration in a pool of BBUs, dynamic migration selects VMs in overloaded hosts with minimum migration count and shortest migration duration. Energy consumption in a pool of BBUs is reduced through an adaptive dynamic migration scheme. The proposed deep Q-network (DQN)-based migration mechanism revolutionizes cloud radio access network (C-RAN) management by intelligently optimizing VM migrations, minimizing energy consumption, and improving load balancing. Unlike traditional Kalman filtering, statistical methods (LR, IQR, MAD), and time-series forecasting (SES, ARIMA), DQN learns dynamically from real-time data, adapting migration decisions proactively without static thresholds. This adaptive approach significantly reduces unnecessary migrations, enhances system efficiency, and ensures optimal resource allocation between cloud infrastructures. In addition, DQN demonstrates improved scalability, security, and robustness, making it an ideal match for modern AI-driven cloud environments. The model learns to adapt to high-dimensional data and dynamically changing workloads, where traditional methods fail. With deep reinforcement learning, this study provides an efficient, future-proof mechanism for resource utilization optimization in future wireless networks.

Author Contributions

Conceptualization, S.F.I. and D.J.K.; methodology, S.F.I. and D.J.K.; software, S.F.I.; validation, S.F.I. and D.J.K.; formal analysis, S.F.I. and D.J.K.; investigation, S.F.I. and D.J.K.; resources, S.F.I. and D.J.K.; data curation, S.F.I. and D.J.K.; writing—original draft preparation, S.F.I.; writing—review and editing, S.F.I. and D.J.K.; visualization, S.F.I. and D.J.K.; supervision, D.J.K.; project administration, D.J.K.; funding acquisition, S.F.I. and D.J.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Qadir, Z.; Le, K.N.; Saeed, N.; Munawar, H.S. Towards 6G Internet of Things: Recent advances, use cases, and open challenges. ICT Express 2022, 1, 1–17. [Google Scholar]
Guo, F.; Yu, F.R.; Zhang, H.; Li, X.; Ji, H.; Leung, V.C. Enabling massive IoT toward 6G: A comprehensive survey. IEEE Internet Things J. 2021, 8, 11891–11915. [Google Scholar] [CrossRef]
Faizan, Q. Enhancing QOS Performance of the 5G Network by Characterizing Mm-Wave Channel and Optimizing Interference Cancellation Scheme/Faizan Qamar; University of Malaya: Kuala Lumpur, Malaysia, 2019. [Google Scholar]
Bani-Bakr, A.; Dimyati, K.; Hindia, M.N.; Wong, W.R.; Izam, T.F.T.M.N. Joint successful transmission probability, delay, and energy efficiency caching optimization in fog radio access network. Electronics 2021, 10, 1847. [Google Scholar] [CrossRef]
Abdulsaheb, J.A.; Kadhim, D.J. Robot Path Planning in Unknown Environments with Multi-Objectives Using an Improved COOT Optimization Algorithm. Int. J. Intell. Eng. Syst. 2022, 15, 548–565. [Google Scholar]
Chen, S.; Liang, Y.-C.; Sun, S.; Kang, S.; Cheng, W.; Peng, M. Vision, requirements, and technology trend of 6G: How to tackle the challenges of system coverage, capacity, user data-rate and movement speed. IEEE Wirel. Commun. 2020, 27, 218–228. [Google Scholar]
Jabbar, S.Q.; Kadhim, D.J.; Li, Y. Developing a Video Buffer Framework for Video Streaming in Cellular Networks. Wirel. Commun. Mob. Comput. 2018, 2018, 6584845. [Google Scholar]
Jaber, Z.H.; Kadhim, D.J.; Al-Araji, A.S. Medium access control protocol design for wireless communications and networks review. Int. J. Electr. Comput. Eng. (IJECE) 2022, 12, 1711–1723. [Google Scholar]
Hindia, M.; Qamar, F.; Majed, M.B.; Abd Rahman, T.; Amiri, I.S. Enabling remote-control for the power sub stations over LTE-A networks. Telecommun. Syst. 2019, 70, 37–53. [Google Scholar]
Hasan, M.Y.; Kadhim, D.J. A new smart approach of an efficient energy consumption management by using a machine-learning technique. Indones. J. Electr. Eng. Comput. Sci. 2022, 25, 68–78. [Google Scholar]
Chih-Lin, I.; Rowell, C.; Han, S.H.; Xu, Z.; Li, G.; Pan, Z. Toward green and soft: A 5G perspective. IEEE Commun. Mag. 2014, 52, 66–73. [Google Scholar]
Lin, Y.; Shao, L.; Zhu, Z.; Wang, Q.; Sabhikhi, R.K. Wireless network cloud: Architecture and system requirements. IBM J. Res. Dev. 2010, 54, 4:1–4:12. [Google Scholar] [CrossRef]
Gao, Z.; Zhang, J.; Yan, S.; Xiao, Y.; Simeonidou, D.; Ji, Y. Deep Reinforcement Learning for BBU Placement and Routing in C-RAN. In Proceedings of the 2019 Optical Fiber Communications Conference and Exhibition (OFC), San Diego, CA, USA, 3–7 March 2019; p. 18618440. [Google Scholar]
Li, Y.; Bhopalwala, M.; Das, S.; Yu, J.; Mo, W.; Ruffini, M.; Kilper, D.C. Joint Optimization of BBU Pool Allocation and Selection for C-RAN Networks. In Proceedings of the 2018 Optical Fiber Communications Conference and Exposition (OFC), San Diego, CA, USA, 11–15 March 2018; p. 17855949. [Google Scholar]
Pizzinat, A.; Chanclou, P.; Saliou, F.; Diallo, T. Things You Should Know About Fronthaul. J. Light. Technol. 2015, 33, 1077–1083. [Google Scholar] [CrossRef]
Ismail, S.F.; Kadhim, D.J. Towards 6G Technology: Insights into Resource Management for Cloud RAN Deployment. IoT 2024, 5, 409–448. [Google Scholar] [CrossRef]
Zhang, Y.; Budzisz, L.; Meo, M.; Conte, A.; Haratcherev, I.; Koutitas, G.; Tassiulas, L.; Marsan, M.A.; Lambert, S. An overview of energy-efficient base station management techniques. In Proceedings of the 2013 24th Tyrrhenian International Workshop on Digital Communications-Green ICT (TIWDC), Genoa, Italy, 23–25 September 2013; pp. 1–6. [Google Scholar]
Ran, C.; Wang, S.H.; Wang, C. Optimal load balancing in cloud radio access networks. In Proceedings of the Wireless Communications and Networking Conference, WCNC, New Orleans, LA, USA, 9–12 March 2015; pp. 1006–1011. [Google Scholar]
Debaillie, B.; Desset, C.; Louagie, F. A flexible and future-proof power model for cellular base stations. In Proceedings of the 2015 IEEE 81st Vehicular Technology Conference, VTC Spring, Scotland, UK, 11–14 May 2015; pp. 1–7. [Google Scholar]
Checko, A.; Christiansen, H.L.; Yan, Y.; Scolari, L.; Kardaras, G.; Berger, M.S.; Dittmann, L. Cloud ran for mobile networks—A technology overview. IEEE Commun. Surv. Tutor. 2015, 17, 405–426. [Google Scholar] [CrossRef]
Vaezi, M.; Zhang, Y. Cloud Mobile Networks; Springer: Cham, Switzerland, 2017. [Google Scholar]
Mahapatra, B.; Kumar, R.; Kumar, S.; Turuk, A.K. A Heterogeneous Load Balancing Approach in Centralized BBU-Pool of C-RAN Architecture. In Proceedings of the 2018 3rd International Conference for Convergence in Technology (I2CT), Pune, India, 6–8 April 2018; pp. 1–5. [Google Scholar]
Yıldız, O.; Sokullu, R.I. Deep Q-Learning based resource allocation and load balancing in a mobile edge system serving different types of user requests. J. Electr. Eng. 2023, 74, 48–56. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Qian, M.; Wang, Y.; Zhou, Y.; Tian, L.; Shi, J. A super base station based centralized network architecture for 5g mobile communication systems. Digit. Commun. Netw. 2015, 1, 152–159. [Google Scholar] [CrossRef]
Wang, C.; Wang, Y.; Gong, C.; Wan, Y.; Cai, L.; Luo, Q. A study on virtual bs live migration—A seamless and lossless mechanism for virtual bs migration. In Proceedings of the Annual International Symposium on Personal, Indoor, and Mobile Radio Communications, PIMRC, London, UK, 8–11 September 2013; pp. 2803–2807. [Google Scholar]
Beloglazov, A.; Buyya, R. Adaptive threshold-based approach for energy-efficient consolidation of virtual machines in cloud data centers. In Proceedings of the 8th International Workshop on Middleware for Grids, Clouds and e-Science, MCG 2010, Bangalore, India, 29 November–3 December 2010; pp. 4–10. [Google Scholar]
Ferdouse, L.; Ejaz, W.; Anpalagan, A.; Khattak, A.M. Joint workload scheduling and bbu allocation in cloud-ran for 5g networks. In Proceedings of the Symposium on Applied Computing, Marrakech, Morocco, 3–7 April 2017; pp. 621–627. [Google Scholar]
Schiller, E.; Ajayi, J.; Weber, S.; Braun, T.; Stiller, B. Toward a Live BBU Container Migration in Wireless Networks. IEEE Open J. Commun. Soc. 2021, 3, 301–321. [Google Scholar] [CrossRef]
Satpathy, A.; Addya, S.K.; Turuk, A.K.; Majhi, B.; Sahoo, G. Crow search based virtual machine placement strategy in cloud data centers with live migration. Comput. Electr. Eng. 2018, 69, 334–350. [Google Scholar] [CrossRef]
Sabella, D.; Rost, P.; Sheng, Y.; Pateromichelakis, E.; Salim, U.; Guitton-Ouhamou, P.; Girolamo, D.; Giuliani, G. Ran as a service challenges of designing a flexible ran architecture in a cloud-based heterogeneous mobile network. In Proceedings of the 2013 Future Network & Mobile Summit, Lisbon, Portugal, 3–5 July 2013; pp. 1–8. [Google Scholar]
Sharma, N.K.; Sharma, P.; Guddeti, R.M. Energy efficient quality of service aware virtual machine migration in cloud computing. In Proceedings of the International Conference on Recent Advances in Information Technology, RAIT, Dhanbad, India, 15–17 March 2018; pp. 1–6. [Google Scholar]
Melhem, S.B.; Agarwal, A.; Goel, N.; Zaman, M. Minimizing Biased VM Selection in Live VM Migration. In Proceedings of the 2017 3rd International Conference of Cloud Computing Technologies and Applications (CloudTech), Rabat, Morocco, 24–26 October 2017. [Google Scholar] [CrossRef]
Beloglazov, A. Energy-Efficient Management of Virtual Machines in Data Centers for Cloud Computing. Ph.D. Theis, University of Melbourne, Melbourne, Australia, February 2013. [Google Scholar]
He, K.; Li, Z.; Deng, D.; Chen, Y. Energy-efficient framework for virtual machine consolidation in cloud data centers. Chin. Commun. 2017, 14, 192–201. [Google Scholar] [CrossRef]
Li, Z.; Wu, G. Optimizing VM Live Migration Strategy Based on Migration Time Cost Modeling. In Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems, Santa Clara, CA, USA, 17–18 March 2016; pp. 99–109. [Google Scholar]
Razali, R.A.M.; Ab Rahman, R.; Zaini, N.; Samad, M. Virtual machine migration implementation in load balancing for Cloud computing. In Proceedings of the 2014 5th International Conference on Intelligent and Advanced Systems (ICIAS), Kuala Lumpur, Malaysia, 3–5 June 2014; pp. 1–4. [Google Scholar]
Raghunath, B.R.; Annappa, B. Dynamic Resource Allocation Using Fuzzy Prediction System. In Proceedings of the 2018 3rd International Conference for Convergence in Technology (I2CT), Pune, India, 6–8 April 2018; pp. 1–6. [Google Scholar]
Yang, G.; Zhang, W.J. Research of optimized resource allocation strategy based on Openstack. In Proceedings of the 2015 12th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), Chengdu, China, 18–20 December 2015. [Google Scholar]
Wang, C.; Cao, Y.; Zhang, Z.; Wang, W. Dual threshold adaptive dynamic migration strategy of virtual resources based on bbu pool. Electronics 2020, 9, 314. [Google Scholar] [CrossRef]
Castane, G.G.; Nunez, A.; Carretero, J. iCanCloud: A brief architecture overview. In Proceedings of the 2012 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, Madrid, Spain, 10–13 July 2012. [Google Scholar]
Beloglazov, A.; Abawajy, J.; Buyya, R. Energy-aware resource allocation heuristics for efficient management of data centers for Cloud computing. Future Gener. Comput. Syst. 2012, 28, 755–768. [Google Scholar]
Fu, X.; Zhou, C. Virtual machine selection and placement for dynamic consolidation in Cloud computing environment. Front. Comput. Sci. 2015, 9, 322–330. [Google Scholar]
Song, Y.; Wang, H.; Li, Y.; Feng, B.; Sun, Y. Multi-Tiered On-Demand Resource Scheduling for VM-Based Data Center. In Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, Shanghai, China, 18–21 May 2009; pp. 148–155. [Google Scholar] [CrossRef]
Li, Y.; Hu, X.; Wang, F. A robust statistical framework for anomaly detection in sensor networks using median absolute deviation. Sensors 2020, 20, 1739. [Google Scholar]
Feng, W.; Qiao, M.; Li, X. An IoT-based anomaly detection method using interquartile range and K-means clustering for smart manufacturing. Sensors 2020, 20, 4356. [Google Scholar]
Wang, X.; Zhang, Y.; Li, Q. A Load Ratio-Based Dynamic Scheduling Algorithm for Cloud Computing Environments. IEEE Access 2021, 9, 15012–15025. [Google Scholar]
Makridakis, S.; Spiliotis, E.; Assimakopoulos, V. Statistical and Machine Learning Forecasting Methods: Concerns and Ways Forward. PLoS ONE 2018, 13, e0194889. [Google Scholar]
Benvenuto, D.; Giovanetti, M.; Vassallo, L.; Angeletti, S.; Ciccozzi, M. Application of the ARIMA Model on the COVID-2019 Epidemic Dataset. Data Brief 2020, 29, 105340. [Google Scholar]

Figure 1. Architecture of cloud radio access network.

Figure 2. Modular architecture of iCanCloud.

Figure 3. The flowchart of the proposed migration procedure.

Figure 4. Proposed virtual cloud BBU architecture.

Figure 5. Energy consumption of the proposed DQN strategy.

Figure 6. Number of migrations occurrence of the proposed DQN strategy.

Figure 7. Time of migration for the proposed DQN strategy.

Figure 8. A comparative energy consumption.

Figure 9. A comparative number of migrations.

Figure 10. A comparative migration time.

Figure 11. Energy consumption for the comparative study with time series methods.

Figure 12. Number of migrations for the comparative study with time series methods.

Figure 13. Migration time for the comparative study with time series methods.

Table 1. Important terminology and their meaning.

Terminology and Parameters	Meaning
Over_load_Host	Overload compute node
C_N_i[CPU/Mem] util	The utilization of the resources; CPU or memory for a compute node
D_CPU/D_Mem	The distinction between the upper threshold and overflow resource utilization
C_N_i_pre[CPU/Mem]util	The prediction of resource utilization; CPU or memory for a compute node
VM _{k_[CPU/Mem]}	CPU or memory utilization of the VMk
VM _{k-[DCPU/DMem]}	Difference between VMk_[CPU/Mem] and DCPU/DMem
VM_to_migrate	Selected VM to be migrated from overload host

Table 2. Proposed simulation parameters.

Parameter	Value	Unit
Number of BBU nodes in each compute node	150	-
Number of compute node	2	-
CPU capacity of the host	2000 and 4000	MIPS
Memory size of the host	4096 and 6144	MB
Number of VMs in each compute node	450	-
CPU capacity of the virtual machine	1000, 2000 and 3000	MIPS
Memory size of the virtual machine	256, 512 and 1024	MB
Number of users	1500	-

Table 3. Comparison between DQN and Kalman filter prediction methods.

Metric	Kalman Filter Model	DQN Model
CPU Utilization (%)	82.5	88.7
Memory Utilization (%)	79.2	85.6
Energy Consumption (kWh)	250	210
Number of Migrations	180	130

Table 4. A comparative discussion.

Aspect	Statistical Methods (MAD, LR, and IQR)	DQN
Data Basis	Historical snapshot data	Real-time and historical combined data
Adaptability	Static thresholds; require periodic updates	Continuously adapts through learning
Multivariate Analysis	Typically univariate (e.g., VM count)	Incorporates multiple performance metrics
Decision Proactivity	Reactive; based on past distributions	Proactive; forecasts future trends
Complexity	Low computational complexity	Higher complexity; requires training time
Scalability	May need reconfiguration with new parameters	Naturally scales with high-dimensional data

Table 5. A comparative analysis between DQN vs. SES and ARIMA.

Aspect	DQN	Time Series Methods (ARIMA and SES)
Energy Consumption	Learns optimal policies to reduce energy usage and balance loads.	ARIMA works well in stable settings; SES adapts to short-term changes but struggles with sudden spikes.
Migration Time	Optimizes timing for faster and more efficient migrations.	Forecast-based migration can help, but errors may delay migrations.
Load Variance	Dynamically balances workload, reducing resource hotspots.	Predicts future loads but lacks real-time adaptation.
Number of Migrations	Minimizes unnecessary migrations, improving system stability.	Forecasting helps, but inaccurate predictions may cause excessive migrations.
Computational Overhead and Implementation Complexity	Requires more processing power for training but offers long-term benefits.	Less resource-intensive but lacks adaptability.
Decision Stability and Robustness	Learns from experience and adapts, ensuring long-term stability.	Works well in stable environments but struggles with sudden demand changes.
Security	Can be enhanced with reinforcement learning-based anomaly detection for secure migrations.	Lacks built-in security features; relies on predefined patterns, making it vulnerable to unpredictable threats.
Scalability	Adapts to high-dimensional data and large cloud infrastructures efficiently.	Requires re-tuning when scaling to larger environments.
Adaptability to Dynamic Environments	Continuously learns and updates migration strategies in real-time.	Static thresholds or pre-defined patterns may not adjust well to sudden demand shifts.
Implementation Complexity	More complex due to deep learning models but highly effective.	Easier to implement but may require frequent adjustments.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ismail, S.F.; Kadhim, D.J. Adaptive BBU Migration Based on Deep Q-Learning for Cloud Radio Access Network. Appl. Sci. 2025, 15, 3494. https://doi.org/10.3390/app15073494

AMA Style

Ismail SF, Kadhim DJ. Adaptive BBU Migration Based on Deep Q-Learning for Cloud Radio Access Network. Applied Sciences. 2025; 15(7):3494. https://doi.org/10.3390/app15073494

Chicago/Turabian Style

Ismail, Sura F., and Dheyaa Jasim Kadhim. 2025. "Adaptive BBU Migration Based on Deep Q-Learning for Cloud Radio Access Network" Applied Sciences 15, no. 7: 3494. https://doi.org/10.3390/app15073494

APA Style

Ismail, S. F., & Kadhim, D. J. (2025). Adaptive BBU Migration Based on Deep Q-Learning for Cloud Radio Access Network. Applied Sciences, 15(7), 3494. https://doi.org/10.3390/app15073494

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adaptive BBU Migration Based on Deep Q-Learning for Cloud Radio Access Network

Abstract

1. Introduction

2. Literature Review

3. Proposed System Model

3.1. System Scenario

3.2. Power Consumption Model

3.3. Resource Utilization

4. Proposed Dynamic Migration System

5. Simulation Results

5.1. The Proposed Simulation Environment

5.2. The Proposed Evaluation Results and Analysis

5.3. Comparative Study of the Proposed System with Some Statistical Methods

5.4. A Comparative Analysis with Some Time Series Methods

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI