Popularity-Aware Service Provisioning Framework in Cloud Environment

Ko, Haneul; Kim, Yumi; Kim, Bokyeong; Kyung, Yeunwoong

doi:10.3390/app14188201

Open AccessArticle

Popularity-Aware Service Provisioning Framework in Cloud Environment

¹

Department of Electronic Engineering, Kyung Hee University, Yongin-si 17104, Gyeonggi-do, Republic of Korea

²

Division of Information & Communication Engineering, Kongju National University, Cheonan-si 31080, Chungcheongnam-do, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(18), 8201; https://doi.org/10.3390/app14188201

Submission received: 12 July 2024 / Revised: 30 July 2024 / Accepted: 10 September 2024 / Published: 12 September 2024

(This article belongs to the Special Issue Cloud Computing: Challenges, Application and Prospects)

Download

Browse Figures

Versions Notes

Abstract

To balance the tradeoff between quality of service (QoS) and operating expenditure (OPEX), the service provider should request the appropriate amount of resources to the cloud operator based on the estimated variation of service requests. This paper proposes a popularity-aware service provisioning framework (PASPF), which leverages the network data analytics function (NWDAF) to obtain analytics on service popularity variations. These analytics estimate the congestion level and list of top services contributing most of the traffic change. Based on the analytics, PASPF enables the service provider to request the appropriate amount of resources for each service for the next time period to the cloud operator. To minimize the OPEX of the service provider while keeping the average response time of the services below their requirements, we formulate a constrained Markov decision process (CMDP) problem. The optimal stochastic policy can be obtained by converting the CMDP model into a linear programming (LP) model. Evaluation results demonstrate that the PASPF can achieve less than

50 %

OPEX of the service provider compared to a popularity-non-aware scheme while keeping the average response time of the services below the requirement.

Keywords:

cloud service; service provisioning; infrastructure-as-a-service; operating expenditure

1. Introduction

In the flourishing cloud industry [1], there are two kinds of players: cloud operators (i.e., cloud providers) and service providers (i.e., cloud consumers).

The cloud operators have three primary deployment models: Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), and Software-as-a-Service (SaaS) [2,3]. Specifically, IaaS provides on-demand infrastructure resources to consumers via the cloud, such as computing, storage, networking, and virtualization. Consumers do not have to manage, maintain, or update their own data center infrastructure, but they are responsible for the operating system, middleware, virtual machines, and any apps or data. IaaS gives consumers flexibility and scalability to purchase only the components they need and scale them up or down as needed. There is low overhead and no maintenance costs, making IaaS a very affordable option. However, the main drawbacks to IaaS are the possibilities of security and service reliability issues due to multi-tenant systems with multiple clients, which can be avoided by choosing a reliable and trustworthy provider. Meanwhile, PaaS provides and manages all the hardware and software resources to develop applications through the cloud. Consumers can use PaaS to develop, run, and manage applications without having to build and maintain the infrastructure or platform on their own. PaaS enhances developer productivity but comes with limited customizations and low flexibility due to the dependency on cloud providers. Finally, SaaS provides an entire application stack that consumers can access and use. SaaS products are completely managed by the cloud provider and come ready to use, including all updates and overall maintenance. Consumers can easily access, set up, and start using SaaS products over any internet connection on any device. However, the main drawbacks to SaaS are very limited customization and vendor lock-in issues due to the tight dependency on the cloud providers. In practice, the service providers frequently use IaaS to achieve a high level of control and flexible management [4,5]. Thus, we focus on the IaaS paradigm from the perspective of service providers. For IaaS, there are two kinds of pricing policy: pay-as-you-go and pay-as-per-use [6]. In the pricing policy of pay-as-you-go, the service provider pays a certain amount of cost to borrow a fixed amount of resources (e.g., CPU, memory, storage, etc.) for a specified period of time (e.g., 1 h) [6]. The pricing policy of pay-as-per-use, on the other hand, charges based on the usage. That is, when a service provider chooses the pricing policy of pay-as-per-use, he/she only pays for what he/she uses. However, the price of pay-as-per-use is generally more expensive than that of pay-as-you-go. Since one of the objectives in this paper is to minimize the cost of the service provider, we consider the pricing policy of pay-as-you-go. However, in pay-as-you-go, if the requests for a specific service boost up and the borrowed resources for a specific service are not sufficient to support the boosted requests, the response time of that service cannot meet its requirement. This phenomenon can be avoided by borrowing sufficient amounts of resources. However, too excessive resource borrowing causes a significant increased service deployment cost (i.e., operating expenditure (OPEX)) of the service provider. This indicates that the service provider should borrow an appropriate amount of resources based on the estimated service popularity variation (i.e., variation of service requests).

To address this issue of the service providers, this paper proposes a popularity-aware service provisioning framework (PASPF). In the PASPF, the service provider assesses the popularity variation of the service by requesting the analytics to a network data analytics function (NWDAF) in 5G. The NWDAF provides the analytics that estimate the congestion level and list of top services contributing most of the traffic change for the next time period. After receiving these analytics, the service provider communicates with the cloud operator to request an appropriate allocation of resources for each service in the forthcoming period. To minimize the OPEX of the service provider while ensuring that the average response time for services remains within specified requirements, a constrained Markov decision process (CMDP) problem was formulated. We derived the optimal stochastic policy by transforming the CMDP model into a linear programming (LP) model. The evaluation results indicate that PASPF can reduce over

50 %

in the service provider’s OPEX compared to a popularity-non-aware scheme while keeping the average response time of the services below the requirement. Moreover, it has been found that the PASPF can dynamically adjust its operational policy based on the environment, which can be utilized as the guidelines for the service providers to design the service provisioning framework.

The followings are this paper’s contributions: (1) We construct the popularity estimation model, which can be used to generate the analytics in NWDAF. (2) The PASPF derives the optimal policy on how much resources are requested for each service. The optimal policy can be derived from a low complexity and practical LP model. (3) The evaluation results under various conditions offer valuable insights for cost-efficient service provisioning.

The rest of this work is organized as follows: Section 2 provides a summary of related works, Section 3 introduces the PASPF. Section 4 describes the CMDP model. Section 5 includes the evaluation results. Section 6 offers the conclusion of this paper.

2. Related Work

There have been lots of works conducted on how to deploy the services in the cloud environments without various performance metrics (e.g., deployment cost, reliability, response time, operational cost, etc.) [7,8,9,10,11,12,13].

Wen [7] formulated a multi-objective optimization problem to maximize the reliability of the services and minimize the deployment cost of the services. In addition, they proposed a heuristic algorithm that minimizes the selected objective while bounding the other objective. Shi [8] formulated a mixed integer linear programming method to minimize the deployment cost while keeping the average response time of the services below the requirement. Then, they divided the problem into linearized sub-problems to solve the problem with the existing solutions. Menzel et al. [9] developed a genetic algorithm-based approach to balance the tradeoff between the response time and the deployment cost of the services. Wu et al. [10] formulated an optimization problem to minimize the overall operational cost during a certain time interval. Deng et al. [11] introduced a model for deploying sub-components-based services to minimize the deployment cost while satisfying response time requirements. Chen et al. [12] developed a heuristic approach that deploys sub-components in the heterogeneous cloud environment to reduce transmission latency. Chen et al. [13] formulated an edge service deployment optimization problem and proved that the formulated problem is NP-hard. Then, they proposed an approximation algorithm to find the near-optimal solution with low complexity. Chen et al. [14] introduced a cooperative edge caching strategy that retrieves content from several servers in collaboration while taking into account the popularity of the contents. Deng et al. [15] developed a heuristic algorithm aimed at reducing the average response time of services by taking into account the content popularity. Wu et al. [16] formulated a joint optimization problem for service provisioning, server activation, and scheduling, taking into account the fluctuations in service popularity over time.

However, there is no work that has considered the dynamics of service popularity and the pricing policy of pay-as-you-go to minimize the deployment cost (i.e., OPEX) of the service provider.

3. Popularity-Aware Service Provisioning Framework

3.1. Overall Framework

Figure 1 shows the system model of this paper. The service provider operates

N_{S}

services for mobile devices (MDs). For the service provision, the service provider exploits the IaaS of the cloud operator based on the pricing policy of pay-as-you-go. That is, if the service provider requests

A_{i}

resources (e.g., CPU, memory, storage, etc.) for the provision of service i during a contract time

T_{D}

(e.g., 1 h), the service provider should pay

ρ A_{i}

, where

ρ

is the unit cost for using one resource during the contract time to the cloud operator. After receiving the resource request message, the cloud operator allocates

A_{i}

resources to the service instance.

As shown in Figure 1, when the MD requests the service i, the service request is enqueued in the service queue and handled in a first-in-first-out (FIFO) manner. In addition, the processing rate of each service is proportional to the requested resource

A_{i}

. This means that the response time of the request is determined by the number of requests in the service queue and the requested resources. Therefore, when lots of MDs request the service i simultaneously and insufficient resources are requested for that service, the response time of that service for some MDs cannot meet its requirement. This phenomenon can be avoided by requesting enough resources. However, since the cost of the IaaS is proportional to the requested resources

A_{i}

, too excessive resource requesting causes a significant increased OPEX of the service provider. That is, there is a tradeoff between the service response time and the OPEX. To optimize this tradeoff, in the PASPF, the service provider requests the analytics on the popularity variation of each service to the NWDAF in 5G. As in standard [17], the NWDAF can provide the analytics (i.e., service popularity dynamics) that estimates how the congestion level and list of top services contributing most of the traffic change for the next time period. After receiving the analytics, the service provider can decide the optimal amount of resources for each service for the next time period. In short, the PASPF operates as follows:

The PASPF leverages the NWDAF, where the popularity estimation model runs to obtain the analytics on the popularity variations of each service.
Based on the analytics, the PASPF decides the optimal amount of resources for the next time period and requests the corresponding resources to the cloud operator.

Note that, for the optimal decision, we formulated the CMDP model, which will be described in the next section.

3.2. Popularity Estimation Model

To estimate the popularity variation of each service, we constructed a general long short-term memory (LSTM) model based on Tensorflow and trained it with the Kaggle public dataset [18]. Within the dataset, we separated

80 %

into a training set and

20 %

into a test set. Note that the training set was utilized for the model design, and the test set was used for the validation and test. The details of the hyper-parameters are as follows. The number of layers and the hidden size are 5 and 16, respectively. The input and window sizes were set to 1 and 24, respectively. The learning rate was set to

0.01

.

Figure 2 shows the real and estimated results for the number of Amazon service requests as a function of time. As shown in Figure 2, the constructed model could catch the popularity variation well. Specifically, the accuracy of the constructed model became approximately over

95 %

, which means that the model is well generalized for the estimation. This indicates that the NWDAF can provide the accurate analytics estimating top services contributing most of the traffic change for the next time period. Thus, based on the analytics, the service provider requests the appropriate amount of resources for each service for the next time period to the cloud operator.

4. Constrained MDP

As shown in Figure 3, in the CMDP model, the agent executes a sequence of specific actions to minimize (or maximize) the cost (or reward) under given constraints [19]. In this paper, the service provider decides the amount of resources for each service

R_{i}

for the next time period at the time epochs of

T = {1, 2, 3, \dots}

to minimize the OPEX of the service provider while keeping the average response time of the services below their requirements. The Table A1 summarizes the important notations.

4.1. State Space

The overall state space

S

is defined as

\begin{matrix} S = T \times \prod_{i} R_{i} \times P_{i} \times Q_{i} \end{matrix}

(1)

where

T

denotes the state space for the remaining contract time.

R_{i}

and

P_{i}

represent the state spaces for the allocated resources and the popularity of the service i, respectively.

Q_{i}

is the state space for the queue length of the service i.

Since

T_{D}

denotes the contract time,

T

can be defined as

\begin{matrix} T = \{0, 1, 2, 3, \dots, T_{D}\} . \end{matrix}

(2)

Let

R_{max}

denote the maximum resources that can be allocated to one instance for the service i. Then,

R_{i}

can be described as

\begin{matrix} R_{i} = \{0, 1, 2, \dots, R_{max}\} . \end{matrix}

(3)

Meanwhile, the popularity of the service i can be represented as the number of service requests during the time epoch. Then,

P_{i}

can be described as

\begin{matrix} P_{i} = \{0, 1, 2, \dots, P_{max}\} \end{matrix}

(4)

where

P_{max}

denotes the maximum number of service requests during the time epoch.

When

Q_{max}

represents the maximum queue length,

Q_{i}

can be represented as

\begin{matrix} Q_{i} = \{0, 1, 2, \dots, Q_{max}\} . \end{matrix}

(5)

4.2. Action Space

The total action space

A

can be defined as

\begin{matrix} A = \prod_{i} A_{i} \end{matrix}

(6)

where

A_{i}

is the action space for the service i.

Since

R_{max}

denotes the maximum resources that can be allocated for the service i,

A_{i}

can be represented as

\begin{matrix} A_{i} = \{0, 1, 2, \dots, R_{max}\} . \end{matrix}

(7)

4.3. Transition Probability

Since T denotes the remained contract time, only when

T = 0

can allocated resource for the service i change based on the chosen action

A_{i}

. That is, the transition of

R_{i}

is affected by T and

A_{i}

. Meanwhile, the popularity of the service i,

P_{i}

, influences the incoming rate of the queue for the service i. In addition, the allocated resources for the service i,

R_{i}

, affects the processing rate of the queue for that service. To sum up, the transition of

Q_{i}

is affected by

P_{i}

and

R_{i}

. The other states transit independently with each other. Thus, the transition probability from the current state S to the next state

S^{'}

can be defined as

\begin{matrix} P [S^{'} | S, A] = P [T^{'} | T] \times \prod_{i} P [R_{i}^{'} | R_{i}, T, A_{i}] \times P [P_{i}^{'} | P_{i}] \times P [Q_{i}^{'} | Q_{i}, P_{i}, R_{i}] \end{matrix}

(8)

where T and

T^{'}

are the current and next states of the remaining contract time. In addition,

R_{i}

,

P_{i}

, and

Q_{i}

(or

R_{i}^{'}

,

P_{i}^{'}

, and

Q_{i}^{'}

) are the current (or next) states for the allocated resources, the popularity, and the queue length for the service i, respectively.

When the current contract time is over (i.e.,

T = 0

), the service provider requests the resources for the next contract period. Therefore, when

T = 0

, the next state for the remained contract time,

T^{'}

, is always

T_{D}

.

\begin{matrix} P [T^{'} | T = 0] = \{\begin{matrix} 1, if T^{'} = T_{D} \\ 0, otherwise . \end{matrix} \end{matrix}

(9)

Meanwhile, when the current contract time is not over (i.e.,

T \neq 0

), the remained contract time decreases one by one. Thus, the corresponding transition probability can be represented as

\begin{matrix} P [T^{'} | T \neq 0] = \{\begin{matrix} 1, if T^{'} = T - 1 \\ 0, otherwise . \end{matrix} \end{matrix}

(10)

The service provider periodically requests

A_{i}

resources for the provision of the service i, and the request period is the contract time

T_{D}

. That is, only when the current contract time is over (i.e.,

T = 0

) does the allocated resources for the service i,

R_{i}

, change according to the requested resources

A_{i}

. Therefore, the corresponding transition probabilities can be described as

\begin{matrix} P [R_{i}^{'} | R_{i}, T = 0, A_{i}] = \{\begin{matrix} 1, if R_{i}^{'} = A_{i} \\ 0, otherwise \end{matrix} \end{matrix}

(11)

and

\begin{matrix} P [R_{i}^{'} | R_{i}, T \neq 0, A_{i}] = \{\begin{matrix} 1, if R_{i}^{'} = R_{i} \\ 0, otherwise . \end{matrix} \end{matrix}

(12)

It is assumed that the service processing rate for the service i with one resource unit follows an exponential distribution with the mean

1 / μ_{i}

. Then, the probability that the service i is processed with the

R_{i}

resources within the time epoch

τ

can be calculated as

R_{i} μ_{i} τ

[20]. Since the processing rate is proportional to the allocated resources

R_{i}

, the probability that the service i is processed within the time epoch

τ

is also proportional to the allocated resources

R_{i}

. Meanwhile, the unit incoming rate

λ

is defined as the rate when the popularity of the service i is one. Then, the probability that the request for the service i with popularity

P_{i}

arrives at the corresponding queue within the time epoch

τ

can be calculated as

P_{i} λ τ

[20]. Based on these probabilities

R_{i} μ_{i} τ

and

P_{i} λ τ

, the corresponding transition probability can be defined as

\begin{matrix} P [Q_{i}^{'} | Q_{i}, P_{i}, R_{i}] = \{\begin{matrix} (1 - P_{i} λ τ) R_{i} μ_{i} τ, & if Q_{i}^{'} = Q_{i} - 1 \\ P_{i} λ τ R_{i} μ_{i} τ + (1 - P_{i} λ τ) (1 - R_{i} μ_{i} τ), & if Q_{i}^{'} = Q_{i} \\ P_{i} λ τ (1 - R_{i} μ_{i} τ), & if Q_{i}^{'} = Q_{i} + 1 \end{matrix} \end{matrix}

(13)

Note that the transition probability of the popularity of the service i (i.e.,

P [P_{i}^{'} | P_{i}]

) can be defined by the popularity estimation model in Section 3.2.

4.4. Cost and Constraint Functions

We define here the cost function to minimize OPEX of the service provider. Since

ρ

is the unit cost for using one resource unit during the contract time, the total cost for the services can be calculated as

ρ \sum_{i} A_{i}

, which can be considered as the cost function

r (S, A)

. That is,

r (S, A) = ρ \sum_{i} A_{i}

.

We define the constraint function

c (S, A)

to maintain the average response time of the services below their requirements. Basically, the average response time of the service i is proportional to the queue length of the service i. That is, the queue length of the service i can represent the response time of the service i. Therefore, the constraint function

c_{i} (S, A)

for the service i can be defined as

c_{i} (S, A) = Q_{i}

.

4.5. Optimization Formulation

The average OPEX of the service provider,

ζ_{O}

, is defined as

\begin{matrix} ζ_{O} = lim_{t \to \infty} \sup \frac{1}{t} \sum_{t^{'}}^{t} E [r (S_{t^{'}}, A_{t^{'}})] \end{matrix}

(14)

where

S_{t^{'}}

and

A_{t^{'}}

are the state and the determined action at

t^{'} \in T

, respectively.

The average response time of the service i,

ψ_{R, i}

, can be represented as

\begin{matrix} ψ_{R, i} = lim_{t \to \infty} \sup \frac{1}{t} \sum_{t^{'}}^{t} E [c_{i} (S_{t^{'}}, A_{t^{'}})] . \end{matrix}

(15)

We can express the CMDP model as follows:

\begin{matrix} min_{π} ζ_{O} \end{matrix}

(16)

\begin{matrix} s . t . ψ_{R, i} \leq θ_{R, i}, for \forall i \end{matrix}

(17)

where

π

is a policy that represents the probabilities of deciding a specific action at each state. Moreover,

θ_{R, i}

indicates the response time requirement of the service i.

To transform the CMDP model to an equivalent LP model, the stationary probabilities of state S and action A,

φ (S, A)

, can be defined as the decision variables for the LP model. Consequently, the LP model can be obtained by

\begin{matrix} min_{φ (S, A)} \sum_{S} \sum_{A} φ (S, A) r (S, A) \end{matrix}

(18)

being subject to

\begin{matrix} \sum_{S} \sum_{A} φ (S, A) c_{i} (S, A) \leq θ_{R, i}, for \forall i \end{matrix}

(19)

\begin{matrix} \sum_{A} φ (S^{'}, A) = \sum_{S} \sum_{A} φ (S, A) P [S^{'} | S, A] \end{matrix}

(20)

\begin{matrix} \sum_{S} \sum_{A} φ (S, A) = 1 \end{matrix}

(21)

and

\begin{matrix} φ (S, A) \geq 0 . \end{matrix}

(22)

The objective function, Equation (18), was used to minimize the average OPEX of the service provider. The constraint in Equation (19) was matched to the constraint of the CMDP model in Equation (17). Equation (20) presents the constraint for the Chapman–Kolmogorov equation. The constraints in Equations (21) and (22) ensure that the probability properties are satisfied.

The above LP model can be solved by a conventional algorithm such as Vaidya’s algorithm [21]. Then, we can obtain the optimal policy with a low complexity, which can be considered as the advantage of the transformation. Specifically, when using Vaidya’s algorithm, the computational complexity is

O (n^{2.5})

, where n denotes the number of decision variables in the model. From the solution

φ^{*} (S, A)

of the LP model, we can obtain the optimal stochastic policy

π^{*} (S, A)

for the CMDP model as

\begin{matrix} π^{*} (S, A) = \frac{φ^{*} (S, A)}{\sum_{A^{'}} φ^{*} (S, A^{'})} for S \in S, \sum_{A^{'}} φ^{*} (S, A^{'}) > 0 . \end{matrix}

(23)

By means of the optimal stochastic policy (i.e., optimal probability distribution), Instead of utilizing the conventional LP solving algorithms, various deep reinforcement learning approaches [22,23] can be used to solve the CMDP model. The service provider requests appropriate resources to the cloud operator at a given state S.

5. Evaluation Results

To show the effectiveness of the PASPF on the average OPEX and the average response time, we designed the following comparison schemes: (1) NON-POP, where the service provider requests the same amount of resources for each service regardless of its popularity (note that the total requested resource amount in the NON-POP is same as that in PASPF); (2) MINIMUM, where the service provider requests the minimum amount of resources for each service; and (3) MAXIMUM, where the service provider requests the maximum amount of resources for each service.

The default values for the parameters are described as follows. The service processing rate and the unit incoming rate were set to 0.05 and 0.1, respectively. The unit cost for using one resource unit during the contract time was 1.

5.1. Effect of $θ_{R}$

Figure 4a,b show the effect of the response time requirement of the service,

θ_{R}

, on the average OPEX of the service provider and the average response time of the service,

ζ_{O}

and

ψ_{R}

, respectively. From Figure 4a, it can be observed that the average OPEX,

ζ_{O}

, of PASPF decreased as

θ_{R}

increased. This can be explained in the following way. A large

θ_{R}

implies that the service does not require a tight response time. ThePASPF recognizes this fact and then requests less resources for the service. Thus, as shown in Figure 4b, a longer average response time of the service is expected. Note that the average response time of the service

ψ_{R}

of the PASPF did not exceed the response time requirement of the service

θ_{R}

. In contrast, the other comparison schemes maintained their policy even when the service required a more relaxed response time. Thus, their

ζ_{O}

and

ψ_{R}

were constant regardless of

θ_{R}

(see Figure 4a,b).

5.2. Effect of $λ$

Figure 5a,b show the effect of the unit incoming rate

λ

on the average OPEX of the service provider

ζ_{O}

and the average response time of the service

ψ_{R}

, respectively. A higher unit incoming rate indicates that more resources should be requested to maintain a sufficiently low response time. However, since the comparison schemes did not consider the incoming rate, their OPEXs remained constant regardless of

λ

(see Figure 5a). On the other hand, the PASPF requested more resources when a higher unit incoming rate was given. Thus, its OPEX slightly increased with the increase in the unit incoming rate. However, the average response time of the PASPF can be maintained below the response time requirement

θ_{R}

(see Figure 5b). Meanwhile, since MINIMUM requests insufficient resources, it had the lowest OPEX (see Figure 5a), but its average response time significantly increased as the unit incoming rate increased (see Figure 5b). Even though MAXIMUM can achieve the lowest average response time of the service (see Figure 5b), its OPEX is highest (see Figure 5a). In addition, the average response time of the NON-POP could be maintained below the response time requirement (see Figure 5b), but its OPEX was higher than those of the PASPF and MINIMUM (see Figure 5a). This means that the NON-POP always requests an excessive amount of resources regardless of popularity.

5.3. Effect of $μ$

Figure 6 shows the effect of the service processing rate with one resource unit,

μ

, on the average response time of the service. From Figure 6, it can be shown that the average response time of the comparison schemes decreased as

μ

increased. This can be explained in the following way. Basically, the comparison schemes request a fixed amount of resources for each service. Thus, the increment of

μ

leads to the shorter average queue length for each service, which can reduce the average response time. Specifically, since MAXIMUM requests a sufficient amount of resources, the increment of

μ

slightly reduced the average response time. On the other hand, due to the insufficient amount of resources in MINIMUM, the increment of

μ

had a significant impact on the average response time.

On the other hand, as

μ

increases, the PASPF requests smaller amount of resources for each service to reduce OPEX. Therefore, its average response time of the service di not decrease with the increase of

μ

.

5.4. Effect of $Q_{max}$

Figure 7 shows the effect of the maximum queue length

Q_{max}

on the average service response time

ψ_{R}

. Basically, as the maximum queue length increased, the average queue length naturally increasds when the same service processing rate was given (i.e., when the same resources were requested). This is because the probability that the service dropped due to the fully-occupied queue decreases. Therefore, from Figure 7, it can be observed that the average service response time of the comparison schemes increased with the increase in

Q_{max}

. Meanwhile, since the PASPF adjusts the amount of requested resources by considering the maximum queue length, its average service response time could be maintained at the same level.

6. Conclusions

This paper designed a popularity-aware service provisioning framework (PASPF), in which a service provider inquires about the popularity variant of services by sending a request to NWDAF. The analytics were then used to decide appropriate the amount of resources requested to the cloud operator for the following billing cycle. To minimize the service provider’s OPEX while keeping the average response time of the services below their requirements, a CMDP was formulated. By transforming the CMDP model into an LP model, the optimal stochastic policy was achieved. The evaluation results show that, compared to a popularity-non-aware scheme, the PASPF can save over

50 %

OPEX on service provider costs while guaranteeing sufficiently low average response times. In our future work, the proposed framework will be extended to consider the dynamicity of the network status, such as bandwidth and congestion, in order to analyze the impact of the dynamicity on performance.

Author Contributions

Conceptualization, H.K.; methodology, H.K.; software, Y.K. (Yumi Kim) and B.K.; validation, H.K. and Y.K. (Yeunwoong Kyung); formal analysis, H.K.; investigation, Y.K. (Yeunwoong Kyung); writing—original draft preparation, H.K. and Y.K. (Yeunwoong Kyung); supervision, Y.K. (Yeunwoong Kyung). All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Research Foundation (NRF) of Korea Grant funded in part by the Korean Government (MSIP) (No. RS-2024-00340698) and in part by a grant from Kyung Hee University in 2023 (KHU-20233236).

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Important notations.

Notation	Description
$S$	Overall state space
$T$	State space for the remained contract time
$R_{i}$	State space for the allocated resources of the service i
$P_{i}$	State space for the popularity of the service i
$Q_{i}$	State space for the queue length of the service i
$T_{D}$	Contract time
$R_{max}$	Maximum resources that can be allocated to one instance for the service i
$P_{max}$	Maximum number of service requests during the time epoch
$Q_{max}$	Maximum queue length
$A$	Overall action space
$A_{i}$	Action space of the service i
$1 / μ_{i}$	Average service processing time of the service i with one resource unit
$λ$	Unit incoming rate of the service
$τ$	Time epoch
$ρ$	Unit cost for using one resource unit during the contract time
$ζ_{O}$	Average OPEX of the service provider
$ζ_{R, i}$	Average response time of the service i
$θ_{R, i}$	Average

References

Gartner Forecasts Worldwide Public Cloud End-User Spending to Supass $675 Billion in 2024. Available online: https://www.gartner.com/en/newsroom/press-releases/2024-05-20-gartner-forecasts-worldwide-public-cloud-end-user-spending-to-surpass-675-billion-in-2024 (accessed on 10 July 2024).
IaaS vs. PaaS vs. SaaS. Available online: https://www.redhat.com/en/topics/cloud-computing/iaas-vs-paas-vs-saas (accessed on 10 July 2024).
PaaS vs. IaaS vs. SaaS vs. CaaS: How are They Different. Available online: https://cloud.google.com/learn/paas-vs-iaas-vs-saas?hl=en (accessed on 24 July 2024).
Dimitri, N. Pricing Cloud IaaS Computing Services. J. Cloud Comput. 2020, 9, 14. [Google Scholar] [CrossRef]
Li, S.; Huang, J.; Cheng, B. Resource Pricing and Demand Allocation for Revenue Maximization in IaaS Clouds: A Market-Oriented Approach. IEEE Trans. Netw. Serv. Manag. 2021, 18, 3460–3475. [Google Scholar] [CrossRef]
Amazon EC2 Spot Instances Pricing. Available online: https://aws.amazon.com/ec2/spot/pricing/?nc1=h_ls (accessed on 10 July 2024).
Wen, Z.; Cala, J.; Watson, P.; Romanovsky, A. Cost Effective, Reliable and Secure Workflow Deployment over Federated Clouds. IEEE Trans. Serv. Comput. 2017, 10, 929–941. [Google Scholar] [CrossRef]
Shi, T.; Ma, H.; Chen, G.; Hartmann, S. Cost-Effective Web Application Replication and Deployment in Multi-Cloud Environment. IEEE Trans. Parallel Distrib. Syst. 2022, 33, 1982–1995. [Google Scholar] [CrossRef]
Menzel, M.; Ranjan, R.; Wang, L.; Khan, S.U.; Chen, J. CloudGenius: A Hybrid Decision Support Method for Automating the Migration of Web Application Clusters to Public Clouds. IEEE Trans. Comput. 2015, 64, 1336–1348. [Google Scholar] [CrossRef]
Wu, Y.; Wu, C.; Li, B.; Zhang, L.; Li, Z.; Lau, F.C.M. Scaling Social Media Applications Into Geo-Distributed Clouds. IEEE/ACM Trans. Netw. 2015, 23, 689–702. [Google Scholar] [CrossRef]
Deng, S.; Xiang, Z.; Taheri, J.; Khoshkholghi, M.A.; Yin, J.; Zomaya, A.Y.; Dustdar, S. Optimal Application Deployment in Resource Constrained Distributed Edges. IEEE Trans. Mob. Comput. 2021, 20, 1907–1923. [Google Scholar] [CrossRef]
Chen, X.; Tang, S.; Lu, Z.; Wu, J.; Duan, Y.; Huang, S.C.; Tang, Q. iDiSC: A New Approach to IoT-Data-Intensive Service Components Deployment in Edge-Cloud-Hybrid System. IEEE Access 2019, 7, 59172–59184. [Google Scholar] [CrossRef]
Chen, F.; Zhou, J.; Xia, X.; Jin, H.; He, Q. Optimal Application Deployment in Mobile Edge Computing Environment. In Proceedings of the IEEE 13th International Conference on Cloud Computing (CLOUD), Beijing, China, 19–23 October 2020. [Google Scholar]
Chen, J.; Wu, H.; Yang, P.; Lyu, F.; Shen, X. Cooperative Edge Caching with Location-Based and Popular Contents for Vehicular Networks. IEEE Trans. Veh. Technol. 2020, 69, 10291–10305. [Google Scholar] [CrossRef]
Deng, S.; Xiang, Z.; Yin, J.; Taheri, J.; Zomaya, A. Composition-Driven IoT Service Provisioning in Distributed Edges. IEEE Access 2018, 6, 54258–54269. [Google Scholar] [CrossRef]
Wu, T.; Fan, X.; Qu, Y.; Yang, P. MobiEdge: Mobile Service Provisioning for Edge Clouds with Time-varying Service Demands. In Proceedings of the IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS), Beijing, China, 14–16 December 2021. [Google Scholar]
3GPP TS 23.288; Architecture Enhancements for 5G System (5GS) to Support Network Data Analytics Services (Release 18). 3GPP: France, Paris, 2024.
IP Network Traffic Flows Labeled with 75 Apps. Available online: https://www.kaggle.com/code/kerneler/starter-ip-network-traffic-flows-49778383-1/notebook (accessed on 10 July 2024).
Ko, H.; Pack, S.; Leung, V. An Optimal Battery Charging Algorithm in Electric Vehicle-Assisted Battery Swapping Environments. IEEE Trans. Intell. Transp. Syst. 2022, 23, 3985–3994. [Google Scholar] [CrossRef]
Ko, H.; Pack, S. Function-Aware Resource Management Framework for Serverless Edge Computing. IEEE Internet Things J. 2023, 10, 1310–1319. [Google Scholar] [CrossRef]
Wikipedia. Linear Programming. Available online: https://en.wikipedia.org/wiki/Linear_programming (accessed on 12 July 2024).
Zhang, L.; Ma, X.; Zhuang, Z.; Xu, H.; Sharma, V.; Han, Z. Q-Learning Aided Intelligent Routing with Maximum Utility in Cognitive UAV Swarm for Emergency Communications. IEEE Trans. Veh. Technol. 2023, 72, 3707–3723. [Google Scholar] [CrossRef]
Zhang, L.; Jia, X.; Tian, N.; Hong, C.; Han, Z. When Visible Light Communication Meets RIS: A Soft Actor–Critic Approach. IEEE Wirel. Commun. Lett. 2024, 13, 1208–1212. [Google Scholar] [CrossRef]

Figure 1. System model.

Figure 2. The estimation result on the popularity variation.

Figure 3. CMDP process.

Figure 4. Effect of the response time requirement of the service. (a) Average OPEX of the service provider. (b) Average response time of the service.

Figure 5. Effect of the unit incoming rate. (a) Average OPEX of the service provider. (b) Average response time of the service.

Figure 6. Effect of the service processing rate with one resource unit on the average service response time.

Figure 7. Effect of the maximum queue length on the average service response time.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ko, H.; Kim, Y.; Kim, B.; Kyung, Y. Popularity-Aware Service Provisioning Framework in Cloud Environment. Appl. Sci. 2024, 14, 8201. https://doi.org/10.3390/app14188201

AMA Style

Ko H, Kim Y, Kim B, Kyung Y. Popularity-Aware Service Provisioning Framework in Cloud Environment. Applied Sciences. 2024; 14(18):8201. https://doi.org/10.3390/app14188201

Chicago/Turabian Style

Ko, Haneul, Yumi Kim, Bokyeong Kim, and Yeunwoong Kyung. 2024. "Popularity-Aware Service Provisioning Framework in Cloud Environment" Applied Sciences 14, no. 18: 8201. https://doi.org/10.3390/app14188201

APA Style

Ko, H., Kim, Y., Kim, B., & Kyung, Y. (2024). Popularity-Aware Service Provisioning Framework in Cloud Environment. Applied Sciences, 14(18), 8201. https://doi.org/10.3390/app14188201

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Popularity-Aware Service Provisioning Framework in Cloud Environment

Abstract

1. Introduction

2. Related Work

3. Popularity-Aware Service Provisioning Framework

3.1. Overall Framework

3.2. Popularity Estimation Model

4. Constrained MDP

4.1. State Space

4.2. Action Space

4.3. Transition Probability

4.4. Cost and Constraint Functions

4.5. Optimization Formulation

5. Evaluation Results

5.1. Effect of $θ_{R}$

5.2. Effect of $λ$

5.3. Effect of $μ$

5.4. Effect of $Q_{max}$

6. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Popularity-Aware Service Provisioning Framework in Cloud Environment

Abstract

1. Introduction

2. Related Work

3. Popularity-Aware Service Provisioning Framework

3.1. Overall Framework

3.2. Popularity Estimation Model

4. Constrained MDP

4.1. State Space

4.2. Action Space

4.3. Transition Probability

4.4. Cost and Constraint Functions

4.5. Optimization Formulation

5. Evaluation Results

5.1. Effect of θ R

5.2. Effect of λ

5.3. Effect of μ

5.4. Effect of Q max

6. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

5.1. Effect of $θ_{R}$

5.2. Effect of $λ$

5.3. Effect of $μ$

5.4. Effect of $Q_{max}$