1. Introduction
A recent study showed that the total amount of Internet traffic will increase three-fold over five years from 2017 to 2022, and video traffic would be growing more sharply than other types of traffic [
1]. More specifically, it was predicted that the video traffic that accounted for about 75% of total Internet traffic in 2017 would increase to about 82% in 2022. It was also revealed by another interesting study that the most popular 50 videos accounted for almost 80% of the total number of views in a major over-the-top (OTT) service provider, YouTube [
2]. Video service providers deploy huge content delivery systems and manage their own systems to meet the increasing demand, but the approach causes a tremendous cost to the companies [
3]. An alternative is to use caching [
4,
5]. If a requested content exists in the cache of a nearby proxy server, clients can receive the cached copy, which typically reduces delivery time. Recently, the concept of caching has been extended to wireless multimedia streaming systems to cope with the sharply increasing data traffic by offloading increasing video traffic or improving the spectral efficiency of wireless networks [
6,
7,
8,
9,
10,
11,
12,
13,
14,
15]. The problem of caching placement was investigated, and some state-of-the-art solutions were introduced in [
6,
7,
8,
9]. A dynamic deep reinforcement learning-based management framework for virtual cache slicing was proposed to manage the limited cache resources when service providers share a common physical infrastructure [
6]. A probabilistic caching placement was investigated to control cache-based channel selection diversity and network interference in a stochastic wireless caching helper network [
7]. A tradeoff between the content diversity gain and the cooperative gain according to content placements was investigated and a probabilistic content placement to optimally balance the tradeoff was proposed [
8]. Device-to-device (D2D) caching network, where each node caches some contents and another node can receive these contents directly from its nearby nodes instead of a remote service provider, was investigated in [
10,
11].
Contrary to previous studies that considered transmitters with cache and studied how to use the transmitters’ cache efficiently, some studies focused on clients with cache [
12,
13,
14,
15,
16,
17,
18,
19]. If a client wants to play a content which has been stored in its cache, the content can be played directly from the cache with no outbound connection. A joint problem of content pushing and recommendation for cache-enabled mobile users was formulated and a reinforcement learning-based framework for maximizing the profit of mobile network operators was proposed in [
12]. Novel transmission techniques based on state-of-the-art interference management schemes such as interference cancellation, zero-forcing, and interference alignment were proposed for cache-aided radio access networks where edge nodes and user equipment are both equipped with cache [
13]. An erasure broadcast network was investigated considering two disjoint sets of receivers: a set of weak receivers with an equal size of cache and a set of strong receivers with no cache [
14]. It was proposed to exploit limited cached packets as side information to cancel incoming interference at the receiver side in a stochastic network where the random locations of base stations and users are modeled by Poisson point process [
15].
Moreover, more advanced transmission schemes that use the contents stored in clients’ cache more efficiently were proposed to improve the efficiency of limited network resource [
16,
17,
18,
19]. If a transmitter has the perfect information of contents stored in clients’ caches, the transmitter can transmit multiple contents requested by different clients at once by using XOR index coding and each client can recover its content by performing XOR operations of the XOR coded data and contents stored in its cache, which can enhance the efficiency of network resource by reducing the number of transmissions [
16]. The exact broadcast rate which had remained unknown was analyzed using linear programs [
17]. A simple inner bound was established on the general index coding problem and it was demonstrated that the inner bound is tight for all index coding problems of up to five messages [
18]. A pruning algorithm that can find the optimal clique cover index code and transmission time allocation for minimizing wireless outage probability was proposed to reduce a computational complexity of a brute-force searching algorithm [
19]. Most of aforementioned studies were devoted to theoretical research to analyze optimal performance or to develop optimal coding schemes, and they are not thus sufficient to be applied to real large-scaled video streaming systems. In addition, the real advantages of the index coding-based video streaming and its compatibility with existing streaming schemes have never been verified.
In this paper, we thus propose a new video streaming system that supports XC. We also propose a new grouping algorithm for creating XC groups while guaranteeing the complete backward compatibility of XC with existing streaming schemes such as UC, MC, and BC. The performance of the proposed system is analyzed and compared with that of a conventional streaming system to verify its gain. We first model the behavior of contents in caches at clients as a Markov chain and derive the steady-state probabilities and caching probabilities for each video to mathematically analyze the statistical characteristics of caches at clients in video streaming systems. Based on the statistical characteristics of caches, the performance of the proposed system is analyzed in terms of the average number of connections that each client requires in order to receive one video content. The proposed scheme can be widely applied for various types of networks including mobile networks and sensor networks with video streaming.
The rest of this paper is organized as follows. In
Section 2, a video streaming system where clients are equipped with cache, and we derive the steady-state probabilities and caching probabilities for each video based on a Markov chain. A new video streaming system for supporting XC is proposed and a grouping algorithm for the proposed streaming system is also described in
Section 3. Numerical results are shown in
Section 4, and the conclusions of this paper are drawn in
Section 5.
2. A Streaming System Using Clients’ Cache
We investigate a video streaming system with a single streaming server and
N clients, which is illustrated in
Figure 1. The streaming server and each client are all equipped with cache that can store up to
V and
C videos
, respectively. It is assumed that all videos have the same size.
denotes the cache of client
n and
If each client plays video contents independently of others, the statistical characteristics of cache at each client will be identical and hence, we omit the index
n for a simplicity of the notation, now and hereafter.
denotes the video that is stored in the
s-th memory of a cache
, where
. Thus,
and
indicate the most recently used video and the least recently used video, respectively.
denotes a set of videos used later than
and is given by
denotes a set of videos used earlier than
and is given by
and can be also obtained by
.
The normalized popularity of a video that ranks
v-th can be predicted by Zipf distribution [
20,
21], which is given by
where
satisfies
and
is the exponent characterizing Zipf distribution [
22]. Zipf distribution with
turns into a uniform distribution, by which all
V videos have the same normalized popularity,
. We use the least recently used (LRU) algorithm as a cache replacement strategy [
23], by which clients’ caches are autonomously updated without consuming extra network resource after streaming a video content. If a client wants to play a video content
v that is not stored in the cache, which denotes
, the content
v will be streamed by one of MC, XC, or UC. After playing the content
v, LRU discards
, which is the least recently used one, and the content
v will be stored in the first place of the cache after shifting all contents in the cache to the right. Thus,
will be updated as
. On the other hand, if a client wants to play a content
v stored in the
s-th place of the cache, which denotes
, the content
v will be directly played by local playing (LC) from the cache without using any network resource for streaming. Then, the content
v will be re-located to the first place of the cache to ensure that the most recently used content will stay in the cache the longest. The contents in
will be only shifted to the right, while there will be no change on contents in
. Thus,
will be updated as
. The case of
denotes that the same content has been being played twice in a row and there is then no change on the cache.
A certain content
might be stored in the client’s cache or might not be stored. In the case that content
k is stored in the cache, the content
k can have
C states based on its stored location. Thus, each content
k can have total
states in
. The
-th state denotes that the content is not stored in the cache. The behavior of content
k in the cache can be modeled by a Markov chain, as depicted in
Figure 2. Transition probability
denotes a probability that a content
k staying at state
s moves to state
j after a client plays the content
v.
For the case of
, a content
k staying in the first place has two possibilities of transition. If
, which means that a client plays the content
k by using LC, the video
k will be still staying in the first place with no transition. Otherwise, it will move to the second place. Two transition probabilities are thus given as
For the case of
, a content
k has three possibilities of transition. If
, then the content
k will be played by LC and move to the first place after being played. If
, the content
k will have no transition. Finally, if
, the content
k will be shifted to the right and thus move to the
-th place. The corresponding transition probabilities are given as
where
denoting the index set of contents except for
k can be obtained by
and
denotes a set that consists of all possible permutations of choosing
n objects from a set
. Thus,
denoting a set of all possible
permutations of
contents that can be chosen from the set
represents all possible combinations of
. For the case of
, the content
k has two cases of transition. If
, the content
k move to the first place after being played. Otherwise, the content
k will be discarded from the cache. The corresponding transition probabilities are given as
A square matrix consisting of state transition probabilities for the content
k is defined as
the size of which is
. Let
be a steady-state probability that the content
k is staying in state
i at an arbitrary moment. Then,
denoting the vector consisting of steady-state probabilities for all states for the content
k can be obtained by solving the following equation [
24]:
The solution of (
14) can be obtained by several approaches and we, in this paper, use a pseudo inverse-based approach [
24]. The equation in (
14) can be rewritten as
where
and
denote an identity matrix with the same size as
and a row vector with
zeros, respectively. The constraint in (
15) can be incorporated into the equation as
by appending
, which is a column vector with
ones, to the right of the matrix
and 1 to the right of the vector
, respectively. Let
and
, then (
16) can be rewritten as
and
can be obtained as
When a client wants to play the video content
k at an arbitrary moment, the probability that the content is stored in the cache is calculated as
while the probability that the content
k is not stored in the cache at the arbitrary moment is given as
that can be calculated by
.
Figure 3 shows probabilities
’s for various
k values,
,
, and
. For
,
’s are the same as normalized popularity values given by Zipf distribution because the current content stored in a cache is the content that a client requested according to Zipf distribution. As
C increases, all
’s increase regardless of the ranks of contents,
k. As explained in (
3),
makes all contents have the same popularity and
’s are thus all the same for all
k’s. As
increases,
’s of upper rank contents increases while
’s of lower rank contents decrease.
3. Proposed Streaming Scheme Using XOR-Based Coding
The XOR operation has the zero-identity, by which
is satisfied for a given bit stream
, where
is the bit stream consisting of 0 and it has the same length as
. It also has the property of self-inverse satisfying
. In addition, XOR is not only commutative, but also associative.
Figure 4 illustrates a simplified concept of XC which is a video streaming scheme using XOR coding based on the four aforementioned properties. Client 1 wants to stream
with
cached in
, and client 2 wants to stream
with
cached in
. Contrary to a conventional scheme where a streaming server transmits
and
by UC using two separate connections, XC generates the coded bit stream
and transmits it by using a single connection. Then, client 1 and 2 reconstruct
and
from
by using their cached data as
respectively. In this section, we propose a new transmission algorithm for supporting XC while guaranteeing a backward compatibility with conventional streaming schemes such as LC, UC, MC, and BC, that have been widely used to transmit video contents.
3.1. A Case Study with Two Clients ()
To clearly explain the proposed transmission scheme, we first consider a special case with two clients, i.e.,
, and the proposed transmission scheme will be then extended to the general case to support multiple clients, i.e.,
, in the following subsection. Suppose that clients 1 and 2 want to play
and
, respectively. Each client first checks if the content to play is stored in its cache before requesting the content to a streaming server. If the content is stored in its cache, the client plays the content directly from the cache by LC without an outbound connection. Otherwise, the client transmits a request message including the information of the content to request and the contents of its own cache to a streaming server. Let
be the probability that the number of connections required to transmit
i and
j is equal to
n for given
and
, where
and
. For the case of
and
, no connection is required to transmit
and
. Thus,
can be defined as
and the average total probability that the number of required connections is 0 can be calculated as
by averaging
over all combinations of
i and
j.
For given
i and
j, the probability
is defined as
where Equation (
22a) denote the probability that only one client is supported by UC because only one of
and
is in their own cache. Equation (
22b) is the probability that clients are supported by MC because they request the same content, i.e.,
, which is not in the clients’ caches. Equation (
22c) is the probability that clients are supported by XC because each client does not have the own request content, i.e.,
and
, but has the content requested by the other client in cache,
and
. In this case, clients 1 and 2 can reconstruct
and
from
, respectively. Then, the average total probability that the number of required connections is 1 can be calculated as
by averaging
over all combinations of
i and
j.
Finally, for given
i and
j,
is defined as
where
denotes the complement of a set
A. Then, the average total probability that the number of required connections is 2 can be calculated as
by averaging
over all combinations of
i and
j. In this case, clients are supported by UC through two independent connections, respectively.
3.2. A Transmission Algorithm Extended for Multiple Clients ()
In this subsection, we propose a transmission algorithm by extending the transmission for two clients described in the previous subsection in order to support XC for clients, while guaranteeing a backward compatibility with conventional streaming schemes such as LC, UC, and MC. Our algorithm consists of two phases. In the first phase, a client is categorized into LC or MC based on the requesting content, , and cache status, , or proceeds to the second phase. The clients satisfying will be served by LC and become the elements of , which is a set of clients served by LC. The remaining clients with , given by , transmit a request message to a streaming server including a content to play and cache information. If two or more clients request the same video content, they are all categorized into an MC group. The MC group becomes an element of , which is a set consisting of MC groups, and the cardinality of is because an MC group consists of at least two clients. is composed of all clients included in .
In the second phase, the clients who are not included in either
or
, given by
, are categorized into XC or UC groups. A client
n in
is first checked to see if it can be served by XC. For an existing XC group
S, the client
n can join the existing XC group
S only if the following conditions are satisfied:
If there is no existing XC group or the client
n fails to join any existing XC group, the client
n attempts to create a new XC group with another client
m who is included in
and satisfies
This process is repeated for all clients in .
For the case that the client n succeeds to join the existing XC group or to create the new XC group, the client n becomes an element of , which is a set consisting the clients in XC groups. Finally, the clients who are not included in either , , or are included into and will be served by UC through separate connections. The proposed video streaming scheme using XC is explained in detail in the Algorithms 1 and 2.
Algorithm 1 Proposed Video Streaming: |
Phase 1. LC and MC |
1: Initialization: |
2: ▹ Sets of clients |
3: ▹ Set of MC groups |
4: Obtain a set of candidate clients for LC: |
5: |
6: for do |
7: if then |
8: |
9: end if |
10. end for |
11: Obtain a set of candidate clients for MC: |
12: |
13: while do |
14: |
15: |
16: for do |
17: if then |
18: |
19: end if |
20: end for |
21: if then |
22: |
23: |
24: |
25: end if |
26: end while |
Algorithm 2 Proposed Video Streaming: |
Phase 2. XC and UC |
27: Initialization: |
28: ▹ Set of clients |
29: ▹ Set of XC groups |
30: Obtain a set of candidate clients for XC: |
31: |
32: |
33: while and do |
34: |
35: ▹ A client to check for XC. |
36: if then ▹ XOR group already exists. |
37: for do ▹S is a set. |
38: |
39: for do |
40: j *= |
41: end for |
42: if then |
43: ▹ Client u is added into S. |
44: |
45: |
46: break |
47: end if |
48: end for |
49: end if |
50: if ( or then |
51: for do |
52: if then |
53: ▹ A new group of XC |
54: |
55: |
56: break |
57: end if |
58: end for |
59: end if |
60: end while |
61: |
4. Numerical Results
In this section, the performance of video streaming scheme using XC is analyzed in terms of average connections required to serve each client’s video play, and compared with that of a conventional scheme without XC. Let
denote the number of the required connections of the video streaming schemes. For the proposed scheme using XC,
can be calculated as
For the conventional scheme without XC, we need to allocate a dedicated connection to every client in
. Thus,
for the conventional scheme can be calculated as
Please note that
is always satisfied because
and
is only valid when
. We define the average normalized load as the average number of required connections per each client’s video view, which is denoted by
and can be calculated for the proposed and conventional schemes, respectively, as
All numerical results were obtained through mathematical analysis given in
Section 3.1 and verified by Monte-Carlo simulations using a Python-based simulator. In the figures, the lines and markers denote numerical results from mathematical analysis and Monte-Carlo simulations, respectively. In simulations, the number of iterations for averaging the effect of many random variables is 100,000.
Figure 5 shows the average normalized load
of the proposed streaming scheme using XC and the conventional scheme without XC for various values of cache length. The numbers of clients and video contents are given by
and
, respectively, and the parameter
of Zipf distribution for contents’ normalized popularity is 0, 0.5, 1.0, or 1.5. The figure shows that our numerical results obtained by mathematical analysis are consistent with those from simulations. The proposed and conventional schemes can both reduce the average load as the parameter
of Zipf distribution increases. As
increases, the deviation of popularity among contents is expanded and thus every client is more likely to play the popular contents, which indicates that the probabilities that the popular contents are cached increase while the probabilities that less popular contents are cached decrease. As a result, an increasing
can reduce the average normalized load because the hit probability, which is the probability that the contents to play are cached in their own caches, increases. As the length of cache,
C, increases, the probabilities that contents are cached increase regardless of popularity. Thus, for given
, an increasing
C also reduces the average normalized load. To be more specific, when
and
C increases from 1 to 5, the average normalized load decreases from 0.7 to 0.3 for both proposed and conventional schemes. In this figure, although the proposed scheme using XC outperforms the conventional scheme without XC, the gap of the performance is marginal, because when the number of clients is small
, the frequency of using XC decreases and hence, the gain achieved by XC is limited.
In
Figure 6, we compare the performance of the proposed scheme to the conventional scheme according to the number of clients,
N.
Figure 6a shows the average normalized load of the proposed and conventional schemes by varying
N from 2 to 10 when
,
and
. Here,
means that Zipf distribution become Uniform distribution and all video contents thus have the same popularity. As
N increases, the frequency of using MC for the conventional scheme and the frequency of using MC or XC for the proposed scheme both increase. Thus, the average normalized load decreases with an increasing
N for both the proposed and conventional schemes, as predicted in
Figure 5. In addition, the average load also decreases with an increasing
C because the frequency of using LC for the conventional scheme and the frequency of using LC or XC for the proposed scheme both increase as
C increases.
Figure 6b shows how much the proposed scheme can reduce the average normalized load by using XC, compared to the conventional scheme. The percentage of average load reduction by the proposed scheme is calculated as
. In this figure, we can observe that for all
N and
C values, the proposed scheme outperforms the conventional scheme due to the gain obtained from XC, and thus
. The frequency of using XC in the proposed scheme increases as
N or
C increases and thus,
increases with increasing
N or
C. As a specific example with
and
, the proposed scheme can reduce the average load about
, compared to the conventional scheme.
In
Figure 7, according to
N, we plot the average normalized load in
Figure 7a and the percentage of average load reduction,
, in
Figure 7b, respectively, when
and
. In this figure,
is set to
to let all videos have different popularity from each other. For the case of
, the frequency of viewing top three out of 20 videos accounts for about
. The average normalized load is reduced more for both proposed and conventional schemes as
increases from 0 to
, as shown in
Figure 6a and
Figure 7a. As
increases, the deviation in popularity among video contents increases, which can yield the same effect as increasing
N or
C. By comparing
Figure 6a and
Figure 7a, we can observe that as
increases from 0 to
, the average load of the proposed scheme decreases from
to
when
and
. In terms of the percentage of average load reduction, by comparing
Figure 6b and
Figure 7b, we can observe that
is determined by
C and
N as well as
. For the case of
, when
C is relatively large, e.g.,
,
decreases from
to
as
increases from 0 to
. In this case, however, when
or 3,
with
is greater than
with
for most
N values. For example, when
and
,
increases from
to
as
increases from 0 to
. In the proposed scheme, LC and MC are given a higher priority than XC. Therefore, as
C,
N, and
all increase, the probability of using LC or MC increases and the gain achieved from XC decreases slightly. Even though we used small values for
N,
V, and
C to simplify computer simulations, the proposed scheme can be applied to real large-scaled video stream systems.