1. Introduction
In the field of combinatorial optimization, there are numerous problems, such as the vertex cover problem [
1,
2], the set cover problem [
3,
4], the machine scheduling problem [
5,
6], the allocation problem [
7], the mobile crowdsensing services problem [
8], the cloud-edge collaborative computation offloading problem [
9], and the facility location problem, among others. The facility location problem has been reviewed in various aspects in recent years, such as the various applications of location models [
10], applications and methods for facility location models [
11], the synthesis and survey of location analysis [
12], facility location models in the context of supply chain management [
13], the service facility location problems [
14], facility location problems for drone (uncrewed vehicle) delivery [
15], and so on.
The facility location problem, as a classical problem in the field of combinatorial optimization, is widely applied in various areas such as communication technology, economic management, traffic governance, and public services, among others. For example, the location problems, such as location decision making for development zones and industrial parks, the positioning of public service sites like logistics centers and gas stations, as well as the selection of locations for facilities like communication base stations and self-service facilities, are typical location problems encountered in everyday life. In real-world scenarios, common facilities generally include factories, base stations, schools, hospitals, supermarkets, post offices, warehouses, proxy servers, express delivery stations, sensors, and other entities [
16,
17,
18,
19].
There is a long history for the facility location problem. In ancient times, facility siting decisions predominantly relied on institutional frameworks and empirical knowledge rather than rigorous scientific methodologies. A pivotal transition occurred in 1909 with the first scientific treatise on the facility location problem by German scholars [
20], marking its emergence as a rigorous research domain. As a typical NP-hard problem in the field of combinatorial optimization, the facility location problem has since profoundly influenced management science, operational research, and computational intelligence, attracting sustained scholarly attention globally. A classic application scenario is the facility location problem in a logistics network. The decision maker needs to determine the optimal warehouse construction solution from multiple candidate locations to achieve service coverage for decentralized demand points. The problem has a dual cost structure: on the one hand, each candidate warehouse location involves a corresponding fixed construction cost; on the other hand, the transportation of goods between a facility and a demand node incurs a variable transportation cost, the value of which is closely related to the spatial distribution of customer points. The goal is to select locations for building warehouses and allocating client demands such that construction and transportation costs are minimized.
Formally, the classical facility location problem is defined over a set of facilities and a set of clients. Each client has a demand , and each facility has an open cost . The unit connection cost between client and facility is . The problem is to open a subset of facilities and connect each client to an open facility such that the total cost, including open cost and connection cost, is minimized. This problem is also known as the uncapacitated facility location (UFL) problem.
Approximation algorithms constitute the principal methodologies for addressing NP-hard problems such as the facility location problem. The precise notion of approximation was initially proposed in the context of multiprocessor scheduling and bin packing [
21].
Definition 1 ([
22])
. Suppose Π is a minimization problem, which consists of instances and feasible solutions to these instances, and is a polynomial-time algorithm for solving Π. The algorithm is a ρ-factor approximation for Π, where ρ is referred to as the approximation ratio if, for every instance , the solution returned by satisfieswhere denotes the cost function and . The approximation ratio can objectively evaluate the quality of solutions provided by approximation algorithms, that is, the degree of closeness between the approximation solution and the optimal solution. The closer the approximation ratio is to 1, the better.
Recall that, in the common facility location problem, the objective function value of a feasible solution is usually composed of two parts: open cost and connection cost. Based on this, the concept of the bifactor is introduced, which enables the precise quantification of the approximation algorithm performance.
Definition 2 ([
23])
. Suppose Π is a facility location problem and is a polynomial-time algorithm for solving Π. The algorithm is a -factor approximation algorithm where is referred to as the bifactor if, for every instance , the open cost and the connection cost of the solution returned by satisfieswhere and are the open cost and connection cost of the optimal solution for instance , respectively. The concepts of approximation schemes, the polynomial-time approximation scheme (PTAS) and fully polynomial-time approximation scheme (FPTAS), are introduced below, and will be referenced in subsequent sections.
Definition 3 ([
24])
. Suppose Π is a minimization problem.An approximation scheme for problem Π is a family of -approximation algorithms for problem Π over all .
A polynomial-time approximation scheme (PTAS) for problem Π is an approximation scheme whose time complexity is polynomial in the input size.
A fully polynomial-time approximation scheme (FPTAS) for problem Π is an approximation scheme whose time complexity is polynomial in the input size and also polynomial in .
The notions of O, , and are introduced below. O provides an upper bound on the order of growth of a function, provides a lower bound on the order of growth of a function, and provides both an upper and lower bound on the order of growth of a function. The formal definitions are as follows.
Definition 4 ([
25])
. Suppose and are two positive valued functions. if there exist positive constants and such that for all ;
if there is constant and such that for all ;
if there are constants , and such that for all .
Hochbaum [
26] reduced the UFL problem to the set cover problem and proposed a greedy algorithm achieving an approximation ratio of
. Based on Feige’s results [
4], it has been established that, under the assumption that P ≠ NP, the lower bound for the approximation ratio of this problem is
.
To achieve enhanced approximation guarantees, the facility location problem is studied in metric space, where the connection cost between facility i and client j is metric.
Definition 5. The connection cost is metric, if it satisfies the following properties:
- (1)
is non-negative for any , , i.e., ;
- (2)
is symmetric for any , , i.e., ;
- (3)
satisfies triangle inequality in , i.e., for any , it holds that
In fact, for property (3), the following quadrilateral inequality is often used in most algorithm analyses (see
Figure 1), i.e., for any facilities
and clients
, it holds that
In the following, unless explicitly stated otherwise, all subsequent discussions are confined to the metric space.
For the metric UFL problem, Shmoys et al. [
27] established the first constant-factor approximation algorithm achieving a 3.16-approximation ratio using the same techniques of Lin and Vitter [
28]. Since then, the ratio has been continually improved by the LP-rounding technique, local search technique, dual-fitting technique, and primal–dual technique [
23,
29,
30,
31,
32,
33,
34,
35,
36,
37,
38]. Currently, the best approximation ratio is 1.488 [
39], achieved by combining the LP-rounding and dual-fitting techniques. For the lower bound of the approximation ratio for the metric UFL problem, Guha and Khuller [
33] demonstrated the inapproximability below a 1.463 factor, unless
. Subsequently, Sviridenko [
40] strengthened this conclusion to the condition that
.
Furthermore, a general concept of metric, the
p-th power metric space, is introduced in [
41]. Specifically, the mathematical definition is presented as follows.
Definition 6 ([
41])
. For any positive integer p, the distances form a p-th power metric, if they are non-negative, symmetric, and obeying the following p-th power relaxed triangle inequality. That is, for any facilities and clients , it holds that It is easy to know that the metric is a particular case when , and the squared metric is another particular case when .
In recent years, a multitude of variants on the facility location problem have emerged, and several important ones are introduced below.
The first important variant of the facility location problem is the prize-collecting UFL problem, in which a penalty function
is defined over the set of clients. The term “penalty” originates from the consideration of the impact of distant clients on the solution structure, whereby certain clients may remain unserved at the cost of incurring a specific penalty. Formally, we are given a set
of facilities and a set
of clients. Each client
has a demand
, and each facility
has an open cost
. The unit connection cost for connecting client
to facility
is
, which is metric. For any client subset
, there exists a penalty function
satisfying the following conditions: (1)
; (2)
is nondecreasing. The function
is termed submodular if it satisfies the following inequality:
The prize-collecting UFL problem is to open a subset of facilities such that each client is either served by an open facility or penalized, with the objective of minimizing the total cost which includes open cost, connection cost, and penalty cost. When the penalty function
is submodular, the prize-collecting UFL problem is referred to as the UFL problem with submodular penalties (UFLPSP). This problem was first proposed by Hayrapetyan et al. [
42], who demonstrated that, if there exists an LP-based
-approximation algorithm for the UFL problem, then a
-approximation algorithm can be obtained for UFLPSP. Consequently, according to [
39], a 2.488-approximation algorithm can be obtained. However, the algorithm proposed in [
42] suffered from high computational complexity due to its reliance on the convexity of this problem and the use of an ellipsoid algorithm. Subsequently, Chudak and Nagano [
43] developed an efficient
-approximation algorithm for UFLPSP by employing convex relaxation instead of linear program relaxation, where
is the approximation factor of an LP-based approximation for the UFL problem. Consequently, a faster
-approximation algorithm can be obtained. Additionally, Du et al. [
44] proposed a 3-approximation algorithm based on the primal–dual technique. Li et al. [
45] further improved the approximation ratio to 2.375 by integrating the primal–dual technique with a greedy augmentation approach. The best approximation ratio for UFLPSP is 2, achieved by Li et al. [
46] using the LP-rounding technique. They provided a general framework for addressing the submodular penalty covering problem. Moreover, Li et al. [
47] investigated the
k-level UFL problem with submodular penalties (
kL-UFLPSP) and proposed a 6-approximation algorithm based on the primal–dual technique. Later, Li et al. [
48] developed an LP-based approximation algorithm with the ratio of
for
kL-UFLPSP. Zhang et al. [
49] proposed a primal–dual greedy augmentation approximation algorithm with a ratio of 2.9444, which is the best result for
kL-UFLPSP. Xu et al. [
50] extended the problem to the stochastic UFL problem with submodular penalties (SUFLPSP) and gave a 3-approximation algorithm based on the primal–dual technique.
While the penalty function
is modular when we replace the inequality “≤” with the equality “=” in (
1). It is evident that the modular function is a special case of the submodular function. When the penalty function
is modular, the prize-collecting UFL problem is referred to as the UFL problem with linear penalties (UFLPLP). This problem was first proposed by Charikar et al. [
51], who developed a 3-approximation algorithm by the primal–dual technique. Xu and Xu [
52] proposed an algorithm achieving an approximation ratio of
through the LP-rounding technique. Geunes et al. [
53] established a theoretical connection between the UFL problem and UFLPLP, demonstrating that, if an
-approximation algorithm exists for UFL problem, then a
-approximation algorithm can be derived for UFLPLP. They further obtained a 2.056-approximation algorithm for UFLPLP. Jain et al. [
23] integrated dualfitting with factor-revealing techniques to obtain a 2-approximation algorithm. Xu and Xu [
54] combined the primal–dual and local search techniques to propose a 1.8526-approximation algorithm. The best approximation ratio currently is 1.5148, achieved by Li et al. [
46] through an LP-rounding technique based on non-uniform distribution parameters. Moreover, Bumb [
55] investigated the
k-level UFL problem with linear penalties (
kL-UFLPLP) and devised a 6-approximation algorithm-based primal–dual technique. Later, Asadi et al. [
56] proposed an LP-based 4-approximation algorithm for
kL-UFLPLP. Li et al. [
48] further improved the approximation ratio to 3 using the LP-rounding technique. Wang et al. [
57] studied the
k-UFL problem with linear penalties (
k-UFLPLP) and developed a local search-based approximation algorithm with a ratio of
. San et al. [
58] introduced the online prize-collecting UFL (OPC-UFL) problem and proposed a primal–dual
-competitive algorithm. Wu et al. [
59] investigated the two-stage stochastic UFL problem with linear penalties and presented an LP-rounding algorithm with a per-scenario constant approximation bound 3.0294. Wang et al. [
60] studied the
k-level UFL game with penalties, formulated a cost-sharing scheme for this problem, and demonstrated the approximate cost recovery was 6.
For ease of understanding, we organize the approximation ratios of the algorithms for prize-collecting UFL problem (and its variants) and the methods used in
Table 1.
The second important variant of the facility location problem is the robust UFL (R-UFL) problem, also referred to as the UFL with outliers. In the classical facility location problem, serving a small number of distant clients may disproportionately affect the solution cost. By contrast, excluding these clients from service can substantially reduce the total cost, which motivates the formulation of the robust UFL problem. Unlike the prize-collecting UFL problem, the robust UFL problem imposes a strict constraint that at most
L clients may remain unserved (without incurring penalty costs), and these
L clients are termed outliers. Formally, in the robust UFL problem, we are given a set
of facilities and a set
of clients. Each client
has a demand
, and each facility
has an open cost
. The unit connection cost for connecting client
to facility
is
, which is metric. Given an integer
, the robust UFL problem is to open a subset
of facilities and find an outlier subset
with
, such that each client is either served by an open facility or assigned to the outlier set, with the objective of minimizing the sum of the open cost and connection cost. For the robust UFL problem, Charikar et al. [
51] first proposed this problem, showing that the natural linear program has an unbounded integrality gap. They further devised a 3-approximation algorithm based on the primal–dual technique. Jain et al. [
23] advanced this line of research by integrating dual-fitting and factor-revealing techniques to achieve a 2-approximation algorithm. For the uniform costs case, Friggstad et al. [
61] established that a multiswap simple local search heuristic yields a PTAS for the doubling metric (including fixed-dimensional Euclidean metrics) or the shortest path metrics of graphs from a minor-closed family of graphs. Moreover, Luo et al. [
62] introduced the priority UFL problem with outliers (PUFLPO), which generalizes both the robust UFL problem and the priority facility location problem (like [
63]). They proposed a 3-approximation algorithm using the primal–dual technique. Han et al. [
64] extended the framework to the
k-level UFL problem with outliers (
kL-UFLO) and developed a 6-approximation algorithm by the primal–dual technique. Inspired by the study of [
64], Zhang [
65] investigated the
k-level squared metric UFL problem with outliers (
kL-SMUFLPO) and derived a 32-approximation algorithm based on the primal–dual technique.
For ease of understanding, we organize the approximation ratios of the algorithms for robust UFL problem (and its variants) and the methods used in
Table 2.
The third important variant of the facility location problem is the
k-UFL problem, also referred to as the uncapacitated
k-facility location problem, where the number of facilities that can be opened is constrained to at most
. The primary distinction between the UFL problem and the
k-UFL problem is the number of open facilities. Intuitively, when
, the
k-UFL problem is exactly the UFL problem. Formally, in the
k-UFL problem, we are given a set
of facilities and a set
of clients. Each client
has a demand
, and each facility
has an open cost
. The unit connection cost between client
and facility
is denoted by
, which is metric. Given an integer
, the
k-UFL problem is to open at most
k facilities and connects each client to an open facility such that the sum of the open cost and connection cost is minimized. For the
k-UFL problem, Jain and Vazirani [
34] employed the primal–dual technique and Lagrangian relaxation to derive a 6-approximation algorithm. They also implicitly demonstrated that any Lagrangian multiplier preserving (LMP)
-approximation algorithm for the UFL problem can be extended to yield a
-approximation algorithm for the
k-UFL problem. Jain et al. [
23] later improved this approximation ratio to 4 by using a scaling method and the conclusion of [
66]. Zhang [
67] proposed an approximation algorithm with a ratio of
based on the local search technique. Charikar and Li [
68] developed a 3.25-approximation algorithm using the LP-rounding technique. Kong and Zhang [
69] proposed a
-approximation algorithm for the
k-UFL problem by employing a new sampling-based technique to estimate the locations of the facilities opened in optimal solutions. Moreover, Jain and Vazirani [
34] proposed a more general model, the squared metric
k-UFL (SM
k-UFL) problem, where the connection cost
is defined as the squared metric distance between
i and
j. They obtained a
-approximation algorithm by combining Lagrangian relaxation with primal–dual techniques. Zhang et al. [
70] further proposed an approximation algorithm with a ratio of
for the SM
k-UFL problem based on the local search technique. Zhang et al. [
71] proposed a
-approximation algorithm for the SM
k-UFL problem within the framework of Lagrangian relaxation. Note that there is a big difference in the ratio between k-UFL and SMk-UFL; this is because the connection costs in the SMk-UFL problem do not satisfy the triangle inequality. Wang et al. [
72] extended the problem to the squared metric
k-UFL problem with linear penalties (SM
k-UFLPLP) and derived an approximation algorithm with a ratio of
using the local search technique.
For ease of understanding, we organize the approximation ratios of the algorithms for
k-UFL problem (and its variants) and the methods used in
Table 3.
The fourth important variant of the facility location problem is the
k-level UFL (
kL-UFL) problem, also referred to as the multi-level UFL problem, where
k represents the number of facility levels. Notably, when
, the
k-level UFL problem is exactly the UFL problem. Formally, in the
k-level UFL problem, we are given a set of facilities organized into
k hierarchical levels, denoted as
, and a set
of clients. Each client
has a demand
, and each facility
has an open cost
. Let
. A path
is termed open if and only if each facility
is open. The unit connection cost between client
and the first-level facility
is denoted by
, and the unit connection between a facility at
level and a facility at
l level is denoted by
, both of which are metric. The unit connection cost of client
to any path
is denoted as
, which is metric in
. The
k-level UFL problem is to open a subset of facilities and connect each client to a path along open facilities, such that the total cost, including the open cost and connection cost, is minimized. For the
k-level UFL problem, Meyerson et al. [
74] proposed the first approximation algorithm with a ratio of
, where
. Bumb and Kern [
75] developed a 6-approximation algorithm based on the primal–dual technique. Ageev et al. [
76] introduced a path reduction technique that reduces the
k-level UFL problem to the UFL problem, achieving a 3.27-approximation algorithm. Aardal and Chudak [
77] formulated a linear program with multiple index variables for the
k-level UFL problem and derived a 3-approximation algorithm based on the LP-rounding technique. Subsequently, Gabor and van Ommeren [
78] introduced a new linear program with polynomial variables for the
k-level UFL problem and achieved the same approximation ratio of 3 based on the LP-rounding technique. Byrka et al. [
79] proposed a new integer program based on the forest structure characteristics of the optimal solution, yielding an
-approximation algorithm, where
is monotonically increasing with respect to
k and satisfies
. They further applied the randomization to improve the approximation ratio for all
, obtaining 1.97, 2.09, and 2.19 for
k = 3, 4, and 5, respectively. The best ratio for
currently is 1.77, achieved by Zhang [
80] through a quasi-greedy technique. For the lower bound of the
k-level UFL problem, Krishnaswamy and Sviridenko [
81] established that no polynomial-time approximation algorithm can achieve performance guarantees better than 1.539 for the 2-level UFL problem and 1.61 for the
k-level UFL problem unless
. Moreover, Wang et al. [
82] investigated the dynamic
k-level UFL (D
kL-UFL) problem, developing a combinatorial primal–dual approximation algorithm with a ratio of 6. They also designed a 6-approximation algorithm for the D
kL-UFL problem with submodular penalties (D
kL-UFLPSP) and another 6-approximation algorithm for the D
kL-UFL problem with outliers (D
kL-UFLPO).
For ease of understanding, we organize the approximation ratios of the algorithms for
k-level UFL problem (and its variants) and the methods used in
Table 4.
The UFL problem serves as a foundational model for optimizing facility placement and client assignment to minimize costs. However, real-world applications often involve resource-constrained facilities, such as storage limits in warehouses, bandwidth thresholds in communication networks, and service capacity ceilings in public infrastructure. These constraints render the traditional UFL model inadequate for practical implementation, necessitating more generalized frameworks. Capacity constraints, a hallmark of resource-limited scenarios, have been extensively studied across combinatorial optimization problems, including the capacitated vertex cover problem [
83,
84,
85,
86], the capacitated set cover problem [
87,
88,
89], and the capacitated allocation problem [
90]. To enhance the practical relevance of facility location models, capacity constraints have been integrated into the framework, leading to the development of the universal facility location (Uni-FL) problem. Introduced by Mahdian and Pál [
91], the Uni-FL problem offers a more flexible and comprehensive approach by incorporating both capacitated (with hard and soft constraints) and uncapacitated cases. Advancing the theoretical understanding and algorithmic solutions for the Uni-FL problem can significantly improve the practicality and efficiency of facility location strategies.
This paper systematically examines the Uni-FL problem and its special cases, including the hard capacitated facility location (HCFL) problem and the soft capacitated facility location (SCFL) problem. Unlike previous works that primarily focus on either uncapacitated or capacitated versions in isolation, this research provides a unified perspective on these models. By surveying existing approximation algorithms and theoretical advancements, it highlights the connections between different problem variants. Furthermore, this work identifies open research questions and proposes future directions, contributing to the ongoing development of more efficient and applicable facility location algorithms. The remainder of this paper is organized as follows.
Section 2 introduces the Uni-FL problem with its relevant work.
Section 3 and
Section 4 discuss two special cases of the Uni-FL problem, the hard capacitated facility location (HCFL) problem and the soft capacitated facility location (SCFL) problem, respectively, along with their relevant work. Finally,
Section 5 summarizes our survey and proposes some future research directions in this field.
2. The Universal Facility Location Problem
The universal facility location (Uni-FL) problem is a generalized framework that extends various facility location problems, including the capacitated facility location problem (both hard and soft capacity constraints) and the UFL problem.
Before formally introducing this problem, we first define the concepts of splittable and unsplittable cases of a client’s demand [
27,
92]. Notably, in the hard capacitated facility location problem, the splittable case is commonly referred to as the multiple-source capacitated facility location problem, whereas the unsplittable case is referred to as the single-source capacitated facility location problem [
93,
94].
Definition 7. When a client’s demand can be served by multiple facilities, it is referred to as the splittable case; otherwise, it is referred to as the unsplittable case.
In the Uni-FL problem, we are given a set of facilities and a set of clients. Each client has a demand , and the unit connection cost that client incurs to connect to facility is denoted as , which is metric. The Uni-FL problem is to allocate a certain capacity to each facility and connect all demands to the facilities subject to the constraint that the total demands served by any facility cannot exceed its allocated capacity . Let represent the cost of allocating units of capacity at facility i, where satisfies
- (a)
Non-decreasing. for every .
- (b)
Normalization. .
- (c)
Left continuous., for every .
The facility cost is defined as the total cost of allocating capacity at all facilities, given by . The connection cost is defined as the total cost of connecting all client demands to facilities. The Uni-FL problem is to minimize the sum of the facility cost and connection cost.
By introducing variable
to indicate whether client
j is served by facility
i or not, we can obtain the integer linear program of the unsplittable Uni-FL problem as follows.
where constraint (
2) means each client’s demand must be satisfied; constraint (
3) means the total demands served by facility
i cannot exceed its capacity
.
In the splittable case, we define variable
as the proportion of client
j’s demand assigned to facility
i, which is in [0, 1]. Consequently, in the splittable case, we only need to replace constraint (
4) to the following constraint in the
-
program.
Few results are available in the literature for the unsplittable Uni-FL problem. In the remainder of this section, we focus on the splittable Uni-FL problem.
For the non-metric splittable Uni-FL problem, Li and Khuller [
95] proposed a greedy algorithm with an approximation ratio of
, where
.
For the metric splittable Uni-FL problem, Mahdian and Pál [
91] first introduced this problem and developed an approximation algorithm with a ratio of
based on the local search technique. Vygen [
96] improved this result, achieving an approximation ratio of
using the same technique while extending the pivoting operation to a more general structure. Garg et al. [
97] introduced extended operations in the local search technique and proposed an analytical framework to achieve an approximation ratio of
, but one of their operations is unlikely to be polynomially computable. Angel et al. [
98] proposed a polynomially computable operation called open–close, closely following the analysis of [
97,
99], and achieved an approximation algorithm with a ratio of
, which is the current best result.
For ease of understanding, we organize the approximation ratios of the algorithms for splittable Uni-FL problem and the methods used in
Table 5.
Open problem 1. Find a polynomial-time algorithm with an approximation ratio less than for the splittable Uni-FL problem.
2.1. The Prize-Collecting Uni-FL Problem
The prize-collecting Uni-FL (PC-Uni-FL) problem is a variant of the Uni-FL problem and extends the prize-collecting UFL problem. A penalty function is defined over the set of clients. Based on the Uni-FL problem, for any client subset , the penalty function satisfies the following conditions: (1) ; (2) is nondecreasing; (3) is either submodular or modular. The prize-collecting Uni-FL problem is to allocate a certain capacity to each facility and connect the demands to the facilities, where the demands are defined as the sum of the demands of non-penalized clients, subject to the constraint that the total demands served by any facility cannot exceed its allocated capacity , with the objective of minimizing the total cost, including the facility cost, connection cost, and penalty cost.
By introducing variable
to indicate the proportion of client
j’s demand assigned to facility
i, and variable
to indicate whether client set
S is penalized or not, we can obtain the mixed-integer linear program of the splittable prize-collecting Uni-FL problem.
where constraint (
5) means, for any client
, either its demand is satisfied, or it is penalized.
When the penalty function
is submodular, few results are available in the literature. When
is modular, the problem is referred to as the Uni-FL problem with linear penalties. This problem was first proposed by Xu et al. [
100], who proposed an approximation algorithm with a ratio of
based on the local search technique. Subsequently, Xu et al. [
101] further improved the approximation ratio to
by the local search technique, which is currently the best result.
For ease of understanding, we organize the approximation ratios of the algorithms for splittable prize-collecting Uni-FL problem when
is modular and the methods used in
Table 6.
Open problem 2. Design a constant approximation algorithm for the splittable prize-collecting Uni-FL problem when is submodular.
2.2. The Robust Uni-FL Problem
The robust Uni-FL (R-Uni-FL) problem represents another variant of the Uni-FL problem with a robust set, and extends the robust UFL problem. This problem is commonly referred to as the Uni-FL problem with outliers and, in some problems, is also known as the partial problem [
102,
103,
104]. Unlike the prize-collecting Uni-FL problem, the robust Uni-FL problem stipulates that at most
L clients may remain unserved (without incurring any penalty cost), corresponding to scenarios where the number of penalized clients is explicitly limited (as discussed in [
105,
106]).
Based on the Uni-FL problem, we are given an integer . The robust Uni-FL problem is to find an outlier set, which is a client subset with a cardinality not exceeding L, allocate a certain capacity to each facility , and connect the total demands of clients (excluding those in the outlier set) to the facilities subject to the constraint that the total demands served by any facility cannot exceed its allocated capacity , with the objective of minimizing the sum of the facility cost and connection cost.
By introducing variable
to indicate the proportion of client
j’s demand assigned to facility
i, and variable
to indicate whether client
j is an outlier or not, we can obtain the mixed-integer linear program of the splittable robust Uni-FL problem.
where constraint (
6) means, for any client
, either its demand is satisfied, or it is an outlier; constraint (
7) means the number of outliers cannot exceed
L.
Research on the robust Uni-FL problem remains limited; however, numerous studies have been conducted on specific cases, such as the hard-capacitated case, as detailed in
Section 3.
Open problem 3. Design a constant approximation algorithm for the splittable robust Uni-FL problem.
2.3. The k-Uni-FL Problem
The k-Uni-FL problem, referred to as the universal k-facility location problem, is an important variant of the Uni-FL problem and extends the k-UFL problem. The primary distinction between the Uni-FL problem and the k-Uni-FL problem lies in the constraint on the number of open facilities. Intuitively, the Uni-FL problem is a special case of the k-Uni-FL problem with . The k-Uni-FL problem is exactly the k-UFL problem when the cost function for each , where .
Based on the Uni-FL problem, given an integer , the k-Uni-FL problem is to allocate a certain capacity to each facility and connect all client demands to the facilities, subject to two constraints: (1) the total demands served by any facility must not exceed its allocated capacity , and (2) the total number of facilities allocated is no more than k. The objective of the k-Uni-FL problem is to minimize the sum of facility cost and connection cost.
By introducing variable
to indicate the proportion of client
j’s demand assigned to facility
i, we can obtain the mixed-integer linear program of the splittable
k-Uni-FL problem.
where
is the indicator function, if
,
; otherwise,
; constraint (
8) means that the total number of facilities allocated is no more than
k.
Research on the
k-Uni-FL problem remains limited; however, numerous studies have been conducted on specific cases, such as the hard-capacitated and soft-capacitated cases, as detailed in
Section 3 and
Section 4.
Open problem 4. Design a constant approximation algorithm for the splittable k-Uni-FL problem.
2.4. The k-Level Uni-FL Problem
The k-level Uni-FL (kL-Uni-FL) problem, also referred to as the multi-level Uni-FL problem, is a variant of the Uni-FL problem, where k denotes the number of facility levels. The k-level Uni-FL problem is exactly the k-level UFL problem when the cost function for each , where .
Based on the Uni-FL problem, the set of facilities is partitioned into k disjoint subsets, denoted as , such that . Let represent the set of all possible paths. A path is termed open if and only if the capacity of each facility is allocated. The unit connection cost between client and the first-level facility is denoted by , and the unit connection between a facility at the level and a facility at the l level is denoted by , both of which are metric. The unit connection cost of client to any path is denoted as , which is metric in . The k-level Uni-FL problem is to allocate a certain capacity to each facility and connect the total demands of clients to open paths along these facilities, subject to the constraint that the total demands served by any facility does not exceed its allocated capacity , with the objective of minimizing the sum of facility cost and connection cost.
By introducing variable
to represent the proportion of client
j’s demand assigned to the facilities in the path
p, the mixed-integer linear program of the splittable
k-level Uni-FL problem can be expressed as follows.
where constraint (
9) means each client must be connected to a path; constraint (
10) means the total demands served by a facility cannot exceed its capacity.
Research on the
k-level Uni-FL problem remains limited; however, numerous studies have been conducted on specific cases, such as the hard-capacitated and soft-capacitated cases, as detailed in
Section 3 and
Section 4.
Open problem 5. Design a constant approximation algorithm for the splittable k-level Uni-FL problem.
2.5. The Other Variants of Uni-FL Problem
The Uni-FL problem in the
p-th power of metric space (
)was introduced by Xu et al. [
41]. This formulation generalizes the Uni-FL problem by relaxing the assumption that the unit connection cost
is proportional to the distance between the client
j and the facility
i. Based on the local search technique, Xu et al. [
41] proposed a
-approximation algorithm, where
was the root of the following equation:
When
, the algorithm achieves an approximation ratio of
, matching the result of Mahdian and Pál [
91]; when
, the algorithm achieves an approximation ratio of
.
3. The Hard Capacitated Facility Location Problem
Hard capacity constraints have been extensively studied in the context of the vertex cover problem, as evidenced by [
107,
108,
109,
110,
111]. Similarly, they are widely investigated in the facility location problem, particularly in the hard capacitated facility location (HCFL) problem, which is a special case of the Uni-FL problem.
In the HCFL problem, each facility
is associated with an open cost
and a hard capacity
. An open facility can serve up to
units of demand, and each facility
can be opened at most
times. Consequently, when the cost function is defined as Equation (
11), the Uni-FL problem is exactly the HCFL problem.
where
x is the total service of facility
i. Based on the unsplittable Uni-FL problem, we introduce variable
to indicate the number of times facility
i is opened. The integer linear program of the unsplittable HCFL problem can be obtained as follows.
Typically, it is assumed that
.
One should note that, in the unsplittable case, the HCFL problem is NP-complete only in determining whether there is a feasible solution, by a straightforward reduction from the bin-packing problem. To analyze the unsplittable HCFL problem, the concept of a bi-criteria approximation factor has been introduced.
Definition 8. An -bi-criteria approximation algorithm produces a solution of cost at most α times the optimum, while violating the capacities by no more than a factor of β.
For the metric unsplittable HCFL problem with general case, Shmoys et al. [
27] were the first to consider the bi-criteria approximation and proposed a
-factor approximation algorithm based on filtering and rounding techniques.
For the metric unsplittable HCFL problem with uniform capacities case, Bateni and Hajiaghayi [
112] developed a bi-criteria
-approximation algorithm for the general metric and a
-approximation algorithm for the tree metric. They also demonstrated that, in the non-metric space, a capacity violation of at least a factor of 1.5 is necessary to achieve any bounded approximation guarantee. Behsaz et al. [
113] introduced a framework for bi-criteria approximation algorithms and developed two new approximation algorithms with factors
and
. Additionally, they presented a quasi-polynomial-time
-approximation algorithm for the Euclidean metric.
In the splittable case, let variable
denote the proportion of client
j’s demand that is served by facility
i, which is in
. Consequently, in the splittable case, we only need to replace the third constraint with the following constraint in the
program.
For the non-metric splittable HCFL problem, Bar-Ilan et al. [
114] proved that there is a constant
such that the approximation ratio cannot be allowed to be
unless
.
For the metric splittable HCFL problem with general case, Pál et al. [
115] proposed the first constant-factor approximation algorithm using the local search technique, achieving a ratio of
. They further refined this approach using the scaling technique, improving the approximation ratio to
. Subsequently, Mahdian and Pál [
91] enhanced the approximation ratio to
, also leveraging the local search technique. Later, Zhang et al. [
99] further improved the approximation ratio to
based on the local search technique. Bansal et al. [
116] achieved a 5-approximation algorithm based on a local search algorithm, which is currently the best approximation ratio. All of the aforementioned algorithms rely on the local search technique, as the hard capacity constraints present significant challenges for linear program-based methods (such as the LP-rounding and primal–dual techniques). Aardal et al. [
117] conducted a comprehensive study on valid inequalities for the HCFL problem and proposed a generalization that the strength of the obtained formulas remains an open question. Kolliopoulos and Moysoglou [
118] demonstrated that many such formulas were insufficient to achieve a constant integrality gap. They further proved that applying the Sherali–Adams hierarchy to the standard LP formula does not reduce the integrality gap. To address these limitations, An et al. [
119] proposed a linear program relaxation with a constant integer gap for the HCFL problem. By leveraging multicommodity flows and matching theory, they developed a 288-approximation algorithm using LP rounding, without violating the capacity constraints. Kao [
120] refined this approach, presenting an LP-rounding-based approximation algorithm with a ratio of
.
For the metric splittable HCFL problem with uniform capacities case, Korupolu et al. [
35] introduced the first constant-factor
-approximation using the local search technique. Chudak and Williamson [
121] later refined this technique, improving the approximation ratio to
. Charikar and Guha [
31] applied the scaling technique, achieving an approximation ratio of
. Aggarwal et al. [
122] further improved this factor to
based on the local search technique, which is currently the best approximation ratio. While most approximation algorithms rely on local search, Grover et al. [
123] proposed an LP-rounding-based algorithm that achieves a constant-factor
-approximation based on a natural LP program, but violating the capacity of a factor
.
For the metric splittable HCFL problem with uniform open costs case, Levi et al. [
92] proposed a 5-approximation algorithm using the LP-rounding technique. Aardal et al. [
124] later developed a combinatorial approximation algorithm with a ratio of
. Subsequently, Kao [
125] introduced a two-stage LP-rounding algorithm, further refining the approximation ratio to 4. The current best approximation ratio is
, obtained by Dabas et al. [
126] using the local search technique. Moreover, Miao and Yuan [
127] extended the result of [
92] and proposed a
-approximation algorithm, where
. When the open cost is uniform, their algorithm achieves a ratio of 5, matching the result of Levi et al. [
92].
For ease of understanding, we organize the approximation ratios of the algorithms for the metric HCFL problem and the methods used in
Table 7.
Open problem 6. Design an LP-based rounding algorithm for the metric splittable HCFL problem with general case with an approximation ratio no more than 9.093.
3.1. The Prize-Collecting HCFL Problem
The prize-collecting HCFL (PC-HCFL) problem is a variant of the HCFL problem. The prize-collecting Uni-FL is exactly the prize-collecting HCFL problem when the cost function is expressed as Equation (
11). By introducing variable
to indicate the proportion of client
j’s demand that is served by facility
i, variable
to indicate whether facility
is open or not, and variable
to indicate whether client set
is penalized or not, we can obtain the mixed-integer linear program of the splittable prize-collecting HCFL problem.
When the penalty function
is submodular, few results are available in the literature. When
is modular, the problem is referred to as the HCFL problem with linear penalties (HCFLPLP). For the general case, Gupta and Gupta [
128] proposed a
-approximation using the local search technique.
For the uniform capacities case, Gupta and Gupta [
128] developed a
-approximation algorithm. Bansal et al. [
129] provided a
-approximation algorithm based on the local search technique. Dabas and Gupta [
130] utilized the LP-rounding technique to design a polynomial-time algorithm achieving a constant factor of
, while violating the capacities by a factor of at most
. Moreover, Dabas and Gupta [
130] investigated a more general model, the uniform capacitated
k-facility location problem with penalties (UC
k-FLPP), and proposed two constant-approximation algorithms based on the LP-rounding technique. The first algorithm achieves an
-approximation, violating the capacities by a factor of at most
and cardinality by a factor of at most 2. The second algorithm achieves an
-approximation, violating the capacities by a factor of at most
.
For the uniform open costs case, Lv and Wu [
131] proposed a 5.732-approximation algorithm based on the LP-rounding technique. Due to the challenges associated with linear penalties under hard capacity constraints, Miao et al. [
132] introduced the hard capacitated uniform facility location problem with soft penalties (HCUFLPSP), where the demand of each client can be fractionally rejected by incurring a penalty cost. They proposed a 5.5122-approximation algorithm using the LP-rounding technique.
For ease of understanding, we organize the approximation ratios of the algorithms for the metric splittable prize-collecting HCFL problem (and its variants) and the methods used in
Table 8.
Open problem 7. Design a constant approximation algorithm for the metric splittable prize-collecting HCFL problem with submodular penalty function.
3.2. The Robust HCFL Problem
The robust HCFL (R-HCFL) problem represents another variant of the HCFL problem with a robust set, and extends the HCFL problem by allowing a subset of clients to remain unconnected without incurring any penalty cost, provided that the total number of unconnected clients does not exceed a given threshold
. In addition, the robust Uni-FL problem is exactly the HCFL problem when the cost function is expressed as the equality (
11). By introducing variable
to indicate the proportion of client
j’s demand that is served by facility
i, variable
to indicate whether facility
i is opened or not, and variable
to indicate whether client
j is an outlier or not, we can obtain the mixed-integer linear program of the splittable robust HCFL problem.
Few results are available in the literature for the general case. For the uniform capacities case, Dabas and Gupta [
130] utilized the LP-rounding framework to devise a constant-factor
-approximation algorithm. This algorithm ensures that the maximum capacity violation is bounded by a factor of
while incurring an outlier loss factor of
.
For the uniform open costs case, Dabas et al. [
126] proposed the first constant-factor
-approximation algorithm, using the local search technique that requires only two operations.
For ease of understanding, we organize the approximation ratios of the algorithms for metric splittable robust HCFL problem and the methods used in
Table 9.
Open problem 8. Design a constant approximation algorithm for the metric splittable robust HCFL problem with the general case.
3.3. The k-HCFL Problem
The
k-HCFL problem, referred to as the hard capacitated
k-facility location problem, is an important variant of the HCFL problem. The
k-Uni-FL problem is exactly the
k-HCFL problem when the cost function is expressed as (
11). By introducing variable
to indicate the proportion of client
j’s demand that is served by facility
i, and variable
to indicate whether facility
i is opened or not, we can obtain the mixed-integer linear program of the splittable
k-HCFL problem as follows.
For the general case of the
k-HCFL problem, Mišković and Stanimirović [
133] proposed an efficient heuristic algorithm based on the variable domain search to solve this problem. Jiang et al. [
134] proposed a
-approximation algorithm based on the primal–dual technique algorithm, violating the capacitated constraints by no more than a factor of 25, where
.
For the uniform capacities case, Byrka et al. [
135] developed an
-approximation algorithm using the dependent rounding technique, which violates the capacity constraints by a factor of at most
. Grover et al. [
123] proposed a constant-factor
-approximation algorithm based on LP rounding, violating the capacity by a factor of at most
while using at most
facilities. Han et al. [
136] introduced an approximation algorithm with a ratio of
based on the local search technique, using at most
facilities, where
and
. Kong and Zhang [
73] proposed a two-step sampling-based algorithm, demonstrating that the
k-HCFL problem can be approximated by a factor of
in fixed-parameter tractable time.
For the uniform open costs case, Aardal et al. [
124] established that any
-approximation algorithm for the uncapacitated
k-median problem ([
34,
137]) can be extended to achieve a
-approximation algorithm for the
k-HCFL problem with uniform open costs, using at most
facilities. They further obtained a
-approximation guarantee, using at most
facilities.
For the case with both uniform open costs and uniform capacities, Aardal et al. [
124] achieved a
-approximation guarantee, using at most
facilities.
For ease of understanding, we organize the approximation ratios of the algorithms for the metric splittable
k-HCFL problem and the methods used in
Table 10.
Open problem 9. Design a constant approximation algorithm for the splittable k-HCFL problem with the general case.
3.4. The k-Level HCFL Problem
The
k-level HCFL (
kL-HCFL) problem, also referred to as the multi-level HCFL problem, is a variant of the HCFL problem, where
k denotes the number of facility levels. The
k-level Uni-FL problem is exactly the
k-level HCFL problem when the cost function is expressed as Equation (
11). By introducing variable
to represent the proportion of the client
j’s demand served by the facilities in the path
p, and variable
to indicate whether facility
i is opened or not, we can obtain the mixed-integer linear program of the splittable
k-level HCFL problem as follows.
For the splittable
k-level HCFL problem, Chen and Wang [
138] developed a cost-varied Vogel-based approximation method, namely the dynamic Vogel approximation method, which is a heuristic method, to solve this problem. Du et al. [
139] proposed the first combinatorial approximation algorithm with a ratio of
.
For ease of understanding, we organize the approximation ratios of the algorithms for splittable
k-level HCFL problem and the methods used in
Table 11.
Open problem 10. Design a constant approximation algorithm for the splittable k-level HCFL problem or prove that such an algorithm does not exist.
3.5. Other Variants of the HCFL Problem
Some other variants of the HCFL problem have also been investigated in the literature. Ageev et al. [
140] studied the network HCFL problem, in which clients and facilities were located at the vertices of a given edge-weighted transportation network graph. Based on the dynamic algorithm proposed by Ageev et al. [
141] for the UFL problem on path graphs, they improved the multi-complexity to
, where
,
.
For the online hard capacitated facility location problem, Cygan et al. [
142] demonstrated that any online algorithm for the UFL problem with arbitrary (e.g., non-uniform) capacities or arbitrary (e.g., non-uniform) open costs has an unbounded worst-case competitive ratio. Consequently, they focused on the case with uniform capacities and uniform open costs, achieving an optimal competitive ratio of
with no deletions, where
n is the length of the sequence. Additionally, they proposed an online algorithm with a competitive ratio of
in the fully dynamic model with deletions, under the uniform capacities assumption, where
m is number of points in the input metric and
u is the capacity of each facility.