A Robust Human–Machine Framework for Project Portfolio Selection

Chen, Hang; Zhang, Nannan; Dou, Yajie; Dai, Yulong

doi:10.3390/math12193025

Open AccessFeature PaperArticle

A Robust Human–Machine Framework for Project Portfolio Selection

¹

College of Systems Engineering, National University of Defense Technology, Changsha 410073, China

²

Finance and Economics Pearl River College, Tianjin University, Tianjin 300345, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(19), 3025; https://doi.org/10.3390/math12193025 (registering DOI)

Submission received: 21 August 2024 / Revised: 24 September 2024 / Accepted: 26 September 2024 / Published: 27 September 2024

Download

Browse Figures

Versions Notes

Abstract

:

Based on the project portfolio selection and scheduling problem (PPSS), the development of a systematic and scientific project scheduling plan necessitates comprehensive consideration of individual preferences and multiple realistic constraints, rendering it an NP-hard problem. Simultaneously, accurately and swiftly evaluating the value of projects as a complex entity poses a challenging issue that requires urgent attention. This paper introduces a novel qualitative evaluation-based project value assessment process that significantly reduces the cost and complexity of project value assessment, upon which a preference-based deep reinforcement learning method is presented for computing and solving project subsets and time scheduling plans. This paper first determines the key parameter values of the algorithm through specific examples. Then, using the method of controlling variables, it explores the sensitivity of the algorithm to changes in problem size and dimensionality. Finally, the proposed algorithm is compared with two classical algorithms and two heuristic algorithms across different instances. The experimental results demonstrate that the proposed algorithm exhibits higher effectiveness and accuracy.

Keywords:

scheduling; combinatorial optimization; deep learning; project portfolio

MSC:

90C27

1. Introduction

Projects serve as fundamental drivers of enterprise development, technological innovation, and national construction. Positioned as a key component of China’s economic diplomacy and further opening up, the Belt and Road initiative encompasses five categories of projects: infrastructure, industry, industrial parks, cultural exchange, and international border cooperation. There are currently 3164 planned and ongoing projects with a combined value exceeding USD 4 trillion [1]. Amidst the US–China trade war, the semiconductor industry has emerged as a focal point in the competition for dominance between both sides. In 2020 alone, China saw nearly 500 semiconductor projects with an investment totaling close to CNY 600 billion. It is evident that projects play a pivotal role in various economic endeavors [2].

The project portfolio selection and scheduling problem (PPSS), namely the allocation of investments across multiple projects and the rational arrangement of project construction timelines, is a critical concern for decision-makers. One challenge in PPSS problems lies in the difficulty of accurately assessing the value of projects. Unlike traditional financial investments such as stocks, projects necessitate a comprehensive evaluation encompassing various dimensions beyond monetary considerations [3]. Indeed, the value of most projects is abstract and resistant to quantification, such as the enhancement of residents’ quality of life resulting from infrastructure initiatives or the long-term benefits accruing to companies from research endeavors. Presently, project valuation predominantly relies on expert scoring methods, which can introduce several practical issues: firstly, subjective expert assessments may compromise objectivity and reliability; secondly, when confronted with large-scale problems, a limited number of experts may struggle to conduct quantitative evaluations across multiple projects.

Currently, there are two primary enhancement measures for the expert scoring method. The first involves aggregating and integrating the results of multiple rounds and stages [4,5,6] of decision-making by numerous experts [7,8,9] into a unified conclusion, with the aim of enhancing objectivity and accuracy through collective wisdom. The second measure entails utilizing different data structures to represent uncertainty in evaluation results, such as triangular fuzzy [10] numbers and interval fuzzy numbers [11,12]. While these methods have partially addressed the challenge of quantifying project value, they still fall within the realm of quantitative evaluation, with their subsequent effectiveness contingent upon the accurate assessment by a limited number of experts.

In this study, a robust human–machine framework is proposed for PPSS, where human–machine collaboration is performed for evaluation and optimization, rather than separately as in the general PPSS process. On this basis, a value evaluation method based on qualitative evaluation is proposed, which can reduce the reliance on expert knowledge and allow more stakeholders to participate in the evaluation process. In this way, the evaluation result is no longer an exact matrix, but an arbitrary point in a convex polygon, thus enabling comparisons between portfolios by giving evaluation criteria. Finally, a deep preference-based Q network (DPbQN) algorithm is proposed to optimize and search for the optimal portfolio investment and scheduling scheme.

2. Literature Review

This section reviews the literature on PPSS, focusing on methodological advances in PPSS in recent years.

Since Markowitz et al.’s [13] first study PPSS in 1952, extensive research has been conducted on this issue. Current research can be categorized into exact methods, heuristic methods, and deep learning methods based on the models and approaches used to address the problem.

2.1. Exact Methods

Exact methods encompass mathematical programming models that articulate the problem through explicit formulas, effectively articulating the optimization objective of PPS and capturing the constraint relationships among decision variables with clarity and conciseness.

PPSS is generally modeled as a 0-1 integer programming problem [10], then adds constraints according to the problem background, and finally selects the appropriate solution [14]; for example, whether there are ambiguities in the data [15] or multiple types of resources [16].

Albano et al. [17] proposed a mixed-integer nonlinear optimization model incorporating four key project management indicators: value maximization, strategic alignment, balance, and future planning. Zolfaghari et al. [12] considered resource management, cash flow, delay costs, and robustness for multi-project environments and presented a novel linear structured mixed-integer programming model. Furthermore, to better depict real-world project scenarios, a new solution method based on triangular interval-valued fuzzy random variables was suggested to integrate both fuzzy and stochastic uncertainties into the mathematical programming model. Zhang et al. [18] introduced a distributed two-stage mixed-attribute decision-making and integer programming model. In the first stage, a mathematical programming-based multi-attribute group decision-making approach was utilized to assess the non-financial value of projects under incomplete preference information; a parameter-adjustable compensatory weighted averaging operator was employed to capture experts’ subjective attitude characteristics in the aggregation process. In the second stage, a bi-objective integer programming model was formulated to optimize project portfolios considering both financial and non-financial values under constraints.

2.2. Heuristic Methods

However, with the expansion of the problem scale, the complexity of the problem increases rapidly, and it is usually difficult to solve by mathematical programming models. In many cases, we need to find the satisfactory solution instead of the optimal solution. Therefore, the heuristic algorithm has been more frequently researched and applied.

For example, Fisher et al. [19] proposed an approximate dynamic programming heuristic to provide options for selecting cost-effective, feasible projects for the future development of the Royal Canadian Navy in 2015. In 2018, Eshlaghy et al. [20] proposed a method combining k-means, a genetic algorithm, and grey theory to address the uncertainty problem in PPS. Manish et al. [21] studied the relationship between the complexity and the problem size of PPS using the tabu search algorithm in 2019.

2.3. Deep Learning Methods

Effective application of deep learning methods in the domain of combinatorial optimization started in 2015. Vinyals et al. [22] modeled the combinatorial optimization problem as sequence-to-sequence mapping, proposed the pointer network model, and solved the TSP problem. In 2018, Nazari et al. [23] used a modified pointer network to solve a TSP problem of size 100, which outperformed multiple heuristics in terms of solution results and significantly reduced training time. In 2021, Li et al. [24] extended pointer networks to multi-objective combinatorial optimization problems by combining decomposition strategies and parameter transfer methods.

The pointer network model is an algorithm based on supervised learning, which requires a large amount of labeled data for training, and the optimization effect of the algorithm depends on the quality of the training data. In practical applications, it is extremely difficult to collect large amounts of high-quality labeled data. Therefore, deep reinforcement learning (DRL) methods based on unsupervised learning methods are used in combinatorial optimization problems. Bello et al. [25] used reinforcement learning methods to train the pointer network model. They treat each problem instance as a training sample, use the objective function of the problem as a feedback signal, train with the DRL algorithm, and introduce the critic network as a baseline to reduce the training variance. Since then, there have been many studies combining neural networks with reinforcement learning methods to solve combinatorial optimization problems, such as the combination of pointer networks and actor–critic to solve the CVRP problem and the job-shop scheduling problem [26]; GNN and reinforcement learning to solve the outcome graph coloring problem [27]; and graph attention combined with PPO [28], and Transformer attention combined with reinforcement learning [29] to solve the CVPR problem.

In terms of applications, DRL is also used in combinatorial optimization problems in various domains, such as network communications, where DRL is used to solve resource allocation [30,31], topology and routing optimization [32], and computational migration problems [33]. In the transportation domain, document [34] leverages DRL to enable fast online generation of intercity distribution routes. In the field of high performance computing, several scholars have investigated how to deploy different computational functions of neural network models on CPUs or GPUS and the impact on training speed [35]. Most of the current applied research is applied in the context of online optimization, which is difficult for traditional algorithms. Deep reinforcement learning-based optimization algorithms have the advantages of fast solution speeds and strong generalization capabilities. Such methods have a wide range of applications due to the large number of combinatorial optimization problems in industry, manufacturing, transportation, and other fields.

Although DRL is a new method for solving combinatorial optimization problems in recent years, due to its superior performance and wide application prospects, a large number of related studies have been carried out, greatly enriching its theoretical foundations and application space. However, as a typical combinatorial optimization problem, PPSS has not yet been solved by DRL.

3. Problem Description

Before the presentation of the mathematical model of PPSS, there is a summary of the terms used in this paper in Abbreviations section.

The definition of the PPSS is to obtain a scheduling plan that selects appropriate projects from a given set of alternatives and schedules them for a certain time. The scheduling plan must satisfy several hard constraints while extracting the maximum value.

A project is abstractly represented as an entity

p_{i}

with multiple attributes, including construction duration

d_{i}

, construction cost

c_{i}

, and values

v_{i 1}, v_{i 2}, \dots, v_{i J}

corresponding to the decision-maker (DM)’s requirements.

For the general case, we need to choose one or more projects from a group of I candidate projects and then schedule the start times of each project within a time window of length T while taking into account a number of constraints. Consequently, the decision variables can be defined as

x_{i t} = \{\begin{matrix} 1, i f p_{i} s t a r t s a t t i m e t \\ 0, o t h e r w i s e \end{matrix}

(1)

where

i = 1, 2, \dots, I

and

t = 1, 2, \dots, T

.

In this situation, any solution can be denoted as a matrix

x \in {\{0, 1\}}^{I \times T}

. As shown in Figure 1, in the example, we decide to start

p_{1}

in year 3, start

p_{2}

in year 1, and not invest in

p_{3}

.

The optimization goal of the PPSS is to maximize the value, so the objective function can be expressed as

max f_{j} = \sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} x_{i t}), j = 1, 2, \dots, J

(2)

and is subject to the following:

(i).: Uniqueness constraint: Each project can at most be executed once.

$\sum_{t = 1}^{T} x_{i t} \leq 1, i = 1, 2, \dots, I$

(3)
(ii).: Cost constraint: The total cost of each year cannot exceed the budget.

$\sum_{i = 1}^{I} x_{i t} c_{i} \leq B_{t}, t = 1, 2, \dots, T$

(4)
(iii).: Cooperation constraint: The benefits of some projects must be based on the completion of each other without a strict sequence of construction times. Suppose that $H (p_{i})$ is the cooperation project subset of $p_{i}$ , then the cooperation constraint can be expressed as

$\sum_{t = 1}^{T} x_{i t} = \sum_{t = 1}^{T} x_{j t} i = 1, 2, \dots, I; j \in H (p_{i})$

(5)
(iv).: Precedence constraint: In some cases, the construction of one project needs to be based on the completion of other projects. Suppose that $p_{i}$ must be executed subsequent to all projects in subset $Ψ (p_{i})$ , then the precedence constraint can be expressed as

$\{\begin{matrix} \sum_{t = 1}^{T} t x_{i t} \geq \sum_{t = 1}^{T} (t + d_{j}) x_{j t} \\ \sum_{t = 1}^{T} x_{i t} \leq \sum_{t = 1}^{T} x_{j t} \end{matrix} i = 1, 2, \dots, I; j \in Ψ (p_{i})$

(6)
(v).: Exclusive constraint: Some projects are mutually exclusive and cannot coexist. Suppose that $p_{i}$ cannot coexist with any projects in subset $Φ (p_{i})$ , then it can be expressed in a mathematical form as

$(\sum_{t = 1}^{T} x_{i t}) (\sum_{j \in Φ (p_{i})} \sum_{t = 1}^{T} x_{j t}) = 0, i = 1, 2, \dots, I$

(7)

4. Method

4.1. Framework

Figure 2 shows the general PPSS process. Subordinate organizations submit projects to the DM, and all submitted projects are aggregated to form an alternative set of projects. The DM quantitatively quantifies the value of projects through expert evaluation. Finally, the quantization project set is fed into, and the portfolio result is obtained by, some optimization algorithm.

Our proposed human–machine framework is shown in Figure 3. The framework primarily consists of two processes. The first process involves machine-assisted project evaluation by humans, as illustrated by the black lines. First, the project set is evaluated by humans who are not only experienced experts, but also DMs, stakeholders, etc., because the form of evaluation is simply a linear qualitative evaluation instead of an exact number. Next, the qualitative evaluation is translated into a number of linear equality or linear inequality constraints whose intersection determines a convex polytope as a feasible space for the value matrix. Correspondingly, a robust criterion is used for the comparison of dominance relationships between two projects or between two solutions. In the aforementioned process, the machine’s assistance is reflected in two key aspects: first, by helping to eliminate contradictory evaluations, and second, by distributing evaluation tasks to humans. The second process, as indicated by the blue lines, involves the development of a preference-based deep reinforcement learning model to search the optimal scheduling plan. During the iterative computation, human preferences play a crucial role, guiding the optimization direction of the model.

This study focus on the following three points:

How to model unstructured qualitative evaluation mathematically.
How to construct comparison criteria.
How to construct the model to search the optional solution.

The next three subsections will elaborate on the above three points, respectively.

4.2. Project Value Evaluation Method Based on Qualitative Evaluation

If each value of each project can be accurately and quantitatively evaluated, that is, each

v_{i j}

has a definite value, then the project set can be represented as follows:

V = [V_{1}, V_{2}, \dots, V_{I}] = [\begin{matrix} v_{11} & \dots & v_{1 J} \\ ⋮ & ⋱ & ⋮ \\ v_{I 1} & \dots & v_{I J} \end{matrix}]

(8)

In this case, V is a vector in the vector space

R^{I * J}

, that is, an exact point. However, as mentioned above, accurate quantitative evaluation is often very difficult, and quantitative evaluation is a more feasible way.

Qualitative evaluation in this paper is a rough evaluation of the value of a certain aspect of a project or a rough comparison of the value of multiple projects. The linear constraints are used as the most basic element to describe these qualitative evaluations, which can be expressed as one or more linear equations or inequalities.

To illustrate, consider the simplest case where a firm plans to invest in two alternative projects,

p_{1}

and

p_{2}

, and considers only the economic benefits. Several stakeholders gave their own evaluations of

p_{1}

and

p_{2}

:

The economic benefits of $p_{1}$ and $p_{2}$ are not much different.
Investment in $p_{1}$ and $p_{2}$ at the same time cannot achieve the maximum economic benefit.

As shown in Figure 4, the above two evaluations can be converted into three linear inequality constraints with the lower bound 0 and upper bound 1 of economic benefit. The three colored lines in the figure respectively represent the three constraints.

\{\begin{matrix} v_{11} - v_{21} \leq 0.2 \\ v_{21} - v_{11} \leq 0.2 \\ v_{11} + v_{21} < 1 \end{matrix}

(9)

The shaded part in Figure 4 is the feasible domain under the constraints of three linear inequalities, and the value of

(v_{11}, v_{21})

can be represented by any point of it, such as

(0.3, 0.4)

,

(0.5, 0.38)

, etc.

Assuming that the evaluation of

V_{j}

can be transformed into

n_{j}

linear equality constraints and

m_{j}

linear inequality constraints, then the feasible domain of

V_{j}

and V can be expressed as

\begin{matrix} V_{j} = {V_{j} ∣ & α_{n}^{j} V_{j} = a_{n}^{j}, n = 1, 2, \dots n_{j}; \\ β_{m}^{j} V_{j} \leq b_{m}^{j}, m = 1, 2, \dots m_{j}} \end{matrix}

(10)

V = \times_{j = 1}^{J} V_{j}

(11)

From Formulas (10) and (11), it can be found that

V_{j}

is a polytope that is the intersection of

n_{j}

hyperplanes and

m_{j}

half spaces. Compared with the accurate quantitative evaluation, the value matrix changes from a definite vector V in vector space

R^{I \times J}

to an uncertainty vector in convex polytope

V

, and the problem also changes from a deterministic problem in a single scenario to an uncertainty problem in multiple scenarios.

4.3. Robust Evaluation Criteria

Before proceeding with PPS, it is necessary to clarify the criteria for judging the relationship between the advantages and disadvantages of two solutions. For two solutions x and y,

x ≻ y

is defined as:

\sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} x_{i t}) ⪰ \sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} y_{i t}), j = 1, 2, \dots, J

(12)

And at least one of the J inequalities is strictly true. The value of

v_{i j}

in Equation (12) is not determined; therefore, the dominant relationship is defined as:

\sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} x_{i t}) \geq \sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} y_{i t}), \forall V_{j} \in V_{j}

(13)

Obviously, enumerating all values in a continuous vector space

V_{j}

is not feasible. But in fact, we only need to compare a finite number of points in

V_{j}

to obtain the result of Equation (13).

Let function

D (V_{j}) = \sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} x_{i t}) - \sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} y_{i t})

. Obviously the function

D (V_{j})

is a convex function defined on

V

. Then:

\begin{matrix} \sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} x_{i t}) ⪰ \sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} y_{i t}) & \Leftrightarrow \\ \sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} x_{i t}) \geq \sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} y_{i t}), \forall V_{j} \in V_{j} & \Leftrightarrow \\ \sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} x_{i t}) - \sum_{i = 1}^{I} (v_{i j} \sum_{t = 1}^{T} y_{i t}) \geq 0, \forall V_{j} \in V_{j} & \Leftrightarrow \\ D (V_{j}) \geq 0, \forall V_{j} \in V_{j} & \Leftrightarrow \\ min_{V_{j} \in V_{j}} D (V_{j}) \geq 0 & \Leftrightarrow \end{matrix}

According to the above formula, if the minimum value of the function

D (V_{j})

is greater than 0, then the inequality is true. According to the properties of convex functions, the minimum value of

D (V_{j})

must occur at the boundary points of its domain. That is:

min_{V_{j} \in V_{j}} D (V_{j}) \geq 0 \Leftrightarrow min_{V_{j} \in ext (V_{j})} D (V_{j}) \geq 0

(14)

where

ext (V_{j})

is the set of all the boundary points of

V_{j}

. Obviously

ext (V_{j})

is a finite set, we just need to enumerate the values of all points in

ext (V_{j})

for comparison. If one solution is better than the other at all boundary points, it is possible to judge the relation between two solutions. The enumeration problem of boundary points in convex polytope is a well-established classical problem that can be efficiently solved using a complete method, yielding results within a short timeframe [36].

The example given in Figure 4 will be used to illustrate the criteria presented in this section. Consider two solutions

x = [1, 0]

and

y = [0, 1]

. As can be seen from Figure 4, there are five boundary points of

V_{j}

, which are

(0, 0)

,

(0, 0.2)

,

(0.2, 0)

,

(0.4, 0.6)

, and

(0.6, 0.4)

. By calculating the values of these five points separately, it can be found that there is no dominant relationship between x and y, that is,

x \sim y

.

4.4. Portfolio Selection Optimization Based on DPbQN

Our approach to optimization problem solving integrates two machine learning methodologies. The first utilizes a deep Q-network (DQN), employing a deep neural network to fit the Q-value table for state–action pairs, making it suitable for addressing high-dimensional and large-scale combinatorial optimization problems. The second method involves preference learning, specifically tailored to handle the unique characteristics of this study where solutions cannot be quantitatively compared. A key departure from traditional supervised machine learning setups is that training information is typically not provided in the form of scalar target values, but rather as paired comparisons expressing preferences between different objects or labels.

4.4.1. Deep Q Network

Deep Q-Network (DQN) is a form of DRL that demonstrates strong performance in the realm of optimization. Fundamentally, it leverages deep neural networks to model Q values and address the scalability issue inherent in high-dimensional problems.

A deep neural network (DNN) with multiple layers shown in Figure 5 and Table 1 is used in this study, which fits a function

Q : S \times A \to R

. The initial layer serves as the input layer, which is responsible for transforming the two-dimensional state matrix into a one-dimensional vector. Due to the uncertain scale of the problem, an extensive number of nodes are utilized in this input layer. In cases where the problem’s scale is smaller than the number of nodes, redundant nodes pass null values. The middle layer comprises several hidden layers, wherein each node adopts the ReLu function as its activation function, which can improve the training efficiency. The type of hidden layer is a dense layer. Although dense layers require more parameters and computation compared to other types of network, they offer higher accuracy and reliability, which is acceptable in the smaller network structure in this paper. As for the final output layer, it utilizes a Linear function as its activation function with dimension

|I|

, generating Q values for each action. Similar to the input layer, a substantial amount of fixed nodes are employed in the output layer and redundant nodes produce null values.

As depicted in Figure 6, the state is denoted by a matrix of

|I| \times |T|

, which carries the same significance as the solution matrix in Section 3. The action set comprises

|I|

actions, each representing the alteration of the status of

p_{i}

from not initiated to initiated; and, if the project is already underway, then adjust the start time to match the current time. As shown in Figure 6, if

a_{3}

in the red dashed box is selected, it means that

p_{3}

is executed at that instant.

4.4.2. Preferences-Based Learning

The dominant relation comparison criteria outlined in the preceding section allow for the comparison of advantages and disadvantages between any two projects or project combinations. However, it is important to note that this method does not provide a specific degree of advantages and disadvantages, thus lacking numerical feedback. Consequently, traditional reinforcement learning algorithms cannot be directly applied as quantitative rewards or function values are not attainable following the selection of a superior or inferior combination solution.

Preference-based learning is the generalization of predictive preference models from empirical data to establish links between machine learning and research areas related to preference modeling and decision-making. The key difference from the traditional supervised machine learning setup is that the training information is usually not given in the form of scalar target values, but in the form of paired comparison to express the preferences between different objects or labels [37]. The data label format utilized in this study is triple

(σ_{1}, σ_{2}, μ)

recorded in a database

D

, where

σ = ((s_{0}, a_{0}), (s_{1}, a_{1}), \dots, (s_{t - 1}, a_{t - 1}))

is a segment that is a sequence of states and actions, and

μ

is a real number based on the dominant relation between

σ_{1}

and

σ_{2}

using the criteria in Section 4.3, as shown in Table 2.

The loss function is defined as follows

l o s s = - \sum_{(σ_{1}, σ_{2}, μ) \in D} μ log \hat{P} [σ_{1} ≻ σ_{2}] + (1 - μ) log \hat{P} [σ_{1} ≺ σ_{2}]

(15)

where

\hat{P} [σ_{1} ≻ σ_{2}] = \frac{exp \sum Q (s_{t}^{1}, a_{t}^{1})}{exp \sum Q (s_{t}^{1}, a_{t}^{1}) + exp \sum Q (s_{t}^{2}, a_{t}^{2})}

(16)

This follows the Bradley–Terry model [38] for estimating score functions from pairwise preferences, and is the specialization of the Luce–Shephard choice rule [39,40] to preferences over trajectory segments.

4.4.3. Complete Algorithm Flow

The details of the algorithm are described in Algorithm 1:

Step1: Initialize the network parameters

θ

and experience pool

D

randomly.

Step2: According to the

ϵ - g r e e d y

strategy, a random action is selected with the probability of

ϵ

, and the probability of

1 - ϵ

selects action

arg {max}_{a} Q (s_{j k}, a)

, and the selected action is denoted as

a_{k}

.

Step3: Execute action

a_{k}

, observe the environment, adjust the corresponding project to the state under construction, obtain the next state

s_{j (k + 1)}

, and the data label

(σ_{j k}, σ_{j (k + 1)}, μ)

Step4: Add

(σ_{j k}, σ_{j (k + 1)}, μ)

to experience pool

D

.

Step5: Take a random sample

(σ_{1}, σ_{2}, μ)

from

D

, then train the network with the loss function defined in Equation (15) and update the parameter

θ

.

Steps2 through 5 are repeat a total of K times, and the resulting state is taken as output.

Algorithm 1 DRL for PPSS

Input: Initial state matrix

s_{0}

, initial experience pool

D

, action set

A

, learning rate

α

, parameter update interval L

1:: Randomly Initialize network parameter $θ$
2:: for $j = 1$ to $| T | - 1$ do
3:: for $k = 1$ to K do
4:: if $j = 1$ and $k = 1$ then
5:: $s_{j k} \leftarrow s_{0}$
6:: end if
7:: select action $a_{k} = π^{ϵ}$
8:: $D \leftarrow \{(σ_{j k}, σ_{j (k + 1)}, μ)\} \cup D$
9:: Take random sample $(σ_{1}, σ_{2}, μ)$ from $D$
10:: train network by gradient descent method
11:: update $\hat{θ} \leftarrow θ_{n}$ every L steps
12:: end for
13:: end for

Output:

s_{J K}

4.4.4. Training

During the training process, each instance in the training set is input into the neural network and propagated by the network to generate a relative output. These values are then compared to the labels in the training data using a loss function, which is used to compute the accuracy of the prediction. In the next step, to reduce the value of the loss function in the next iteration (gradient descent), according to the influence of network weights on the loss function, the neural network needs to adjust the weights of the network. After dealing with all instances in the training set, the first epoch of the training was completed. The training may be repeated for a number of epochs until the error rate shows no further improvement.

5. Experimental

5.1. Experimental Setup

The data for the instances were sourced from the National Bureau of Statistics of China website [41], encompassing publicly released annual statistical yearbook data and the directory of local statistical survey projects. Table 3 lists the parameters of the instances, with problem size and dimension increasing incrementally from small to large, resulting in a corresponding escalation in difficulty. The objective of the experiments is to verify three questions:

What is the optimal parameter value at which the algorithm demonstrates best performance?
How do the scale and dimensionality of the problem impact the algorithm’s performance?
Is the algorithm superior in comparison to other heuristic methods in the quality of the solutions?

In response to the first query, a range of parameter values were employed within a consistent context. In addressing the subsequent question, an assessment of the algorithmic performance was conducted utilizing identical parameters across instances of diverse scales and dimensions. Finally, a comparative analysis was conducted between the algorithm in question and the most prevalent heuristic algorithms, as well as classical algorithms, in order to address the final question.

Tests were conducted on a PC (2.5-GHz Intel Core i9 processor and 16 GB of RAM and 16 GB of memory). The algorithm was implemented in Python 3.9, with pytorch 2.3.1, as the backend for the implementation of the DNN. The epoch was set to 2500 and all networks were trained using an Adam optimizer which facilitates faster algorithm convergence [42].

5.2. Experiment 1: Comparison of Different Parameter Settings

When using a DNN to conduct the selection, two strategies are utilized: depth-first search and limited discrepancy search. Parameter w is employed to regulate the preference level of these two strategies. The probability of selecting the first strategy is denoted as w, while the probability of utilizing the second strategy is represented by

1 - w

.

The parameter w has a significant impact on the performance of the algorithm. To determine the optimal value, experiments were conducted to test the performance of the algorithm for different parameter values a over 3-8 instances. Specifically, for each instance, the algorithm was run repeatedly using the same parameter values until no new Pareto solutions were found after 5 runs, at which point the parameter values were adjusted and the algorithm continued to run. Finally, all Pareto solutions obtained in the same instance using different w values are compared. Non-Pareto solutions are eliminated and the parameter is measured against the number of remaining Pareto solutions.

Table 4 presents the Pareto solutions obtained by the algorithm for various values of w. It is evident that, in each case, the performance of the two groups with

w = 0.75

and

w = 1

surpasses significantly that of the other four groups, corresponding to probabilities of 75% and 100% for selecting a depth-first strategy. The former demonstrates strong performance in four instances, while the latter prevails in the remaining two instances, with minimal discernible difference between them, rendering it challenging to ascertain superiority. Consequently, further analysis is imperative to determine optimal parameter value.

The Pareto solution obtained under specific parameters, when compared with the Pareto solution obtained under other parameters, still maintains its status as a Pareto solution and is referred to as an effective Pareto solution. Upon analyzing the proportion of effective Pareto solutions under two sets of parameters in Figure 7, it is observed that in instance 3, the effectiveness rate is 100% for both cases. However, in other instances, except for instance 7 where

w = 0.75

shows a slightly lower effectiveness than

w = 1

, the former consistently demonstrates significantly higher efficiency and effectiveness than the latter. This indicates that the former exhibits stronger solving capabilities. Therefore,

w = 0.75

was used as the parameter setting in subsequent experiments.

5.3. Experiment 2: Comparison of Different Problem Scales and Dimensions

After determining the appropriate parameter values, the subsequent step in the experiment involves testing the algorithm’s performance across various problem scales and dimensions, while observing its sensitivity to changes in these factors. All instances are computed in this experiment, and the effects of problem scale and dimension are analyzed separately using a method that controls variables.

As depicted in Figure 8, the three line charts within the same column exhibit consistent problem scales, with the problem dimension increasing from bottom to top. Similarly, the four line charts within the same row display uniform problem dimensions, with the problem scale escalating from left to right. Consequently, an analysis of the variations in each column’s line charts allows for an examination of the impact of problem dimensions, while an analysis of changes in each row’s line charts facilitates an assessment of the influence of problem scales.

Upon analysis of the change in the loss function value for each column, it is evident that with an increase in problem scale, there is a higher frequency of horizontal line segments, indicating a greater likelihood for the algorithm to converge towards local optimal solutions. Examination of the loss function values for each row reveals a smoother curve on the right-hand side and a more challenging path to reach the minimum point on the vertical axis as problem scale increases, primarily resulting in a deceleration of the solving process. Notably, instance 5 exhibits a more intricate curve compared to other instances, characterized by an increased occurrence of horizontal line segments and delayed convergence to the minimum point. This may be attributed to instance 5 having more constraints.

5.4. Experiment 3: Comparison of Different Methods

The algorithm proposed in this paper is compared with two well-established classic algorithms, ejection chain [43] and branch-and-price [44], as well as two heuristic algorithms, CP-ILS [45] and DP-VNS [46], all of which have demonstrated strong performance in the field of combinatorial optimization. Each algorithm is permitted to execute for a duration of 60 min on each instance, following which the Pareto solutions obtained for the same instance are compared, and only valid Pareto solutions are preserved.

As shown in Table 5, the proposed algorithm demonstrates superior performance compared to ejection chain across all 12 instances. While maintaining comparable performance to DP-VNS in the initial three instances and slightly lower than CP-ILS in instance 6, the proposed algorithm exhibits significantly improved results in the remaining instances. The performance of the branch-and-price algorithm is strong in the initial six instances, even yielding the most effective Pareto solutions in instances 4 and 5. However, as the complexity of the problem increased, its effectiveness noticeably declined compared to other algorithms in the subsequent six instances. This suggests that its time cost becomes unacceptable when dealing with a larger problem scale. Furthermore, the proposed algorithm‘s advantage expands as the complexity of the problem increases.

Figure 9 illustrates the clear superiority of the proposed algorithm. Among all effective Pareto solutions, the 290 solutions obtained by the proposed algorithm cover almost all effective Pareto solutions, with 53 being exclusively derived from this algorithm and only 3 being solved by other algorithms but not by this one.

6. Conclusions and Future Work

This paper presents a novel project value assessment model, developed on the basis of the PPSS problem, and subsequently proposes the DPbQN algorithm. Through experimentation, the optimal algorithm parameters are determined, the impact of problem size and dimensionality on the algorithm’s performance is investigated, and finally, the proposed method is compared with other algorithms to verify its effectiveness and superiority.

In future research, it is conceivable to explore the optimization of network structures, encompassing both input and hidden layers, in order to better accommodate a wider array of intricate problems. Furthermore, the integration of embedding algorithms as integral components within other algorithms to create hybrid algorithms for addressing a broader spectrum of fields also warrants further investigation.

Author Contributions

Conceptualization, H.C. and Y.D. (Yajie Dou); methodology, N.Z.; software, H.C. and Y.D. (Yulong Dai); validation, N.Z. and Y.D. (Yulong Dai); formal analysis, Y.D. (Yajie Dou); investigation, H.C.; resources, Y.D. (Yajie Dou); data curation, H.C. and N.Z.; writing—original draft preparation, H.C. writing—review and editing, H.C. and N.Z.; visualization, H.C. and Y.D. (Yulong Dai); supervision, Y.D. (Yajie Dou); project administration, Y.D. (Yajie Dou); funding acquisition, N.Z. and Y.D. (Yajie Dou). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China (NNSFC) under grant number 72231011.

Data Availability Statement

Data sharing is not applicable to this article due to privacy restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

Notations used in this paper.

Parameters
I	Numbers of projects
T	Length of periods associated with time
J	Number of types of project value
$p_{i}$	Project i
$d_{i}$	Duration of $p_{i}$
$c_{i}$	Construction cost of $p_{i}$
$v_{i j}$	j-th value provide by $p_{i}$
$V_{j}$	Column vector consisting of the j-th value of all projects
V	Value matrix
$B_{t}$	Total budget for year t
$H (p_{i})$	A subset of projects for which cooperation constraints exist
$Ψ (p_{i})$	Set of projects that must be executed ahead of time if $p_{i}$ is selected
$Φ (p_{i})$	Set of projects that cannot coexist with $p_{i}$
$V_{j}$	Feasible domain of $V_{j}$
$V$	Feasible domain of V
$α_{n}^{j}$	Coefficient vector of the n-th linear equality constraints of $V_{j}$
$β_{m}^{j}$	Coefficient vector of the m-th linear inequality constraints of $V_{j}$
Decision variable
$x_{i t}$	Whether $p_{i}$ starts at time t

References

Yang, X.A. Belt and Road: The first decade. Int. Aff. 2024, 100, 433–434. [Google Scholar] [CrossRef]
Swenson, D.L. Trade-war Tariffs and Supply Chain Trade. Asian Econ. Pap. 2024, 23, 66–86. [Google Scholar] [CrossRef]
Jiang, J.; Jin, Q.C.; Xu, X.M.; Hou, S.; Li, J.C. Preliminary study on national defense science and technology system engineering in the era of intelligence. Syst. Eng. Electron. 2022, 44, 1880–1888. [Google Scholar]
Wang, M.W.; Liang, D.C.; Li, D.F. A Two-Stage Method for Improving the Decision Quality of Consensus-Driven Three-Way Group Decision-Making. IEEE Trans. Syst. Man Cybern. Syst. 2023, 53, 2770–2780. [Google Scholar] [CrossRef]
Zhang, C.H.; Su, W.H.; Zeng, S.Z.; Tomas, B.; Enrique, H. A Two-stage subgroup Decision-making method for processing Large-scale information. Expert Syst. Appl. Int. J. 2021, 171, 114586. [Google Scholar] [CrossRef]
Madjid, T.; Ghasem, K.; Hassan, M. A new dynamic two-stage mathematical programming model under uncertainty for project evaluation and selection. Comput. Ind. Eng. 2020, 149, 106795. [Google Scholar]
Yin, X.P.; Xu, X.H.; Pan, B. Selection of Strategy for Large Group Emergency Decision-making based on Risk Measurement. Reliab. Eng. Syst. Saf. 2021, 208, 107325. [Google Scholar] [CrossRef]
Diag, D.; Yuji, K.W. Pivotal voting: The opportunity to tip group decisions skews juries and other voting outcomes. Proc. Natl. Acad. Sci. USA 2022, 119, e2108208119. [Google Scholar]
Marie, R.; Johannes, P.; Lara, M.; Emma, B.; Margarete, B. In no uncertain terms: Group cohesion did not affect exploration and group decision making under low uncertainty. Front. Psychol. 2023, 14, 1038262. [Google Scholar]
Zhang, X.X.; Hipel, K.W.; Tan, Y.J. Project Portfolio Selection and Scheduling under a Fuzzy Environment. Memetic Comput. 2019, 11, 391–406. [Google Scholar] [CrossRef]
Mohagheghi, V.; Mousavi, S.M.; Shahabi-Shahmiri, R. Sustainable project portfolio selection and optimization with considerations of outsourcing decisions, financing options and staff assignment under interval type-2 fuzzy uncertainty. Neural Comput. Appl. 2022, 34, 14577–14598. [Google Scholar] [CrossRef]
Zolfaghari, S.; Mousavi, S.M. A Novel Mathematical Programming Model for Multi-mode Project Portfolio Selection and Scheduling with Flexible Resources and Due Dates under Interval-valued Fuzzy Random Uncertainty. Expert Syst. Appl. 2021, 182, 115202. [Google Scholar] [CrossRef]
Markowitz, H. Portfolio selection. J. Financ. 1952, 7, 77–91. [Google Scholar]
Mohagheghi, V.; Meysam, M.S.; Mojtahedi, M. Project Portfolio Selection Problems: Two decades Review from 1999 to 2019. J. Intell. Fuzzy Syst. 2020, 38, 1675–1689. [Google Scholar] [CrossRef]
Mavrotas, G.; Makryvelios, E. An Integrative Review of Project Portfolio Management Literature: Thematic Findings on Sustainability Mindset, Assessment, and Integration. Oper. Res. 2023, 23, 629–650. [Google Scholar]
Alvarez, A.; Litvinchev, I.S.; Lopez, F. Large-Scale Public R&D Portfolio Selection by Maximizing a Biobjective Impact Measure. IEEE Trans. Syst. Man Cybern. Part A Syst. Humans 2010, 40, 572–582. [Google Scholar]
Albano, T.C.L.; Baptista, E.C.; Armellini, F. Proposal and Solution of a Mixed-Integer Nonlinear Optimization Model That Incorporates Future Preparedness for Project Portfolio Selection. IEEE Trans. Eng. Manag. 2021, 68, 1014–1026. [Google Scholar] [CrossRef]
Zhang, X.Y.; Liao, H.C. A Two-stage Mathematical Programming Model For Distributed Photovoltaic Project Portfolio Selection with Incomplete Preference Information. Expert Syst. Appl. 2022, 28, 1545–1571. [Google Scholar] [CrossRef]
Fisher, B.; Brimberg, J.; Hurley, W.J. An Approximate Dynamic Programming heuristic to support non-strategic project selection for the Royal Canadian Navy. J. Def. Model. Simul. 2015, 12, 83–90. [Google Scholar] [CrossRef]
Eshlaghy, A.T.; Razi, F.F. A hybrid grey-based k-means and genetic algorithm for project selection. Int. J. Bus. Inf. Syst. 2015, 18, 141–159. [Google Scholar] [CrossRef]
Manish, K.; Mittal, M.L.; Gunjan, S. A Tabu Search Algorithm for Simultaneous Selection and Scheduling of Projects. Harmon. Search Nat. Inspired Optim. Algorithms 2019, 741, 1111–1121. [Google Scholar]
Vinyals, O.; Fortunato, M.; Jaitly, N. Pointer Networks. Statistics 2015, 28, 2692–2700. [Google Scholar]
Nazari, M.; Oroojlooy, A.; Snyder, L.V.; Takac, M. Deep Reinforcement Learning for Solving the Vehicle Routing Problem. Statistics 2018, 31, 9861–9871. [Google Scholar]
Li, K.; Zhang, T.; Wang, R. Deep Reinforcement Learning for Multiobjective Optimization. IEEE Trans. Cybern. 2021, 51, 3103–3114. [Google Scholar] [CrossRef]
Bello, I.; Pham, H.; Le, Q.; Norouzi, M.; Bengio, S. Neural combinatorial optimization with reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations (ICLR), Toulon, France, 24–26 April 2017. [Google Scholar]
Chen, X.Y.; Tian, Y.D. Learning to perform local rewriting for combinatorial optimization. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 6278–6289. [Google Scholar]
Yolcu, E.; Poczos, B. Learning local search heuristics for boolean satisfiability. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 7992–8003. [Google Scholar]
Gao, L.; Chen, M.X.; Chen, Q.C.; Luo, G.Z.; Zhu, N.Y.; Liu, Z.X. Learn to design the heuristics for vehicle routing problem. arXiv 2020, arXiv:2002.08539. [Google Scholar]
Lu, H.; Zhang, X.W.; Yang, S. A learning-based iterative method for solving vehicle routing problems. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
Mijumbi, R.; Hasija, S.; Davy, S.; Davy, A.; Jennings, B.; Boutaba, R. A connectionist approach to dynamic resource management for virtualised network functions. In Proceedings of the 12th Conference on Network and Service Management (CNSM), Montreal, QC, Canada, 31 October–4 November 2016; pp. 1–9. [Google Scholar]
Mijumbi, R.; Hasija, S.; Davy, S.; Davy, A.; Jennings, B.; Boutaba, R. Topology-aware prediction of virtual network function resource requirements. IEEE Trans. Netw. Serv. Manag. 2017, 14, 106–120. [Google Scholar] [CrossRef]
Lu, J.Y.; Feng, L.Y.; Yang, J.; Hassan, M.M.; Alelaiwi, A.; Humar, I. Artificial agent: The fusion of artificial intelligence and a mobile agent for energy-efficient traffic control in wireless sensor networks. Future Gener. Comput. Syst. 2019, 95, 45–51. [Google Scholar] [CrossRef]
Jiang, Q.M.; Zhang, Y.; Yan, J.Y. Neural combinatorial optimization for energy-efficient offloading in mobile edge computing. IEEE Access 2020, 8, 35077–35089. [Google Scholar] [CrossRef]
Yu, J.J.Q.; Yu, W.; Gu, J.T. Online vehicle routing with neural combinatorial optimization and deep reinforcement learning. IEEE Trans. Intell. Transp. Syst. 2019, 20, 3806–3817. [Google Scholar] [CrossRef]
Mirhoseini, A.; Goldie, A.; Pham, H.; Steiner, B.; Le, Q.V.; Dean, J. A hierarchical model for device placement. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Dyer, M.E.; Proll, L.G. An improved vertex enumeration algorithm. Eur. J. Oper. Res. 1982, 9, 359–368. [Google Scholar] [CrossRef]
Paul, C.; Jan, L.; Tom, B.B.; Miljan, M.; Shane, L.; Dario, A. Deep reinforcement learning from human preferences. Learning 2017, 3. [Google Scholar]
Bradley, R.A.; Terry, M.E. Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 1952, 39, 324–345. [Google Scholar] [CrossRef]
Luce, R.D. Individual choice behavior: A theoretical analysis. Cour. Corp. 1960, 50, 186–188. [Google Scholar]
Shepard, R.N. Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space. Psychometrika 1957, 22, 325–345. [Google Scholar] [CrossRef]
National Bureau of Statistics. Available online: https://www.stats.gov.cn/ (accessed on 14 April 2024).
Christian, L.; Lucas, T.; Ferenc, H. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. Statistics 2016, 2, 4681–4690. [Google Scholar]
Burke, E.K.; Curtois, T.; Qu, R.; Vanden, B.G. A time predefined variable depth search for nurse rostering. Informs J. Comput. 2013, 25, 411–419. [Google Scholar] [CrossRef]
Burke, E.K.; Curtois, T. New approaches to nurse rostering benchmark instances. Eur. J. Oper. Res. 2014, 237, 71–81. [Google Scholar] [CrossRef]
Musliu, N.; Winter, F. A hybrid approach for the sudoku problem: Using constraint programming in iterated local search. IEEE Intell. Syst. 2017, 32, 52–62. [Google Scholar] [CrossRef]
Abdelghany, M.; Eltawil, A.B.; Yahia, Z.; Nakata, K. A hybrid variable neighbourhood search and dynamic programming approach for the nurse rostering problem. J. Ind. Manag. Optim. 2021, 17, 2051. [Google Scholar] [CrossRef]

Figure 1. Solution and corresponding mathematical matrix expression.

Figure 2. PPSS process.

Figure 3. Two processes of human–machine framework for PPSS.

Figure 4. Linear inequality constraints.

Figure 5. DNN model.

Figure 6. State and action.

Figure 7. Effective Pareto proportion for

w = 0.75

and

w = 1

.

Figure 7. Effective Pareto proportion for

w = 0.75

and

w = 1

.

Figure 8. Progress of loss function value in different instances.

Figure 9. Distribution of effective Pareto solutions for instance 12.

Table 1. Network structure.

Layer Type	Input	Output	Activation Function
Flatten	$\| I \| \times T$	128	/
Dense	128	64	ReLu
Dense	64	32	ReLu
Dense	32	$\| I \|$	Linear

Table 2. Data labels.

Label	Implication
$(σ_{1}, σ_{2}, 1)$	$σ_{1} ≻ σ_{2}$
$(σ_{1}, σ_{2}, 0)$	$σ_{1} ≺ σ_{2}$
$(σ_{1}, σ_{2}, 0.5)$	$σ_{1} \sim σ_{2}$

Table 3. Instance data.

Instance	Projects	Cost Types	Cooperation Constraint	Precedence Constraint	Exclusive Constraint	Qualitative Evaluation
1	40	5	9	12	8	61
2	40	6	13	18	14	73
3	40	7	15	22	14	88
4	80	5	12	23	12	90
5	80	6	48	62	28	179
6	80	7	45	52	24	184
7	120	5	39	55	28	205
8	120	6	50	73	35	240
9	120	7	67	85	70	394
10	160	5	83	102	79	461
11	160	6	90	133	85	410
12	160	7	102	151	82	485

Table 4. Result when applying different value of w.

Instance	$w = 0$	$w = 0.25$	$w = 0.5$	$w = 0.75$	$w = 1$
3	16	19	18	22	15
4	28	35	25	42	37
5	40	38	43	53	57
6	56	61	73	85	79
7	94	86	103	114	121
8	135	148	141	168	162

Boldface indicates that this parameter value is optimal for this instance.

Table 5. Comparison result of proposed algorithm with different algorithms.

Instance	DPbQN	Classic Approach		Heuristic Approach
Instance	DPbQN	Ejection Chain	B&P	CP-ILS	DP-VNS
1	18	7	17	16	18
2	25	4	23	25	25
3	22	8	18	15	22
4	42	26	45	35	33
5	53	28	62	37	41
6	85	43	56	89	85
7	114	55	63	82	97
8	168	61	69	130	126
9	183	69	78	141	152
10	227	84	90	159	178
11	256	90	85	185	192
12	290	96	103	228	243

The underline indicates the optimal result of the algorithm in this instance.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, H.; Zhang, N.; Dou, Y.; Dai, Y. A Robust Human–Machine Framework for Project Portfolio Selection. Mathematics 2024, 12, 3025. https://doi.org/10.3390/math12193025

AMA Style

Chen H, Zhang N, Dou Y, Dai Y. A Robust Human–Machine Framework for Project Portfolio Selection. Mathematics. 2024; 12(19):3025. https://doi.org/10.3390/math12193025

Chicago/Turabian Style

Chen, Hang, Nannan Zhang, Yajie Dou, and Yulong Dai. 2024. "A Robust Human–Machine Framework for Project Portfolio Selection" Mathematics 12, no. 19: 3025. https://doi.org/10.3390/math12193025

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Robust Human–Machine Framework for Project Portfolio Selection

Abstract

1. Introduction

2. Literature Review

2.1. Exact Methods

2.2. Heuristic Methods

2.3. Deep Learning Methods

3. Problem Description

4. Method

4.1. Framework

4.2. Project Value Evaluation Method Based on Qualitative Evaluation

4.3. Robust Evaluation Criteria

4.4. Portfolio Selection Optimization Based on DPbQN

4.4.1. Deep Q Network

4.4.2. Preferences-Based Learning

4.4.3. Complete Algorithm Flow

4.4.4. Training

5. Experimental

5.1. Experimental Setup

5.2. Experiment 1: Comparison of Different Parameter Settings

5.3. Experiment 2: Comparison of Different Problem Scales and Dimensions

5.4. Experiment 3: Comparison of Different Methods

6. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Instance	Projects	Cost Types	Cooperation Constraint	Precedence Constraint	Exclusive Constraint	Qualitative Evaluation
1	40	5	9	12	8	61
2	40	6	13	18	14	73
3	40	7	15	22	14	88
4	80	5	12	23	12	90
5	80	6	48	62	28	179
6	80	7	45	52	24	184
7	120	5	39	55	28	205
8	120	6	50	73	35	240
9	120	7	67	85	70	394
10	160	5	83	102	79	461
11	160	6	90	133	85	410
12	160	7	102	151	82	485

Instance	Projects	Cost Types	Cooperation Constraint	Precedence Constraint	Exclusive Constraint	Qualitative Evaluation
1	40	5	9	12	8	61
2	40	6	13	18	14	73
3	40	7	15	22	14	88
4	80	5	12	23	12	90
5	80	6	48	62	28	179
6	80	7	45	52	24	184
7	120	5	39	55	28	205
8	120	6	50	73	35	240
9	120	7	67	85	70	394
10	160	5	83	102	79	461
11	160	6	90	133	85	410
12	160	7	102	151	82	485

Instance	Projects	Cost Types	Cooperation Constraint	Precedence Constraint	Exclusive Constraint	Qualitative Evaluation
1	40	5	9	12	8	61
2	40	6	13	18	14	73
3	40	7	15	22	14	88
4	80	5	12	23	12	90
5	80	6	48	62	28	179
6	80	7	45	52	24	184
7	120	5	39	55	28	205
8	120	6	50	73	35	240
9	120	7	67	85	70	394
10	160	5	83	102	79	461
11	160	6	90	133	85	410
12	160	7	102	151	82	485