1. Introduction
Joint operations have become the main trend of modern warfare. The construction of “system of systems (SoS)” is not only a goal but also a basic guideline on the long-term weapon/equipment development. System portfolio selection is a widely used concept of weapon SoS construction, where a key step is evaluation [
1]. Traditional evaluation models rely too much on subjective awareness, making assessment results inaccurate and unconvincing to some extent. With the rise of data science, an effective method to compensate for the low accuracy and implementing difficulty of relying on expert experience is making decisions according to real data. Therefore, the combination of data-driven methods and model-based approaches is a new trend to solving system portfolio selection problems.
Markowitz first proposed the portfolio theory in 1952, opening a new era of utilizing mathematical approaches in resource allocation problems [
2,
3]. In the field of management science and operation research, the portfolio theory is widely used in project research and development (R&D) [
4], supplier selection [
5], material selection etc. With the development of SoS science, portfolio theory shows increasing popularity in the field of weapon SoS construction, where the optimal system portfolio will be selected by evaluating system portfolio candidates through model-based methods. So far, there has been little research that measures weapon system portfolios without subjective criteria. Typical measurements in most literature, such as benefit-risk analysis [
6], cost-efficiency analysis [
7], and requirement-satisfaction analysis [
8] are inaccurate and unconvincing to some extent because they usually require too much expert experience.
Motivated by solving issues mentioned above, data-driven methods are combined with traditional model-based approaches to improve the accuracy and credibility of evaluation results on system portfolios, by reducing dependence on artificial expertise. In addition, data-driven methods are complements of model-based approaches, instead of substitutions, because pure data without models cannot construct the bridge connecting scheme variable inputs and evaluation outputs.
In civilian fields, the portfolio selection theory has been mainly studied and applied on project portfolio problems [
9]. From the perspective of modeling, the scenario-based models are frequently used to describe the boundary of possible cases, based on which, decision-makers evaluate and select the well-matched optimal system portfolio [
10,
11,
12,
13]. Robust models are also widely studied and applied in project portfolio problems to solve the difficulties in determining probabilities of future scenarios, aiming to select an ideal system portfolio that performances well at almost all possible situations [
14,
15,
16]. As for the evaluation and trade-off of project portfolios, variant methods are proposed and studied, such as risk analysis methods [
17], value evaluation methods [
18], cost-efficiency methods [
19], fuzzy assessment methods [
4], preference-based methods [
20], game theory, interactive decision methods [
21], etc. A common ground of those methods is determining the value and risk of a system portfolio to abstractly indicate what decision-makers expect or not expect. As regard to portfolio planning and optimization, the goal is to select the optimal project portfolio by analyzing and comparing candidate project portfolios. The mixed integer model [
22], multi-objective optimization [
23], hybrid and dynamic planning are the most popular optimization methods. In addition, genetic algorithms [
23], Monte Carlo simulation [
24] and Lagrangian relaxation methods are also widely used in the solving process, when facing a large solution scale and specific constraints.
In military fields, most methods in system portfolio selection are based on specific evaluation models, where the most commonly investigated techniques include multiple objective analysis, multiple criteria analysis [
25], value analysis [
26], cost-efficiency analysis [
27], expert judgment [
27], Monte Carlo technique, risk analysis and etc. In detail, Yang et al. [
26] formulize the weapon system portfolio problem with a mixed integer non-linear optimization model and solve the problem with an adaptive immune genetic algorithm. Greiner et al. [
28] conclude challenges of the Department of Defense (DoD) in determining weapon system value during portfolio selection processes. Cheng et al. [
29] use combat network and operation loop to analyze strategies of the weapon system portfolio selection problem, where the operational capability evaluation indexes of weapon systems are constructed. Zhou et al. [
30] deal with weapon system portfolio selection problems based on fuzzy clustering, with the maximum deviation methods applied to rank all the candidates by calculating the weight of each weapon system. Kangaspunta J et al. [
27] use the cost-efficiency method to decide the acquisition and maintenance of military equipment, aiming to build long-term capabilities in future military conflicts. Li et al. [
31] adopt a network-based method to formulate and analyze weapon system portfolio architecting problem by embedding different types of systems into a network. Zhou et al. [
32] study the evolving capability requirement-oriented portfolio planning problem with a capability-based approach from the perspective of operational research. Huang et al. [
33] regard the weapon system portfolio as a constrained combinatorial optimization problem and use a self-adaptive memetic algorithm-based decision-making method to maximize the expected damage of hostile targets.
Whatever in civilian fields and military fields, the model-based portfolio selection methods have been elaborately studied. With the increase of requirements for more accurate and valid approaches, the data-driven idea is appropriate to be applied to the portfolio selection. In the paper, we focus mainly on system portfolio evaluation, where a key part is determining criteria that influence the evaluation result of an object. Herein, two criteria of value and risk are used to evaluate system portfolios, where the value criterion is decided according to capability gaps of system portfolios and the risk criterion is decided by the remaining useful life (RUL) of systems. Based on the two criteria, the optimization is to obtain the system portfolio with the maximal value and minimal risk, within the limitation of a certain cost. To increase the credibility and practicability, the weight information in value evaluation and the RUL are all decided according to simulation data, instead of expert experience.
The remaining parts of the paper are structured as follows. In the second section, the capability gap-based value decision method and the RUL-based risk decision method are studied. In the third section, a case is examined to verify the utility and effectiveness of the proposed methods and models. Then, the results are discussed by analyzing the frequency of being selected and the association rules.
3. Results
3.1. Background Description
It is hypothesized that the problem aims to select an optimal system portfolio under the anti-missile scenario, where an object will suffer saturated missiles attacks. The objective is selecting a system portfolio from 100 alternative systems ( candidate system portfolios in total) under the budget limitation to maximize the system portfolio value and simultaneously minimize the whole risk.
According to the operation process of OODA, capabilities discussed in the paper are
detection range,
communication range,
striking range and
decision time, where the former three capabilities are beneficial type and the last one is cost type. In a specific operation scenario, the capability requirements and combination rules are shown in
Table 1.
In addition, for capabilities of systems, they are generated by executing a Monte Carlo simulation method according to truncated normal distribution functions. The histogram of generated data is shown in
Figure 4. The worst value of cost-type capability decision time is 34.6853, which will be used in value calculation according to
Section 2.1.2.
3.2. Value Calculation
3.2.1. Weight Determination
Based on simulations on the “Command: Modern Air/Naval Operations”, an ultimate military simulator for modern military conflicts, the weight information can be deduced by the correlation analysis.
The independent variables are four capabilities, that is detection range, communication range, striking range and decision time. The dependent variable is the intercepted missile number. By auto-simulating for 10,000 times, 10,000 sets of data are generated. Through the MIC algorithm, the corresponding results can be obtained, as
Table 2 shows. Through normalization, the weight of four capabilities are determined as 0.276, 0.250, 0.174, and 0.300.
3.2.2. Value Calculation
Because it is impossible to calculate all values of
candidate system portfolios, an example of the value calculation process is introduced. Assuming a system portfolio SP1 have 5 component systems of S1, S2, S3, S4, and S5, with capability information shown in
Table 3.
According to capability combination rules in
Table 1, the combined capabilities of system portfolio SP1 are shown in the last column in
Table 3.
Then, according to Equation (4), the capability gaps of four combined capabilities are calculated as Equation (15) shows. The value of any system portfolio can be calculated based on the same steps.
3.3. Risk Determination Based on RUL Prediction
In the case study, the key component of weapon systems, the turbine engine, is taken as an example for analyzing risks. The data is derived from the experiment conducted by a commercial modular simulation software C-MAPSS as shown in
Figure 5.
The C-MAPSS simulates the operation of a turbine engine with 900,000-pound thrust and records monitoring signals. Based on the principle of thermodynamics, two failure modes are designed: high-pressure compressor degradation and fan degradation. The main functional modules and connections are shown in
Figure 6. The simulation runs in the following settings:
- (1)
The simulation experiment data contains time series of 21 variables. It can be further divided into a training set and a testing set. Each multivariate time series corresponds to a specific engine, meaning that the data can be considered to be generated by engines of different systems.
- (2)
The initial wear condition of each engine might not be identical and there are manufacturing variations, which are considered reasonable and not treated as reasons of engine failures.
- (3)
There are 3 operational setting parameters that have a substantial impact on an engine’s performance.
- (4)
There are noises in the data.
- (5)
The engines operate normally at the initial moment and begin to degrade at some points in time series. In the training set, the cumulative degradation quantity continues to grow until it reaches or exceeds the preset threshold. In the testing set, the time series will terminate when engines fail.
As a result, 100 degradation tracks are obtained in the training set and 100 tracks before failure in the testing set. The training data is used to establish the RUL prediction model of engines, and the testing set is used to test the feasibility of the model.
The monitoring data is shown in the scatter plots in
Figure 7. Each plot visualizes the 100 degradation tracks of one variable in the training set. The engine code, the operation cycle and the 3 operational setting parameters are not shown in the figure.
Due to the fact that the constant variable is unable to reflect the evolution of engine degradations, variable 1, 5, 6, 10, 16, 18, and 19 are not regarded as feature variables. What’s more, the tracks exhibit different trends in terms of variable 9 and 14, so variable 9 and 14 are inadequate to describe the degradation process.
Then, the variation coefficients of the rest 12 variables are calculated based on the degradation data and the results are shown in
Table 4. According to the rule of eliminating variables with small variation coefficients, variable 3, 4, 11, 15, 17, 20 and 21 are chosen as the base variables that represent engine degradation characters.
According to the RUL prediction method, the remaining life of the engines in the testing set can be estimated by matching the testing data with the reference tracks. Then, the risks can be obtained. The results are shown in
Table 5.
3.4. Portfolio Selection Results Analysis
In total, there are possible schemes, which is a huge number. Thus, a heuristic algorithm is necessary to be applied to the solving process. Considering the value and risk factors, the objectives are maximizing the value and minimize the risk of system portfolios. Therefore, a multi-objective algorithm is employed to solving the optimization problem. The non-dominated sorting generic algorithm (NSGA) is a kind of widely used multi-objective algorithm, which exhibits a good performance for retaining elites in offspring. On the other side, the differential evolution (DE) is a nice genetic operator, which plays really well on keeping population diversity. Thus, the paper uses the non-dominated differential evolution (NSDE) algorithm, which fuses the two advantages of NSGA and DE, to solve the system portfolio optimization problem. The corresponding parameters are set as follows. The population size is Pop = 100, the number of iterations is Gen = 1000, the mutation probability is 0.01 and the crossover probability is 0.2.
Due to the certain randomization of all genetic algorithms, The NSDE also generates results with certain fluctuant. A typical method to guarantee the optimality of generated result is running the algorithm for multiple times, and then select the best individuals by comparing the corresponding multiple results. In the case, the program is iterated for 10 times to generate 10 Pareto results with each containing 200 individuals, shown in (a) of
Figure 8. Then, the 10 sets of Pareto results are combined together to obtain the best 200 individuals among them, as shown in (b) of
Figure 8.
In detail, the 200 individuals of the best Pareto set are shown in the system option diagram in
Figure 9. The rectangular area is divided into 100 × 200 rectangles according to the number of system candidates and the dimension of the non-dominated weapon system portfolios. Each rectangle represents whether a system candidate is selected in the Pareto set. If a weapon system
i is selected by the
j th non-dominated system portfolio, the
i th row and
j th column rectangular block will be colored black, otherwise, it is left blank.
From
Figure 10, it can be seen that some systems are frequently selected in the Pareto set. However, some systems are seldom selected or even never selected. To compare the importance degree of different systems, the frequencies for all systems of being selected in the Pareto set is counted. Systems of S6, S9, S15, S16, S22, S25, S32, S39, S50, S52, S65, S69, S70, S71, S72, S73, S75, S79, S83, S90, S98, and S99 are selected by at least one system portfolio in the Pareto set. In addition, the systems S9, S25, S50 are quite important according to the high selected numbers in the Pareto set. Further, the rank of systems according to selected numbers is: S25 > S9 > S50 > S39 > S69 > S75 > S83 > S22 > S71 > S32 > S73 > S16 > S98 > S72 > S15 > S6 > S65 > S99 > S70 > S79 > S52 > S90, which to some extend indicates the importance degrees of selected systems. As regarding to the rest systems, they can be directly neglected in the system portfolio selection process.
By deeper analysis, it can be discovered that some systems tend to always be selected together. Therefore, a frequent item set mining algorithm of Apriori is applied to identify the association rules, shown in
Table 6. The support parameter indicates the ratio between the simultaneously appearing frequency and all items, which means the probability of appearing simultaneously. The confidence of the rule of “A
B” represents the ratio of
, which means the probability of
when
appears.
In
Table 5, the association rules are ranked by the value of support and confidence respectively. Firstly, according to the ranking by support, it can be elicited that the “S9
S25” is the most frequent rule, which means they tend to be selected together. In addition, when system S75 is selected, the system S25 must also be selected according to the first rule in the ranking by confidence. Referring the association rules, decision-makers can have a deeper understanding of the significance of system portfolios.
4. Discussion
The paper shows the feasibility of replacing expert subjective expertise with knowledge obtained from data. Firstly, the weight information of capabilities is determined by analyzing correlations between capabilities and the intercepted missile numbers, based on operation simulation data. Then, as regards the risk criterion, the paper tries to determine the risk by mining information from system operation data. The data-driven methods are only components of the model-based approaches, aiming to increase the accuracy and credibility of results.
In the case study, 100 system candidates are provided to be optimized on the scenario of anti-missile. By automatically simulating the operation scenario for 10,000 times, 10,000 simulation results are generated, according to which, the maximal information coefficients between four capabilities and the variable of intercepted missile quantity are calculated as the weight of capabilities. It quantitatively indicates that the capability of decision time has the biggest impact on the interpreted missile quantity. In addition, by running the simulator of C-MAPSS for 200 times, 200 groups of system operation data are generated, according to which, systems risks are obtained through prediction of RUL.
In the system portfolio optimization, considering the great number of candidate system portfolios, the NSDE algorithm is applied to solving the optimization problem. To guarantee the optimality of the result as far as possible, 10 Pareto sets are obtained by running the NSDE for 10 times. 200 non-dominated individuals are reserved by comparing the 10 Pareto sets. However, it can be not proved the best Pareto set, due to the almost infinity of candidate system portfolios and the randomness of genetic algorithms. By further analyzing the characters of generated Pareto set, 22 systems are selected at least one time, and 16 association rules are mined. These characters can play an assistant rule for decision-makers to make a deeper understanding of the system portfolios.
In conclusion, the system portfolio selection is the mainstream trend of future equipment development. Compared to other traditional system portfolio decision and optimization methods, the proposed model and data-driven approach provide a solution to avoid the excessive dependence on subjective expert experience in the evaluation and decision process. Traditionally, determining these parameters requires cumbersome processes of organizing experts, collecting expert opinions, analyzing expert scores etc., which are time and effort-consuming and more likely to be questioned. The model and data-driven methods can make use of models that have been proved to be effective on one hand, and on the other hand, it can determine the required parameter values in the model through data analysis. Therefore, it supports more efficient, more credible, and more practical evaluation and decisions in system portfolio selection and other fields applications.