An Explainable Data-Driven Optimization Method for Unmanned Autonomous System Performance Assessment
Abstract
:1. Introduction
2. Problem Statement
2.1. Existing Evaluation Methods
- Subjective evaluation methods assigned weights based on expert judgment. A widely used approach was the Delphi method [13], which gathered expert consensus to establish indicator weights. Although simple and leveraging expert insights, it can be prone to bias and inconsistencies due to varying expert opinions. The Analytic Hierarchy Process (AHP) [14] breaks down complex decisions into hierarchical structures, assigning weights through pairwise comparisons. While systematic, this method may be subject to individual biases and become cumbersome with numerous indicators. The network analysis method [15] studies relationships and dependencies between factors through a network structure, offering insights into factor interactions. However, it can be challenging to model and often requires significant data for accurate representation;
- Objective evaluation methods derive weights from the inherent characteristics of data. Techniques such as principal component analysis (PCA) [16] reduce the dimensionality of data and assign weights based on the variance explained by principal components. This is effective for handling large datasets but may neglect the significance of individual indicators not captured in the components. Information weighting techniques [17] allocate weights based on the information contribution of each indicator, such as through information entropy. While quantitative, these methods might overlook contextual factors affecting indicator significance. The CRITIC method [18] assigns weights based on contrast intensity and criteria conflict, balancing indicator characteristics. Its performance depends on the quality and availability of statistical data. Factor analysis [19] explores hidden relationships between variables and assigns weights based on their contribution to overall variance, simplifying complex datasets but potentially obscuring the importance of individual indicators. Entropy weighting [20] assigns greater weights to indicators with more variability, offering a data-driven approach, but it may not fully capture practical relevance;
- Hybrid evaluation methods integrate subjective and objective techniques for a more comprehensive assessment. A commonly used approach is fuzzy AHP [21], which combines fuzzy logic with AHP to address uncertainty in multi-criteria decision-making. This method improves adaptability but can be computationally demanding. The CRITIC-G1 method [22], which merges CRITIC and G1 techniques, is another hybrid method that provides a balanced evaluation but may require substantial computational resources and data.
2.2. Comparison and Analysis of Important Algorithms Within the Last Three Years
2.3. Existing Model Interpretation Method
- Global interpretability methods: Based on the entire dataset, these methods analyze the impact of each feature on the output results of the black-box model, identifying which feature or features have the greatest impact on the model’s output results. Common global interpretability methods include Partial Dependence Plots (PDPs) [23] and Accumulated Local Effect (ALE) plots [24]. PDPs show the impact of a feature on the model’s prediction results, assuming other features remain unchanged. It calculates the average impact of different values of a feature on the model’s prediction, given that all other features are fixed, thus revealing the relationship between a specific feature and the prediction result, helping to understand how the model uses that feature for predictions. It is intuitive, but may be misleading in the presence of high feature correlations. ALEs, by calculating the local effects of a feature at different values and accumulating them across all sample points, shows the feature’s impact on the model’s prediction results. ALEs consider the correlation between features, avoiding the potential misleading results of PDPs, making it suitable for handling data with high feature correlation, providing more reliable explanations. However, compared to PDPs, their calculations and interpretations are more complex;
- Local interpretability methods: Based on individual samples or a group of samples, these methods interpret the prediction behavior of the black-box model from the perspective of individuals, i.e., for a specific input sample, identifying which features contribute most to the black-box model. Common local interpretability methods include Local Interpretable Model-agnostic Explanations (LIMEs) [25], Individual Conditional Expectation (ICE) [26], and Shapley Additive Explanations (SHAPs) [27]. LIMEs perturb the data around a specific sample and fits simple models such as linear models or decision trees to understand the local behavior of the complex model. While they provide local interpretability, they may not capture global model behavior. ICE plots show the impact of changes in feature values on prediction results for different samples, allowing the intuitive observation of the local effects of features on individual predictions. They offer insights into local effects but may not address interactions between features. SHAPs calculate the marginal contribution of each feature in different feature combinations and averages these contributions across all features, thus quantifying the impact of each feature on a specific prediction. However, they can be computationally expensive and complex.
2.4. Limitations of the Existing Evaluation Method
- The evaluation results are often not comprehensive enough for certain targets;
- The results obtained from different methods within the same evaluation framework can vary significantly;
- The impact of individual indicators on the evaluation outcomes is not adequately explained.
3. Proposed Method
3.1. Stability of Data Distribution Characteristics
3.2. Data Feature Stability-Based Optimization Algorithm
3.3. Algorithmic Complexity Analysis
4. Experiments
4.1. Intelligent System Architecture
- (1)
- Motion planning decision time: the average time taken by a field exploration robot to initiate task execution after receiving a mission during multiple experiments;
- (2)
- Task decision time: the average time taken from the commencement to the conclusion of the decision-making process in a field exploration robot across multiple experiments;
- (3)
- Task decision accuracy: the ratio of successful decision-making instances to the total number of decisions made by a field exploration robot in accomplishing specified tasks during multiple experiments;
- (4)
- Environmental complexity: the degree of complexity of the environment in which a field exploration robot formulates decision plans.
4.2. Performance of the Optimizing Evaluation System
4.3. Evaluation System Interpretability Verification
- (1)
- PDP
- (2)
- ALE
- (3)
- LIME
- (4)
- SHAP
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Han, Y.; Fang, D.; Li, Y.; Zhang, H. Efficiency Evaluation of Intelligent Swarm Based on AHP Entropy Weight Method. In Proceedings of the Journal of Physics: Conference Series, Hulun Buir, China, 25–27 September 2020. [Google Scholar]
- Alharasees, O.; Abdalla, M.S.; Kale, U. Evaluating AI-UAV Systems: A Combined Approach with Operator Group Comparison. In Proceedings of the 2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), Istanbul, Turkey, 8–10 June 2023. [Google Scholar]
- Yunbin, Y.A.N.; Weiwei, P.E.I.; Wanku, S.U.N.; Jianzhou, Y.E. Research on Maintenance Quality Evaluation Method for Unmanned Aerial Vehicle. In Proceedings of the 2019 IEEE 3rd Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Chongqing, China, 11–13 October 2019. [Google Scholar]
- Xiaohong, W.; Zhang, Y.; Lizhi, W.; Dawei, L.U.; Guoqi, Z. Robustness Evaluation Method for Unmanned Aerial Vehicle Swarms Based on Complex Network Theory. Chin. J. Aeronaut. 2020, 33, 352–364. [Google Scholar]
- Zhu, Z.; Wang, J.; Zhu, Y.; Chen, Q.; Liang, X. Systematic Evaluation and Optimization of Unmanned Aerial Vehicle Tilt Photogrammetry Based on Analytic Hierarchy Process. Appl. Sci. 2022, 12, 7665. [Google Scholar] [CrossRef]
- Chen, G.; Zhang, W. Comprehensive Evaluation Method for Performance of Unmanned Robot Applied to Automotive Test Using Fuzzy Logic and Evidence Theory and FNN. Comput. Ind. 2018, 98, 48–55. [Google Scholar] [CrossRef]
- Sun, Y.; Yang, H.; Meng, F. Research on an Intelligent Behavior Evaluation System for Unmanned Ground Vehicles. Energies 2018, 11, 1764. [Google Scholar] [CrossRef]
- Dong, S.; Yu, F.; Wang, K. Safety Evaluation of Rail Transit Vehicle System Based on Improved AHP-GA. PLoS ONE 2022, 17, e0273418. [Google Scholar] [CrossRef] [PubMed]
- Fei, C.-W.; Li, H.; Liu, H.-T.; Lu, C.; An, L.-Q.; Han, L.; Zhao, Y.-J. Enhanced Network Learning Model with Intelligent Operator for the Motion Reliability Evaluation of Flexible Mechanism. Aerosp. Sci. Technol. 2020, 107, 106342. [Google Scholar] [CrossRef]
- Sheh, R. Evaluating Machine Learning Performance for Safe, Intelligent Robots. In Proceedings of the 2021 IEEE International Conference on Intelligence and Safety for Robotics (ISR), Virtual, Nagoya, Japan, 4–6 March 2021. [Google Scholar]
- Leitzke, P.M.; Wehrmeister, M.A. Real-Time Performance Evaluation for Robotics. J. Intell. Robot. Syst. 2021, 101, 37. [Google Scholar]
- Jia, L.; Chen, S.; Feng, L. An optimization evaluation approach to enhance the reliability of intelligent system. In Proceedings of the 2023 10th International Forum on Electrical Engineering and Automation, IFEEA 2023, Nanjing, China, 3–5 November 2023; pp. 1208–1211. [Google Scholar]
- Linstone, H.A. The Delphi Technique. In Environmental Impact Assessment, Technology Assessment, and Risk Analysis; Covello, V.T., Mumpower, J.L., Stallen, P.J.M., Uppuluri, V.R.R., Eds.; Springer: Berlin/Heidelberg, Germany, 1985; pp. 621–649. [Google Scholar]
- Nguyen, G. The Analytic Hierarchy Process: A Mathematical Model for Decision Making Problems. Bachelor’s Thesis, The College of Wooster, Wooster, OH, USA, 2014. [Google Scholar]
- Roger, T.A.J.L. Modeling Social Influence through Network Autocorrelation: Constructing the Weight Matrix. Soc. Netw. 2002, 24, 21–47. [Google Scholar]
- Groth, D.; Hartmann, S.; Klie, S.; Selbig, J. Principal Components Analysis. In Computational Toxicology; Reisfeld, B., Mayeno, A.N., Eds.; Methods in Molecular Biology; Humana Press: Totowa, NJ, USA, 2013; Volume 930, pp. 527–547. [Google Scholar]
- He, D.; Xu, J.; Chen, X. Information-Theoretic-Entropy Based Weight Aggregation Method in Multiple-Attribute Group Decision-Making. Entropy 2016, 18, 171. [Google Scholar] [CrossRef]
- Diakoulaki, D.; Mavrotas, G.; Papayannakis, L. Determining Objective Weights in Multiple Criteria Problems: The Critic Method. Comput. Oper. Res. 1995, 22, 763–770. [Google Scholar] [CrossRef]
- Lawley, D.N.; Maxwell, A.E. Factor Analysis as a Statistical Method. J. R. Stat. Society. Ser. D Stat. 1962, 12, 209–229. [Google Scholar] [CrossRef]
- Wu, R.M.; Zhang, Z.; Yan, W.; Fan, J.; Gou, J.; Liu, B.; Gide, E.; Soar, J.; Shen, B.; Fazal-e-Hasan, S. A Comparative Analysis of the Principal Component Analysis and Entropy Weight Methods to Establish the Indexing Measurement. PLoS ONE 2022, 17, e0262261. [Google Scholar] [CrossRef] [PubMed]
- Ghazanfari, M.; Rouhani, S.; Jafari, M. A Fuzzy TOPSIS Model to Evaluate the Business Intelligence Competencies of Port Community Systems. Group 2014, 12, 14. [Google Scholar] [CrossRef]
- Zhao, H.; Wang, Y.; Liu, X. The Evaluation of Smart City Construction Readiness in China Using CRITIC-G1 Method and the Bonferroni Operator. IEEE Access 2021, 9, 70024–70038. [Google Scholar] [CrossRef]
- Moosbauer, J.; Herbinger, J.; Casalicchio, G.; Lindauer, M.; Bischl, B. Explaining Hyperparameter Optimization via Partial Dependence Plots. Adv. Neural Inf. Process. Syst. 2021, 34, 2280–2291. [Google Scholar]
- Okoli, C. Statistical Inference Using Machine Learning and Classical Techniques Based on Accumulated Local Effects (ALE). arXiv 2024, arXiv:2310.09877. [Google Scholar]
- Zafar, M.R.; Khan, N. Deterministic Local Interpretable Model-Agnostic Explanations for Stable Explainability. Mach. Learn. Knowl. Extr. 2021, 3, 525–541. [Google Scholar] [CrossRef]
- Goldstein, A.; Kapelner, A.; Bleich, J.; Pitkin, E. Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. J. Comput. Graph. Stat. 2015, 24, 44–65. [Google Scholar] [CrossRef]
- Mangalathu, S.; Hwang, S.-H.; Jeon, J.-S. Failure Mode and Effects Analysis of RC Members Based on Machine-Learning-Based SHapley Additive exPlanations (SHAP) Approach. Eng. Struct. 2020, 219, 110927. [Google Scholar] [CrossRef]
Algorithm | Average Computational Time | Accuracy | Adaptability |
---|---|---|---|
AHP-FCE [1] | 120 s | Subjective judgment, low accuracy | Poor on large datasets |
AHP-GA [8] | 250 s | Local optimum risk | Sensitive to initial parameters |
PSO-WOA [12] | 180 s | Improved convergence, noise sensitive | Good across scenarios |
CRITIC [18] | 90 s | Uneven data reduces accuracy | Limited in dynamic tasks |
Methods | Time Complexity | Space Complexity |
---|---|---|
Proposed method | ||
AHP-FCE method | ||
Two-level analytic hierarchy model | ||
OODA loop + AHP method | ||
AHP-GA method | ||
EAHP method | ||
Fuzzy logic + evidence theory + fuzzy neural networks | ||
GRNN + MPGA | ||
Robotstone benchmark | ||
ML-based evaluation system |
Collection Serial Number | Motion Planning Decision Time | Environment Complexity | Task Decision Time | Task Decision Accuracy |
---|---|---|---|---|
1 | 58.6 s | 84% | 146 s | 85% |
2 | 79.4 s | 80% | 193 s | 88% |
… | … | … | … | … |
1200 | 64.7 s | 89% | 179 s | 91% |
Input Output | PDP Curve |
---|---|
Step | S1: Initialization Set as the feature to analyze. S2: Select feature values Determine a range of values for based on the data distribution or specific intervals of interest. S3: Iterate through feature values For each value in the determined range of , perform the following: Replace in the dataset with , keeping other features unchanged. Compute the expected value of the model’s prediction: S4: Record results Store each and its corresponding . S5: Plot PDP Use the recorded data to plot the PDP curve. |
Input Output | ALE Plot |
---|---|
Step | S1: Initialization Set as the feature to analyze. S2: Select feature values Determine a range of values for based on the data distribution or specific intervals of interest. S3: Iterate through feature values For each value in the determined range of , perform the following: Replace in the dataset with , keeping other features unchanged. Compute the expected value of the model’s prediction: S4: Record results Store each and its corresponding . S5: Plot PDP Use the recorded data to plot the PDP curve. |
Input Output | Local Explanation Plot |
---|---|
Step | S1: Select instance Choose a specific instance from the dataset for which you want to explain the model prediction. S2: Generate perturbations Perturb to create a dataset of similar instances (perturbed samples) by sampling around and introducing small changes in the feature values. S3: Model prediction Obtain predictions from the black-box model for the perturbed samples. S4: Fit interpretable model Fit an interpretable model (e.g., linear regression, decision tree) locally around to explain the predictions made by the black-box model on the perturbed samples. S5: Interpretation Interpret the coefficients or feature importance of the interpretable model to understand which features and their values contribute most to the prediction for . |
Input Output | SHAP Values |
---|---|
Step | S1: Initialization Compute the model prediction . S2: Calculate baseline Compute the average prediction over the background dataset: , where is the number of instances in . S3: Iterate through features For each feature create subsets of instances by conditioning on being present or absent in : ①: Subset where in . ②: Subset where in . S4: Calculate contributions (SHAP values) For each feature : Compute the difference in predictions due to the presence of : ①. ②Weight the contribution using Shapley values: . S5: Aggregate SHAP values Sum up the contributions across all features to obtain the SHAP values for , where is the number of features. S6: Output Return SHAP values for each feature , indicating their impact on the prediction for . |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yi, H.; Zhang, H.; Wang, H.; Wang, W.; Jia, L.; Feng, L.; Wang, D. An Explainable Data-Driven Optimization Method for Unmanned Autonomous System Performance Assessment. Electronics 2024, 13, 4469. https://doi.org/10.3390/electronics13224469
Yi H, Zhang H, Wang H, Wang W, Jia L, Feng L, Wang D. An Explainable Data-Driven Optimization Method for Unmanned Autonomous System Performance Assessment. Electronics. 2024; 13(22):4469. https://doi.org/10.3390/electronics13224469
Chicago/Turabian StyleYi, Hang, Haisong Zhang, Hao Wang, Wenming Wang, Lixin Jia, Lihang Feng, and Dong Wang. 2024. "An Explainable Data-Driven Optimization Method for Unmanned Autonomous System Performance Assessment" Electronics 13, no. 22: 4469. https://doi.org/10.3390/electronics13224469
APA StyleYi, H., Zhang, H., Wang, H., Wang, W., Jia, L., Feng, L., & Wang, D. (2024). An Explainable Data-Driven Optimization Method for Unmanned Autonomous System Performance Assessment. Electronics, 13(22), 4469. https://doi.org/10.3390/electronics13224469