*5.3. Model Establishment Based on the XGBoost Algorithm*

Huawei MLS integrates multiple algorithm nodes and can combine different nodes by dragging and connecting, and creating a corresponding visual workflow for data processing, model training, evaluation, and prediction, according to research tasks. At the same time, MLS integrates the function of the Jupyter notebook, which provides users with an interactive notebook as an integrated development environment for machine learning applications. The environment supports the writing of Python scripts and performs data analysis and model building by using the Spark native algorithm MLlib. Based on the workflow, a hydraulic valve fault diagnosis model combining PCA and the XGBoost algorithm is established in MLS. The specific process is shown in Figure 5.

**Figure 5.** The model of principal component analysis (PCA) and eXtreme Gradient Boosting (XGBoost).
