Next Article in Journal
Updated Toolbox for Assessing Neuronal Network Reconstruction after Cell Therapy
Next Article in Special Issue
Quantification of Visceral Fat at the L5 Vertebral Body Level in Patients with Crohn’s Disease Using T2-Weighted MRI
Previous Article in Journal
Application of Artificial Intelligence Methods on Osteoporosis Classification with Radiographs—A Systematic Review
Previous Article in Special Issue
Automatic Segmentation of Bone Marrow Lesions on MRI Using a Deep Learning Method
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Staging of Liver Fibrosis Based on Energy Valley Optimization Multiple Stacking (EVO-MS) Model

by
Xuejun Zhang
1,2,
Shengxiang Chen
1,*,
Pengfei Zhang
1,
Chun Wang
1,
Qibo Wang
1 and
Xiangrong Zhou
3
1
School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China
2
Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning 530004, China
3
Department of Electrical, Electronic and Computer Engineering, Gifu University, Gifu 501-1193, Japan
*
Author to whom correspondence should be addressed.
Bioengineering 2024, 11(5), 485; https://doi.org/10.3390/bioengineering11050485
Submission received: 23 April 2024 / Revised: 9 May 2024 / Accepted: 10 May 2024 / Published: 13 May 2024

Abstract

:
Currently, staging the degree of liver fibrosis predominantly relies on liver biopsy, a method fraught with potential risks, such as bleeding and infection. With the rapid development of medical imaging devices, quantification of liver fibrosis through image processing technology has become feasible. Stacking technology is one of the effective ensemble techniques for potential usage, but precise tuning to find the optimal configuration manually is challenging. Therefore, this paper proposes a novel EVO-MS model—a multiple stacking ensemble learning model optimized by the energy valley optimization (EVO) algorithm to select most informatic features for fibrosis quantification. Liver contours are profiled from 415 biopsied proven CT cases, from which 10 shape features are calculated and inputted into a Support Vector Machine (SVM) classifier to generate the accurate predictions, then the EVO algorithm is applied to find the optimal parameter combination to fuse six base models: K-Nearest Neighbors (KNNs), Decision Tree (DT), Naive Bayes (NB), Extreme Gradient Boosting (XGB), Gradient Boosting Decision Tree (GBDT), and Random Forest (RF), to create a well-performing ensemble model. Experimental results indicate that selecting 3–5 feature parameters yields satisfactory results in classification, with features such as the contour roundness non-uniformity (Rmax), maximum peak height of contour (Rp), and maximum valley depth of contour (Rm) significantly influencing classification accuracy. The improved EVO algorithm, combined with a multiple stacking model, achieves an accuracy of 0.864, a precision of 0.813, a sensitivity of 0.912, a specificity of 0.824, and an F1-score of 0.860, which demonstrates the effectiveness of our EVO-MS model in staging the degree of liver fibrosis.

1. Introduction

Liver fibrosis is a common hepatic disease characterized by the abnormal proliferation and deposition of collagen fibers and other extracellular matrix components within the liver, resulting from chronic liver injury [1,2]. This pathological repair response is a critical step in the progression of various chronic liver diseases towards cirrhosis. The process is associated not only with chronic viral hepatitis, such as hepatitis B and C, but also with the incidence of fibrosis due to non-alcoholic fatty liver disease (NAFLD) and autoimmune liver diseases, which have also been increasing in recent years. Early diagnosis and accurate staging of liver fibrosis are of significant importance for treatment and prognosis. Traditional diagnostic methods for liver fibrosis primarily rely on liver tissue examination, namely, liver biopsy. Liver biopsy is considered the gold standard for diagnosing liver fibrosis [3,4,5]; however, its invasive nature, high cost, and associated risks limit its widespread clinical application. Therefore, there is an urgent need for a non-invasive and convenient method for diagnosing liver fibrosis. As fibrosis progresses, the liver surface becomes increasingly irregular, forming nodules and rough edges, leading to increased roughness of the liver’s margin. By calculating the roughness characteristics of the liver’s edge, one can assess the complexity and heterogeneity of the liver surface, which correlates positively with the degree of fibrosis [6].
In recent years, with the advancement of medical imaging technologies [7,8,9,10], such as ultrasound, CT, and MRI, it has become possible to provide information on the morphology, structure, and function of the liver [11]. The feature extraction, classification, and analysis of medical images can achieve non-invasive or minimally invasive qualitative or quantitative assessment of liver fibrosis, offering possibilities for non-invasive or minimally invasive grading of fibrosis [12]. Features in medical images can be broadly categorized into texture features and shape features. Texture features describe the attributes of gray-scale variations and spatial distribution in an image [13], while shape features are quantitative indicators used to describe the morphology of an object [14]. Due to the complexity of individual learners, their performance often fails to meet requirements; ensemble learning can combine multiple weak learners into a strong learner [15]. Boosting, bagging, and stacking are classic algorithms in ensemble learning. Boosting sequentially builds a series of classifiers, adjusting sample weights each round, focusing on incorrectly classified samples to generate multiple prediction functions [16]. Bagging constructs multiple independent learners in parallel, combining their prediction results in the end [17]. Stacking combines the prediction results of multiple base-learning algorithms through a meta-learning algorithm [18]. Stacking ensemble techniques are widely applied; for instance, a stacking ensemble learning framework (SELF) was constructed by Liang M et al. [19] by integrating three machine learning methods, achieving high accuracy in prediction tasks. Cui S et al. proposed a stacking ensemble learning model based on an improved swarm intelligence optimization algorithm, validating its effectiveness on a Chinese earthquake dataset from 1996–2017 [20]. Mota L F M et al. combined stacking ensemble learning with real-time milk analysis to predict cheese production characteristics [21]. Zhang H et al. introduced a multi-dimensional feature fusion and stacking ensemble mechanism (MFFSEM), effectively detecting abnormal network traffic behaviors, achieving commendable results on two intrusion detection evaluation datasets (UNSW-NB15 and CIC-IDS-2017) [22]. Rashid M et al. introduced a tree-based stacking ensemble technique (SET), which, by further enhancing feature selection techniques, better identified normal and anomalous traffic in networks, compared to other existing IDS models [23]. Kardani N et al. used the Artificial Bee Colony (ABC) optimization algorithm to find the best combination of base classifiers and determine the most suitable meta-classifier from 11 machine learning algorithms. The experiments showed that the improved stacking model significantly enhanced the predictive ability for slope stability [24]. By applying meta-heuristic algorithms, suitable solutions, close to the optimal, can be found in a short time for model optimization. The EVO algorithm has a strong global search capability, allowing it to find global optima in complex optimization problems more effectively. The EVO algorithm tends to have higher search efficiency and better convergence performance compared to traditional optimization algorithms. Accordingly, this study proposes an EVO-MS model optimized by the energy valley algorithm (EVO) [25,26,27,28].
The author adapted the micro-unevenness indicators from industrial applications for detecting the shape characteristics of the liver’s edge, selecting materials with significant deformation, such as silicone models, to replace human liver in preliminary tests. Using the SVM model [29] to analyze liver CT images, the study identified feature parameters with significant impact on classification experiments and trained the EVO-MS ensemble model with these parameters. This research aims to explore the effectiveness and applicability of the EVO-MS-based liver fibrosis grading method, providing a new tool for the diagnosis and monitoring of clinical liver fibrosis.

2. Materials and Methods

2.1. Dataset

All liver CT images in this study were obtained from the Radiology Department of the First Affiliated Hospital of Guangxi Medical University between June 2009 and March 2011 [6,30,31]. The images consist of 415 cases, both diagnosed via liver puncture biopsy and those without a history of liver-related diseases, who did not undergo biopsy. The grading of liver fibrosis was based on the chronic hepatitis fibrosis staging standards revised in 2000 by the Infectious Diseases and Parasitology Branch and the Hepatology Branch of the Chinese Medical Association. The stages were divided into the normal group (S0), the mild fibrosis groups (S1 and S2), the severe fibrosis groups (S3 and S4), and cirrhosis group (CIR), each comprising 70, 69, 69, 69, 69, and 69 cases, respectively. The sample set of imaged CTs included 39 males and 31 females in the normal group, with an average age of 38.60 years; 118 males and 20 females in the mild fibrosis group, with an average age of 37.25 years; 90 males and 48 females in the severe fibrosis group, with an average age of 38.6 years; and 53 males and 16 females in the cirrhosis group, with an average age of 47.5 years. Each image was verified by experienced radiologists to ensure the accuracy of the grading labels. The CT image of the liver is shown in Figure 1.
In practical medical applications, CT scans typically involve the injection of a contrast agent into the patient. The contrast agent, spreading with the blood flow into various tissues and organs, enhances the sensitivity of the tissues to X-rays during scanning. This allows for clearer X-ray signals and better reconstruction of internal body images. The acquired scan images can be categorized according to the timing of the contrast agent injection, as per Table 1. Each scanning phase yields a complete set of full liver cross-sectional images. The CT scanner used was the 64-slice multi-layer spiral CT machine (GE Lightspeed VCT) produced by GE, USA, with an exposure voltage of 120 kV, a tube current of 250 mA, and an image pixel matrix of 512 × 512. The contrast agent used was iohexol injection fluid, administered through an antecubital vein using a high-pressure injector, with a dosage of 85–90 mL, a concentration of 320 mg/mL, and an injection rate of 3.0 mL/s.

2.2. Microscopic Roughness

The hepatic surface profile is outlined with the red line, consisting of more than 128 points, as shown in Figure 2a, on which an approximate curve was determined by a least-square approach, and a one-dimensional function was obtained by drawing a straight line between the start and end points before rotating it parallel to the y-axis (Figure 2b). Then, the microscopic roughness of the hepatic surface is calculated as the shape feature.
Micro-unevenness is a quantitative indicator used in mechanical engineering to describe the characteristics of surface morphology. The author intends to select a total of ten such parameters as characteristic parameters, which include l , representing the sampling length, and Z x , the profile deviation function.
The average arithmetic deviation of a profile is represented by R a . It is the arithmetic mean of the absolute values of the distances. These distances are between the points on the profile line and the baseline. The measurement is taken along the direction of the profile within a sampling length. A smaller R a means a smoother surface. The calculation formula for R a is as follows:
R a = 1 l 0 l Z x d x
The root mean square deviation of the profile is denoted as R q . It is the square root of the arithmetic mean of the squared distances. These distances are between the points on the profile line and the baseline. Again, the measurement is within a sampling length. A smaller R q value means a smoother surface. The calculation formula for R q is as follows:
R q = 1 l 0 l Z 2 x d x
The maximum height of profile micro-unevenness is represented by R m a x . It is the vertical distance between the highest and lowest points on the profile line. This measurement is also within a sampling length. A smaller R m a x value suggests a smoother surface. The calculation formula for R m a x is as follows:
R m a x = m a x 0 x l Z x
The maximum valley depth of the profile is denoted as R m i n . It is the vertical distance from the lowest point on the profile line to the baseline. This is measured within a sampling length. The calculation formula for R m i n is as follows:
R m i n = m i n 0 x l Z x
The maximum peak height of the profile is denoted as R p . It is the highest peak value relative to the mean line within a sampling length. The calculation formula for R p is as follows:
R p = m a x 0 x l Z x
The average spacing of micro-unevenness of the profile is denoted as S m . It is the average distance between the micro-unevenness within a sampling length. The spacing of micro-unevenness refers to the segment length on the mean line between a profile peak and its adjacent valley. Here, n represents the number of profile elements, and s i denotes the width of the i th profile element. A profile element is defined as the segment of the profile line between a peak and its adjacent valley. The calculation formula for S m is as follows:
S m = 1 n i = 1 n s i
The average spacing of single peaks of the profile is denoted as S . It is the average distance between individual peaks within a sampling length. Here, x i + 1 represents the position of adjacent peaks, and n is the total number of peaks. The calculation formula for S is as follows:
S = 1 n 1 i = 1 n 1 x i + 1 x i
The average height of micro-unevenness of the profile is denoted as R z . It is calculated as the sum of the average of the five highest peaks and the average of the five deepest valleys within a sampling length. Here, y p i represents the height of the i th highest peak, and y v i denotes the depth of the i th deepest valley. The calculation formula for R z is as follows:
R z = 1 5 i = 1 5 y p i + y v i
The density of profile peaks is denoted as D . It is the ratio of the number of profile peaks to the sampling length within a sampling length. Here, n represents the number of profile peaks contained within the sampling length, and l is the sampling length. The calculation formula for D is as follows:
D = n l
The profile bearing length ratio is denoted as t p . It is the ratio of the bearing length to the sampling length. Given a horizontal intercept, a line parallel to the mean line is drawn at the intercept length below the peaks. The intersection of the profile with this line l 1 + l 2 + l 2 + + l n is called the bearing length. The calculation formula for t p is as follows:
t p = l 1 + l 2 + l 2 + + l n l
Examples of S m and t p are illustrated in Figure 3.

2.3. Overview of the Proposed Method

In cases where liver imaging resources were scarce, the authors employed a silicone mold with a favorable degree of deformability as a surrogate for the liver in simulation experiments. Beneath the silicone mold, two sets of holes, totaling six, are uniformly spaced to suspend weights, thereby simulating the edge roughness associated with varying degrees of liver fibrosis. Simulated experiments were conducted to verify the correlation between micro-unevenness parameters and the degree of deformation. The silicone mold and the edges of the silicone are depicted, respectively, in Figure 4.
The liver CT experiment comprises three stages: the data extraction module; the feature optimization module; and the EVO-MS classification module. The specific flowchart is illustrated in the Figure 5.
(a) The data extraction module involves the extraction of representative edge curves from the lower segment of the left hepatic lobe to the lower segment of the left hepatic outer lobe on the liver contour map after positioning, rotating, and fitting the edge curve. Based on this edge curve, ten characteristic parameters are extracted. Following min–max normalization of the data, the samples are input into an SVM classifier. The leave-one-out method is employed to maximize the input samples, and an exhaustive search method is used to select different combinations of all feature parameters.
(b) The feature optimization module is divided into two parts: the optimization of the number of feature parameters, and the optimization of the weights of feature parameters. The optimization of the number of feature parameters involves calculating the highest accuracy rate for each feature parameter. The highest accuracy rate for each quantity of feature parameters is denoted as P k , where k represents the number of feature parameters selected, and P 2 represents the highest accuracy rate obtained from C 10 2 combinations when classifying with two feature values. The weight of the feature parameters refers to the degree of influence on the accuracy of the results in the feature parameter SVM classifier. A greater weight indicates a larger impact on accuracy, while a lesser weight suggests a smaller influence on the classification outcome. The frequency of occurrence of ten feature parameters in hierarchical classification combinations is counted.
p k = 1 N i = 1 N n i ( k ) k = 1 ,   2 ,   3 ,   4 ,   5 ,   6 ,   7 ,   8 ,   9 ,   10 ,   i = 1 ,   2 , , N
N = C 10 1 + C 10 2 + C 10 3 + C 10 4 + + C 10 10 = 2 10 1
where k represents the ten features’ parameter number, p k represents the weight of the k feature parameter. The process iteratively traverses every possible combination of feature parameters to define the classification space using an exhaustive traversal method. The total number of combinations, denoted as N . In this experiment, n i ( k ) indicates whether the feature quantity k appears in the i th classification space; if the feature quantity appears, n i ( k ) = 1 , otherwise, n i ( k ) = 0 . The min–max normalization is employed, where p k m a x is the maximum value of the sample weight data, p k m i n is the minimum value, and the transformation function is as follows:
W k = p k p k m i n p k m a x p k m i n
(c) After optimizing the number and weights of the feature parameters, those with greater weights were input into the EVO-MS model for training. Initially, the selected six base classifiers—KNN, DT, NB, XGBoost, GBDT, and RF—were utilized to predict the sample and determine the predicted class probabilities, which are denoted as a matrix P :
P = p 11 p 12 p 1 k p 21 p 22 p 2 k p n 1 p n 2 p n k
where n denotes the number of base models, k represents the number of splits, and p i j signifies the predicted probability of the i th base model for the j th split. Subsequently, the probability values outputted by the base models are fed as input data into a mixture layer composed of m mixing units. The predictive probabilities from the mixture layer are then input into the meta-model layer. A logistic regression method is employed to synthesize the predictions of the various mixed classifiers, thereby yielding a more accurate final forecast. The hyperparameter combinations of each base model and the weights of the individual units in the mixture layer are optimized using the energy valley optimization algorithm, with the optimization process utilizing cross-entropy loss as the objective function.

2.4. Multiple Model

The multiple stacking model is an ensemble learning algorithm that enhances the efficiency of complex data processing by integrating the predictive capabilities of various base models. The multiple stacking architecture comprises three levels: the base models, the blending layer, and the meta-model. The base models include a diverse array of machine learning algorithms, which are independently trained on data using K-fold cross-validation to ensure the model’s generalizability. The output of the base model layer, consisting of the prediction results of each model on the data, is fed into the blending layer. In this layer, the predictions of the base models are used to train multiple ensemble models responsible for learning how to most effectively combine the predictions of the base models. The output of the blending layer is then used as the input for the meta-model, which further optimizes the prediction results to achieve higher accuracy than individual models. The advantage of the multiple stacking model lies in its ability to capture the complementary information between different models. By learning the differences in predictions of various models, it enhances the overall predictive performance and improves the model’s generalization ability on unknown data. The multiple stacking algorithm is illustrated in Figure 6, where k denotes the use of k-fold cross-validation, and P r e d i c t i o n i , j denotes the predicted probability of the i th model on the j th split.

2.5. Energy Valley Optimization

The energy valley optimizer (EVO) is a metaheuristic algorithm grounded in physical principles, inspired by the stability and decay processes of particles. In the cosmos, the majority of particles are considered unstable, with only a select few capable of maintaining permanence. Unstable particles release energy through decay, with the decay rate varying slightly among different particle types. The energy valley focuses on particle stability, determined by the binding energy of particles and their interactions with others. Depending on the stability level of the particles, each tends to increase its stability level by adjusting the ratio of neutrons to protons and moving towards the stability band or the bottom of the energy valley. During the decay process, a particle with a lower energy level is produced, while excess energy is emitted. The decay processes of particles with different stability levels yield three types of emissions, corresponding to three position update processes. Two of these processes occur within the decision variables, executing the exploration process, while one position update process occurs within the candidate solutions, satisfying exploitation. These principles provide the foundation for the EVO algorithm, enabling it to optimize the performance of solutions by simulating the stability and decay processes of particles.
The initial step of the EVO algorithm is initialization, where particles (candidate solutions) X i within the search space are established, representing various levels of stability. Assuming the search space is a specified section, a random initialization operation is conducted:
X i = x 1 1 x 1 j x 1 d . . . . . . x i 1 x i j x i d . . . . . . x n 1 x n j x n d ,   i = 1,2 , , n . j = 1,2 , , d .
x i j = x i , m i n j + r a n d x i , m a x j x i , m i n j , i = 1,2 , , n . j = 1,2 , , d .
where n represents the total number of particles, d denotes the dimensions of the problem under consideration, x i j signifies the j th decision variable of the initial position of the i th particle, while x i , m i n j and x i , m a x j , respectively, represent the lower and upper bounds of the j th decision variable within the i th particle; r a n d is a random number uniformly distributed in the interval [ 0 ,   1 ] .
The second step of the energy valley algorithm involves determining the enrichment boundary ( E B ) for the particles. Each particle is assessed through the objective function, establishing its neutron enrichment level ( N E L i ), which is utilized to distinguish between neutron-poor and neutron-rich particles.
E B = i = 1 n N E L i n , i = 1,2 , , n .
where N E L i represents the neutron enrichment energy level of the i th particle, while E B denotes the enrichment boundary for particles in the universe.
The third step of the energy valley algorithm is to evaluate the stability level of the particles, based on the objective function:
S L i = N E L i B S W S B S , i = 1,2 , , n .
where S L i denotes the stability level of the i th particle, while B S and W S represent the particles with the best and worst stability levels within the universe, respectively. Their stability levels are determined by the minimum and maximum values of the objective function.
Within the main search loop of the energy valley optimization (EVO), if the neutron enrichment level of a particle exceeds the enrichment threshold ( N E L i > E B ), it is postulated that the particle possesses a higher neutron-to-proton ratio. Depending on the stability level of the particle, three decay processes ( α ,   β ,   γ ) are adopted accordingly. To simulate the stability boundary ( S B ) in the cosmos, a random number within the interval [ 0 ,   1 ] is generated. Should the stability level of the particle surpass the stability boundary ( S L i > E B ), α and γ decays may occur, as these decays are pertinent for heavy particles with higher stability. In accordance with the physical principles of α decay, the emission of α rays facilitates the enhancement of the stability of the reaction products. This process serves as one of the EVO position update mechanisms, thereby generating new candidate solutions. Specifically, two random integers, A l p h a   I n d e x   I , are generated within the interval of [ 1 , d ] to represent the quantity of emitted α rays. Subsequently, within the [ 1 , A l p h a   I n d e x   I ] interval, a value for A l p h a   I n d e x   I I is determined to specify the particular α rays to be emitted. The emitted rays, being decision variables within the candidate solution, are removed and replaced by the rays from the particle with the highest level of stability ( X B S ) or from the α rays within the candidate solution. The pertinent mathematical formulas are as follows:
X i N e w 1 = X i ( X B S ( x i j ) ) , i = 1,2 , , n . j = A l p h a   I n d e x   I I .
where a new particle is generated, denoted as X i N e w 1 , while X i represents the current position vector of the i th particle (solution candidate) within the universe (search space). The position vector of the particle with the optimal stability level is denoted as X B S , and x i j represents the j th decision variable or emitted rays. Moreover, in the gamma decay process, γ rays are emitted to elevate the stability level of the excited particles. This process can act as another position update mechanism for EVO, generating new candidate solutions in the process. For this purpose, within the interval [ 1 , d ] , two random integers, referred to as G a m m a   I n d e x   I , are generated to represent the number of γ rays to be emitted. A value for G a m m a   I n d e x   I I is determined within the interval [ 1 , G a m m a   I n d e x   I ] to specify the γ rays to be considered within the particle. The γ rays within the particle, serving as decision variables in the candidate solution, are removed and replaced by those from adjacent particles or candidate solutions ( X N g ), emulating the interaction of the excited particles with other particles or even magnetic fields. The total distance between the considered particle and other particles is calculated as follows:
D i k = x i x k 2 + y i y k 2 , i = 1,2 , , n . k = 1,2 , , n 1 .
where D i k represents the total distance between the i th particle and the k th adjacent particle, while ( x i , y i ) and ( x k , y k ) denote the coordinates of the particle in the search space. When considering the ith particle, compute its position relative to the other n − 1 particles and identify the nearest kth particle. Utilizing these operations, the position update process for generating the second candidate solution is as follows:
X i N e w 2 = X i ( X N g ( x i j ) ) , i = 1,2 , , n . j = G a m m a   I n d e x   I I .
where a new particle, denoted as X i N e w 2 , is generated, and X i represents the current position vector of the i th particle (solution candidate) within the cosmos (search space). Additionally, X N g denotes the position vector of the neighboring particles surrounding the i th particle, and x i j represents the j th decision variable or emitted photons. If the stability level of the particle falls below the stability threshold ( S L i E B ), β decay is presumed to have occurred, as such decay processes occur in unstable particles with lower stability. In accordance with the physical principles of β decay, particles emit β rays to enhance their stability level; hence, those particles with higher levels of instability should perform larger jumps within the search space. During the position update process, particles move towards the optimal stability level ( X B S ) and the particle center ( X C P ). This simulates the behavior of particles gravitating towards the stability band, where most known particles congregate, typically exhibiting higher stability. The relevant mathematical formulas are as follows:
X C P = i = 1 n X i n , i = 1,2 , , n .
X i N e w 1 = X i + r 1 × X B S r 2 × X C P S L i , i = 1,2 , , n .
where X i N e w 1 and X i , respectively, represent the future and current position vectors of the i th particle (solution candidate) within the universe (search space). X B S denotes the position vector of the particle with the optimal stability level, while X C P is the position vector of the particle center. S L i is the stability level of the i th particle, and r 1 and r 2 are two random numbers within the interval [ 0 ,   1 ] , determining the amplitude of the particle’s motion. To enhance the development and exploration level of the algorithm, a new position update mechanism is implemented for particles with β decaying stability level. This mechanism, without affecting the particle’s motion, is achieved by controlling the movement of the particle with the highest level of stability ( X B S ), as well as the movement of adjacent particles or candidate particles ( X N g ). The mathematical formula is as follows:
X i N e w 2 = X i + r 3 × X B S r 4 × X N g , i = 1,2 , , n .
where X i N e w 2 and X i represent the future and current position vectors of the i th particle (solution candidate) in the universe (search space), respectively. X B S is the position vector of the particle with the optimal stability level, and X B S is the position vector of the neighboring particles around the i th particle; r 3 and r 4 are two random numbers within the [ 0 ,   1 ] interval that determine the amount of particle movement. If the neutron enrichment level of a particle is below the enrichment threshold ( N E L i E B ), it is considered that the particle has a relatively small proton-to-neutron ratio, and the particle is more inclined to migrate towards the stability band through processes such as electron capture or positron emission. The random motion in the search space is characterized as the following types of movement:
X i N e w = X i + r , i = 1,2 , , n .
where X i N e w and X i represent the future and current position vectors of the i th particle (solution candidate) in the universe (search space), and r is a random number within the [ 0 ,   1 ] interval that determines the magnitude of the particle’s movement.
At the end of the EVO main loop, if a particle’s enrichment level is above the enrichment threshold, each particle generates only two new position vectors, X i N e w 1 and X i N e w 2 , while for particles with lower enrichment levels, only X i N e w is generated as the new position vector. In each state, the newly generated vectors are merged with the current population, and the best particle participates in the next search cycle of the algorithm. For decision variables that exceed the predefined upper and lower bounds, a boundary violation flag is determined, and the maximum number of evaluations of the objective function or the maximum number of iterations is used as the termination criterion. The pseudo-code of the energy valley optimization algorithm is presented in the Table 2.
The flow diagram of the energy valley optimization algorithm is shown in Figure 7.

3. Results

3.1. Feature Optimization

In the feature extraction phase, a silicone simulation experiment was initially conducted to present the relationship between representative feature parameters R a , R q , R p , R m a x and the mass of the weights. As illustrated in Figure 8a–d, with the gradual increase in the mass of the weights, the values of these feature parameters also correspondingly rise, indicating a significant positive correlation between them. This outcome confirms that the feature parameters can effectively reflect the changes in the mass of the weights. The blue data points in Figure 8 represent the feature parameter values measured at different weight masses, while the red fitting line is obtained through least squares fitting, providing us with the best estimate of the trend within the dataset.
In the liver CT experiments, the quantity of feature parameters was optimized using an SVM classifier. The statistical analysis of the experimental results for the highest accuracy rates of shape feature quantities at various counts reveals that the classifier achieves superior classification performance when the number of feature parameters ranges from three to five. A moderate number of feature parameters aids in enhancing the classifier’s accuracy and efficiency. In contrast, an excess of feature parameters was observed to negatively impact the classifier’s performance due to data redundancy, leading to a decrease in classification accuracy. Conversely, when the number of feature parameters is insufficient, the classifier is unable to effectively distinguish between different categories due to a lack of adequate information. The optimization of the number of feature parameters results is depicted in the Figure 9.
During the optimization of feature parameter weights, the experimental results of feature parameter weight optimization are displayed in descending order, as shown in Figure 10. The results demonstrate that the weights of five feature parameters— R p , S , S m , R m i n , and R m a x —significantly influence the accuracy of the classification. These parameters are highly relevant in the diagnosis of liver fibrosis, providing an accurate reflection of the degree of liver fibrosis pathology. In the field of mechanical engineering, measurements of micro-surface unevenness that are widely recognized as representative include the maximum height of R m a x , R p , R m i n . This confirms the theoretical basis for applying these micro-unevenness quantification indicators to the medical domain, utilizing them to detect the edge roughness associated with the degree of liver fibrosis, and serving as a criterion for grading.

3.2. EVO-MS Model Performance

The dataset for this study encompasses a total of 415 cases, including both patients diagnosed with liver fibrosis via liver biopsy and individuals without a history of liver-related diseases, examined at the First Affiliated Hospital of Guangxi from June 2009 to March 2011. The data were divided into a training set and a test set at a ratio of 7:3. To validate the effectiveness of the model, its performance was assessed on the test set using various evaluation metrics, including the construction of Receiver Operating Characteristic (ROC) curves and calculation of AUC, accuracy, precision, sensitivity, specificity, and F1-score. Moreover, the Wilcoxon signed-rank test is employed to compare the EVO-MS model with the other six models.
The EVO-MS model and the six individual models’ performance metrics at a prediction threshold of 0.5, such as the validation set’s root mean square error (RMSE), test set RMSE, accuracy, precision, sensitivity, and specificity, are shown in the Table 3. It is evident from the Table 3 that our proposed EVO-MS outperforms the other six individual models, achieving the highest levels among the seven models, with an accuracy of 0.864, a precision of 0.813, a sensitivity of 0.912, a specificity of 0.824, and an F1-score of 0.860. The EVO-MS model’s accuracy, precision, sensitivity, specificity, and F1-score are, respectively, 5.6%, 7.4%, 3.5%, 8.9%, and 5% higher than those of the lowest-scoring model. The scores of each model on the five metrics are illustrated in the Figure 11. The results of the Wilcoxon signed-rank test, as delineated in Table 4, reveal significant variances between the EVO-MS model and the other six models ( p < 0.05). Considering evaluation metrics such as AUC, sensitivity, and specificity, it is inferred that the overall performance of the EVO-MS model is superior to that of the competing models.
The Area Under the Curve (AUC) values for the models EVO-MS, KNN, DT, NB, XGB, GB, and RF are 0.940, 0.916, 0.879, 0.868, 0.924, 0.921, and 0.903, respectively. The proposed EVO-MS model achieved the highest AUC. The ROC (Receiver Operating Characteristic) curves for each model are illustrated in the Figure 12 below:

4. Discussion

The diagnosis and grading of liver fibrosis is a critical fundamental task. This study has constructed a multiple stacking model, optimized by energy valley optimization (EVO), based on 415 CT images from different stages of liver fibrosis. By analyzing and extracting shape features, accurate grading of the extent of liver fibrosis can be achieved, which has the potential to change the current reliance on invasive tests, such as liver biopsies.
The statistical results indicate that selecting three shape features yields better classification performance, with the maximum peak height of the contour, the average inter-peak distance of the contour, and the average inter-roughness distance of the contour microstructure having significant weights. In terms of model construction, the EVO-MS combines six individual base models (KNN, DT, NB, XGBoost, GBDT, and RF). For parameter tuning, we employ the EVO algorithm to replace the manual selection process. The EVO-MS model demonstrates excellent performance in grading liver fibrosis on the test set, achieving commendable scores across the five main evaluation metrics (an accuracy of 0.864, a precision of 0.8125, a sensitivity of 0.9123, a specificity of 0.8235, and an F1-score of 0.8595), outperforming the lowest-scoring model by 3.2%, 4%, 1.7%, 4.5%, and 3.1%, respectively. This indicates that the EVO-MS model can more effectively detect different stages of liver fibrosis.
Our work is primarily limited by the size of the dataset. Due to patient privacy protection, high data annotation costs, and the dispersed, non-shared nature of the data, it is challenging to obtain large-scale liver fibrosis CT image datasets. With more abundant data in the future, we hope to further improve the model’s performance. Moreover, after further comparison with clinical presentations by doctors, we anticipate providing a new tool for the grading of liver fibrosis.

Author Contributions

Conceptualization, X.Z. (Xuejun Zhang), S.C. and P.Z.; methodology, S.C., C.W. and Q.W.; software, Q.W. and X.Z. (Xiangrong Zhou); validation, S.C.; formal analysis, X.Z. (Xuejun Zhang), S.C. and P.Z.; investigation, X.Z. (Xuejun Zhang), S.C. and C.W.; resources, Q.W. and X.Z. (Xiangrong Zhou); data curation, X.Z. (Xuejun Zhang) and S.C.; writing—original draft preparation, S.C.; writing—review and editing, P.Z. and C.W.; visualization, S.C., Q.W. and X.Z. (Xiangrong Zhou); supervision, X.Z. (Xuejun Zhang) and S.C.; project administration, X.Z. (Xuejun Zhang); funding acquisition, X.Z. (Xuejun Zhang). All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 61262027, in part by the Science and Technology Key Projects of Guangxi Province under Grant 2020AA21077007, and in part by the Guangxi University Training Program of Innovation and Entrepreneurship under Grant S202310593234.

Institutional Review Board Statement

The Medical Ethics Committee of Guangxi University approved this study (IRB No. GXU-2022-230) and waived the requirements for informed consent, considering the retrospective study design and the use of anonymized patient data. All the methods employed in this study were in accordance with the approved guidelines and the Declaration of Helsinki.

Informed Consent Statement

This work was of a retrospective design, and therefore did not require explicit consent to participate; all data usage complied with the rights and privacy of the patients involved.

Data Availability Statement

The real medical datasets were obtained from the Radiology Department of the First Affiliated Hospital of Guangxi Medical University. The dataset is available on reasonable requests from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Friedman, S.L.; Pinzani, M. Hepatic fibrosis: 2022 Unmet needs and a blueprint for the future. Hepatology 2022, 75, 473–488. [Google Scholar] [CrossRef]
  2. Rockey, D.C. Hepatic fibrosis and cirrhosis. Yamada’s Textbook of Gastroenterology, John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2022; 2000–2023. [Google Scholar]
  3. Heyens, L.J.M.; Busschots, D.; Koek, G.H.; Robaeys, G.; Francque, S. Liver fibrosis in non-alcoholic fatty liver disease: From liver biopsy to non-invasive biomarkers in diagnosis and treatment. Front. Med. 2021, 8, 615978. [Google Scholar] [CrossRef] [PubMed]
  4. Khalifa, A.; Rockey, D.C. The utility of liver biopsy in 2020. Curr. Opin. Gastroenterol. 2020, 36, 184–191. [Google Scholar] [CrossRef] [PubMed]
  5. Chowdhury, A.B.; Mehta, K.J. Liver biopsy for assessment of chronic liver diseases: A synopsis. Clin. Exp. Med. 2022, 23, 273–285. [Google Scholar] [CrossRef] [PubMed]
  6. Zhang, X.J.; Zhou, B.; Ma, K.; Qu, X.H.; Tan, X.M.; Gao, X.; Yan, W.; Long, L.L.; Fujita, H. Selection of optimal shape features for staging hepatic fibrosis on CT image. J. Med. Imaging Health Inform. 2015, 5, 1926–1930. [Google Scholar] [CrossRef]
  7. Zhou, S.K.; Greenspan, H.; Davatzikos, C.; Duncan, J.S.; Van Ginneken, B.; Madabhushi, A.; Prince, J.L.; Rueckert, D.; Summers, R.M. A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proc. IEEE 2021, 109, 820–838. [Google Scholar] [CrossRef]
  8. Liu, X.; Gao, K.; Liu, B.; Pan, C.; Liang, K.; Yan, L.; Ma, J.; He, F.; Zhang, S.; Pan, S.; et al. Advances in deep learning-based medical image analysis. Health Data Sci. 2021, 2021, 8786793. [Google Scholar] [CrossRef]
  9. Aggarwal, R.; Sounderajah, V.; Martin, G.; Ting, D.S.W.; Karthikesalingam, A.; King, D.; Ashrafian, H.; Darzi, A. Diagnostic accuracy of deep learning in medical imaging: A systematic review and meta-analysis. npj Digit. Med. 2021, 4, 65. [Google Scholar] [CrossRef]
  10. Singh, S.P.; Wang, L.; Gupta, S.; Goli, H.; Padmanabhan, P.; Gulyás, B. 3D deep learning on medical images: A review. Sensors 2020, 20, 5097. [Google Scholar] [CrossRef]
  11. Panayides, A.S.; Amini, A.; Filipovic, N.D.; Sharma, A.; Tsaftaris, S.A.; Young, A.A.; Foran, D.J.; Do, N.V.; Golemati, S.; Kurc, T.; et al. AI in medical imaging informatics: Current challenges and future directions. IEEE J. Biomed. Health Inform. 2020, 24, 1837–1857. [Google Scholar] [CrossRef]
  12. Loomba, R.; Adams, L.A.J.G. Advances in non-invasive assessment of hepatic fibrosis. Gut 2020, 69, 1343–1352. [Google Scholar] [CrossRef] [PubMed]
  13. Humeau-Heurtier, A. Texture feature extraction methods: A survey. IEEE Access 2019, 7, 8975–9000. [Google Scholar] [CrossRef]
  14. Mayerhoefer, M.E.; Materka, A.; Langs, G.; Häggström, I.; Szczypiński, P.; Gibbs, P.; Cook, G. Introduction to radiomics. J. Nucl. Med. 2020, 61, 488–495. [Google Scholar] [CrossRef] [PubMed]
  15. Dong, X.; Yu, Z.; Cao, W.; Shi, Y.; Ma, Q. A survey on ensemble learning. Front. Comput. Sci. 2019, 14, 241–258. [Google Scholar] [CrossRef]
  16. Binder, H.; Gefeller, O.; Schmid, M.; Mayr, A. The evolution of boosting algorithms. Methods Inf. Med. 2014, 53, 419–427. [Google Scholar] [CrossRef] [PubMed]
  17. Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
  18. Wolpert, D.H. Stacked generalization. Neural Netw. 1992, 5, 241–259. [Google Scholar] [CrossRef]
  19. Liang, M.; Chang, T.; An, B.; Duan, X.; Du, L.; Wang, X.; Miao, J.; Xu, L.; Gao, X.; Zhang, L.; et al. A stacking ensemble learning framework for genomic prediction. Front. Genet. 2021, 12, 600040. [Google Scholar] [CrossRef]
  20. Cui, S.; Yin, Y.; Wang, D.; Li, Z.; Wang, Y. A stacking-based ensemble learning method for earthquake casualty prediction. Appl. Soft Comput. 2020, 101, 107038. [Google Scholar] [CrossRef]
  21. Mota, L.F.M.; Giannuzzi, D.; Bisutti, V.; Pegolo, S.; Trevisi, E.; Schiavon, S.; Gallo, L.; Fineboym, D.; Katz, G.; Cecchinato, A. Real-time milk analysis integrated with stacking ensemble learning as a tool for the daily prediction of cheese-making traits in Holstein cattle. J. Dairy Sci. 2022, 105, 4237–4255. [Google Scholar] [CrossRef]
  22. Zhang, H.; Li, J.-L.; Liu, X.-M.; Dong, C. Multi-dimensional feature fusion and stacking ensemble mechanism for network intrusion detection. Futur. Gener. Comput. Syst. 2021, 122, 130–143. [Google Scholar] [CrossRef]
  23. Rashid, M.; Kamruzzaman, J.; Imam, T.; Wibowo, S.; Gordon, S. A tree-based stacking ensemble technique with feature selection for network intrusion detection. Appl. Intell. 2022, 52, 9768–9781. [Google Scholar] [CrossRef]
  24. Kardani, N.; Zhou, A.; Nazem, M.; Shen, S.-L. Improved prediction of slope stability using a hybrid stacking ensemble method based on finite element analysis and field data. J. Rock Mech. Geotech. Eng. 2020, 13, 188–201. [Google Scholar] [CrossRef]
  25. Azad, A.; Sajid, I.; Lu, S.-D.; Sarwar, A.; Tariq, M.; Ahmad, S.; Liu, H.-D.; Lin, C.-H.; Mahmoud, H.A. Energy Valley Optimizer (EVO) for Tracking the Global Maximum Power Point in a Solar PV System under Shading. Processes 2023, 11, 2986. [Google Scholar] [CrossRef]
  26. Azizi, M.; Aickelin, U.; Khorshidi, H.A.; Shishehgarkhaneh, M.B. Energy valley optimizer: A novel metaheuristic algorithm for global and engineering optimization. Sci. Rep. 2023, 13, 226. [Google Scholar] [CrossRef] [PubMed]
  27. Fathy, A. Efficient energy valley optimization approach for reconfiguring thermoelectric generator system under non-uniform heat distribution. Renew. Energy 2023, 217, 119177. [Google Scholar] [CrossRef]
  28. Rao, M.R.; Sundar, S. Allocation of Resources in LPWAN Using Hybrid Coati-Energy Valley Optimization Algorithm Based on Reinforcement Learning. IEEE Access 2023, 11, 116169–116182. [Google Scholar] [CrossRef]
  29. Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. Their Appl. 1998, 13, 18–28. [Google Scholar] [CrossRef]
  30. Zhang, X.; Gao, X.; Liu, B.J.; Ma, K.; Yan, W.; Long, L.; Huang, Y.; Fujita, H. Effective staging of fibrosis by the selected texture features of liver: Which one is better, CT or MR imaging? Comput. Med. Imaging Graph. 2015, 46 Pt 2, 227–236. [Google Scholar] [CrossRef] [PubMed]
  31. Ouyang, G.; Zhang, X.; Wu, D. Staging of Hepatic Fibrosis Based on Optimization of Selected Texture Features. Comput. Sci. Appl. 2018, 8, 1089–1101. [Google Scholar]
Figure 1. Different CT phased images were obtained from a 52-year-old woman with fibrosis stage F2 due to type C viral hepatitis before and after injection of contrast agent.
Figure 1. Different CT phased images were obtained from a 52-year-old woman with fibrosis stage F2 due to type C viral hepatitis before and after injection of contrast agent.
Bioengineering 11 00485 g001
Figure 2. The outline of the hepatic surface, shown in red (a), is rotated according to its angle of approximate curve to generate a one-dimensional profile function (b).
Figure 2. The outline of the hepatic surface, shown in red (a), is rotated according to its angle of approximate curve to generate a one-dimensional profile function (b).
Bioengineering 11 00485 g002
Figure 3. An example of calculating Sm and tp on the profile.
Figure 3. An example of calculating Sm and tp on the profile.
Bioengineering 11 00485 g003
Figure 4. The silicone mold is hung by different weights to simulate the restraint force on the liver caused by the progression of fibrosis (a), as shown in its profile image (b).
Figure 4. The silicone mold is hung by different weights to simulate the restraint force on the liver caused by the progression of fibrosis (a), as shown in its profile image (b).
Bioengineering 11 00485 g004
Figure 5. The overall flowchart of staging liver fibrosis based on EVO-MS.
Figure 5. The overall flowchart of staging liver fibrosis based on EVO-MS.
Bioengineering 11 00485 g005
Figure 6. The architecture diagram of the multiple stacking algorithm.
Figure 6. The architecture diagram of the multiple stacking algorithm.
Bioengineering 11 00485 g006
Figure 7. The flow diagram of the energy valley optimization algorithm.
Figure 7. The flow diagram of the energy valley optimization algorithm.
Bioengineering 11 00485 g007
Figure 8. Feature parameter-quality curve. (a) Ra-quality. (b) Rq-quality. (c) Rp-quality. (d) Rmax-quality.
Figure 8. Feature parameter-quality curve. (a) Ra-quality. (b) Rq-quality. (c) Rp-quality. (d) Rmax-quality.
Bioengineering 11 00485 g008
Figure 9. Optimization of the number of feature parameters.
Figure 9. Optimization of the number of feature parameters.
Bioengineering 11 00485 g009
Figure 10. Weight of feature parameters.
Figure 10. Weight of feature parameters.
Bioengineering 11 00485 g010
Figure 11. The scores of each model on the five metrics: (a) KNN. (b) DT. (c) NB. (d) XGB. (e) GBDT. (f) RF. (g) EVO-MS.
Figure 11. The scores of each model on the five metrics: (a) KNN. (b) DT. (c) NB. (d) XGB. (e) GBDT. (f) RF. (g) EVO-MS.
Bioengineering 11 00485 g011
Figure 12. The ROC curves for each model.
Figure 12. The ROC curves for each model.
Bioengineering 11 00485 g012
Table 1. The scanning time for each phase on contrast CT images.
Table 1. The scanning time for each phase on contrast CT images.
Scan PhaseCT Scan TimingContrast Agent Diffusion
N Phase: Non-contrast Phase<0 sNo contrast agent injected
A Phase: Arterial Phase25 sContrast agent diffused into hepatic arterial vessels
V Phase: Venous Phase60 sContrast agent refluxed into hepatic venous vessels
P Phase: Equilibrium Phase120 sContrast agent diffused into hepatic capillary tissues
Table 2. The pseudo-code of the energy valley optimization algorithm.
Table 2. The pseudo-code of the energy valley optimization algorithm.
EVO Pseudo-Code
Define the iteration_max, problem bounds, problem dimension ( d ), population size ( n ), objective function.
Calculate the fitness values of all candidate particles based on the neutron enrichment level ( N E L i )
while iteration < iteration_max do
    Calculate the particle enrichment boundary ( E B )
Determine the particle with the best stability level ( X B S )
        for  i = 1 : n  do
            Calculate the stability level ( S L i ) of the i t h particle
            Calculate the neutron enrichment level ( N E L i ) of the i t h particle
            if N E L i > E B , then
                Generate the stability bound ( S B )
                if S L i > S B , then
                    Generate A l p h a   I n d e x   I   a n d   I I
                    for j = 1 : A l p h a   I n d e x   I   a n d   I I do
                         X i N e w 1 = X i ( X B S ( x i j ) )                     
                    end
                    Generate G a m m a   I   a n d   I I
                    Find a neighboring particle ( X N g )
                    for  j = 1 : G a m m a   I I  do
                         X i N e w 2 = X i ( X N g ( x i j ) )
                    end
                else if S L i S B then
                    Determine center of particles ( X C P )
                     X i N e w 1 = X i + r 1 × X B S r 2 × X C P S L i
                    Find a neighboring particle ( X N g )
                     X i N e w 2 = X i + r 3 × X B S r 4 × X N g
                end
            else if N E L i E B then
                 X i N e w = X i + r
            end
         i = i + 1
        end
iteration = iteration + 1
end
Return the best stability level ( X B S )
Table 3. The scores of each model across five metrics.
Table 3. The scores of each model across five metrics.
AccuracyPrecisionSensitivitySpecificityF1-Score
EVO-MS0.8640.8130.9120.8240.860
KNN0.8400.7760.9120.7790.839
DT0.8080.7390.8950.7350.810
NB0.8540.8130.9120.8240.860
XGB0.8480.7970.8950.8090.843
GBDT0.8320.7720.8950.7750.829
RF0.8310.7810.8770.7940.826
Table 4. The Wilcoxon signed-rank test between the EVO-MS model and the other models.
Table 4. The Wilcoxon signed-rank test between the EVO-MS model and the other models.
Pair Comparison p Value
EVO-MS vs. KNN0.041
EVO-MS vs. DT0.040
EVO-MS vs. NB0.039
EVO-MS vs. XGB0.000
EVO-MS vs. GBDT0.045
EVO-MS vs. RF0.037
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, X.; Chen, S.; Zhang, P.; Wang, C.; Wang, Q.; Zhou, X. Staging of Liver Fibrosis Based on Energy Valley Optimization Multiple Stacking (EVO-MS) Model. Bioengineering 2024, 11, 485. https://doi.org/10.3390/bioengineering11050485

AMA Style

Zhang X, Chen S, Zhang P, Wang C, Wang Q, Zhou X. Staging of Liver Fibrosis Based on Energy Valley Optimization Multiple Stacking (EVO-MS) Model. Bioengineering. 2024; 11(5):485. https://doi.org/10.3390/bioengineering11050485

Chicago/Turabian Style

Zhang, Xuejun, Shengxiang Chen, Pengfei Zhang, Chun Wang, Qibo Wang, and Xiangrong Zhou. 2024. "Staging of Liver Fibrosis Based on Energy Valley Optimization Multiple Stacking (EVO-MS) Model" Bioengineering 11, no. 5: 485. https://doi.org/10.3390/bioengineering11050485

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop