1. Introduction
Chronic kidney disease (CKD) has become a significant public health issue worldwide, posing a serious threat to human health and contributing to a substantial increase in the burden of cardiovascular disease, other morbidity and mortality [
1]. The estimated global prevalence of CKD in 2017 was 9.1% [
2], and it is estimated to become the fifth leading cause of death globally by 2040 [
3]. In a Chinese adult cross-sectional study conducted from 2018 to 2019, the estimated prevalence of CKD was 8.2%, affecting approximately 82 million adults. Among these individuals, 73.3% were classified at stages 1 and 2, 25.0% at stage 3, and 1.8% at stage 4 and 5. However, the awareness of CKD was found to be only 10.0% [
4]. The estimated glomerular filtration rate (eGFR), calculated using the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation, has become a critical basis for CKD staging [
5]. Patients with CKD are classified into five progressive stages based on eGFR, and mortality increases with disease stage progression [
6].
Unlike in developed countries, the proportion of patients in the early to intermediate stages of CKD in China is as high as 98.3%. Since early-stage CKD is often asymptomatic and frequently goes unrecognized in clinical practice, studies have shown that early diagnosis and intervention can effectively slow the progression of kidney damage [
7]. Therefore, there is an urgent need to screen and identify early-stage CKD, as this is a critical step in halting or even reversing disease progression.
Non-contrast computed tomography (NCCT) is a non-invasive imaging technique with excellent spatial resolution and no risk of contrast-induced nephropathy, making it widely used for kidney disease screening. However, in the early stages of CKD, it is challenging to distinguish between healthy and diseased kidneys on CT images with the naked eye.
Radiomics leverages advanced algorithms to transform standard medical images into high-dimensional data arrays, enabling the identification of subtle structural changes in the kidneys that are beyond human perception. This approach represents a significant advancement over traditional imaging methods and aids in disease characterization. Compared to conventional CT examinations, CT radiomics can extract more detailed information about subtle lesions. It has been applied in various contexts, such as differentiating kidney stones based on NCCT [
8], identifying renal tumors [
9], and predicting radiation-induced kidney damage using contrast-enhanced CT [
10].
Renal fibrosis characterizes virtually all progressive renal diseases [
11]. This pathological process leads to microstructural changes in the kidneys, which may alter texture and other features compared to healthy kidneys. This study aims to investigate the diagnostic potential of radiomic features derived from NCCT images in distinguishing between CKD stages 1–3 and healthy kidneys.
2. Materials and Methods
2.1. Ethics Statement
This retrospective study was conducted with approval by the Institutional Review Board of The Second Affiliated Hospital of Xi’an Jiaotong University (Approval No. 2021168), and the requirement for informed consent was waived.
2.2. Selection of Study Participants
The inclusion criteria were as follows: patients admitted to the Department of Nephrology and diagnosed with CKD, aged ≥ 18 years, who underwent abdominal non-contrast CT scans and laboratory tests at the same time. The diagnosis of CKD was based on the KDIGO guidelines [
12], and patients with CKD stages 1 to 3, as determined by eGFR, were included. Patients and age- and sex-matched healthy controls from January 2020 to September 2024 were retrospectively reviewed. Exclusion criteria included incomplete clinical or laboratory data, poor or incomplete image quality, large renal cysts (diameter > 3 cm), renal agenesis, asymmetric bilateral kidney atrophy, renal calculi, hydronephrosis, benign or malignant renal tumors, prior kidney biopsy, other urinary tract diseases, or a history of renal radiation therapy.
2.3. CT Imaging
All abdominal CT images were acquired using one of the following CT scanners: Somatom Definition Flash, Somatom go.Top, and Somatom Force (Siemens, Munich, Germany); Revolution CT and LightSpeed 64 (GE Healthcare, Chicago, IL, USA); uCT 780 (United Imaging, Shanghai, China). The CT scanning parameters were as follows: tube voltage, 120 kV; automatic tube current; matrix size, 512 × 512; and reconstructed slice thickness, 1 mm. The images were retrieved from the Picture Archiving and Communication System and uploaded to the research platform.
2.4. Kidney Segmentation
For the segmentation of both kidneys, the VB-net [
13,
14] kidney automatic segmentation algorithm was utilized. This algorithm was developed by United Imaging Intelligence’s one-stop research platform (uAI Research Portal, V20240730,
https://urp.united-imaging.com/; accessed on 31 December 2024) [
15]. Based on the differences in CT values between the renal sinus and renal parenchyma, the adipose tissue of the renal sinus was removed, retaining only the renal parenchyma. All segmentation masks were reviewed and verified by an experienced radiologist, with manual adjustments performed when necessary.
2.5. Radiomics Analysis
All CT images were resampled to a uniform voxel spacing of 1 × 1 × 1 mm3 using the B-spline interpolation algorithm, standardizing variable pixel sizes and slice thicknesses. This preprocessing step ensures that the machine learning model receives consistent input data, enhancing its ability to learn and generalize to new images. Following preprocessing, a total of 2264 radiomics features were automatically extracted from the volumes of interest (VOIs) of both kidneys for each patient.
These radiomics features included first-order statistics and shape and texture features. The texture features comprised Gray Level Size Zone Matrix, Gray Level Co-occurrence Matrix, Gray Level Run Length Matrix, Neighboring Gray Tone Difference Matrix, and Gray Level Dependence Matrix features. Additionally, 24 filters (Box Mean, Additive Gaussian Noise, Binomial Blur, Curvature Flow, Box-sigma, Normalize, Laplacian Sharpening, Discrete Gaussian, Mean, Speckle Noise, Recursive Gaussian, Shot Noise, LoG (sigma: 0.5, 1, 2, 4), and Wavelet (HHH, HLL, HLH, HHL, LLL, LLH, LHL, LHH)) were also applied to obtain the derived images. First-order statistics and texture features were then extracted based on the derived image.
2.6. Feature Selection and Prediction Model Establishment
The mean value of each feature from the right and left kidneys was calculated and used to construct radiomics models. For feature selection, each feature was normalized to standardized z-scores to minimize distortion in the differences among the radiomic features.
The data were randomly allocated to the training and testing sets at an 8:2 ratio. A three-step procedure was implemented to identify robust radiomic features within the training set. First, the Relief method was used for preliminary feature selection, retaining the top 50 features. Next, the Max-Relevance and Min-Redundancy (MRMR) algorithm was used to identify the feature sets with the strongest correlation and the least redundancy, retaining the top 30 features. Finally, the Least Absolute Shrinkage and Selection Operator (LASSO) regression analysis was employed to identify significant radiomic features with non-zero coefficients capable of distinguishing CKD patients from healthy controls.
The parameter α of LASSO was set to 0.05 to prevent overfitting. The final selected features were used to train a Gaussian process (GP) machine learning model to differentiate CKD patients from healthy controls.
2.7. Radiologist Diagnosis
Radiologist diagnostic classification was performed by two attending physicians, each with six years of experience. Any discrepancies were resolved by senior radiologists. All radiologists were blinded to clinical diagnosis, laboratory tests, and kidney biopsy pathological results. The diagnostic criteria for CKD were defined as structural abnormalities detected through imaging, typically manifested as renal atrophy or abnormal renal contours. The diagnosis was primarily based on clinical experience.
2.8. Statistical Analysis
The Shapiro–Wilk test was employed to assess the normality of the datasets. Continuous variables with a normal distribution are presented as mean ± standard deviation, whereas those with a non-normal distribution are reported as median (interquartile range). Categorical data are described as frequency counts (in percentages). For statistical comparisons, Student’s t-test was used for normally distributed continuous variables, the Mann–Whitney U-test was applied for non-normally distributed continuous variables, and the chi-square test was used for categorical variables to determine statistically significant differences. All statistical analyses were performed using Python 3.10.6. Two-sided tests were conducted, and p-values < 0.05 were considered statistically significant. The diagnostic performance was evaluated using the area under the receiver operating characteristic (ROC) curve. Additional metrics, including sensitivity, specificity, accuracy, and F1 score, were also calculated to assess model performance. The comparative performance between radiologists and radiomics model was determined using the DeLong test.
4. Discussion
The identification of CKD is a prolonged process for patients. CT technology is widely used in the evaluation of renal diseases, including CKD. However, its applications are predominantly limited to manual analysis, which relies heavily on the personal experience of professional physicians and lacks standardization. The results of the current study demonstrate that the AUC and sensitivity of radiologists in identifying early CKD were only 0.570–0.575 and 0.196–0.200, respectively, in the training and testing sets. Nearly half of the patients could not be accurately identified. This finding aligns with previous ultrasound studies, which revealed that professional radiologists had low sensitivity for the identification and grading of CKD, with the issue becoming more pronounced as the disease grade decreased [
16].
The current results suggest that the constructed radiomics features model can effectively differentiate early CKD from healthy kidneys, achieving an AUC of 0.79–0.849 and a sensitivity of 0.709–0.750. The AUC and sensitivity of the GP model were 21.5% to 27.9% and 50.9% to 55.4% higher, respectively, than those of the radiologist in both the testing and training sets. A previous ultrasound study also demonstrated that the diagnostic accuracy and sensitivity of the radiomics model for CKD stages 1–3 were significantly higher than those of senior radiologists [
16].
For model validation, the calibration curves show strong alignment with the dotted line, which represents an ideal model. This indicates that the GP model exhibits no significant deviation from a perfect fit, suggesting that its predictions closely mirror the true prevalence of CKD in the given datasets. Furthermore, the GP model outperforms traditional diagnostic methods used by physicians. The clinical decision curve plots net benefit on the y-axis, calculated by balancing the gains from true positives against the costs of false positives. The GP model consistently demonstrates a higher net across the entire range of threshold probabilities. This superior performance in both clinical decision-making and practical application underscores the model’s value in enhancing diagnostic accuracy for CKD stages 1–3 compared to conventional radiologist diagnoses.
From the perspective of feature extraction, healthy kidney tissues exhibit higher values in features such as Inverse Variance (speckle noise), Large Dependence Emphasis (wavelet-LLH), 90th Percentile (wavelet-LLH), and Dependence Entropy (wavelet-LLH) on CT images. This indicates that healthy kidneys have more homogeneous texture, uniform regions, and higher intensity values on CT images. Inverse Variance reflects the uniformity in pixel intensity, with higher values suggesting more consistent tissue. Large Dependence Emphasis indicates the presence of extensive, similar-intensity regions, which are characteristic of healthy, undamaged kidney tissue. The 90th Percentile value denotes the range of higher-intensity pixels, implying robust and dense renal parenchyma. Additionally, higher Dependence Entropy implies a complex and varied texture pattern, which is characteristic of normal kidney function and structure. These features contrast sharply with the lower values typically observed in chronic kidney disease, which is marked by fibrosis, atrophy, and overall tissue degradation. The reduced values in these features for CKD kidneys reflect less homogeneous texture, smaller homogeneous regions, and lower intensity values due to pathological changes such as fibrosis and atrophy. Lower Large Dependence Emphasis can be attributed to the diminished prevalence of large, homogeneous regions within diseased kidneys, a hallmark of chronic pathological changes such as scarring and loss of functional renal parenchyma.
In contrast, diseased kidney tissue exhibits higher values in features such as Contrast (wavelet-LLH), Gray Level Non-Uniformity (discrete gaussian), Correlation (wavelet-HHL), and Imc2 (wavelet-HLH) on CT images. This signifies that CKD kidneys exhibit more pronounced heterogeneity and irregularity in texture. Higher Contrast values reflect significant variations in pixel intensity, highlighting the uneven distribution of fibrotic and atrophic areas within the kidney tissue. Gray Level Non-Uniformity measures the variability of gray levels in the image, with higher values suggesting a lack of uniformity and increased structural degradation. Furthermore, the Correlation (wavelet-HHL) feature, which captures the linear dependency of gray levels in specific orientations, shows higher values in CKD kidneys, implying disrupted and irregular structural patterns. The Imc2 (wavelet-HLH) feature, which quantifies the complexity and entropy of the image, also exhibits elevated values in diseased kidneys, indicating a more chaotic and disordered tissue architecture. These elevated values in CKD contrast sharply with the uniform and organized texture characteristics of healthy kidneys, underscoring the extent of damage and pathological changes present in CKD.
These characteristic alterations in CKD can be interpreted based on pathological studies. A common pathological change in CKD is renal fibrosis, defined as excessive accumulation of extracellular matrix (ECM) produced by cells deposited after activation and expansion. This process affects all parts of the kidney and is referred to as glomerulosclerosis, tubulointerstitial fibrosis, and arteriosclerosis of the arteries and arterioles [
17]. This massive pathological proliferation of the ECM involves numerous cytokines and leads to scarring and hardening of the kidney tissue, resulting in a heterogenic appearance of the kidney [
18]. This heterogeneity and hardness alter texture and other features (such as increased gray contrast, complexity, and non-uniformity) compared to healthy kidneys. The radiologists’ AUC was marginally higher in the test set (0.575) than in the training set (0.570). This minor difference (Δ = 0.005) is likely due to random variation rather than overfitting, as radiologists rely on clinical expertise rather than data-driven learning. The train–test split was applied for fair comparison with models.
This study represents the first attempt to investigate early CKD using NCCT entire renal parenchyma analysis. A previous prospective cohort study on classifying and predicting radiation-induced CKD also identified 90th Percentile and Dependence Entropy as critical features based on entire kidney enhanced CT images [
10]. A small-sample ultrasound study performed texture analysis on the entire kidney, cortex, and medulla of CKD patients and HC, finding that the most accurate results were obtained from the entire kidney and the cortex region [
19]. Our results suggest that higher-order texture features can identify diseased kidneys earlier than radiologists, supporting the optimization of the diagnostic process. The radiomics model can significantly enhance the detection rate of CKD stages 1–3 without biochemical examination and guide further clinical evaluation and treatment.
We recognize that this study has several limitations. The retrospective nature of the study and the use of data from a single center may have introduced certain biases. In the future, external validation will be conducted alongside large-sample, multi-center studies to further evaluate predictive performance of the model. Healthy subjects in this study were those with no history or any symptoms of kidney disease; however, only some underwent eGFR testing. Ideally, eGFR measurements should have been performed for all participants in this group.
This study also offers insights and directions for further enhancing the use of CT imaging in the diagnosis of CKD. Future research directions include investigating whether the radiomics model can serve as an effective predictor for the etiology and pathological changes of different CKD stages, and whether renal corticomedullary segmentation based on deep learning can achieve better performance than whole-kidney radiomics in predicting and classifying early CKD. Further follow-up studies could obtain more comprehensive information from CT images and integrate it with other clinical indicators closely associated with CKD to construct a more robust and comprehensive model.
5. Conclusions
This study investigates the diagnostic potential of radiomic features extracted from non-contrast CT (NCCT) images, compared with radiologists’ assessments, in differentiating between early-stage chronic kidney disease (CKD stages 1–3) and normal kidney. Conventional empirical qualitative evaluations by professional radiologists showed only near-random diagnostic performance. In contrast, machine learning methods transformed imperceptible biological signals into quantifiable biomarkers via radiomics. Our approach extracted 2264 high-dimensional features (shape, texture, wavelet) from automatically segmented kidney VOIs using VB-net. Key algorithms identified features reflecting early pathophysiology. And the Gaussian process classifier integrated these weakly correlated features into a robust predictive model, outperforming radiologists by shifting from subjective morphology-based assessment to objective, quantitative analysis of latent biomolecular signatures. This paradigm highlights radiomics’ ability to uncover “invisible” disease markers beyond human vision, offering transformative potential for early CKD diagnosis.
The NCCT-based radiomics model demonstrates significant clinical utility by enabling non-invasive, early diagnosis of CKD stages 1–3, outperforming radiologist assessments. This approach minimizes human error and improves diagnostic consistency through automated segmentation and advanced machine learning techniques.