1. Introduction
In modern orthopedics, precise preoperative planning is essential for total knee arthroplasty (TKA), often relying on three-dimensional (3D) surgical planning through computed tomography (CT) scans and robotic systems like Stryker’s Mako
TM [
1,
2]. While CT-based planning is highly accurate, it presents challenges such as financial burdens, longer turnaround times, increased patient radiation exposure, and logistical complexities. Additionally, evolving healthcare reimbursement models in the United States may limit insurance coverage for CT scans, reducing patient access to these advanced technologies [
3].
Precise preoperative planning is critical for TKA to avoid implant sizing and alignment complications. Radiographs play a pivotal role in this process, with 3D CT-based approaches offering practical advantages. According to multiple studies, 3D CT planning of the knees offers advantages in accuracy over 2D knee radiographs and is required for certain robotic workflows. However, it comes with higher costs. Additionally, a major shortcoming of CT scans is that they are taken in non-weight-bearing positions, which fail to capture pathologies related to joint space narrowing while bearing weight.
Personalized surgical instruments (PSIs) have also shown success in total knee replacement in comparison to conventional techniques. A PSI utilizes disposable cutting blocks customized to match a patient’s anatomy, eliminating the need for invasive pins and instrumentation of the intramedullary canals. However, a PSI’s success also depends on accurate digital bone surface data obtained from preoperative CT or MRI scans [
4,
5,
6,
7,
8].
Efforts have been made to explore X-ray-based two-dimensional-to-three-dimensional (2D–3D) reconstruction techniques for knee anatomy. Still, adoption has been slow due to concerns about accuracy in real-world pathological TKA cases. Most validation studies have been limited to healthy knee cadaver specimens, raising doubts about their applicability in clinical practice [
9].
Recent advancements in artificial intelligence (AI) have paved the way for innovative algorithms that can construct 3D bone reconstructions from 2D biplanar radiographs, offering a cost-effective alternative to traditional 3D imaging. These algorithms are based on statistical shape modeling (SSM) and require specialized calibration procedures and image acquisition protocols [
10,
11,
12]. To address these challenges, researchers have explored alternative approaches using SSM and multiple calibrated radiographic images to reconstruct patient-specific 3D surface models. SSM methods provide advantages by transforming morphological characteristics into a space governed by fundamental modes of variation. The quality of the results depends on the extent of morphological variability within the training dataset in addition to the quantity and orientation of the Digitally Reconstructed Radiographs (DRRs) [
13,
14]. RSIP XPlan.ai™ [
15] represents a groundbreaking solution driven by AI principles. This innovation combines machine learning models, trained on a diverse dataset of over 1000 pathological knee samples, with a robust 3D calibration technique for X-ray acquisition geometry. This study’s aim was to assess the accuracy of this novel 2D-to-3D knee reconstruction solution using clinically relevant data from a TKA patient cohort, comparing X-ray-based 3D reconstructions with corresponding patient CT scans.
2. Materials and Methods
The study took place at a major medical center, was approved by the institution’s IRB, and followed the applicable laws and regulations. The inclusion criteria for the study were adults (age > 18 years old) with diagnosed osteoarthritis referred to CT scans by their orthopedic surgeon. The average within-slice resolution of the CT scans was 0.49 mm (median: 0.37 mm; STD: 0.17 mm; max: 0.71 mm), and the average slice thickness was 0.61 mm (median: 0.6 mm; STD: 0.24 mm; max: 1 mm). All patients underwent standard anteroposterior (AP) and lateral knee X-rays while wearing the RSIP XPlan.ai™ calibration jig, which was strapped to the patient’s shin by the X-ray technician (
Figure 1). The pixel size of all X-ray images was 0.14 mm.
X-ray-based 3D reconstructions were generated by running the RSIP XPlan.ai™ algorithm and its associated image processing workflow. The algorithm generates 3D models from X-rays using a proprietary neural network.
The bone model generated by the neural network is represented as a 3D volumetric mask. This mask is then converted into an STL mesh using Marching Cubes. All the CT scans were manually segmented, yielding a 3D volumetric mask for each bone. These masks are then converted into STL meshes using Marching Cubes. The segmentation was performed under the guidance of a certified radiologist or orthopedic surgeon, using ITK SNAP software (version 4.0.2), generating the ground-truth CT-based bone models. All voxels with the appearance of bone, including spongy bone, cortical bone, and osteophytes, were included in the bone segmentations; femurs and tibias were each segmented separately.
Examples of the X-ray-based 3D reconstructions and their corresponding CT-based ground-truth models are shown in
Figure 2.
The Root Mean Square Error (RMSE) distance between the points of the X-ray-based 3D models and the CT segmentations was computed by registering the STL meshes using the Iterative Closest Points algorithm. Subsequently, the RMSE was calculated by taking the square root of the mean of the square distances of each mesh vertex to the closest vertex in the second mesh. This computation was performed using the Python implementation of Open3D (version 0.17.0), using the following formula:
= predicted surface point;
= actual surface point;
= # of points (For X-ray-based models.)
The average number of points (N) was 34,298 (STD: 4669) for the femur, and 29,852 (STD: 4463) for the tibia.
Eight bony landmarks (five on the femur and three on the tibia) were analyzed to assess reconstruction accuracy in clinically relevant regions, as illustrated in
Table 1. Additionally, three anatomical axes (two on the femur and one on the tibia) were assessed, as presented in
Table 2.
To define the locations of the bony landmarks and the anatomical axes, 3D models of the X-ray-based reconstructions and the CT segmentations were annotated by three certified orthopedic surgeons using MITK software. (Version 2023.04, German Cancer Research Center, Heidelberg, Germany.) Each surgeon was presented twice with each 3D model, in a randomized order, and was blinded to the model’s origins—whether CT-based or X-ray-based. Surgeons performed annotations independently and were unaware of the study’s purpose. For each of the two presentations (of each bone model), the surgeon annotated a single point, yielding 6 points (3 surgeons × 2 presentations) per landmark per bone model. Bony landmark regions were then defined by fitting a circle with a predefined radius to all 3 × 2 = 6 point annotations of each landmark and discarding outliers. Examples of the resulting landmark regions are shown in
Figure 3. The X-ray-based bony landmark regions were then compared to the CT-based models using the RMSE.
The TEA was annotated 6 times for each bone model, by asking each of the 3 surgeons to annotate the epicondyles twice (in random order, blinded), yielding 6 annotations per bone model of a 3D TEA axis, defined simply as the line connecting the epicondyles. The 3D PCA was similarly defined as the line connecting the posterior lateral condyle to the posterior medial condyle, yielding 6 annotations per bone model. The 3D MLTA was defined as the line connecting the lateral and medial tibial plateau points, yielding 6 annotations per bone model.
To simulate the distal femoral cutting plane, the anatomical axis of the femur was first computed by fitting a line to the center of the femoral shaft. An estimated mechanical axis was then defined at a 5° rotation angle in the AP plane. The simulated cutting plane was defined as perpendicular to the estimated mechanical axis at an offset of 9 mm from the most distal point on the medial condyle. A proximal tibial cutting plane was defined using a tibial slope at an offset of 9 mm from the most proximal tibial plateau center. The simulated cutting planes are shown in
Figure 4.
A 2-dimensional slice contour was obtained by intersecting the cutting plane with the corresponding bone model to assess model performance on the cut contours. The X-ray-based model’s cut contour was then compared to the CT-based model’s cut contour using the RMSE, which was calculated by computing the square root of the mean of the square distances between each point along the contour from the closest point of the second contour. It is noteworthy that the contours were rendered at 0.1 mm resolution (
Figure 4).
The 2D anatomical axes (TEA, PCA, and MLTA) were then defined by projecting the axes that were annotated by the surgeons on the 3D models onto the relevant 2-dimensional cutting planes, as shown in
Figure 5. This process yielded 6 annotations of each 2D axis per bone model.
Statistical Analysis
RMSE values for whole bones, bony landmark regions, and cut contours were assessed by computing their means and standard deviations (STDs) across the 18 patients in the dataset.
To quantify the baseline human-level variation in anatomical axes annotated in computerized bone models, we calculated the mean angle between all possible pairs of CT-based 2D axes’ annotations for a given bone (3 × 2 = 6 annotations per bone per axis, generating 15 pairs of CT-based annotations per bone per axis), and the STD of those angles.
To assess the deviations between the 2D axes computed from X-ray-based reconstructions and the corresponding CT-based models, we first computed the angles between all possible pairs consisting of an X-ray-based axis annotation and a CT-based annotation of the corresponding axis (6 × 6 = 36 pairs per bone per axis). We then computed the means and the STDs of these 36 angles for each of the 3 axes. The statistical analysis was done using Pandas 2.1 and NumPy 1.25 Python packages.
4. Discussion
In this study, we evaluate a novel 2D-to-3D reconstruction solution, RSIP XPlan.ai™, on data from patients that underwent TKA. Our results demonstrate the accuracy of this solution in both overall bone reconstruction as well as clinically relevant anatomical measurements typically used to plan implant size and positioning. Accuracy was consistently sub-mm, and accuracy measures on bony landmarks were in the 0.5 mm range, indicating excellent performance in clinically relevant regions. Anatomical axes were also represented in X-ray-based reconstructions with CT-level accuracy, as demonstrated by comparison to human-level baselines.
Image-based robotics, and more generally, preoperative planning, present significant benefits to TKA workflows. Two-dimensional-to-three-dimensional reconstruction technologies present the opportunity to significantly increase access to these solutions by eliminating the necessity for a preoperative CT scan [
12,
16]. However, due to the paucity of information in plain-film X-rays, these technologies face significant challenges in providing the needed accuracy. In addition, these technologies are often validated on cadaveric data, which is not reflective of real-world TKA patient anatomical characteristics. For example, Fernandes et al. [
9] supported further investigation into the real-world clinical value of an AI algorithm that demonstrated high accuracy in converting 2D X-rays to 3D bone models for TKA planning. When comparing the AI bone reconstructions to CT scans, four out of six anatomical measurement parameters showed mean absolute errors below 2 mm. Furthermore, when comparing the AI models to direct cadaveric bone measurements, five out of six parameters exhibited errors under 2 mm.
Victor et al. [
17] evaluated the precision of determining bony landmarks on knee CT scans and concluded that these landmarks can be determined with fairly high precision. CT scans allow for reasonably precise and reproducible localization of knee anatomy, facilitating surgical planning and assessment. The mean intra-observer error was approximately 1 mm for most anatomical landmarks, indicating good repeatability. However, the mean inter-observer error ranged from 0.3 to 3.5 mm, depending on the specific landmark, suggesting moderate variability between observers. Notably, femoral epicondyles and posterior points of the tibial condyles exhibited higher inter-observer variability, with mean errors exceeding 2 mm, while joint centers and condyle centers demonstrated lower errors, typically below 1 mm.
In this work, we demonstrate that the XPlan.ai™ X-ray-based reconstructions are sub-mm accurate in clinically relevant regions when compared to CT scans. Further research is warranted to assess the accuracy of CT scans and X-ray-based models when compared to physical bones. This assessment could be conducted using tools such as a laser scanner, digitizer, or similar physical measurement apparatus.
This study has several limitations. Firstly, the data was collected exclusively from a single medical center, involving a cohort of only 18 patients. Consequently, it is advisable to conduct broader investigations to validate and establish the robustness and generalizability of the system’s performance across diverse populations and healthcare settings. It is important to note that the bone landmarks and anatomical axes were determined through surgeon interpretation rather than direct measurement from the anatomical landmarks generated from the CT scan. In addition, clinical suitability was assessed in this study by specific relevant anatomical measurements performed on computerized 3D models. Further studies are recommended for direct assessment of solution performance in the prediction of actual implant sizes and locations in the context of a real-world surgical planning workflow. Additional application-specific measures, such as registration accuracy in robotic surgery workflows, should also be explored.