Next Article in Journal
Factors Affecting Korean Medicine Health Care Use for Functional Dyspepsia: Analysis of the Korea Health Panel Survey 2017
Previous Article in Journal
Metabolic Comorbidities and Cardiovascular Disease in Pediatric Psoriasis: A Narrative Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimation of Surgery Durations Using Machine Learning Methods-A Cross-Country Multi-Site Collaborative Study

1
Health Services and Systems Research, Duke-NUS Medical School, Health Services Research Centre, Singapore Health Services, Singapore 169856, Singapore
2
SingHealth Duke-NUS Global Health Institute, SingHealth Duke-NUS Academic Medical Centre, Singapore 168753, Singapore
3
Department of Surgery, Duke University School of Medicine, Durham, NC 27710, USA
4
Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710, USA
5
Thomas Lord Department of Mechanical Engineering and Materials Science, Duke University, Durham, NC 27708, USA
6
Division of Surgery and Surgical Oncology, Singapore General Hospital, Singapore 168753, Singapore
*
Author to whom correspondence should be addressed.
Healthcare 2022, 10(7), 1191; https://doi.org/10.3390/healthcare10071191
Submission received: 18 April 2022 / Revised: 21 June 2022 / Accepted: 21 June 2022 / Published: 25 June 2022
(This article belongs to the Section Artificial Intelligence in Medicine)

Abstract

:
The scheduling of operating room (OR) slots requires the accurate prediction of surgery duration. We evaluated the performance of existing Moving Average (MA) based estimates with novel machine learning (ML)-based models of surgery durations across two sites in the US and Singapore. We used the Duke Protected Analytics Computing Environment (PACE) to facilitate data-sharing and big data analytics across the US and Singapore. Data from all colorectal surgery patients between 1 January 2012 and 31 December 2017 in Singapore and, 1 January 2015 to 31 December 2019 in the US were used, and 7585 cases and 3597 single and multiple procedure cases from Singapore and US were included. The ML models were based on categorical gradient boosting (CatBoost) models trained on common data fields shared by both institutions. The procedure codes were based on the Table of Surgical Procedure (TOSP) (Singapore) and the Current Procedural Terminology (CPT) codes (US). The two types of codes were mapped by surgical experts. The CPT codes were then transformed into the relative value unit (RVU). The ML models outperformed the baseline MA models. The MA, scheduled durations and procedure codes were found to have higher loadings as compared to surgeon factors. We further demonstrated the use of the Duke PACE in facilitating data-sharing and big data analytics.

1. Introduction

Operating Rooms (ORs) account for a significant proportion of a hospital’s total revenue and about 40% of the hospital’s total expenses. [1] With a global estimate of 312.9 million surgeries performed each year, the effective planning of OR resources is an essential process that has a large impact on hospital surgical processes worldwide [2]. The scheduling of surgical procedures plays an important role in the OR planning process and has a direct impact on resource utilization, patient outcomes and staff welfare. Optimal scheduling of OR slots for surgeries prevents under-utilization of costly surgical resources as well as delays which cause unfavorable waiting times. Overtime resulting from sub-optimal schedules also leads to staff dissatisfaction as well as burnout from long working hours [3,4]. One key factor required in optimal scheduling of OR slots is the accurate prediction of surgery duration [3,5]. However, the variability in patients’ conditions and the type of surgical procedures and techniques required and the uncertainties around these patient and provider related factors present challenges for the prediction of surgery durations [6,7].
In recent years, many studies around the world have reported the use of various machine-learning (ML) methods to accurately predict surgery duration [8,9,10]. Linear regression techniques have been explored using patient and surgical factors and have reported the importance of such variables in predicting the total surgical procedure time [11]. Distributional modelling methods such as Kernel Density Estimation (KDE) [12] as well as log-normal distributions [13] were also demonstrated to be able to effectively predict surgery duration. More complex methods such as heteroscedastic neural network regression combined with expressive drop-out regularized neural networks have also been shown to have good performance [14]. Ensemble tree-based methods such as random forests were also used to predict surgery duration and cross validation showed that it outperformed other methods, reducing the mean absolute percentage error by 28%, when compared to current hospital estimation approaches [15]. A multi-center study based on two large European teaching hospitals has also demonstrated the use of a parsimonious lognormal modelling approach to improve the estimation of surgery duration and OR efficiency across more than one hospital [13].
The significance of multiple factors affecting surgical duration predictions may vary across hospital systems due to differences in case scheduling practices, surgical techniques and intraoperative processes. A broader understanding of the differences in the characteristics of the scheduling processes as well as scheduling behavior will lead to further insights into improving such estimations. In order to determine hospital level differences for estimating surgery duration, there is a need to go beyond simple distributional approaches to understand the multifactorial effects influencing the length of surgery durations across multiple sites. ML models such as gradient boosted trees, which utilize error residuals to improve the performance of ensemble models, have also been reported for other clinical prediction models, such as anterior chamber depth (ACD) in cataract surgery [16].
Although previous studies have examined the use of various ML prediction modelling methods to provide more accurate estimates, most of these studies offer results that are specific to a single institution with a minority that derived prediction models validated with external institutional data, albeit from the same country [3,6,10,11,13,14,17]. Similar to other use cases where prediction models are developed based on an extensive use of real-world data, a key reason resulting in the difficulty of conducting multi-site across multiple countries is the lack of data sharing and of governance infrastructure to support collaborative work. This impediment can be further magnified when the sharing of data has to occur over multiple jurisdictions. This has resulted in a scarcity of published studies that can cover multi-institutional data across countries or continents, thereby reducing the external validity and generalizability of the prediction models.
In this study, we aim to determine the performance of current surgery case duration estimations and the use of machine learning models to predict surgery duration across two large teaching hospitals in the United States and Singapore. To facilitate deep collaboration between both hospitals and the sharing of large-scale datasets required for the development of the ML models, we will introduce the use of Duke Protected Analytics Computing Environment (PACE) [18]. PACE is a collaborative platform for facilitating data-sharing and analysis across both healthcare institutions. This study has demonstrated value in the use of PACE for a cross border and multi-institutional studies in the evaluation of surgical durations across institutions.

2. Materials and Methods

The two study hospitals SH-1 and SH-2 described in this study are from Singapore, a city-state in Southeast Asia, and Durham, North Carolina, a state in the United States, respectively. SH-1 is the Singapore General Hospital (SGH), which is one of the largest comprehensive public hospitals in Singapore under the Singapore Health Services (SingHealth) public healthcare cluster. SGH is a tertiary multidisciplinary academic hospital which comprises more than 30 clinical disciplines and approximately 1700 inpatient beds and provides acute and specialist care to over one million patients per year [19]. The hospital saw more than 25,000 surgeries in 2019. SH-2 is Duke University Hospital, a full-service tertiary and quaternary care hospital that is part of the Duke University Health System in Durham, North Carolina. Duke University Hospital has 957 inpatient beds, 51 operating rooms, an endo-surgery center and an ambulatory surgery center with nine operating rooms. The hospital offers multidisciplinary care and serves as a regional emergency/trauma center where 42,554 patients were admitted in 2020 [20,21].
Ethics approval for the study was exempted by both the SingHealth’s Centralized Institutional Review Board (SingHealth CIRB Reference: 2018-2558) and Duke Institutional Review Board (Duke IRB Reference: Pro00104275) for both study hospitals.

2.1. Cross Country Collaborative Platform

The Duke PACE [22] is a secured virtualized network environment where researchers can collaborate and perform analysis with protected health information. PACE simplifies the process of obtaining and sharing protected data from the electronic medical record (EMR) systems. Datasets from both study hospitals are shared and analyzed jointly by the study team through the PACE system. The use of PACE requires video-based training and a rigorous account request and approval process for Duke University employees and affiliates. Data loaded in PACE has to be HIPAA compliant. Ethics approval or exemption has to be given by the ethics review boards of the respective study hospitals.
Data was extracted from the SH-1 EMR system based on the Sunrise Clinical Manager, Allscripts [23], extracted through the enterprise data warehouse, electronic Health Intelligence System-eHIntS [24]. Data from the SH-2 EMR system were extracted from the Duke Health enterprise data warehouse and Duke’s Maestro Care (Epic) EMR system [25]. These data were loaded into PACE and then served through a secured Duo multifactor authentication gateway [26] for access by collaborators across the two countries with approved network IDs. The analysis was performed with Python 3.6, Python Software Foundation [27], with the required packages loaded into the PACE environment. The hardware provisioned in PACE for this study was Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (2 processors) (Intel Corporation) and 32 GB of RAM running on Windows 10 Enterprise operating system (Microsoft Corporation). Access was time-bound based on the approved period of study according to the respective ethics review boards’ decisions.

2.2. Descriptive Analysis

We performed a retrospective analysis of all patients who had undergone colorectal surgery between 1 January 2012 and 31 December 2017 for SH-1 and 1 January 2015 to 31 December 2019 for SH-2. Common data fields were mapped between datasets from both study sites and used in the study. Patient demographics included age, gender, height, weight and body mass Index (BMI). Surgery related factors included surgery procedure codes, number of procedures done in the surgery, first and second surgeons codes, principal anesthetist codes, anesthesia type (local, general or regional anesthesia), patient case type (inpatient or day surgery), OR location, OR code and ASA scores. The Table of Surgical Procedures (TOSP) is a categorical variable used for billing purposes [28]. TOSP codes provide some information on the complexity of the procedure codes used in SH-1, where higher levels represent greater complexity. SH-2 uses the categorical Current Procedural Terminology (CPT) codes that similarly show the complexity of the procedures and services [29]. The TOSP and CPT codes for colorectal surgeries used in this study were mapped by the surgical domain experts and shown in Table A1, Appendix A. The list of mapped fields across the two institutions is shown in Table 1. In SH-2, the categorical CPT codes per case were transformed into the relative value unit (RVU) which is a single continuous variable (see Table 2). The RVU is a consensus driven billing indicator that can serve as a proxy for procedure workload and replace CPT codes as a more informative feature of surgical duration predictions [30,31]. The scheduled/listing duration for the surgical case as well as the moving average (MA) durations were also included.
A total of 7585 cases and 3597 cases from SH-1 and SH-2 were included respectively. The data cleaning process is shown in Figure 1 for both SH-1 and SH-2. The mean duration for SH-1 and SH-2 were 102 min and 128 min, respectively. The mean patient age across both sites were 54.8 and 54.4 years, respectively.

2.3. Moving Average Estimation

The existing EMR systems for both SH-1 and SH-2 surgical case management both adopt a moving average (MA) prediction of historical surgery durations to provide an estimated surgery duration for each surgery case. SH-1 uses Allscripts [23], whilst SH-2 uses the EPIC system [25]. The MA algorithm calculates a historical moving average of actual surgery duration by grouping surgical procedure codes and surgeon codes over a specified period of time. The OR schedulers, who schedule surgery cases into the respective EMR systems, are able to override the estimates with their own estimated durations.
The historical moving average of the actual case length for each procedure and surgeon code combination was used as the prediction for the next surgery of the same procedure code and conducted by the same surgeon. If there are less than five cases for a particular surgeon and procedure code combination, the MA of the surgical duration for that particular procedure (regardless of surgeon) is utilized. If the data is insufficient for this grouping, no MA estimate will be provided and the scheduler will have to provide an estimate instead. If there are sufficient data for the MA estimates, data below the 10th percentile and above the 90th percentile will be excluded from the MA calculation. If there are no manual overrides on the MA prediction, the MA-based duration is then recorded as the scheduled duration and is used to schedule cases in the system. The existing estimates as well as the scheduled durations will be used as the baseline against which new predictions developed by machine learning algorithms in this study will be compared.
Cases which were listed as emergency cases as well as those with missing actual surgery duration were excluded for the study. Both single and multiple procedure surgeries were included. The data cleaning process is summarized in Figure 1. Cases with missing values for either scheduled duration or MA duration were excluded. Outcome metrics were compared by available cases by individual duration type, as well as a common set of valid cases.
The outcome of interest was the difference between the predicted and the actual surgery durations for each surgical case. Surgery duration was defined as the time taken between the point when the patient is wheeled into the OR and when the patient is wheeled out of the OR. For each comparison, we compared the Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and the percentage of cases within 20% of scheduled duration with the predicted duration.
The predictive models were generated with the categorical gradient boosting (CatBoost, version 0.2.1) [32] package in Python 3.6 [27]. The dataset of each study hospital was split into 80% for training and 20% for testing. The models were trained on the common data fields (Patient and Surgery factors) shared by both datasets, with different permutations of additional key variables such as the Moving Average, Scheduled/Listing Duration and TOSP Code/RVU. The CatBoost models with features listed in Table 2 were compared. SH-1’s models were trained and tested on SH-1’s dataset and SH-2’s models were trained and tested on SH-2’s dataset.
Hyperparameter optimization was conducted using a grid search on a four-fold cross validation performed to determine the optimal parameters for the models. The following parameters were used for the CatBoost models-Number of Iterations: 200; Maximum Tree Depth: 5, 6, 7, 8, 9, 10, 11, Learning Rate: 0.01, 0.03, 0.05, 0.1, 0.2, 0.3; Loss Function: Root Mean Square Error (RMSE). The parameters which provided the lowest cross validation RMSE score were chosen for the final model. Feature importance for the CatBoost models were evaluated based on the amount that the prediction value changes with respect to a change in the predictor variable [32].

3. Results

Scheduler and System Average Performance

Table 3 compares the performance of the scheduled duration and the MA duration against the actual surgery duration. The MA algorithm provides better performance across all evaluation metrics for both datasets as compared to the scheduled durations. Proportion of cases with the actual duration falling within 80–120% of the listed duration is higher in SH-2. Higher proportion of cases were found to be overestimated in the SH-2 dataset, whereas for SH-1 (>80% of actual duration) higher proportion of cases were found to be underestimated (<80% of actual duration) (see Figure 2).
Table 4 and Table 5 show the performance of the various SH-1 and SH-2 predictive models based on the test dataset. The results showed that with the ML-based models (Models 0–5), SH-1 could at least predict 40% of cases accurately within +/−20% of the actual duration, while SH-2 could at least predict approximately 50% of its cases within +/−20% of the actual duration. Based on the +/−20% prediction band, the ML-based models in both hospitals showed better prediction accuracy than the existing MA models that each individual hospital uses.
In SH-1, Model 5 showed the best performance as shown in Table 4. Model 4 has slightly higher MAE, MAPE and RMSE, as compared to Model 5, but it shows a better prediction accuracy (within +/−20% deviation from the actual duration). Nonetheless, both Models 4 and 5 have at least a 5% higher prediction accuracy than that of the MA. Among the five models in SH-2, Table 5 shows that Model 5 has the best performance, with 56.11% of its predictions falling within +/−20% of the actual duration. Model 5 prediction accuracy (within +/−20%) is 7.78% higher than that of the MA. Model 5 also has the lowest RMSE, MAE and MAPE at 38.48%, 23.61% and 23.36%, respectively.
The feature importance of each predictor variable is averaged across all of the decision trees within the model. The best performing models (Model 5 for both SH-1 and SH-2) were used to plot the feature importance shown in Figure 3 and Figure 4.

4. Discussion

The objective of this study was to explore and determine the performance of current surgery case duration estimations and the use of machine learning models to predict surgery duration across two large tertiary healthcare institutions located in the United States and Singapore. The two healthcare institutions have different EMR systems, coding and representation of surgical details such as surgical procedure codes. The Table of Surgical Procedure (TOSP) codes [28] were used in SH-1 whilst the Current Procedural Terminology (CPT) [29] codes were used in SH-2. The two types of codes were mapped by surgical domain experts. The mapping table is given in Appendix A.
The validation results showed that, in both study sites, the simple MA-based predictions outperform the scheduled duration provided by the OR schedulers across RMSE, MAE, MAPE and proportion of cases within 80–120% of the scheduled actual duration. Every minute of improved duration estimates would help in improving the efficiency of OR performance [15,33]. MA-based predictions have been frequently reported in the literature. Similar to the existing literature [3,6,34], both hospitals have been using simple MA-based methods, such as Last-5 [6], which uses the average of the most recent five cases in the relevant history for the prediction. Simple MA methods can be accurate in the estimation of surgery durations across multiple sites.
The baseline machine-learning (ML) models which considered patient, surgeon and surgery related factors without the MA (Model 0) show improved performance for SH-1 across all metrics against the scheduled durations and MA estimates. Extending from the baseline model, Models 2, 4 and 5 included the MA features to improve the performance and generalizability for the predictions. The improved performance of ML-based models is similar to results that were recently reported [8,10,17,35]. For both SH-1 and SH-2, the majority of the contributions to the model was based on the MA and scheduled duration. For SH-1, the next five variables with the highest contributions are: Procedure Surgical Table Code, OT Code, First Surgeon ID, OT Location Code and BMI. At SH-2, the most significant factors are also MA and scheduled duration. whilst the next five variables with the highest contributions are RVU, Patient Class, Primary Physician ID and OR Location and Number of Procedures. In both SH-1 and SH-2, MA, scheduled duration and, TOSP Code (for SH-1) or RVU (for SH-2), have higher loading in the model as compared to surgeon factors. However, the order of importance of the other variables differs slightly between the two sites. For both sites, the variables describing the complexity of the surgery (TOSP Code in SH-1 and RVU in SH-2) have relatively higher loadings in the prediction models. The presence of the MA, Scheduled/Listing and Procedure Surgical Table Code and RVU only for Model 5 may have resulted in the better performance of this model. All these features have the highest contributions in Model 5 feature importance for SH-1 and SH-2 as shown in Figure 4.
As the study sites utilized similar datasets across different study periods, there may be concerns about model bias. However, for both sites during the study horizon, there were no significant shifts in the surgical procedures for colorectal procedures and the design of the EMR systems and the extract-transform-load (ETL) system within the enterprise data warehouse across both sites. Moreover, each hospital has its own trained CatBoost ML model [32] so different periods in one model will not affect the other. The framework using CatBoost ML models has been tuned to provide the best prediction models based on the lowest cross validation RMSE. This result can be further evaluated in future studies in collaboration with more study sites. The collaborative PACE platform [22] has been shown to facilitate such study across two different jurisdictions.
Electronic health record (EHR) data are extremely sensitive and valuable and require a protected environment to work in. This can be difficult and time consuming to achieve even in one institute. Duke PACE [22] provides a secured and protected environment to query and store these data and perform advanced analysis. This study demonstrates that PACE can provide the platform for this study to share EHR data between the two institutes of the two countries and facilitates the use of advanced machine-learning tools to predict surgical durations. Similar features were used in the prediction models developed at both sites (see Table 1). This study shows a viable alternative to facilitate future collaboration between institutes around the world. The collaboration through PACE demonstrated the feasibility in data sharing, validating the hypothesis and collaborative development of analytical models in order to support better clinical decision that can improve system, process and patient outcomes.

5. Conclusions

In this study, we compared the performance of existing MA-based estimates with novel ML-based predictive models for surgery durations across two large tertiary healthcare institutions. The ML-based models which considered additional patient, surgeon and surgery related factors show improved performance over both the MA-based method and the scheduled durations across multiple accuracy metrics. The ML-based models can be deployed in place of the existing MA-based estimates. Additional patient-related factors (e.g., comorbidities) could potentially help to further improve the accuracies of the predictions.
We further demonstrated the use of the Duke PACE as the collaborative platform for facilitating data-sharing and analysis across both healthcare institutions for cross border and cross-institutional studies. Duke PACE was able to overcome the impediments in data sharing and governance policies to support collaborative work across multiple jurisdictions.

Author Contributions

The work presented here was carried out in collaboration amongst all authors. S.S.W.L., W.W., D.B., C.M. and H.K.T. were responsible for the conception and study design. S.S.W.L., H.Z., D.B. and B.Y.A. did the literature review, modelling, data analysis and drafting the manuscript. S.S.W.L., H.Z., B.Y.A. and D.B. made significant revisions. S.S.W.L., W.W., D.B., C.M. and H.K.T. supervised the analysis, modelling and the interpretation of data. All authors have read and agreed to the published version of the manuscript.

Funding

This project is funded by the Duke/Duke-NUS Research Collaboration Pilot Project Award (Duke/Duke-NUS/RECA(Pilot)/2019/0058).

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the study does not involve the use of human biological material or health information that is not individually-identifiable, hence does not meet the definition of human biomedical research (Singhealth Centralized Institutional Review Board, CIRB Ref: 2018/2558; Duke University Health System Institutional Review Board, Study ID: Pro00104275).

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study may be available on request from the corresponding author subject to legal or collaboration agreements. The data are not publicly available due to the proprietary nature of the data.

Acknowledgments

We would like to express our deep gratitude to Ms Ginny Chen from Health Services Research Centre, Singapore Health Services and Health Services Research Institute, SingHealth Duke-NUS Academic Medical Centre for her support in this collaborative research.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Mapping of CPT to TOSP codes.
Table A1. Mapping of CPT to TOSP codes.
Anatomical SystemType of ProcedureCPT CodeTOSP CodesTOSP Table
DigestiveHepatectomy47120SF815L4C
DigestiveHepatectomy47120SF813L5C
DigestiveHepatectomy47122SF809L7C
DigestiveHepatectomy47125SF812L6B
DigestiveHepatectomy47130SF812L6B
DigestiveAppendectomy44950SF849A3B
DigestiveAppendectomy44950SF723A4A
DigestiveAppendectomy44960SF849A3B
DigestiveAppendectomy44960SF723A4A
DigestiveAppendectomy44970SF849A3B
DigestiveAppendectomy44970SF723A4A
DigestiveColorectal44140SF701C6C
DigestiveColorectal44140SF803C5C
DigestiveColorectal44140SF806C5C
DigestiveColorectal44143SF808R5C
DigestiveColorectal44144SF808R5C
DigestiveColorectal44145SF805R6C
DigestiveColorectal44145SF703R6C
DigestiveColorectal44145SF807R6B
DigestiveColorectal44146SF805R6C
DigestiveColorectal44146SF703R6C
DigestiveColorectal44146SF807R6B
DigestiveColorectal44147SF703R6C
DigestiveColorectal44150SF804C6A
DigestiveColorectal44150SF712C6A
DigestiveColorectal44151SF804C6A
DigestiveColorectal44151SF712C6A
DigestiveColorectal44160SF803C5C
DigestiveColorectal44204SF701C6C
DigestiveColorectal44204SF803C5C
DigestiveColorectal44204SF806C5C
DigestiveColorectal44205SF803C5C
DigestiveColorectal44206SF808R5C
DigestiveColorectal44207SF805R6C
DigestiveColorectal44207SF703R6C
DigestiveColorectal44207SF807R6B
DigestiveColorectal44208SF805R6C
DigestiveColorectal44208SF703R6C
DigestiveColorectal44208SF807R6B
DigestiveColorectal44210SF712C6A
DigestiveColorectal44210SF804C6A
DigestiveEsophagectomy43101SF802E5B
DigestiveEsophagectomy43107SF809E7B
DigestiveEsophagectomy43108SM702L7C
DigestiveEsophagectomy43112SF809E7B
DigestiveEsophagectomy43112SM702L7C
DigestiveEsophagectomy43113SF809E7B
DigestiveEsophagectomy43113SM702L7C
DigestiveEsophagectomy43116SF806E7C
DigestiveEsophagectomy43117SF804E6B
DigestiveEsophagectomy43117SF809E7B
DigestiveEsophagectomy43118SF804E6B
DigestiveEsophagectomy43118SF809E7B
DigestiveEsophagectomy43121SF804E6B
DigestiveEsophagectomy43122SF804E6B
DigestiveEsophagectomy43123SF804E6B
DigestiveEsophagectomy43124SF812E3A
DigestiveEsophagectomy43124SF806E7C
DigestivePancreatectomy48120SF705P4C
DigestivePancreatectomy48120SF706P5A
DigestivePancreatectomy48140SF708P5B
DigestivePancreatectomy48145SF809P7C
DigestivePancreatectomy48145SF712P5C
DigestivePancreatectomy48146SF703P7A
DigestivePancreatectomy48146SF704P7A
DigestivePancreatectomy48148SF807B5C
DigestivePancreatectomy48150SF809P7C
DigestivePancreatectomy48152SF809P7C
DigestivePancreatectomy48153SF809P7C
DigestivePancreatectomy48154SF809P7C
DigestivePancreatectomy48155SF809P7C
DigestiveColorectal44155SF712C6A
DigestiveColorectal44155SF804C6A
DigestiveColorectal44155SF805C6B
DigestiveColorectal44156SF805C6B
DigestiveColorectal44157SF805C6B
DigestiveColorectal44157SF713C6C
DigestiveColorectal44158SF713C6C
DigestiveColorectal44211SF713C6C
DigestiveColorectal44212SF712C6A
DigestiveColorectal44212SF804C6A
DigestiveColorectal44212SF805C6B
DigestiveColorectal45110SF845A6B
DigestiveColorectal45110SF805R6C
DigestiveColorectal45111SF805C6B
DigestiveColorectal45111SF701R5C
DigestiveColorectal45112SF807R6B
DigestiveColorectal45113SF807R6B
DigestiveColorectal45114SF701R5C
DigestiveColorectal45116SF701R5C
DigestiveColorectal45119SF807R6B
DigestiveColorectal45120SF803R5C
DigestiveColorectal45120SF700R5C
DigestiveColorectal45121SF803R5C
DigestiveColorectal45126SF703R6C
DigestiveColorectal45126SF808R5C
DigestiveColorectal45126SF805A6B
DigestiveColorectal45130SF700R5C
DigestiveColorectal45135SF700R5C
DigestiveColorectal45160SF701R5C
DigestiveColorectal45395SF805C6B
DigestiveColorectal45395SF805C6B
DigestiveColorectal45397SF713C6C
DigestiveColorectal45402SF701R5C
DigestiveColorectal45550SF701R5C
EndocrineThyroid60200SJ801T3B
EndocrineThyroid60210SJ802T4A
EndocrineThyroid60212SJ802T4A
EndocrineThyroid60220SJ804T6A
EndocrineThyroid60220SJ802T4A
EndocrineThyroid60225SJ804T6A
EndocrineThyroid60225SJ802T4A
EndocrineThyroid60240SJ803T5C
EndocrineThyroid60240SJ703T6C
EndocrineThyroid60252SJ702T6A
EndocrineThyroid60254SJ702T6A
EndocrineThyroid60260SJ702T6A
EndocrineThyroid60270SJ702T6A
EndocrineThyroid60271SJ702T6A
ReproductiveHysterectomy/Myomectomy58140SI816U3B
ReproductiveHysterectomy/Myomectomy58146SI815U5A
ReproductiveHysterectomy/Myomectomy58150SI803U4A
ReproductiveHysterectomy/Myomectomy58150SI804U5C
ReproductiveHysterectomy/Myomectomy58150SI805U5C
ReproductiveHysterectomy/Myomectomy58150SI812U5C
ReproductiveHysterectomy/Myomectomy58152SI702U4C
ReproductiveHysterectomy/Myomectomy58180SI802U4A
ReproductiveHysterectomy/Myomectomy58210SI825U5C
ReproductiveHysterectomy/Myomectomy58210SI827U5A
ReproductiveHysterectomy/Myomectomy58210SI828U4A
ReproductiveHysterectomy/Myomectomy58240SI824U6B
ReproductiveHysterectomy/Myomectomy58260SI837U4A
ReproductiveHysterectomy/Myomectomy58260SI713V4A
ReproductiveHysterectomy/Myomectomy58262SI723U4B
ReproductiveHysterectomy/Myomectomy58263SI721U4B
ReproductiveHysterectomy/Myomectomy58270SI713V4A
ReproductiveHysterectomy/Myomectomy58290SI837U4A
ReproductiveHysterectomy/Myomectomy58290SI713V4A
ReproductiveHysterectomy/Myomectomy58291SI723U4B
ReproductiveHysterectomy/Myomectomy58292SI721U4B
ReproductiveHysterectomy/Myomectomy58294SI713V4A
ReproductiveHysterectomy/Myomectomy58541SI713U4B
ReproductiveHysterectomy/Myomectomy58542SI713U4B
ReproductiveHysterectomy/Myomectomy58543SI712U5A
ReproductiveHysterectomy/Myomectomy58544SI712U5A
ReproductiveHysterectomy/Myomectomy58545SI709U3C
ReproductiveHysterectomy/Myomectomy58546SI700O4B
ReproductiveHysterectomy/Myomectomy58548SI800O 5C
ReproductiveHysterectomy/Myomectomy58548SI804O4A
ReproductiveHysterectomy/Myomectomy58550SI718U4B
ReproductiveHysterectomy/Myomectomy58552SI718U4B
ReproductiveHysterectomy/Myomectomy58553SI718U4B
ReproductiveHysterectomy/Myomectomy58554SI718U4B
ReproductiveHysterectomy/Myomectomy58570SI713U4B
ReproductiveHysterectomy/Myomectomy58572SI712U5A
ReproductiveHysterectomy/Myomectomy58940SI805O3B
ReproductiveHysterectomy/Myomectomy58951SI800O5C
ReproductiveHysterectomy/Myomectomy58951SI711U6A
ReproductiveHysterectomy/Myomectomy58953SI804O4A
ReproductiveHysterectomy/Myomectomy58954SI800O5C
ReproductiveHysterectomy/Myomectomy58954SI804O4A
MusculoskeletalTHA27125SB838H5C
MusculoskeletalTHA27130SB839H6A
MusculoskeletalTHA27130SB723H6B
MusculoskeletalTHA27132SB724H6C
MusculoskeletalTHA27134SB724H6C
MusculoskeletalTHA27137SB724H6C
MusculoskeletalTHA27138SB724H6C
KidneyNephrectomy50220SG816K4B
KidneyNephrectomy50225SG816K4B
KidneyNephrectomy50230SG804K5C
KidneyNephrectomy50234SG800K5C
KidneyNephrectomy50236SG800K5C
KidneyNephrectomy50240SG721K5C
KidneyNephrectomy50543SG720K6A
KidneyNephrectomy50545SG710K6A
KidneyNephrectomy50546SG700K6A
KidneyNephrectomy50546SG722K4C
KidneyNephrectomy50548SG700K6A
Table A2. List of features present in each CatBoost model.
Table A2. List of features present in each CatBoost model.
Model Number
SNSH-1 Data FieldsSH-2 Data Fields0 (Baseline)12345
1OT CodeRoom
2Actual DurationIn-Out Duration
3First Surgeon Department CodeService Type
4Priority of OperationCase Class
5Department CodeDivision
6OT Location CodeLocation
7Procedure CodeCPT List
8Type of AnesthesiaPrimary Anesthesia Type
9ASA StatusASA Rating
10AgePatient Age
11GenderSex
12Visit TypePatient Class
13BMIBMI
14HeightHeight
15WeightWeight
16First Surgeon IDPrimary Physician ID
17Second Surgeon IDSecondary Physician ID
18Principal Anesthetist IDFirst Anesthetist ID
19MA 1year_3rdMA 1year_3rd (calculated)
20Number of ProceduresNumber of Procedures
21Number of PanelsNumber of Panels
22Multiple Procedure CodesSorted CPT List
23Listing DurationScheduled Duration
Notations: OT: Operating Theatre; CPT: Current Procedure Terminology (transformed into Relative Value Units); ASA: American Society of Anesthesiology; BMI: Body Mass Index; ID: Unique identifier; MA: Moving Average.

References

  1. Ang, W.; Sabharwal, S.; Johannsson, H.; Bhattacharya, R.; Gupte, C. The cost of trauma operating theatre inefficiency. Ann. Med. Surg. 2016, 7, 24–29. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Weiser, T.G.; Regenbogen, S.E.; Thompson, K.D.; Haynes, A.B.; Lipsitz, S.R.; Berry, W.R.; Gawande, A.A. An estimation of the global volume of surgery: A modelling strategy based on available data. Lancet 2008, 372, 139–144. [Google Scholar] [CrossRef]
  3. Kayış, E.; Khaniyev, T.T.; Suermondt, J.; Sylvester, K. A robust estimation model for surgery durations with temporal, operational, and surgery team effects. Health Care Manag. Sci. 2014, 18, 222–233. [Google Scholar] [CrossRef] [PubMed]
  4. Memon, A.G.; Naeem, Z.; Zaman, A. Occupational Health Related Concerns among Surgeons. Int. J. Health Sci. 2016, 10, 265–277. [Google Scholar] [CrossRef]
  5. Erdogan, S.A.; Denton, B.T. Surgery Planning and Scheduling. In Wiley Encyclopedia of Operations Research and Management Science; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2011. [Google Scholar]
  6. Kayis, E.; Wang, H.; Patel, M.; Gonzalez, T.; Jain, S.; Ramamurthi, R.J.; Santos, C.; Singhal, S.; Suermondt, J.; Sylvester, K. Improving prediction of surgery duration using operational and temporal factors. AMIA Annu. Symp. Proc. 2012, 2012, 456–462. [Google Scholar] [PubMed]
  7. Thiels, C.A.; Yu, D.; Abdelrahman, A.M.; Habermann, E.B.; Hallbeck, S.; Pasupathy, K.S.; Bingener, J. The use of patient factors to improve the prediction of operative duration using laparoscopic cholecystectomy. Surg. Endosc. 2016, 31, 333–340. [Google Scholar] [CrossRef] [PubMed]
  8. Bellini, V.; Guzzon, M.; Bigliardi, B.; Mordonini, M.; Filippelli, S.; Bignami, E. Artificial Intelligence: A New Tool in Operating Room Management. Role of Machine Learning Models in Operating Room Optimization. J. Med. Syst. 2019, 44, 20. [Google Scholar] [CrossRef] [PubMed]
  9. Hosseini, N.; Sir, M.Y.; Jankowski, C.J.; Pasupathy, K.S. Surgical Duration Estimation via Data Mining and Predictive Modeling: A Case Study. AMIA Annu. Symp. Proc. 2015, 2015, 640–648. [Google Scholar] [PubMed]
  10. Tuwatananurak, J.P.; Zadeh, S.; Xu, X.; Vacanti, J.A.; Fulton, W.R.; Ehrenfeld, J.M.; Urman, R.D. Machine Learning Can Improve Estimation of Surgical Case Duration: A Pilot Study. J. Med. Syst. 2019, 43, 44. [Google Scholar] [CrossRef] [PubMed]
  11. Edelman, E.R.; Van Kuijk, S.M.J.; Hamaekers, A.E.W.; De Korte, M.J.M.; Van Merode, G.G.; Buhre, W.F.F.A. Improving the Prediction of Total Surgical Procedure Time Using Linear Regression Modeling. Front. Med. 2017, 4, 85. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Taaffe, K.; Pearce, B.; Ritchie, G. Using kernel density estimation to model surgical procedure duration. Int. Trans. Oper. Res. 2018, 28, 401–418. [Google Scholar] [CrossRef]
  13. Stepaniak, P.S.; Heij, C.; Mannaerts, G.H.H.; de Quelerij, M.; de Vries, G. Modeling Procedure and Surgical Times for Current Procedural Terminology-Anesthesia-Surgeon Combinations and Evaluation in Terms of Case-Duration Prediction and Operating Room Efficiency: A Multicenter Study. Anesth. Analg. 2009, 109, 1232–1245. [Google Scholar] [CrossRef] [PubMed]
  14. Ng, N.H.; Gabriel, R.A.; McAuley, J.; Elkan, C.; Lipton, Z.C. Predicting Surgery Duration with Neural Heteroscedastic Regression. In Proceedings of the 2nd Machine Learning for Healthcare Conference, Boston, MA, USA, 18–19 August 2017; Finale, D.-V., Ed.; PMLR (Proceedings of Machine Learning Research): Freiburg, Germany, 2017; pp. 100–111. [Google Scholar]
  15. ShahabiKargar, Z.; Khanna, S.; Good, N.; Sattar, A.; Lind, J.; O’Dwyer, J. Predicting Procedure Duration to Improve Scheduling of Elective Surgery. In Lecture Notes in Computer Science; Springer International Publishing: Berlin/Heidelberg, Germany, 2014; pp. 998–1009. [Google Scholar]
  16. Li, T.; Yang, K.; Stein, J.D.; Nallasamy, N. Gradient Boosting Decision Tree Algorithm for the Prediction of Postoperative Intraocular Lens Position in Cataract Surgery. Transl. Vis. Sci. Technol. 2020, 9, 38. [Google Scholar] [CrossRef] [PubMed]
  17. Bartek, M.A.; Saxena, R.C.; Solomon, S.; Fong, C.T.; Behara, L.D.; Venigandla, R.; Velagapudi, K.; Lang, J.D.; Nair, B.G. Improving Operating Room Efficiency: Machine Learning Approach to Predict Case-Time Duration. J. Am. Coll. Surg. 2019, 229, 346–354.e3. [Google Scholar] [CrossRef] [PubMed]
  18. Evans, S. Introduction to the PACE Project. In Computers and Medicine; Springer: New York, NY, USA, 1997; pp. 1–25. [Google Scholar]
  19. SingHealth Annual Reports. Available online: https://www.singhealth.com.sg/about-singhealth/newsroom/Documents/SingHealth%20Duke-NUS%20AR%202019-20.pdf (accessed on 6 April 2022).
  20. About Duke University Hospital Durham, NC Duke Health. Available online: https://www.dukehealth.org/hospitals/duke-university-hospital (accessed on 6 April 2022).
  21. Facts & Statistics Duke Health. Available online: https://corporate.dukehealth.org/who-we-are/facts-statistics (accessed on 6 April 2022).
  22. Protected Analytics Computing Environment (PACE). Available online: https://pace.ori.duke.edu/ (accessed on 6 April 2022).
  23. Sunrise™. Allscripts. Available online: https://as.allscripts.com/ (accessed on 6 April 2022).
  24. Electronic Health Intelligence System. Available online: https://www.ihis.com.sg/Project_Showcase/Healthcare_Systems/Pages/eHINTS.aspx (accessed on 6 April 2022).
  25. Maestro Care for Research Duke University School of Medicine. Available online: https://medschool.duke.edu/research/research-support/research-support-offices/duke-office-clinical-research-docr/get-docr-0 (accessed on 6 April 2022).
  26. Duo Access Gateway. Duo Security. Available online: https://duo.com/docs/dag (accessed on 6 April 2022).
  27. Python Language Reference. In Python for Bioinformatics; Chapman and Hall/CRC: Boca Raton, FL, USA, 2009; pp. 457–538.
  28. Ministry of Health Table of Surgical Procedures. Available online: https://www.moh.gov.sg/docs/librariesprovider5/medisave/table-of-surgical-procedures-(1-feb-2021).pdf (accessed on 6 April 2022).
  29. American Medical Association. CPT® Overview and Code Approval. Available online: https://www.ama-assn.org/practice-management/cpt/cpt-overview-and-code-approval (accessed on 18 April 2022).
  30. PFS Relative Value Files CMS. Available online: https://www.cms.gov/Medicare/Medicare-Fee-for-Service-Payment/PhysicianFeeSched/PFS-Relative-Value-Files (accessed on 18 April 2022).
  31. Garside, N.; Zaribafzadeh, H.; Henao, R.; Chung, R.; Buckland, D. CPT to RVU conversion improves model performance in the prediction of surgical case length. Sci. Rep. 2021, 11, 14169. [Google Scholar] [CrossRef]
  32. Dorogush, A.V.; Ershov, V.; Gulin, A. CatBoost-state-of-the-art open-source gradient boosting library with categorical features support. In Proceedings of the Workshop on ML Systems, NIPS 2017, Long Beach, CA, USA, 8 December 2017. [Google Scholar]
  33. Macario, A. What does one minute of operating room time cost? J. Clin. Anesth. 2010, 22, 233–236. [Google Scholar] [CrossRef] [PubMed]
  34. Zhou, J.; Dexter, F.; Macario, A.; Lubarsky, D.A. Relying solely on historical surgical times to estimate accurately future surgical times is unlikely to reduce the average length of time cases finish late. J. Clin. Anesth. 1999, 11, 601–605. [Google Scholar] [CrossRef]
  35. Strömblad, C.T.; Baxter-King, R.G.; Meisami, A.; Yee, S.-J.; Levine, M.R.; Ostrovsky, A.; Stein, D.; Iasonos, A.; Weiser, M.R.; Garcia-Aguilar, J.; et al. Effect of a Predictive Model on Planned Surgical Duration Accuracy, Patient Wait Time, and Use of Presurgical Resources: A Randomized Clinical Trial. JAMA Surg. 2021, 156, 315–321. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Data-cleaning process (MA: Moving Average).
Figure 1. Data-cleaning process (MA: Moving Average).
Healthcare 10 01191 g001
Figure 2. Proportion of Cases within 20% of Actual Durations.
Figure 2. Proportion of Cases within 20% of Actual Durations.
Healthcare 10 01191 g002
Figure 3. Model 0 Feature Importance for (a) SH-1; (b) SH-2. (Note: Refer to Table 1 for feature mapping between SH-1 and SH-2).
Figure 3. Model 0 Feature Importance for (a) SH-1; (b) SH-2. (Note: Refer to Table 1 for feature mapping between SH-1 and SH-2).
Healthcare 10 01191 g003
Figure 4. Model 5 Feature Importance for (a) SH-1 (Procedure.Surgical.Table.Code is “Procedure Code”; (b) SH-2 (RVU_TOTAL_CaseMax is “CPT List”). (Note: Refer to Table 1 for feature mapping between SH-1 and SH-2).
Figure 4. Model 5 Feature Importance for (a) SH-1 (Procedure.Surgical.Table.Code is “Procedure Code”; (b) SH-2 (RVU_TOTAL_CaseMax is “CPT List”). (Note: Refer to Table 1 for feature mapping between SH-1 and SH-2).
Healthcare 10 01191 g004
Table 1. Field mapping between SH-1 and SH-2.
Table 1. Field mapping between SH-1 and SH-2.
SNSH-1 Data FieldsSH-2 Data Fields
1OT CodeRoom
2Actual DurationIn-Out Duration
3First Surgeon Department CodeService Type
4Priority of OperationCase Class
5Department CodeDivision
6OT Location CodeLocation
7Procedure CodeCPT List
8Type of AnesthesiaPrimary Anesthesia Type
9ASA StatusASA Rating
10AgePatient Age
11GenderSex
12Visit TypePatient Class
13BMIBMI
14HeightHeight
15WeightWeight
16First Surgeon IDPrimary Physician ID
17Second Surgeon IDSecondary Physician ID
18Principal Anesthetist IDFirst Anesthetist ID
19MA 1 year_3rdMA 1 year_3rd (calculated)
20Number of ProceduresNumber of Procedures
21Number of PanelsNumber of Panels
22Multiple Procedure CodesSorted CPT List
23Listing DurationScheduled Duration
Notations: OT: Operating Theatre; CPT: Current Procedure Terminology (transformed into Relative Value Units); ASA: American Society of Anesthesiology; BMI: Body Mass Index; ID: Unique identifier; MA: Moving Average.
Table 2. List of CatBoost models compared (List of features present in each model are shown in Table A2 in the Appendix A).
Table 2. List of CatBoost models compared (List of features present in each model are shown in Table A2 in the Appendix A).
NameFeatures Considered
Model 0Baseline Model which considered patient and surgery factors only
Model 1Baseline Model + RVU/Procedure Surgical Table Code
Model 2Baseline Model + Moving Average
Model 3Baseline Model + Scheduled Duration
Model 4Baseline Model + Moving Average + Scheduled Duration
Model 5Baseline Model + Moving Average + Scheduled Duration + RVU/Procedure Surgical Table Code
Table 3. Performance of scheduled vs. MA duration.
Table 3. Performance of scheduled vs. MA duration.
SH-1SH-2
ScheduledMAScheduledMA
N (cases)7685768535973597
RMSE61.551.557.548.2
MAE (mins)37.729.234.829.5
MAPE (%)7.49%2.40%15.91%5.54%
<=80%41.0%36.8%20.2%24.7%
80–120%24.8%36.6%41.4%49.8%
>=120%34.2%26.6%38.4%25.5%
RMSE: Root Mean Squared Error; MAE: Mean Absolute Error; MAPE: Mean Absolute Percentage Error.
Table 4. SH−1 Accuracy & Error Metrics Comparison of Models.
Table 4. SH−1 Accuracy & Error Metrics Comparison of Models.
ModelPercentage within +\−20% RMSEMAEMAPE
Listing24.68%62.3137.50565.57%
MA37.66%55.1628.84446.85%
Model 040.31%48.1526.32336.74%
Model 143.15%47.8825.22134.61%
Model 243.28%47.3024.93835.56%
Model 341.34%46.2625.42634.97%
Model 442.89%45.3024.32534.50%
Model 544.06%45.1823.98634.40%
RMSE: Root Mean Squared Error; MAE: Mean Absolute Error; MAPE: Mean Absolute Percentage Error.
Table 5. SH-2 Accuracy & Error Metrics Comparison of Models.
Table 5. SH-2 Accuracy & Error Metrics Comparison of Models.
ModelPercentage within +\−20%RMSEMAEMAPE
Listing43.06%53.5732.16727.63%
MA48.33%45.3928.1927.30%
Model 049.86%50.84530.49227.23%
Model 152.78%38.81724.41223.83%
Model 255.42%40.925.52924.90%
Model 355.28%43.20826.1824.54%
Model 455.42%39.36724.51823.79%
Model 556.11%38.48223.6123.36%
RMSE: Root Mean Squared Error; MAE: Mean Absolute Error; MAPE: Mean Absolute Percentage Error.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lam, S.S.W.; Zaribafzadeh, H.; Ang, B.Y.; Webster, W.; Buckland, D.; Mantyh, C.; Tan, H.K. Estimation of Surgery Durations Using Machine Learning Methods-A Cross-Country Multi-Site Collaborative Study. Healthcare 2022, 10, 1191. https://doi.org/10.3390/healthcare10071191

AMA Style

Lam SSW, Zaribafzadeh H, Ang BY, Webster W, Buckland D, Mantyh C, Tan HK. Estimation of Surgery Durations Using Machine Learning Methods-A Cross-Country Multi-Site Collaborative Study. Healthcare. 2022; 10(7):1191. https://doi.org/10.3390/healthcare10071191

Chicago/Turabian Style

Lam, Sean Shao Wei, Hamed Zaribafzadeh, Boon Yew Ang, Wendy Webster, Daniel Buckland, Christopher Mantyh, and Hiang Khoon Tan. 2022. "Estimation of Surgery Durations Using Machine Learning Methods-A Cross-Country Multi-Site Collaborative Study" Healthcare 10, no. 7: 1191. https://doi.org/10.3390/healthcare10071191

APA Style

Lam, S. S. W., Zaribafzadeh, H., Ang, B. Y., Webster, W., Buckland, D., Mantyh, C., & Tan, H. K. (2022). Estimation of Surgery Durations Using Machine Learning Methods-A Cross-Country Multi-Site Collaborative Study. Healthcare, 10(7), 1191. https://doi.org/10.3390/healthcare10071191

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop