**Consensus Definition and Prediction of Complexity in Transurethral Resection or Bladder Endoscopic Dissection of Bladder Tumours**

**Mathieu Roumiguié 1, Evanguelos Xylinas 2, Antonin Brisuda 3, Maximillian Burger 4, Hugh Mostafid 5, Marc Colombel 6, Marek Babjuk 3, Joan Palou Redorta 7, Fred Witjes <sup>8</sup> and Bernard Malavaud 1,\***


Received: 5 September 2020; Accepted: 6 October 2020; Published: 20 October 2020

**Simple Summary:** Transurethral resection of bladder tumours may be technically challenging. Complexity was defined by consensus from the literature by a panel of ten senior urologists as "any TURBT/En-bloc dissection that results in incomplete resection and/or prolonged surgery (>1 h) and/or significant (Clavien-Dindo ≥ 3) perioperative complications". Patient and tumour's characteristics that suggested to by the panel to relate to complex surgery were collected and then ranked by Delphi consensus. They were tested in the prediction of complexity in 150 clinical scenarios. After univariate and logistic regression analyses, significant characteristics were organized into a checklist that predicts complexity. Receiver operating characteristics (ROC) curves of the regression model and the corresponding calibration curve showed adequate discrimination (AUC = 0.916) and good calibration. The resulting Bladder Complexity Checklist can be used to deliver optimal preoperative information and personalise the organisation of surgery.

**Abstract:** Ten senior urologists were interrogated to develop a predictive model based on factors from which they could anticipate complex transurethral resection of bladder tumours (TURBT). Complexity was defined by consensus. Panel members then used a five-point Likert scale to grade those factors that, in their opinion, drove complexity. Consensual factors were highlighted through two Delphi rounds. Respective contributions to complexity were quantitated by the median values of their scores. Multivariate analysis with complexity as a dependent variable tested their independence in clinical scenarios obtained by random allocation of the factors. The consensus definition of complexity was "any TURBT/En-bloc dissection that results in incomplete resection and/or prolonged surgery (>1 h) and/or significant (Clavien-Dindo ≥ 3) perioperative complications". Logistic regression highlighted five domains as independent predictors: patient's history, tumour number, location, and size and access to the bladder. Receiver operating characteristic (ROC) analysis confirmed good discrimination (AUC = 0.92). The sum of the scores of the five domains adjusted to their regression coefficients or Bladder Complexity Score yielded comparable performance (AUC = 0.91, C-statistics, *p* = 0.94) and good calibration. As a whole, preoperative factors identified by expert judgement were organized to quantitate the risk of a complex TURBT, a crucial requisite to personalise patient

information, adapt human and technical resources to individual situations and address TURBT variability in clinical trials.

**Keywords:** bladder cancer; transurethral resection; en-bloc resection

#### **1. Introduction**

Bladder cancer is the seventh most prevalent cancer worldwide [1] and the sixth leading cause of cancer in the EU, where it entails a significant burden in healthcare organization and cost [2]. Most patients present with non-muscle invasive bladder cancer (NMIBC), for which endoscopic resection or en-bloc dissection of bladder tumours, collectively referred to as transurethral resection of bladder tumour (TURBT), initiate the treatment and inform the risks of recurrence and progression. Pathology also provides information on the adequacy of surgery that is visually complete resection and presence of muscle at the resection base [3]. Although this is the most common procedure in oncologic urology, with over 120,000 new cases across Europe annually [2], few reports have addressed how individual characteristics may challenge the successful completion of surgery [4,5]. In addition, the reported variability of residual disease [6] and higher performances of experienced surgeons [7] emphasize the demands of "good-quality" TURBT [7]. Moreover, quality represents latent information for the non-expert, contrary to clinical complications that are self-evident, closely monitored by the public and insurers and used as proxy for quality metrics [8].

Any system capable to document how individual presentations influence surgical outcomes would be of high clinical relevance. Therefore, the objective of the present consensus was to detail and organize the factors based on which experienced urologists anticipate a complex TURBT.

#### **2. Results**

#### *2.1. Step 1: Definition of Complexity*

A PubMed search of "transurethral resection" (of) "bladder" and "morbidity" or "complication", or "mortality" or "death" yielded 585, 664, 9 and 95 articles, respectively. Of these, 89 articles relevant to the process of defining complexity were analysed, obtaining 36 articles (Table S(1) which were instrumental in highlighting adequacy, operative time and morbidity as the three drivers that characterize a complex surgery, as opposed to an uneventful procedure [4,8–42].

After a single round of circulation, all panellists validated the following definition of a complex TURBT: "any TURBT/En-bloc dissection that results in incomplete resection and/or prolonged surgery (>1 h) and/or significant (Clavien-Dindo ≥ 3) perioperative complications".

#### *2.2. Step 2: Items That Drive Complexity*

Eighty-five characteristics that were suggested by the panellists to influence surgery were organized into six chapters consistent with standard medical practice: patient's characteristics and history, tumour characteristics, access to the bladder, bladder anatomy and surgical environment.

Their relevance was researched in two Delphi rounds, which showed consensus for 42 characteristics in the first round (Figures S1–S4) and 83 in the second (Figures 1–4). For any characteristic or item, the median opinion of the panel (Figures 1–4) was then used as the metrics to weight its individual contribution to complexity.

**Figure 1.** Distribution of the scores regarding the likelihood of incomplete resection and/or prolonged surgery (>1 h) and/or significant (Clavien-Dindo ≥ 3) perioperative complications according to patient's characteristics. ((1) age, (2) sex, (3) weight and body mass index (BMI), (4) patient's history, (5) American Society of Anaesthesiologists' (ASA) physical status classification, (6) tobacco smoking. MMC: Mitomycin C, Bacille Calmette Guérin (BCG), TURBT: transurethral resection of bladder tumour.

**Figure 2.** Distribution of the scores regarding the likelihood of incomplete resection and/or prolonged surgery (>1 h) and/or significant (Clavien-Dindo ≥ 3) perioperative complications according to tumour's characteristics: ((1) number, (2) location, (3) size, (4) structure, (5) surroundings. CIS: carcinoma in situ.

**Figure 3.** Distribution of the scores regarding the likelihood of incomplete resection and/or prolonged surgery (>1 h) and/or significant (Clavien-Dindo ≥ 3) perioperative complications according to bladder characteristics and access to the bladder cavity: ((1) bladder capacity, (2) bladder structure, (3) prostate volume, (4) bladder neck, (5) others.

**Figure 4.** Distribution of the scores regarding the likelihood of incomplete resection and/or prolonged surgery (>1 h) and/or significant (Clavien-Dindo ≥ 3) perioperative complications according to the surgical environment: ((1) anaesthesia, (2) energy, (3) operator, (4) bladder irrigation, (5) instruments.

#### *2.3. Step 3: Construction, Discrimination and Accuracy of the Bladder Complexity Checklist Sum*

#### 2.3.1. Clinical Scenarios

Smoking, underweight, normal weight and American Society of Anaesthesiologists (ASA) class 1–2 or 3 that in the panel's opinions did not relate to the complexity of TURBT were not included in the scenarios, although age and sex that were also considered of little influence were retained, as they are standards in medical reporting. Although the surgical environment was consistently considered to have bearing on the odds of a complex surgery, the corresponding items were not included in the scenarios, as they were considered circumstantial rather than constitutive of the case. As a whole, 150 scenarios that included 9 items organized 5 five domains (Table 1) were presented to the panel. The members were strongly consistent in their anticipation of complexity, as consensus was observed for 131/150 (87.3%) scenarios that were by design confirmed for univariate and multivariate analysis.


**Table 1.** Univariate analysis of the scores of preoperative characteristics in a cohort of 131 random scenarios for which the panel was consistent in its anticipation of complexity.

n.s. not significant.

#### 2.3.2. Discrimination and Accuracy

In univariate analysis, the items that informed the tumour characteristics (number, location, size) and access to the bladder were significantly associated with complexity (Table 1). Patient's history that did not reach statistical relevance still qualified for multivariate analysis (*p* = 0.07).

Five domains (Table 2) that in logistic regression were independent predictors of complexity, i.e., history, tumour number, location, and size and access to the bladder cavity, were used to develop the probability function that modelled the probability of a complex surgery.

**Table 2.** Logistic regression analysis showing independent relationships between the complexity of TURBT and patient history, tumour number, main tumour location and size and factors restraining the access to the bladder cavity.


$$p(\text{complex}) = \frac{1}{1 + \text{ap}\left(13.34 - 0.99\text{nH}\text{Aliter} - 0.96\text{nT}\text{Na}\text{MnO}\text{or} - 1.44\text{nMaitTaLaxation} - 1.04\text{nMaitTaSix} - 1.1\times\text{Aaxes}\right)}\tag{1}$$

This function showed good discrimination (AUC: 0.92 (95%CI: 0.87–0.96) in receiver operating characteristic (ROC) analysis (Figure 5).

**Figure 5.** Receiver operating characteristics (ROC) curves of the regression model with the corresponding calibration curve showing adequate discrimination (AUC = 0.916) and good calibration, with calibration slope of 1 and calibration in the large (CITL) of 0, indicating that the predicted prevalence of complexity was in keeping with the observed prevalence (CITL) and that the model was not over fitted (slope).

The simplification offered by the Bladder Complexity Checklist Sum (BCCS, Table 3) yielded comparable performance (C-Statistics *p* = 0.94, Figure 6).



Abbreviations:

 UTI: urinary tract infection.

**Figure 6.** ROC curves of the Bladder Complexity Checklist Sum (BCCS) and the corresponding calibration curve showing similar discrimination and calibration performances compared to the regression model.

Both instruments showed good calibration (Figure 3, Figure 4).

Figure 7 illustrates the balance between positive and negative predictive values according to increments in BCCS.

**Figure 7.** Negative (**blue**) and positive (**red**) predictive values (NPV and PPV) of increments in the BCCS.

## **3. Discussion**

Anticipation is essential to adapt staff and technical resources to individual challenges of clinical situations. The adoption of standardized instruments of evaluation for major urological procedures [43] spurred us to develop similar instruments for TURBT, the most common procedure in oncologic urology [2].

The first step contextualized complexity, a concept adapted to the rationalization of healthcare [44]. A PubMed search highlighted three dimensions that characterize a complex surgery, as opposed to a satisfactory and uneventful procedure. Adequacy was recently introduced in the European Association of Urology (EAU) guidelines to insist on the importance of complete resection of all visible tumours with the detrusor muscle in the specimen, a surrogate marker of resection quality that controls the risk of early recurrence [9] and may impact adjuvant treatment [11]. Surgery longer than one hour was included following a large population-based report from the American College of Surgeons National Surgical Quality Improvement Program (NSQIP), where it related to postoperative complications independently from age, comorbidities, tumour size and ASA classification [31]. Lastly, postoperative complications requiring surgical, endoscopic or radiological intervention—that is, Grade III and higher in the recently TURBT-adapted Clavien–Dindo classification [29]—were also considered, as they were recently shown [33] to affect a significant minority of patients (8.1%, of which 15% were Grade III and higher). Reminiscent of other major oncologic procedures (e.g., trifecta in kidney and prostate surgery), the consensus therefore encompassed the three reported qualifiers of complexity, oncological, procedural and postoperative into a multidimensional definition.

The second step researched robust clinical predictors. To that end, we relayed on expert judgement, a valuable instrument when other methods are intractable for scientific or practicable reasons [45]. TURBT appears to fall in that category, as although many factors are known to impact surgery and its outcomes [4,5,46], some important ones were not detailed in population-based series (e.g., position of the tumour) or were so infrequent as to elude detection (e.g., diverticulum). Conversely, experienced urologists are bound to encounter them along their career and to drive some operational conclusions as to the influence they may have on their management. This was confirmed by the extensive list of items drawn from experience and by the broad consensus of the panel on their relative contributions to complexity.

Most of the items that carried a "possibly", "likely" or "very likely" risk of complication were consistent with the current literature. Conversely, some that had eluded cohorts [33] and population-based registries [4,31] made sense to the practising physician, notably, the access to the bladder cavity or the position of the tumours, with TURBT at the dome considered as "likely" to result in visually incomplete, lengthy or morbid surgery, compared to "very unlikely" for the trigon. The increments in scores with tumour sizes presented according to the current US procedural terminology (Figure 2) were in keeping with the increasing risks of complication and 30-day reoperation rates reported in two large NSQIP population-based studies [4,31]. A similar correlation was observed for the number of tumours, that is also a central parameter in the EAU/European Organisation for Research and Treatment of Cancer (EORTC) risk stratification of progression and recurrence [3].

Overall, high consistency between the literature or the practical constraints of surgery and the Delphi scores vindicated the present approach to anticipate complex TURBT.

However relevant, no single factor could possibly drive the entirety of the surgical challenge, which spurred us to the third step to analyse their respective contributions in random scenarios. Although the panel acknowledged the influence of technology in TURBT (Figure 4), elements pertaining to the surgical environment that were considered as adaptive rather than constitutive were not considered in the scenarios. Consistent with the format of clinical presentations, scenarios included age and sex, although they are considered of little bearing in TURBT (Figure 1). To account for the risk of cognitive overload [47], only four aspects were considered: patient's history, tumour and bladder anatomy and access. Although this resulted in a high prevalence of complex cases (58/131 (44.2%) scenarios were classified as "possibly", "likely" or "very likely" to result in incomplete resection or prolonged surgery (>1 h) or significant complications), random scenarios were preferred to collecting real-life clinical cases in the construction of the score, as this ensured that even rare situations were not overlooked.

On univariate analysis, tumour number, size, and location and access to the bladder cavity significantly related to complexity (Table 1). Although not significant in univariate analysis (*p* = 0.07), patient's history still qualified for multivariate analysis, where all five aspects independently related to complexity.

As measured by their regression coefficients (Table 2), although patients' history and bladder contributed to a lesser extent, tumour characteristics carried most of the information, thereby emphasizing the classical emphasis on thorough preoperative evaluation. The regression model showed excellent discrimination on ROC analysis (AUC: 0.92), while the calibration curve confirmed its accuracy (Figure 5).

The Bladder Complexity Checklist was then developed to facilitate the recording of significant characteristics in the clinic (Table 3). For illustration purposes, the case of a 75-year-old female patient with a thin bladder wall, showing a single 3 cm tumour of the dome would yield a sum of 15, consistent with a predictive value for complexity (PPV) of 100% (Figure 7). Summing the weight-adjusted scores of the Bladder Complexity Checklist carried similar discrimination and accuracy as the logistic model (Figure 4). This is to our knowledge the first effort to quantitatively inform with a simple clinical instrument the multidimensional complexity of TURBT. It could readily complement the other checklists proposed to control the quality of the procedure [37] or the step-by-step management of NMIBC [14].

Overall, the present methodology highlighted the factors that drove the anticipation by experienced surgeons of a complex TURBT. It would be amenable to other procedures where the surgical outcome relates to a large number of factors accessible to preoperative evaluation (e.g., radical prostatectomy, kidney transplantation). It also emphasized the variability in complexity of a procedure that is still widely regarded as menial.

The ability to anticipate and document complexity has important practical consequences. First, the Bladder Complexity Checklist could be instrumental in personalising the human and technical resources required for the most common procedure in oncologic urology [2]. This has become an absolute requisite in the current era of value-based care [48], where most procedural terminologies and reimbursement policies for TURBT consider the size and number of tumours compounded by comorbidity indexes, but overlook essential predictors such as the position of the tumour, a key descriptor of complexity in the present consensus. The Bladder Complexity Checklist Sum that organises and quantitates all relevant clinical information could also be used to drive the adaptation of health resources according to increments of complexity and support complexity-adapted coverage from health insurances.

Second, quantitating the difficulties entailed by a "good-quality" TURBT [7] would offer a solid ground to confront the morbidity and oncological outcome of a potentially complex procedure. Documenting variability is also important when analysing the benefits of different systems of resection or evaluating adjuvant treatments in research protocols [11]. Although all controlled trials to date overlooked the bias of complexity, we believe that crucial information such as the complexity score or, at the very least, a minimal dataset including size, number and position of the tumours should be documented and balanced in clinical research.

Third, measuring complexity that amounts to weighting the risks of the procedure would constitute an important instrument to inform the patient and therefore control part of his anxiety [49]. The constraints of information also include the training and experience of the surgical staff [50]. A large study from the NSQIP concluded that residents' involvement in urology procedures was not associated with increased complications, although it significantly increased the operative time [27].

Regarding TURBT, the relation between time and complications [31] and surgeon experience and the presence of the detrusor muscle in the specimen [9] vindicated the panel's prudent assessment of residents' participation (Figure 4). This observation also has direct bearing on the organisation of care in academic hospitals, in terms not only of informed consent [50] but also of organizing the list so that cases showing high complexity receive proper attention in terms of consultant supervision and position on the surgical list [50].

Several limitations should be considered. First, it is recommended for health indicators to include panellists of different origins, from public health experts to patients' representatives [51]. Here, the sole urologists' perspective was adopted, which certainly contributed to the high degree of consensus and the strong consistency with clinicians' experience. With 10 experts, the panel positioned at the first quartile of the distribution of panellists in a systematic review [51] of the Delphi methodology and was in line with the number of experts invited to develop other multidimensional instruments in urology [43].

Second, the model was not validated in the clinics, where a lower prevalence of complex cases may be anticipated. However, the review of 416 diagnostic studies showed that a lower prevalence improved specificity and had no systemic effect on sensitivity [52], suggesting that the current model would retain its relevance in the real-life setting. Third, important predictors such as the position or the multiplicity of tumours are best defined by preoperative flexible cystoscopy [53], which is optional when the diagnosis can be ascertained by medical imaging [3]. Last, the process yielded a large number of items (Table 3) that may require streamlining after the first returns of clinical experience.
