**1. Introduction**

Colposcopy represents the recommended second-level procedure for the assessment of the uterine cervix as part of a cervical cancer screening program; it is indicated following the detection of primary test positivity according to specific guidelines, and its main objective is the early detection of high-grade cervical intraepithelial neoplasia (CIN2+) [1,2]. Colposcopic observation thus relies on the visual interpretation of macroscopic changes in color and morphology of the genital mucosae and on the correlation of specific patterns with different degrees of cervical disease. According to this intrinsic aspect of the procedure, colposcopy carries the cost of significant observer-dependent performance and thus the risk of lacking sensitivity and accuracy.

The performance of the exam is fundamental and mainly depends upon three steps: the identification of the squamocolumnar junction (SCJ), the correct assessment of the Transformation Zone (TZ) and the decision to take a biopsy/biopsies in the most appropriate cervical area.

Although colposcopy plays a fundamental role in the prevention of cervical cancer as it allows the identification, treatment, and/or follow-up of pre-cancer lesions, the accuracy of the procedure is largely influenced by a high degree of subjectivity and low reproducibility. This may lead to high rates of severe lesions under diagnosis or even cancer under detection. In this view, Artificial Intelligence (AI) may represent a promising option to overcome this limitation.

Colposcopy performance has been largely investigated and reported in different settings and different geographic areas [3–5]; almost all published data are consistent in reporting a large variability in terms of both sensitivity and specificity, with values ranging from 30% to 90% and from 40% to 95%, respectively. In this view, the colposcopic impression (CI), based on the detailed identification and interpretation of the different aspects of the TZ, represents the major issue, being closely correlated with the operator's decision to perform a targeted biopsy [6,7] and the success of the cervical cancer prevention strategy.

In the last few years, the application of Quality Control (QC) and Quality Assurance (QA) principles to assess the accuracy and performance of colposcopy has been advocated as of pivotal importance and is a strong recommendation worldwide [8–13].

The present study aims, through the multicentric involvement of major Italian teaching and academic gynecological institutions, to investigate the accuracy and quality assessment of colposcopy and, consequently, to determine the performance of operators with different levels of expertise in the field. In particular, the study was designed to assess the probability for a patient with a histologically confirmed cervical lesion of being incorrectly managed through the colposcopic workup (e.g., under detection of significant TZ alterations, not having a biopsy performed, or having a biopsy in an incorrect site). The secondary objective of the study was the development of a user-friendly online platform where Quality Control of colposcopy could be easily achieved and that could potentially be proposed and promoted for a nationwide QC and QA program.

#### **2. Materials and Methods 2. Materials and Methods**

examples in Figures 1–3.

One hundred (n. 100) colposcopic digital images were selected by a panel of experts among a large database of clinical cases with a comprehensive dataset of patients' demographic information, clinical history, cytological, virological (HPV-DNA detection), and pathological data. In particular, 35 were histologically negative (or without any type of lesion), 34 were low-grade lesions (HPV or CIN1), 24 were high-grade lesions (CIN2, CIN3, or in situ carcinoma), and 7 were pathologically proven invasive squamous or adenocarcinoma. One hundred (n. 100) colposcopic digital images were selected by a panel of experts among a large database of clinical cases with a comprehensive dataset of patients' demographic information, clinical history, cytological, virological (HPV-DNA detection), and pathological data. In particular, 35 were histologically negative (or without any type of lesion), 34 were low-grade lesions (HPV or CIN1), 24 were high-grade lesions (CIN2, CIN3, or in situ carcinoma), and 7 were pathologically proven invasive squamous or adenocarcinoma. **2. Materials and Methods** One hundred (n. 100) colposcopic digital images were selected by a panel of experts among a large database of clinical cases with a comprehensive dataset of patients' demographic information, clinical history, cytological, virological (HPV-DNA detection), and pathological data. In particular, 35 were histologically negative (or without any type of lesion), 34 were low-grade lesions (HPV or CIN1), 24 were high-grade lesions (CIN2, CIN3, or in situ carcinoma), and 7 were pathologically proven invasive squamous or ade-

Images were deliberately identified when an objectively "difficult" colposcopic pattern was present. Nevertheless, the quality and resolution of all images, complete visibility of the entire cervix, absence of mucus/blood, and good representation of normal/abnormal colposcopic patterns were always identifiable; randomly selected images are illustrated as examples in Figures 1–3. Images were deliberately identified when an objectively "difficult" colposcopic pattern was present. Nevertheless, the quality and resolution of all images, complete visibility of the entire cervix, absence of mucus/blood, and good representation of normal/abnormal colposcopic patterns were always identifiable; randomly selected images are illustrated as examples in Figures 1–3. nocarcinoma. Images were deliberately identified when an objectively "difficult" colposcopic pattern was present. Nevertheless, the quality and resolution of all images, complete visibility of the entire cervix, absence of mucus/blood, and good representation of normal/abnormal colposcopic patterns were always identifiable; randomly selected images are illustrated as

**Figure 1.** *Fully visible* SCJ—G2—biopsy indicated. **Figure 1.** *Fully visible* SCJ—G2—biopsy indicated. **Figure 1.** *Fully visible* SCJ—G2—biopsy indicated.

**Figure 2.** *Fully visible* SCJ—G2—biopsy indicated. **Figure 2.** *Fully visible* SCJ—G2—biopsy indicated. **Figure 2.** *Fully visible* SCJ—G2—biopsy indicated.

**Figure 3.** *Fully visible* SCJ—G2—biopsy indicated. **Figure 3.** *Fully visible* SCJ—G2—biopsy indicated.

The experts' panel, for each single case, identified and recorded the following items: (1) assessment of colposcopic patterns according to the 2011 International Federation of Cervical Pathology and Colposcopy (IFCPC) nomenclature [14] and the 2017 American Society of Colposcopy and Cervical Pathology (ASCCP) terminology proposal [15]; (2) colposcopic impression, categorized as (2.1) negative, (2.2) favour low-grade lesion (Human Papillomavirus infection—Cervical Intraepithelial Neoplasia grade 1 CIN1), (2.3) favor high-grade lesion (Cervical Intraepithelial Neoplasia grade 2–3 CIN2+ or in situ squamous/adenocarcinoma), (2.4) favor malignant lesion (invasive squamous carcinoma or adenocarcinoma); (3) indication for taking a single biopsy or up to a maximum of 3 biopsies; and (4) the most appropriate area to be biopsied. The experts' panel, for each single case, identified and recorded the following items: (1) assessment of colposcopic patterns according to the 2011 International Federation of Cervical Pathology and Colposcopy (IFCPC) nomenclature [14] and the 2017 American Society of Colposcopy and Cervical Pathology (ASCCP) terminology proposal [15]; (2) colposcopic impression, categorized as (2.1) negative, (2.2) favour low-grade lesion (Human Papillomavirus infection—Cervical Intraepithelial Neoplasia grade 1 CIN1), (2.3) favor high-grade lesion (Cervical Intraepithelial Neoplasia grade 2–3 CIN2+ or in situ squamous/adenocarcinoma), (2.4) favor malignant lesion (invasive squamous carcinoma or adenocarcinoma); (3) indication for taking a single biopsy or up to a maximum of 3 biopsies; and (4) the most appropriate area to be biopsied.

By the use of Qualtrix XM® software (2022 version) (www.qualtrics.com), an online platform was developed, either loggable via personal computers, tablets, or smartphones; following log-in, the application delivered the colposcopic digital high-resolution images integrated by a caption with details about the patient's age and primary screening results (cervical cytology and/or HPV-DNA detection), and a set of questions focused on: (1) squamocolumnar junction (SCJ) interpretation; (2) Transformation Zone (TZ) assessment; (3) biopsy indication; (4) areas suitable for performing biopsy; and (5) colposcopic impres-By the use of Qualtrix XM® software (2022 version) (www.qualtrics.com), an online platform was developed, either loggable via personal computers, tablets, or smartphones; following log-in, the application delivered the colposcopic digital high-resolution images integrated by a caption with details about the patient's age and primary screening results (cervical cytology and/or HPV-DNA detection), and a set of questions focused on: (1) squamocolumnar junction (SCJ) interpretation; (2) Transformation Zone (TZ) assessment; (3) biopsy indication; (4) areas suitable for performing biopsy; and (5) colposcopic impression.

sion. The web link to the platform was forwarded to 10 academic and teaching Ob/Gyn Italian institutions, all having tertiary-level preventive oncological gynecology units, inviting colposcopy operators to anonymously attend the survey, detailing their respective level of expertise (<5 years vs. >5 years of colposcopy practice). Almost all juniors were residents/fellows of the participating institutions. The workload to complete the exam was anticipated to be at least 90 minutes according to the survey's characteristics, and it had to be finished in a single slot; at the end, each participant was provided with a final score but was not informed of the rate of correct/incorrect answers or the specification of the correct/incorrect ones. After completion of the test, the same could not be performed again The web link to the platform was forwarded to 10 academic and teaching Ob/Gyn Italian institutions, all having tertiary-level preventive oncological gynecology units, inviting colposcopy operators to anonymously attend the survey, detailing their respective level of expertise (<5 years vs. >5 years of colposcopy practice). Almost all juniors were residents/fellows of the participating institutions. The workload to complete the exam was anticipated to be at least 90 min according to the survey's characteristics, and it had to be finished in a single slot; at the end, each participant was provided with a final score but was not informed of the rate of correct/incorrect answers or the specification of the correct/incorrect ones. After completion of the test, the same could not be performed again because the platform credentials were no longer valid to log in to the application.

because the platform credentials were no longer valid to log in to the application. Data were collected, centralized, and recorded by the promoting investigators and analyzed using the R statistical software (www.r-project.org); participants responses to the test were compared with those of the committee and analyzed with those of variables treated as categorical. Pearson's chi-squared test (with Yates' continuity correction) and Cohen's *kappa* coefficient of agreement (95% CI intervals) were used to estimate the strength of associations; a *p* value < 0.05 was considered statistically significant, with *kappa* 0.60–0.80 indicating substantial agreement among observers [16,17]. The study design, Data were collected, centralized, and recorded by the promoting investigators and analyzed using the R statistical software (www.r-project.org); participants responses to the test were compared with those of the committee and analyzed with those of variables treated as categorical. Pearson's chi-squared test (with Yates' continuity correction) and Cohen's *kappa* coefficient of agreement (95% CI intervals) were used to estimate the strength of associations; a *p* value < 0.05 was considered statistically significant, with *kappa* 0.60–0.80 indicating substantial agreement among observers [16,17]. The study design, methodology, and results were approved by the Scientific Committee of the Italian Society of Colposcopy and Cervico-Vaginal Pathology (SICPCV).
