*2.3. Monitoring Clinical Disease Activity Using icompanion: Real-World Evidence*

#### 2.3.1. Study Synopsis-Characterizing MS Types with ico**mpanion**

The ico**mpanion** app and website were launched in July 2020. We describe the information entered into ico**mpanion** and investigate the validity of the real-world collected PRO data by looking at their sensitivity to clinical differences between MS types based on known-groups validity. These MS types include clinically isolated syndrome (CIS), relapsing-remitting MS (RRMS), secondary progressive MS (SPMS), and primary progressive MS (PPMS). The descriptions of the collected data and the analyses were based on an anonymized dataset of collected ico**mpanion** data. We performed a non-parametric Kruskal–Wallis H test (or a one-way analysis of variance (ANOVA) on ranks), with MS type as independent variable and the variable of interest as dependent variable (mental feeling, physical feeling, prEDSS, SymptoMScreen composite, Neuro-QoL Fatigue, and Neuro-QoL Cognitive Function). We used a significance level of 0.05 and *p*-values of the separate models were corrected for multiple comparisons using false discovery rate (FDR) correction [30]. For the variables that showed significant group effects, post-hoc multiple comparisons tests were carried out for pairwise differences between the different MS types with Dunn–Sidák correction, using a significance level of 0.05.

The percentage of PwMS with a statistically meaningful change in their Neuro-QoL Fatigue and Cognitive Function scores was evaluated based on the conditional minimal detectable change, specifically developed for the Neuro-QoL short-forms [31]. A statistically meaningful change can be interpreted as a difference of more than one standard error (SE). It was estimated from the average of the SEs from a normative dataset for any given pair of scores multiplied by the z score for a 95% confidence interval, or ([*SEScore*<sup>1</sup> + *SEScore*2]/2) · 1.96 · √ 2 [31]. Solely PwMS with more than one result for these PROs were included.

#### *2.4. Monitoring Subclinical Disease Activity Using icobrain ms: Real-World Evidence*

ico**brain ms** was launched in 2016 and has been adopted by more than 400 hospitals worldwide since then. In the context of standardizing MS care, in this manuscript, we evaluate how:


#### 2.4.1. Study Synopsis-Reliability of Lesion Count with ico**brain ms**

ico**brain ms** lesion segmentations were compared with the assessment of two raters, one experienced radiologist and one assistant neurologist. The experiment consisted of marking and counting MS lesions on fluid-attenuated inversion recovery (FLAIR) and T1-weighted images acquired from 10 PwMS with a 3T MRI scanner (Achieva, Philips Medical Systems) at the University Hospital Brussels. Inclusion criteria were MS diagnosis

according to McDonald Criteria 2010 and no MRI contraindication. For more details, see [27], from which a subset was used for this analysis. Two repeated acquisitions with patient repositioning were taken to assess test-retest reliability of the lesion count. The two raters independently assessed all images, which were presented in a shuffled order, first as original MRI scans, then with lesion annotations obtained by ico**brain ms**. In addition, the reporting time was recorded.

Intra-rater and inter-rater agreement of lesion counts was assessed. Of special interest was the question whether there is an improved agreement between the counts reported by the two raters after using ico**brain ms** segmentations, as opposed to the case when each rater counted lesions on the original images without ico**brain ms**.

## 2.4.2. Study Synopsis-Detecting Subclinical Disease Activity in MRI Follow-Up with ico**brain ms**

In this study, we evaluated how the availability of ico**brain ms** reports might change the findings of radiological reading when assessing follow-up brain MRI scans. Longitudinal MRI acquisitions from 25 PwMS approximately 1 year apart were randomly selected (and limited to 25 because of feasibility) from different institutions that use ico**brain ms** in clinical practice, ensuring that these centers obtained informed consent from their PwMS to use fully anonymized MRI scans for research. The inclusion criteria were (1) being diagnosed as RRMS or SPMS, (2) having 2 pairs of scans separated at least one or more years apart acquired at the same scanner, (3) Having MRI acquired at the 1.5T or 3T in which there is a presence of high-resolution 3D-T1 and 2D or 3D FLAIR sequence, (4) having Expanded Disability Status Scale (EDSS) assessment at baseline and at follow-up, and (5) not having steroid treatment or relapse 30 days prior to the MRI scan. Each MRI dataset was presented in random order to an experienced neuroradiologist: once without and once with an automatically generated ico**brain ms** report. In the latter case, besides color-coded lesion and brain segmentation overlays, the expert also had access to the structured ico**brain ms** reports. The two radiological reporting scenarios (without and with volumetric software results) were compared in terms of effect on diagnostic findings and reporting time.

#### 2.4.3. Study Synopsis-Insights into the MS Brain Patterns Using ico**brain ms**

In this study, it was evaluated whether ico**brain ms** is able to reveal different brain or lesion volume patterns when comparing different MS clinical phenotypes. Multiple MR sessions from CIS (*n* = 12), RRMS (*n* = 30), PPMS (*n* = 17), and SPMS (*n* = 28) PwMS, with 3D T1w and 2D FLAIR images acquired on a 1.5T MR system (Sonata Siemens) at CERMEP in Lyon, were evaluated. EDSS was also available at each time point. For more details on the original dataset, see [32]. First, differences between MS groups were calculated in terms of longitudinal lesion evolution by location, where new and enlarging lesions were estimated between two time points at least 2 years apart for each patient. Secondly, we evaluated the known-groups validity or the sensitivity of ico**brain ms** to subclinical differences in brain volumetrics. This was done in the same way as for the ico**mpanion** data (see Section 2.3.1) using the Kruskal–Wallis H tests to look for a group effect, and pairwise post-hoc tests to look for differences between different MS type groups.

#### **3. Results**

#### *3.1. Patient Perspective*

3.1.1. Patient Survey 1: Telemonitoring Tools for Monitoring Clinical Disease Activity

45 PwMS completed the survey, of which 80% (*n* = 36) were women. The average age of participants was 45.6 (SD = 11.5). Of the sample, 55.6% (*n* = 25) were RRMS, 15.6% (*n* = 7) were SPMS, 11.1% (*n* = 5) PPMS while 17.8% (*n* = 8) did not know their MS type or did not want to disclose it. About one third of PwMS were diagnosed in the last three years (31.1%, *n* = 14) or had a disease duration of 3 to 10 years (31.1%, *n* = 14), while 22.2% (*n* = 10) and 15.5% (*n* = 7) had been diagnosed, respectively, 10 to 20 years ago and longer than 20 years ago. The larger part of participants thought of themselves to be very digitally

literate (33%, *n* = 15) or quite digitally literate (42.2%, *n* = 19) compared to neutral (22.2%, *n* = 10) and not quite digitally literate (2.2%, *n* = 1) while no PwMS indicated to be not very digitally literate.

We asked for PwMS' attitude about the use of an app to monitor the disease course, where only one person (2.2%) answered negatively (see Figure 4a). The most important features (see Figure 2 for an overview of the main functions) for this cohort were Knowledge center (97.8%, *n* = 44), Symptom logging (95.5%, *n* = 43), Treatment overview (88.9%, *n* = 40), and Test/PROs (88.9%, *n* = 40). These were also the features reported to be the most probable to be used by the PwMS. When asking whether PwMS had an intention to use the app, 68.9% (*n* = 31) answered yes, 26.7% (*n* = 12) answered maybe and 4.4% answered no (*n* = 2). When asking how frequently they would like to use the app, 26.7% (*n* = 12) answered daily, 31.1% (*n* = 14) multiple times per week, 22.2% once per week (*n* = 10), 2.2% (*n* = 1) once every two weeks and 13.3% (*n* = 6) once every month while 4.4% (*n* = 2) answered Other. *Brain Sci.* **2021**, *11*, x FOR PEER REVIEW 11 of 27

**Figure 4.** Visualization of answers for a selection of survey questions from the two surveys: (**a**) survey on telemonitoring tools for monitoring clinical disease activity, (**b**) survey on MR imaging for monitoring subclinical disease activity in collaboration with iConquerMS. **Figure 4.** Visualization of answers for a selection of survey questions from the two surveys: (**a**) survey on telemonitoring tools for monitoring clinical disease activity, (**b**) survey on MR imaging for monitoring subclinical disease activity in collaboration with iConquerMS.

#### *3.2. icompanion MS Patient App Validation*  3.1.2. Patient Survey 2: MR Imaging for Monitoring Subclinical Disease Activity

3.2.1. Sensitivity to Clinical Differences between MS Types Summary statistics of ico**mpanion** users' characteristics and entered data are presented in Table 1, including gender distribution, and average age and disease duration of the current user base. In Figure 5, the distribution of entered treatments per MS type is The survey was answered by 876 PwMS, predominantly located in the U.S. and Canada (91.4%). Of the participants, 80% (*n* = 699) were female. 2.8% of PwMS were aged 75 to 84 years, 19.4% 65 to 74 years, 34.1% 55 to 64 years, 27.2% 45 to 54 years, 11.7% 35 to 44 years, and 4.9% 34 years or younger.

visualized, for PwMS on a DMT. For CIS, 28.6% were on glatiramer acetate, 42.9% on interferons, 14.3% on dimethyl fumarate, and 14.3% on teriflunomide. For people with RRMS, 12.5% indicated that they were on fingolimod, 13.8% on glatiramer acetate, 13.3% The results of the survey showed that only 0.6% (*n* = 5) of PwMS have never had an MRI performed for the purpose of diagnosing or treating. Only 54.9% (*n* = 474) undergo an MRI scan every year or more frequently (see Figure 4b). Almost 27% (26.9%, *n* = 228) of

17.5% on ocrelizumab, 1.6% on alemtuzumab, and 9.4% on natalizumab. For people with SPMS, 8.3% indicated that they were on fingolimod, 8.3% on glatiramer acetate, 12.5% on interferons, 16.7% on dimethyl fumarate, 8.3% on teriflunomide, 33.3% on ocrelizumab, 8.3% on alemtuzumab and 4.2% on natalizumab. Finally, 4.4% of people with PPMS were on fingolimod, 4.3% on glatiramer acetate, 8.7% on cladribine, 4% on dimethyl fumarate

or teriflunomide, 65.2% on ocrelizumab, and 9% on natalizumab.

PwMS have never received an electronic version of their MRI from their clinic or radiology lab. Of the PwMS that received an electronic version of their MRI, 79.9% (*n* = 560) got it on a CD-ROM, 15.6% (*n* = 109) through their clinic's patient portal, 4.0% (*n* = 28) through a direct download into their computer or other device and 0.6% (*n* = 4) on a USB-drive.

Of PwMS that received an electronic version of their MRI, 70. 5% (*n* = 431) looked at their MR images on their own. Of those people, only 13.3% (*n* = 57) claimed to completely understand their MR images. 70.2% (*n* = 99) of the PwMS that had access to an electronic MRI but have not looked at it on their own, would like to do so. Of the reasons for not viewing the MR images, 46.1% (*n* = 83) of PwMS indicated to not know how to interpret the images, while 33.9% (*n* = 61) did not have a software application to view them, 32.8% (*n* = 59) did not know how to view the images, and 12.2% (*n* = 22) failed to load the images onto their computer or software program.

Respectively 98.2% (*n* = 836) and 94.7% (*n* = 767) of PwMS answered to be interested in knowing about changes between their MRIs and whether their MRI scan was performed according to clinical MS guidelines. Finally, 96.6% (*n* = 714) of PwMS indicated that they would be willing to share their MRI scans with researchers.

#### *3.2. icompanion MS Patient App Validation*

#### Sensitivity to Clinical Differences between MS Types

Summary statistics of ico**mpanion** users' characteristics and entered data are presented in Table 1, including gender distribution, and average age and disease duration of the current user base. In Figure 5, the distribution of entered treatments per MS type is visualized, for PwMS on a DMT. For CIS, 28.6% were on glatiramer acetate, 42.9% on interferons, 14.3% on dimethyl fumarate, and 14.3% on teriflunomide. For people with RRMS, 12.5% indicated that they were on fingolimod, 13.8% on glatiramer acetate, 13.3% on interferons, 4.7% on cladribine, 17.8% on dimethyl fumarate, 9.4% on teriflunomide, 17.5% on ocrelizumab, 1.6% on alemtuzumab, and 9.4% on natalizumab. For people with SPMS, 8.3% indicated that they were on fingolimod, 8.3% on glatiramer acetate, 12.5% on interferons, 16.7% on dimethyl fumarate, 8.3% on teriflunomide, 33.3% on ocrelizumab, 8.3% on alemtuzumab and 4.2% on natalizumab. Finally, 4.4% of people with PPMS were on fingolimod, 4.3% on glatiramer acetate, 8.7% on cladribine, 4% on dimethyl fumarate or teriflunomide, 65.2% on ocrelizumab, and 9% on natalizumab.



*Brain Sci.* **2021**, *11*, x FOR PEER REVIEW 12 of 27

**Figure 5.** Treatments logged into ico**mpanion** by PwMS per MS type*.*  **Table 1.** Descriptive statistics for the complete dataset of app users that indicated to know their MS type (82.8%, *n* = 1301). Except for gender, all variables are described as [average (standard deviation, *n*)]. Group effect column provides the results of the Kruskal–Wallis analyses to look for an effect of MS type on these variables. *p*-values have been corrected using FDR correction. **CIS 3.1%**  *n* **= 42 RRMS 78.4%**  *n* **= 1061 SPMS 10.9%**  *n* **= 147 PPMS 7.6%**  *n* **= 103 Group Effect**  In context of the validation of ico**mpanion** through known-groups validity, we observed a group effect of MS type on physical feeling (*p* = 0.025) (see Table 1 and Figure 6C). We also observed a significant group effect of MS type on body function or prEDSS (*p* < 0.001 Figure 6D) and general symptom load or SymptoMScreen composite (*p* = 0.005) where progressive PwMS scored higher than people with CIS and RRMS (Table 1). For mental feeling (*p* = 0.193) and the Neuro-QoL Fatigue (*p* = 0.312), we observed no effect of MS type. For the latter, average scores for all MS types were worse than the average general US reference sample, but within the range of 1 SD (50 ± 10) [23]. The same was true for average scores on the Neuro-QoL Cognitive where lower scores indicate worse functioning, and where also no effect of MS type was observed (*p* = 0.193).

Gender (female) 76.2% 75.7% 61.2% 63.1% Age 33.9 (10.2, 41) 39.7 (10.4, 1058) 47.7 (12.6, 144) 45.8 (12.6, 101) Post-hoc tests for the variables that showed significant effects of MS type were carried out, performing pairwise comparisons between the different MS types. For physical feeling, we observed a significantly higher or better physical feeling in people with RRMS compared to SPMS (*p* = 0.014). prEDSS scores for both PPMS and SPMS were each significantly higher than both people with RRMS and CIS (all *p* < 0.001). Average symptom load, or SymptoMScreen composite, was significantly higher in people with PPMS compared to RRMS (*p* = 0.022).

Disease duration 5.24 (6.87, 42) 7.98 (6.84, 1061) 16.5 (9.21, 147) 7.38 (6.25, 103) Mental 0.17 0.49 0.37 0.47 H(3) = 5.15 For the Neuro-QoL PROs, we were able to calculate whether two consecutive scores indicated a statistically meaningful change (based on conditional minimal detectable change, described in Section 2.3.1 and [31]). We observed such statistically meaningful change in 25.0% (*n* = 36) of PwMS with more than 1 logged Neuro-QoL Fatigue result score (*n* = 118) and 12.3% (*n* = 18) of PwMS with more than 1 logged Neuro-QoL Cognitive score (*n* = 121).

feeling (1.17, 33) (0.91, 869) (0.89, 114) (0.91, 81) *p* = 0.193 Physical feeling 0.12 (0.89, 33) 0.25 (0.88, 862) -0.05 (0.81, 113) 0.06 (0.85, 80) H(3) = 10.85 *p* = 0.025 prEDSS 2.87 (1.77, 16) 3.35 (1.57, 336) 5.13 (1.38, 41) 5.02 (1.62, 31) H(3) = 64.47 *p* < 0.001 The scores for the separate 12 symptoms included in the SymptoMScreen are shown in Figure 7. Visually, a clear distinction could be made between CIS and RRMS, and progressive MS types (SPMS, PPMS). Especially concerning Walking problems, SPMS (2.32) and PPMS (2.18) seem to score higher than CIS (0.86) and RRMS (1.13) on average, but also on Spasticity and stiffness (CIS: 0.89; RRMS: 1.20; SPMS: 1.96; PPMS: 1.84), Hand function and dexterity (CIS: 0.95; RRMS: 0.90; SPMS: 1.31; PPMS: 1.39) and Bladder control (CIS: 0.67; RRMS: 0.91; SPMS: 1.58; PPMS: 1.38).

> 17.0 (15.5, 140)

56.0 (7.74, 38)

42.0 (9.09, 41)

17.9 (15.9, 101)

55.1 (8.95, 30)

44.2 (7.97, 30) H(3) = 15.40 *p* = 0.005

H(3) = 0.31 *p* = 0.312

H(3) = 5.22 *p* = 0.193

12.8 (13.5, 1024)

55.1 (8.75, 319)

43.7 (8.87, 334)

10.5 (13.2, 40)

54.0 (11.91, 15)

48.5 (8.95, 15)

Sympto-MScreen composite

> Neuro-QoL Fatigue

Neuro-QoL Cognitive

1

**Figure 6.** Summary statistics of ico**mpanion** users: (**A**) distribution of age based on sex, (**B**) distribution of age based on MS type, (**C**) distribution of average mental and physical feeling based on MS type, (**D**) distribution of average Neuro-QoL Cognitive and Fatigue score based on MS type. also on Spasticity and stiffness (CIS: 0.89; RRMS: 1.20; SPMS: 1.96; PPMS: 1.84), Hand function and dexterity (CIS: 0.95; RRMS: 0.90; SPMS: 1.31; PPMS: 1.39) and Bladder control (CIS: 0.67; RRMS: 0.91; SPMS: 1.58; PPMS: 1.38).

**Figure 7.** Average severity scored by ico**mpanion** users for all SymptoMScreen symptoms based on MS type. Severity is scored on a scale of 0–6. **Figure 7.** Average severity scored by ico**mpanion** users for all SymptoMScreen symptoms based on MS type. Severity is scored on a scale of 0–6.

cantly improved for the assistant neurologist, from a standard deviation (SD) of the differences between test and retest lesion counts of 28.1 without ico**brain ms** to 22.0 with ico**brain ms** (improvement of 21.7%) but was constant for the experienced radiologist (SD = 7.3 in both scenarios). Larger changes were observed in the case of inter-rater agreement: without ico**brain ms** annotations, inter-rater lesion count agreement between experienced radiologist and assistant neurologist was significantly worse (SD = 20.8) than with ico**brain ms** (SD = 15.7), indicating an improvement of 32.5% by using ico**brain ms**. Figure 8 presents all intra- and inter-rater comparisons as Bland–Altman plots, annotated with bias and standard deviation of the lesion count differences, including the raters' comparisons with the automated lesion count obtained from the ico**brain ms** lesion annotations. It can be observed that the assistant neurologist consistently overestimated the counts of ico**brain ms** and of the experienced radiologist. Similar trends were observed when repeating the analysis for T1 hypointensities (blackholes) and lesions per location, see [33].

*3.3. icobrain ms Brain MRI Analysis Validation*  3.3.1. Reliability of Lesion Count with ico**brain ms**

## *3.3. icobrain ms Brain MRI Analysis Validation*

## 3.3.1. Reliability of Lesion Count with ico**brain ms**

Intra-rater test-retest lesion count agreement on scan and rescan images was significantly improved for the assistant neurologist, from a standard deviation (SD) of the differences between test and retest lesion counts of 28.1 without ico**brain ms** to 22.0 with ico**brain ms** (improvement of 21.7%) but was constant for the experienced radiologist (SD = 7.3 in both scenarios). Larger changes were observed in the case of inter-rater agreement: without ico**brain ms** annotations, inter-rater lesion count agreement between experienced radiologist and assistant neurologist was significantly worse (SD = 20.8) than with ico**brain ms** (SD = 15.7), indicating an improvement of 32.5% by using ico**brain ms**. Figure 8 presents all intra- and inter-rater comparisons as Bland–Altman plots, annotated with bias and standard deviation of the lesion count differences, including the raters' comparisons with the automated lesion count obtained from the ico**brain ms** lesion annotations. It can be observed that the assistant neurologist consistently overestimated the counts of ico**brain ms** and of the experienced radiologist. Similar trends were observed when repeating the analysis for T1 hypointensities (blackholes) and lesions per location, see [33]. *Brain Sci.* **2021**, *11*, x FOR PEER REVIEW 15 of 27

**Figure 8.** Bland–Altman plots of total FLAIR lesion counts on scan-rescan MRI data from 10 PwMS for 2 different raters (expert radiologist and assistant neurologist) and different scenarios per rater (without and with ico**brain ms**). The main diagonal depicts the intra-rater test-retest agreement over the 10 repeated MRI scans. Non-diagonal plots represent interrater or inter-scenario comparisons using the complete dataset of 20 MRI scans. **Figure 8.** Bland–Altman plots of total FLAIR lesion counts on scan-rescan MRI data from 10 PwMS for 2 different raters (expert radiologist and assistant neurologist) and different scenarios per rater (without and with ico**brain ms**). The main diagonal depicts the intra-rater test-retest agreement over the 10 repeated MRI scans. Non-diagonal plots represent inter-rater or inter-scenario comparisons using the complete dataset of 20 MRI scans.

as follows [34]:

without (mean ± SD: 54.3 ± 11.8 min), or with ico**brain ms** (mean ± SD: 26.7 ± 19.8 min).

Radiological findings were compared between the scenario when the radiologist examined the raw MRI follow-up scans and the scenario when ico**brain ms** annotations and reports were also available. The PwMS were considered stable, slightly active, and active,

• stable: if they had no new or enlarging lesions and had normal rate of brain atrophy compared to controls (within 0.2% from normal atrophy rate of sex- and age-matched

• slightly active: if they had enlarging lesions or slightly abnormal atrophy rate compared to controls (further than 0.2% but within 0.4% from normal atrophy rate of sex-

and age-matched healthy controls in the case of ico**brain ms** measurements);

3.3.2. Detecting Subclinical Disease Activity in MRI Follow-up with ico**brain ms**

healthy controls in the case of ico**brain ms** measurements);

The timing also differed significantly between the task of performing lesion count without (mean ± SD: 54.3 ± 11.8 min), or with ico**brain ms** (mean ± SD: 26.7 ± 19.8 min).

3.3.2. Detecting Subclinical Disease Activity in MRI Follow-Up with ico**brain ms**

Radiological findings were compared between the scenario when the radiologist examined the raw MRI follow-up scans and the scenario when ico**brain ms** annotations and reports were also available. The PwMS were considered stable, slightly active, and active, as follows [34]:


Conventional radiological reporting indicated 19 out of 25 stable PwMS (no lesion activity, no apparent atrophy) and 6 active PwMS (new lesion formation or lesion enlargement). The radiological findings with access to ico**brain ms** indicated 7 out of 25 PwMS as stable (normal atrophy, no lesion activity), 7 PwMS with slight disease activity (slightly abnormal atrophy rate and/or enlarging lesions), and 11 active PwMS (5 with new lesions, 10 with abnormal atrophy rate for their age). All stable PwMS identified by ico**brain ms** were also deemed stable by conventional radiological reading. All active PwMS identified by conventional reading were also identified as active or slightly active when using ico**brain ms**. However, the automatic brain MRI measurements indicated several other PwMS as (slightly) active, even if these were part of the stable group according to conventional radiological reading. As such, the percentage of PwMS deemed as having (slight) disease activity or progression grew from 24% in conventional reading to 76% (44% active, 32% slightly active, according to the definitions above) with the ico**brain ms** assisted reading.

With respect to timing, radiological reporting took on average 7 min 28 s (SD: 3 min 6 s) without ico**brain ms** and 5 min 49 s (SD: 2 min 15 s) with ico**brain ms**. In other words, computer-aided radiological reporting with ico**brain ms** was faster than conventional reporting, with approximately 8 conventional reports per hour versus 13 computer-aided reports per hour, which is an improvement by about 40%.

3.3.3. Insights into the MS Brain Patterns: Sensitivity to Subclinical Differences between MS Types

In a two-year MRI follow-up study, new and enlarging lesions assessed with ico**brain ms** were evaluated in different MS subtypes of CIS (*n* = 12), RRMS (*n* = 30), PPMS (*n* = 17), and SPMS (*n* = 28) PwMS, with average age at baseline 31.8, 33.2, 39.5, and 41.1 years and average disease duration 2.9, 8.3, 7.5, and 14.9, respectively [35]. The largest volume of new lesion formation (i.e., lesions not touching any older lesion) was observed in CIS, with approximately 0.1ml new lesion volume over 2-year follow-up, without a preferred location (juxtacortical, periventricular, deep white matter). Further, it was also observed that people with RRMS exhibited more deep white matter (WM) lesions (either new or pre-existing) in comparison to other MS types. PPMS and SPMS had virtually no new periventricular lesions, but a significant amount of enlargement in that region, consistent with a longer disease duration. Figure 9 illustrates the location-dependent evolution patterns for new and enlarging lesions obtained with ico**brain ms** in the 4 MS clinical phenotypes.

groups.

**Figure 9.** Location prevalence and average volumes for new and enlarging lesions in subjects with CIS, RRMS, PPMS, and SPMS. The schematic representation shows three colored layers, where colors represent new or enlarging lesion volumes in the given scales (mL). For each brain slice, the **Figure 9.** Location prevalence and average volumes for new and enlarging lesions in subjects with CIS, RRMS, PPMS, and SPMS. The schematic representation shows three colored layers, where colors represent new or enlarging lesion volumes in the given scales (mL). For each brain slice, the innermost contour represents periventricular lesions; the outermost contour represents juxtacortical lesions; and middle region represents deep white matter lesions.

innermost contour represents periventricular lesions; the outermost contour represents juxtacortical lesions; and middle region represents deep white matter lesions. When examining brain and lesion volumes simultaneously (whole brain volume, gray matter volume, lateral ventricles volume, total FLAIR lesion volume, and T1 blackholes volume), very distinct group patterns were observed for the MRIs corresponding to all time points for which EDSS was lower than or equal to 4 (Figure 10a), with all volumes significantly different between groups according to a non-parametric Kruskal–Wallis H test with MS type as independent variable (p < 0.001) (Table 2). The CIS and RRMS groups showed significantly higher whole brain and gray matter volumes and lower ventricular When examining brain and lesion volumes simultaneously (whole brain volume, gray matter volume, lateral ventricles volume, total FLAIR lesion volume, and T1 blackholes volume), very distinct group patterns were observed for the MRIs corresponding to all time points for which EDSS was lower than or equal to 4 (Figure 10a), with all volumes significantly different between groups according to a non-parametric Kruskal–Wallis H test with MS type as independent variable (*p* < 0.001) (Table 2). The CIS and RRMS groups showed significantly higher whole brain and gray matter volumes and lower ventricular volumes compared to the PPMS and SPMS groups (Table 2). Highest volumes of FLAIR hyperintensities and T1 blackholes were evident in SPMS. At higher EDSS (greater than 4), the patterns corresponding to PPMS and SPMS groups were almost indistinguishable, with no significant volume differences observed for whole brain, gray matter, lateral ventricles and T1 blackholes (Table 2). In contrast, the RRMS group with EDSS > 4 showed higher whole brain volume and lower ventricles volume compared to the progressive groups.

volumes compared to the PPMS and SPMS groups (Table 2). Highest volumes of FLAIR hyperintensities and T1 blackholes were evident in SPMS. At higher EDSS (greater than

with no significant volume differences observed for whole brain, gray matter, lateral ventricles and T1 blackholes (Table 2). In contrast, the RRMS group with EDSS > 4 showed higher whole brain volume and lower ventricles volume compared to the progressive

**Figure 10.** Volumetric patterns observed with ico**brain ms** are illustrated for (**a**) EDSS lower than or equal to 4 and (**b**) greater than 4 (right). For each considered volume (whole brain, gray matter, lateral ventricles, FLAIR lesions, T1 blackholes), the represented radius goes from low volume in the center to high volume on the exterior (each volume being rescaled based on 1st and 3rd quartiles in the four groups combined). Solid lines link the median volumes per group. **Figure 10.** Volumetric patterns observed with ico**brain ms** are illustrated for (**a**) EDSS lower than or equal to 4 and (**b**) greaterthan 4 (right). For each considered volume (whole brain, gray matter, lateral ventricles, FLAIR lesions, T1 blackholes), the represented radius goes from low volume in the center to high volume on the exterior (each volume being re-scaled based on 1st and 3rd quartiles in the four groups combined). Solid lines link the median volumes per group.


**Table 2.** Results from statistical analysis of differences in ico**brain ms** volumetrics between MS types. The group effect rows provide the results of the Kruskal–Wallis analyses looking for an effect of MS type on these variables. Other rows indicate the group differences as observed via **Table 2.** Results from statistical analysis of differences in ico**brain ms** volumetrics between MS types. The group effect rows provide the results of the Kruskal–Wallis analyses looking for an effect of MS type on these variables. Other rows indicate the group differences as observed via post-hoc pairwise tests. *p*-values have been corrected for multiple comparisons.

#### **4. Discussion 4. Discussion**

Digital solutions have the potential to assist clinicians to further standardize MS clinical decision making, to allow for an early detection of disease activity and inform therapeutic decisions. As these solutions are now available with the necessary regulatory clearances and hospital integrations, they can be used in a routine clinical setting. Digital solutions have the potential to assist clinicians to further standardize MS clinical decision making, to allow for an early detection of disease activity and inform therapeutic decisions. As these solutions are now available with the necessary regulatory clearances and hospital integrations, they can be used in a routine clinical setting.

In this paper, we present the initial real-world evidence results of such a novel regulatory cleared and workflow-integrated MS care path solution. It was demonstrated that the ico**mpanion** mHealth application is a response to clear patient needs and that it is a sensitive tool to capture clinically relevant information about MS symptoms and patient wellbeing, as well as significant longitudinal changes in cognition and fatigue over time. In addition, it was shown that ico**brain ms**' MRI volumetric brain reports save radiologists In this paper, we present the initial real-world evidence results of such a novel regulatory cleared and workflow-integrated MS care path solution. It was demonstrated that the ico**mpanion** mHealth application is a response to clear patient needs and that it is a sensitive tool to capture clinically relevant information about MS symptoms and patient wellbeing, as well as significant longitudinal changes in cognition and fatigue over time. In addition, it was shown that ico**brain ms**' MRI volumetric brain reports save

radiologists 40% time while also detecting subclinical MRI activity with a significantly higher sensitivity.

#### *4.1. Patient Perspective*

In order to gain insight into the PwMS' perspective on digital telemonitoring solutions, we carried out a survey which was answered by 45 PwMS of which the larger part (75.2%) indicated to see themselves as digitally literate. Only one patient (2.2%) indicated to have a negative attitude towards using an app to monitor their disease course, which is in line with previous reports about positive attitudes of PwMS [13].

Patients reported the most important features to be a knowledge center, symptom logging, tests/PROs and treatment overviews (88–98%), but more than 60% also found having an appointment calendar, viewing their own MRI scans, and viewing the evolution of their MRI scan important. The features that PwMS thought they would actually use were symptom logging, performing tests/PROs, treatment overviews and a knowledge center were found to be most popular (84–91%). This is in line with recent research indicating a patient demand for medication schedules and reminders [13]. This study also indicated a strong interest for visit overviews, which has been implemented into ico**mpanion**'s calendar and visit preparation feature recently. What differentiates ico**mpanion** from other MS apps available is that ico**mpanion** integrates all the features mentioned above and, at the same time, is a CE-marked and FDA-cleared medical device.

From PwMS in our cohort, 68.9% indicated to intend to start using an MS app like ico**mpanion**, and 80.1% intended to start using it daily or weekly which is in line with a previous study [36]. While our survey suggests that PwMS are interested in telemonitoring apps and their features, and actually using them, it must be noted that the sample size of this survey was relatively small and potentially biased due to the relatively small number of non-digitally literate PwMS.

A second survey was carried out to gain insight into PwMS' perspective on MRI scans in collaboration with iConquerMS. The survey investigated PwMS' experiences with MRI scans as well as their knowledge and viewing behavior and was answered by a total of 876 PwMS. Responses indicated that about 45% of PwMS did not have a yearly brain MRI scan, as advised by the MAGNIMS-CMSC-NAIMS recommendations [11]. Of the PwMS who received an electronic version of their MRI, 70.5% looked at their images on their own, but only 13.3% of PwMS reported to completely understand these MR images, in line with previous studies [37]. Considering the key role that MRI plays in clinical decisions in MS care, and the positive outcomes related to an increased patient involvement in clinical decisions [38,39], it is important to include an MRI-focused knowledge center and MRI viewer in medical apps for MS.

Relevant to the latter, our survey showed that many technological limitations prevented PwMS from looking at their MRI scans on their own. 33.9% of PwMS reported not having a software application to view the images, 32.8% not knowing how to view the images, and 12.2% failing to load the images onto their computer or software program. 94.7% of PwMS indicated to be interested in knowing about whether their MRI scan was performed according to clinical guidelines. This is relevant to PwMS, providers and payers, as it has been demonstrated that less than 10% of the MRI scans for PwMS were acquired according to the local guidelines [40]. Finally, almost all PwMS (98.2%) indicated their interest in knowing about changes between their MRIs. This information is provided to PwMS' care teams via ico**brain ms** in the MS care platform.

#### *4.2. icompanion MS Patient App Validation*

In a first exploratory analysis aimed at investigating the validity and clinical relevance of the real-world collected ico**mpanion** data, we assessed the sensitivity of ico**mpanion** PROs (mental and physical feeling, prEDSS, SymptoMScreen composite, Neuro-QoL Fatigue and Neuro-QoL Cognition) to clinical differences between MS types, so-called knowngroups validity. A significant effect of MS type on physical feeling, prEDSS and Symp-

toMScreen composite was observed. We found physical feeling to be significantly worse in people with SPMS compared to RRMS, while prEDSS scores for both RRMS and CIS showed to be significantly lower than both PPMS and SPMS. These results are in line with previous studies that described a significant effect of MS type on EDSS [38,39,41]. Average symptom load or SymptoMScreen composite was found to be significantly higher in people with PPMS compared to RRMS, which is in line with a previous study [39] that found that scores for symptoms associated with spinal cord abnormalities were significantly higher for SPMS and PPMS than for RRMS. These symptoms were included in the SymptoMScreen as Spasticity and stiffness, Sensory symptoms, and Bladder control, and consequently also in the SymptoMScreen composite [21]. While these findings are expected, and in line with the literature described above, they indicate that the prEDSS [22] and SymptoMScreen [21] PROs included in the ico**mpanion** mobile app are able to pick up important clinical differences between MS types.

In addition, we observed statistically meaningful changes, based on conditional minimal detectable change [31], for the Neuro-QoL (V1.0) Fatigue in 25.0% of PwMS with more than one datapoint, and in 12.3% of PwMS with more than one datapoint for the Neuro-QoL (V2.0) Cognitive. This is the first time that this measure, aimed at providing a clinically useful way of interpreting individual change in the Neuro-QoL short-forms, is employed in MS, and this suggests that ico**mpanion** is able to pick up statistically meaningful and consequently clinically relevant changes in cognition and fatigue in PwMS. In the HCP portal, HCPs can easily evaluate whether changes in the data entered by linked PwMS for these PROs are statistically meaningful based on this measure. This provides them with an indication of changes in clinical symptoms that are large enough to help motivate treatment changes [31].

In summary, we provide real-world data obtained by a medical device app which are in line with other published studies and provide initial evidence that it is feasible to obtain reliable real-world data which can potentially be used for clinical decision making. Further in-depth analyses will be needed on how mHealth app telemonitoring data can help PwMS, inform clinicians, and impact clinical decision making and outcomes.

#### *4.3. icobrain ms Brain MRI Analysis Validation*

The use of follow-up brain MRI scans to detect disease activity in PwMS is recommended by all international guidelines. Typically, changes in terms of new and enlarging lesions and brain volume compared to the previous brain scan are evaluated visually. However, especially because many lesions can be present in an MS patient's brain and subtle but significant brain atrophy is almost impossible to visually assess, it is known that visual MRI reading is prone to inter-rater variability and potential discrepancies [12]. This was confirmed by the results reported in this paper, which demonstrate a significant inter-rater lesion count difference, which can in part be explained by a subjective rater's preference for merging certain nearby lesions into one connected lesion, or for indicating separate nearby lesion foci as distinct lesions. It should be mentioned here that in the presented experiment the raters were asked to provide the best possible lesion count as possible, not to perform a brain MRI reading they would do in a clinical setting. Given the time pressure and distractions in a clinical context, it can be expected that the variability in detecting (new) lesions can be even higher. In this study, it was demonstrated that ico**brain ms** has an excellent test-retest lesion count agreement, and that the expert raters improved their test-retest lesion count agreement when the software annotations were made available. Such results are in line with previous studies that used various other assistive research software approaches [42–45] although we must note that one limitation of this analysis was the small number of raters.

As detecting brain MRI based disease activity is an essential part of the current MS treatment guidelines, it is important to assess to what extent AI augmented radiological reading can impact clinical decision making. In this context, it was demonstrated that the use of ico**brain ms** together with the radiological reading detected a significantly higher

number of PwMS with disease activity when compared to the visual radiological assessment alone. This is in line with the results reported by [46], where the proportion of PwMS who were found as having evidence of disease activity/progression grew from around 35% based on clinical criteria alone to around 54% based on conventional radiological reading (with lesion activity and/or visually estimated brain atrophy), to 61% and 80% when employing radiological reading assisted by ico**brain ms** (only lesion activity, and lesion activity and estimated annual atrophy thresholded at 0.4%, respectively). In addition, and as crucial to implement new technologies in the clinical setting, it was observed that the radiological reading workflow was improved by 40%, which is significant given the increasing time pressure on radiologists [47].

Finally, known-groups validity was also demonstrated for ico**brain ms** (Table 2) as significant group differences were observed between MS phenotypes for the different ico**brain ms** volumetric measures (Figure 10). These findings, albeit based on limited sample sizes, indicated that lesion evolution and brain volumetry, as well as cognitive performance and symptoms, have distinct patterns in the relapsing and progressive types/phases of MS, but that the patterns seem to become more indistinguishable once the disease is more advanced. Indeed, there is more heterogeneity in brain atrophy and lesion burden patterns among different MS groups (relapsing-remitting, primary progressive, secondary progressive) at low EDSS, and, conversely, brain atrophy and lesion burden patterns converge to a common pattern when EDSS gets higher (Figure 10). The RRMS group is clearly distinct from the progressive forms of MS. This divergence, followed by unification of clinical and subclinical findings, is in line with the unifying concept of MS [48]. This highlights the importance of not allowing the disease to progress beyond a certain stage, by addressing the earliest signs of disease activity before irreversible damage sets in.

The results from these analyses demonstrate that it is feasible to implement brain MRI AI solutions in a clinical routine setting and that they can improve the radiological workflow. In addition, it is shown that the ico**brain ms** software, as an assistive tool for radiological reading, decreases the intra- and inter-rater radiological reading variability. Finally, it was demonstrated that ico**brain ms** results can help differentiate between MS subtypes, in line with the literature, and that they allow for a significantly higher detection rate of MS disease activity.

In further research, we aim to provide the combined ico**mpanion** and ico**brain ms** results to clinicians to evaluate the potential impact of these technologies on clinical decision making and standardization of care.

#### **5. Conclusions**

Given the heterogeneity of the disease, the increasing number of available treatment options, and the long-term outcome effects of early clinical decisions in chronic disorders such as MS, there is a clear need to move towards more personalized decision making in MS. Hence, MS care pathways need to become more data-driven and standardized. In this paper, real-world evidence on how new digital/AI technologies can impact MS patient care was presented, and the feasibility of linking different digital tools into one overarching MS care pathway was demonstrated.

**Author Contributions:** Conceptualization, W.V.H.; Methodology, W.V.H., L.C., A.D., A.R., D.S. and D.M.S.; Software, L.C., A.D. and D.M.S.; Validation, W.V.H., L.C., A.D. and D.M.S.; Formal analysis, L.C., A.D. and D.M.S.; Investigation, W.V.H., L.C., A.D. and D.M.S.; Resources, W.V.H., A.R., D.S. and D.M.S.; Data curation, L.C., A.D. and D.M.S.; Writing—original draft, W.V.H., L.C. and D.M.S.; Writing—review & editing, W.V.H., L.C., A.D., A.R., G.N., D.S. and D.M.S.; Visualization, L.C., A.D. and D.M.S.; Supervision, W.V.H., A.R., G.N. and D.S.; Project administration, W.V.H., A.R., D.S. and D.M.S.; Funding acquisition, W.V.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research did not receive specific funding.

**Institutional Review Board Statement:** All reported studies were conducted according to the guidelines of the Declaration of Helsinki and approved by the relevant Institutional Review Boards and/or Ethics Committees except for the survey studies as these were considered market research studies and did not require specific ethical approval.

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the reported studies, except for the iConquerMS survey as it was considered market research and all data was collected anonymously. All ico**mpanion** users approved a Terms of Use (https://files.icometrix. com/icompanion/Terms-of-use/Terms-of-use-short-en-GB.pdf, accessed on 27 August 2021) and a Privacy Policy (https://files.icometrix.com/icompanion/Privacy-policy/Privacy-policy-en-GB.pdf, accessed on 27 August 2021) in which they confirm that ico**metrix** can use their data for research.

**Data Availability Statement:** The data presented in this study are not publicly available due to PwMS' privacy rights.

**Acknowledgments:** The authors would like to thank all the colleagues who contributed to the implementation of the described platform and who assisted in the execution of the surveys (in particular, Hollie Schmidt, Sara Loud, Robert McBurney, and David Gwynne from iConquerMS and the Accelerated Cure Project for Multiple Sclerosis, and Eva de Roey, Liese Steenwinckel, Aske Vloebergs, and Katrien Verhoeven for the survey on MS app patient preferences) validation studies (in particular, Guido Wilms who performed expert radiological reading, Dominique Sappey-Marinier, and Françoise Durand-Dubief (CERMEP-Imagerie du vivant and CREATIS (UMR5220 CNRS & U1294 INSERM), Université de Lyon) for the long-standing collaboration on the analysis of their longitudinal MRI data, as well as to Andreas Lysandropoulos, Than Vân Phan, Thijs Vande Vyvere, and Nathan Torcida for their contribution to the reported MRI studies). Special thanks are due to all the persons with multiple sclerosis who participated in the surveys and studies reported in this paper.

**Conflicts of Interest:** The following authors are employed by icometrix: Lars Costers, Annabel Descamps, Annemie Ribbens, Dirk Smeets, Diana M. Sima. Guy Nagels is medical director neurology at icometrix and minority shareholder; he or his institution (VUB/UZ Brussel) have received research, educational and travel grants from Biogen, Roche, Genzyme, Merck, Bayer and Teva. Wim Van Hecke is CEO, founder, shareholder and member of the board of icometrix.

#### **Appendix A**

Note that these questions are part of a broader questionnaire that surveyed PwMS' expectations of an MS app.
