*Article* **Application of Machine Learning Methods for Epilepsy Risk Ranking in Patients with Hematopoietic Malignancies Using**

**Iaroslav Skiba <sup>1</sup> , Georgy Kopanitsa 2,3,\* , Oleg Metsker <sup>2</sup> , Stanislav Yanishevskiy <sup>2</sup> and Alexey Polushin <sup>1</sup>**

	- 197101 Saint Petersburg, Russia

**Abstract:** Machine learning methods to predict the risk of epilepsy, including vascular epilepsy, in oncohematological patients are currently considered promising. These methods are used in research to predict pharmacoresistant epilepsy and surgical treatment outcomes in order to determine the epileptogenic zone and functional neural systems in patients with epilepsy, as well as to develop new approaches to classification and perform other tasks. This paper presents the results of applying machine learning to analyzing data and developing diagnostic models of epilepsy in oncohematological and cardiovascular patients. This study contributes to solving the problem of often unjustified diagnosis of primary epilepsy in patients with oncohematological or cardiovascular pathology, prescribing antiseizure drugs to patients with single seizure syndromes without finding a disease associated with these cases. We analyzed the hospital database of the V.A. Almazov Scientific Research Center of the Ministry of Health of Russia. The study included 66,723 treatment episodes of patients with vascular diseases (I10–I15, I61–I69, I20–I25) and 16,383 episodes with malignant neoplasms of lymphoid, hematopoietic, and related tissues (C81-C96 according to ICD-10) for the period from 2010 to 2020. Data analysis and model calculations indicate that the best result was shown by gradient boosting with mean accuracy cross-validation score = 0.96. f1-score = 98, weighted avg precision = 93, recall = 96, f1-score = 94. The highest correlation coefficient for G40 and different clinical conditions was achieved with fibrillation, hypertension, stenosis or occlusion of the precerebral arteries (0.16), cerebral sinus thrombosis (0.089), arterial hypertension (0.17), age (0.03), non-traumatic intracranial hemorrhage (0.07), atrial fibrillation (0.05), delta absolute neutrophil count (0.05), platelet count at discharge (0.04), transfusion volume for stem cell transplantation (0.023). From the clinical point of view, the identified differences in the importance of predictors in a broader patient model are consistent with a practical algorithm for organic brain damage. Atrial fibrillation is one of the leading factors in the development of both ischemic and hemorrhagic strokes. At the same time, brain infarction can be accompanied both by the development of epileptic seizures in the acute period and by unprovoked epileptic seizures and development of epilepsy in the early recovery and in a longer period. In addition, a microembolism of the left heart chambers can lead to multiple microfocal lesions of the brain, which is one of the pathogenetic aspects of epilepsy in elderly patients. The presence of precordial fibrillation requires anticoagulant therapy, the use of which increases the risk of both spontaneous and traumatic intracranial hemorrhage.

**Keywords:** oncohematology; risk factors; machine learning; epilepsy risk; epilepsy modeling

#### **1. Introduction**

Malignant diseases of the hematopoietic system, despite their relatively low prevalence in the population, remain a socially significant group of diseases. Neurological complications in this cohort of patients occur in correlation with disease or with ongoing

**Citation:** Skiba, I.; Kopanitsa, G.; Metsker, O.; Yanishevskiy, S.; Polushin, A. Application of Machine Learning Methods for Epilepsy Risk Ranking in Patients with Hematopoietic Malignancies Using. *J. Pers. Med.* **2022**, *12*, 1306. https:// doi.org/10.3390/jpm12081306

Academic Editors: Bernd Blobel and Mauro Giacomini

Received: 22 June 2022 Accepted: 8 August 2022 Published: 11 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

treatment. These complications may affect patient survival and may determine whether a therapy protocol can be fully implemented [1]. Acute symptomatic seizure (ASS) is one of the most significant neurological complications because of its high incidence and impact on survival [2]. A number of studies have evaluated the risk of ASS in this cohort of patients [3,4], while assessment of the risks of epilepsy is virtually unreported in the research to date [5,6].

Arterial hypertension is a cardiovascular complication in oncohematological patients that develops due to both disease-related and treatment-related factors [7,8]. Arterial hypertension has also been identified as one of the risk factors for the late-onset epilepsy in the general population [9].

Posterior reversible encephalopathy syndrome (PRES) is a brain disease associated with hypertension, which may determine the risk of epilepsy by indirect (in relation to arterial hypertension itself) mechanisms [10]. In the general population of patients with PRES syndrome, ACS occurs in 77% of cases [11]. In the cohort of oncohematological patients, the development of PRES syndrome may be accompanied by ASS in 97% of cases [12]. In the general population, arterial hypertension is the main etiological factor in the development of PRES syndrome (72%) [13]. It is a high-risk factor for the development of this complication in oncohematological patients as well (HR 14.466, 95% CI 7.107–29.443, *p* < 0.001) [12]. The risk of epilepsy in patients with PRES syndrome is considered low but may increase significantly in the presence of signs of cytotoxic edema and ASS in the debut of PRES syndrome [14].

The use of machine learning methods to predict the risk of complications in oncohematological patients has proven to be promising [15]. These methods are actively used in epilepsy, for example, to predict the pharmacoresistant epilepsy [16], to predict surgical treatment outcomes [17], to determine the epileptogenic zone [18] and to determine functional neural systems in patients with epilepsy [19], to develop new classification approaches [20,21], and to perform other tasks [22]. Machine learning models are actively used in decision support systems to treat patients with various forms of epilepsy [23,24]. At the same time, classical statistical methods of analysis are usually used to identify factors associated with the development of epilepsy within a typical case-control study design. However, factors related to the presence of epilepsy and prognostic tools that substantiate the optimal model for determining the risk of epilepsy in oncohematological patients are not fully understood now [25]. Currently, there exist no risk stratification models for epilepsy in oncohematological patients. The causes of symptomatic epilepsy are heterogeneous and require different approaches in the prevention of new foci of altered electrogenesis (e.g., brain infarcts in atrial fibrillation) [26].

The main goal of the study is to improve algorithms for diagnosing the cause of epilepsy in a group of patients without a previous history of epilepsy.

The groups of patients under consideration are patients with oncohematological diseases and cardiovascular pathology.

The main problem is the often-unjustified diagnosis of primary epilepsy in patients with oncohematological or cardiovascular pathology, prescribing antiseizure drugs to patients with single seizure syndromes without finding a disease associated with these episodes.

This paper presents the results of applying machine learning to analyzing data and developing diagnostic models of presence of epilepsy in oncohematological and cardiovascular patients.

We evaluate factors associated with the presence of epilepsy in oncohematological patients and the effect of arterial hypertension and the number of transplanted hematopoietic stem cells on the risk of epilepsy.

#### **2. Materials and Methods**

A single-center retrospective study was conducted. We analyzed the hospital database of the V.A. Almazov Scientific Research Center of the Ministry of Health of Russia. The study included 35,634 patients with 66,723 inpatient treatment cases (Dataset II) and

3723 patients with 16,383 inpatient treatment cases (Dataset I) of patients with malignant neoplasms of lymphoid, hematopoietic, and related tissues (C81–C96 according to ICD-10) for the period from the 27 January 2010 to the 5 January 2020. Laboratory parameters were chosen according to their clinical relevance and available data from real clinical practice. We considered their potential role in metabolism, systemic inflammation, and hemostasis and in the development of epileptic syndromes. Cerebrovascular factors were chosen according to the evidence on the increasing role of cardiovascular complications in predicting long-term outcomes in patients with oncohematological diseases.
