**1. Introduction**

Within the field of bioinformatics, researchers use metagenomics approaches to characterize microbial genomes directly isolated from the environment [1]. For this, new sequencing technologies generate large volumes of data to be analyzed, due to the abundant varieties of species that can be found in metagenomics samples, which are characterized by sequences of short length and high complexity. In addition, with the possibility of discovering new species, the problem of taxonomic assignment of reads of short DNA sequences becomes extremely challenging [2]. In this respect, metagenomics is considered as the field of study of many genomes in different environments that may even be compartments or regions of living beings, such as mucous membranes and intestines, among others. Therefore, metagenomics is a challenge for computer science researchers who seek to develop methods to understand such amount of genetic information [3]. Concerning the area of computational intelligence, this work deals with a technique already known and validated with artificial neural networks. According to [3] Soueidan and Hayssam (2016), machine learning techniques currently offer a large set of promising tools to build predictive models for the classification of biological data. These

tools are built under di fferent frameworks o ffering the possibility of implementing supervised and unsupervised techniques (clustering), among others.

CTX-M-type enzymes are a group of class A extended-spectrum β-lactamases (ESBLs) that are rapidly spreading among Enterobacteriaceae worldwide. The first recognition of the appearance of CTX-M β-lactamases occurred almost simultaneously in Europe and South America in early 1989. The first publication to recognize an ESBL from the CTX-M group was a report presenting a species of *E. coli* resistant to cefotaxime but susceptible to ceftazidime, isolated from the ear of a four-month-old child su ffering from otitis media in Munich [4].

At the regional level, the Manizales Antibiotic Resistance Group (GRAM) is in charge of presenting the accumulated antibiotic resistance data of the main hospitals in the city. Among total isolates from patients in intensive care units, non-intensive care units and emergencies, the main bacteria identified are Enterobacteriaceae such as *Escherichia coli, Klebsiella pneumoniae,* and *Eneterobacter cloacae*, among others. All of these species display the capacity to carry ESBL genes of the CTX-M group. In addition, according to the antibiotic susceptibility analyses carried out by di fferent clinics in the city, resistance to cefotaxime (cephalosporin with a broad hydrolysable spectrum by CTX-M) ranges between 15% and 35% [5]. This means that, in Manizales, up to one out of every three isolates of this bacterial group is suspected of carrying a CTX-M-type ESBL. The high frequency of this type of ESBL in our context highlights the importance of this type of developments for antibiotic surveillance processes based on metagenomic data.

The validation of this pipeline allows us to extend this analysis for other important genes such as *TEM, SHV*, *metalloenzymes, carbapenemases* that are probably prevalent in our regional context, considering the characteristics of the population, the clinical managemen<sup>t</sup> protocols of patients and health, and asepsis in operating rooms. Since this is a common problem, the development of a pipeline that allows the identification of resistance variants becomes a fundamental step in the establishment of a modern antibiotic surveillance system. The subsequent goal of this study will be to test this development on metagenomic data derived from the surveillance process, in collaboration with research groups in this field.
