**1. Introduction**

A neuromorphic system is a massively multi-core system composed of simple processing units and memory elements communicating by message exchanging [1]. This type of approach strives to simulate the behaviour of the brain using design principles based on biological nervous systems. Neuromorphic systems differ from traditional multi-core systems in the way in which memory and processing are organised. Indeed, in this case, memory is distributed with processing units rather than centralised and physically separated from the cores. Using this strategy, it is possible to avoid the traditional bottleneck of memory access time, present in the classical Von-Neumann architectures. The main idea behind this kind of system is to process information using an event-driven protocol that lets the cores work in an asynchronous way [2]. The processing units remain in an idle state until an event is presented, triggering a reaction; after that, the units return to the idle state. Using this feature, neuromorphic systems are much more energy-efficient than traditional multi-core systems. This idea is inspired by biology; indeed, the human brain is composed of billions of neurons connected

by synapses, working asynchronously, with a power consumption lower than that of a light-bulb [3]. Another peculiarity of neuromorphic systems is the high number of interconnections between the processing units, which speeds up and simplifies communication between the cores.

Neuromorphic HW platforms are attracting the interest of many research groups, mainly for the simulation of neural network structures observed in the brain and modelled through the simulation of Spiking Neural Networks (SNN). Although initially intended for brain simulations, the adoption of emerging neuromorphic HW architectures is also appealing in fields such as high-performance computing and robotics [4]. It has been proved that neuromorphic platforms provide better scalability than traditional multi-core architectures and are well suitable for classes of problems which require massive parallelism as well as the exchange of small messages, for which neuromorphic HW has a native optimised support [5]. However, the tools currently available in this field are still weak and miss many useful features required to support the spreading of a new neuromorphic-based computational paradigm.

In this paper, we analyse and benchmark the scaling capability of the SpiNNaker neuromorphic architecture. The SpiNNaker Machine is a multi-chip, globally asynchronous locally synchronous (GALS) neuromorphic architecture that connects general purpose ARM cores in a toroidal-shaped triangular mesh. It is efficient when used to solve problems modelled as a directed graph with an important communication component.

Other works have used this platform to execute parallel general purpose computation, with positive outcomes both for scaling performances and energy efficiency. In Blin et al. [5], authors have customised the neural model of an SNN configured for reproducing the connection graph of a page rank problem, showing that the scalability rate of the neuromorphic platform outperforms the general purpose architectures; whereas Sugiarto et al. [6] have implemented on SpiNNaker an energy efficient image processing algorithm, using a task graph representation to describe the mechanism and behavior of the method. However, none of these two approaches has tested synchronous applications, since both of them used an adapted SNN simulated with the standard asynchronous framework.

In previous work [7], authors have used a minimal Message Passing Interface (MPI) framework to implement a synchronization strategy that allows configuration of the cores of the board with a distributed application implementing the N-Body problem. The authors benchmarked the performance of the board in the execution of an MPI parallel application that simulates 2 k particles on 240 processors with a speed-up of 194× and an efficiency of 80% when compared to the serial version running on a single CPU.

In this paper, we compared the scaling performance of the SpiNNaker system with that offered by a many-core general purpose architecture. We implemented a parallel processing approach for a pattern matching algorithm able to identify the similarity of DNA sequences. In our implementation, we used the Message Passing Interface (MPI), a distributed parallel programming paradigm, to synchronise the communication of the computing cores on the two architectures. By using the MPI framework, we can port on the SpiNNaker platform an algorithm normally executed on a standard architecture without any need to re-shape the algorithm in the form of a Spiking Neural Network. The focus of the research presented in this paper is threefold.


The rest of the manuscript is organized as follows: Section 2 provides background information on existing neuromorphic architectures, with a detailed focus on the SpiNNaker board and on the DNA search algorithm. Section 3 describes the materials and methods used to carry out the study, whereas Section 4 examines experimental results. Finally, Section 5 closes with the conclusions.
