**1. Introduction**

The field of biosensing has exploded into nearly all areas of research, from medical applications [1] to environmental monitoring [2]. Some of the greatest appeals of biosensors are their specificity and sensitivity. These properties are primarily due to bioreceptors, which are selected for their inherent specificities such as enzymes [3], antibodies [4], and aptamers [5]. However, the very aspect that makes biosensors so specific and sensitive can also limit the sensor stability due to the degradation of the bioreceptor [6]. Additionally, as the bioreceptor is specific to an individual analyte, the particular sensor's scope is limited to the specific analyte to which the bioreceptor can bind.

To obviate these issues, many nature-inspired sensors have emerged that are bioreceptorfree. Some of the most notable examples that have made grea<sup>t</sup> progress include the electronic nose (Enose) [7–11] and electronic tongue (Etongue) [12–16]. Additionally, surface enhanced Raman spectroscopy (SERS)-based sensors have demonstrated incredible chemosensing ability [17–21]. Without a bioreceptor, however, there is the risk of significantly compromised biosensor performance including the limit of detection (LOD) and specificity. Researchers have introduced machine learning (ML) to bioreceptor-free biosensors to bridge this trade-off gap, improving the LOD and specificity [22]. In a sense, ML can be used to take the place of a bioreceptor by reintroducing specificity during data analysis. This is made possible by powerful ML techniques capable of detecting subtle patterns in sensor responses.

**Citation:** Schackart, K.E., III; Yoon, J.-Y. Machine Learning Enhances the Performance of Bioreceptor-Free Biosensors. *Sensors* **2021**, *21*, 5519. https://doi.org/10.3390/s21165519

Academic Editors: Zihuai Lin and Wei Xiang

Received: 15 July 2021 Accepted: 13 August 2021 Published: 17 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

While this approach has demonstrated success, there are still several challenges that these systems must overcome. A major challenge being faced is model generalizability. Since many models rely on subtle patterns in the data, they can be quite sensitive to underlying data changes. This can make the models susceptible to error when faced with sensor drift or replacing parts of the system [14].

Since the scope of this review is quite large and covers all bioreceptor-free biosensors that utilize ML, there are a few points to clarify. Many subsets of our scope have received thorough attention and review. For instance, the use of ML for Enose and Etongue [23–27] and SERS-based biosensors [28] have previously been described. Since the literature is rich in these areas, we realize that all recent original research cannot be adequately covered here. Rather, our intent is to provide a unified discussion of the relevant methods and challenges to give a bigger picture. We also would like to acknowledge that there is a complementary review in the literature discussing the use of ML in biosensing in general [29], but not for biosensors that are bioreceptor-free.

In this review, we will give the current state of using ML to enhance the performance of bioreceptor-free biosensors. Section 2 briefly introduces the types of biosensors that have most benefited from ML. Section 3 provides some background on machine learning algorithms and how their performance can be assessed. Section 4 covers electrochemical biosensors, with particular emphasis on Enose and Etongue. Successful methods are discussed as well as some of the challenges and how they are being addressed with ML. Section 5 discusses optical biosensors, notable for imaging- and SERS-based biosensors. Additional considerations and future perspectives are discussed in Section 6 including what currently prevents many of these systems from being commercialized and what directions may be taken. We also present some considerations on best practices for ML in biosensing, especially regarding communication of methods and reproducibility.

#### **2. How Biosensors Can Benefit from Machine Learning**

Biosensors in the classic definition are sensors that utilize a bioreceptor such as antibody, enzyme, peptide, nucleic acid, etc. A bioreceptor binds to a target biological molecule and generates a signal when coupled with a transducer. Biosensors have evolved to a wide range of transducer types including electrochemical, optical, and spectroscopic biosensors. Traditionally, it is the bioreceptor that provides specificity and sensitivity to the biosensor. Increasingly, however, researchers are developing biosensors that lack a specific bioreceptor. A typical example is a semi-specific chemical sensor array, termed Enose (from gas), or Etongue (in solution). Since such a sensor's specificity is not provided by the bioreceptor, a fingerprinting technique is used to recognize signal patterns indicative of a particular analyte. Frequently, machine learning techniques are employed to detect these patterns and provide specificity.

The use of machine learning to enhance the performance (e.g., specificity, sensitivity, and LOD) of bioreceptor-free biosensors is not limited to chemical sensor arrays. It has been employed in various biosensor mechanisms. Some of the most famous examples aside from Enose and Etongue are imaging-based biosensors and SERS-based biosensors. Additionally, the use of machine learning for biosensors is not limited to those that lack bioreceptors. Cui et al. [29] cover several examples of traditional biosensors employing machine learning to enhance performance.

Table 1 provides an overview of the tasks for which machine learning has been applied, the specific algorithms used, and the relevant papers. More information on the algorithms themselves can be found in Section 3. Additionally, Table 2 gives a comparison of each of the major biosensing mechanisms including data type and appropriate feature engineering and ML methods. All information in Table 2 comes from Table 1 and serves as a higher-level summary.


#### **Table 1.** Machine learning tasks and algorithms used in biosensing.


**Table 1.** *Cont.*

Note. CV = cyclic voltammetry; ANN = artificial neural network; LSTM = Long short-term memory; PCA = principal component analysis; DT = decision tree; RF = random forest; SVM = support vector machine; SVR = support vector regression; BPNN = back-propagation neural network; JDA = joint distribution adaptation; DTBLS = domain transfer broad learning system; GBM = gradient boost machine; ELM = extreme learning machine; EIS = electrical impedance spectroscopy; EIT = electrical impedance tomography; *k*-NN = k-nearest neighbor; RBFN = radial basis function network; CNN = convolutional neural network; *t*-SNE = *t*-distributed stochastic neighbor embedding; Si-CARS-PLS = synergy interval partial least square with competitive adaptive reweighted sampling; FTIR = Fourier transform infrared; VOC = volatile organic compound; GAN = generative adversarial network; DNN = deep neural network; TLC = thin layer chromatography; SERS = surface enhance Raman spectroscopy; PLSR = partial least squares regression; PCR = principal component regression; LDA = linear discriminant analysis; HCA = hierarchical cluster analysis; KPCA = kernel principal component analysis.



#### **3. A Brief Tour of Machine Learning**

In simple terms, machine learning aims to learn patterns in data to make predictions on new data. Generally, this prediction is either categorical classification (into one of a set of classes) or regression (continuous numerical output). In machine learning terms, the data used for prediction (i.e., biosensor data) are termed features or predictors. The set of features associated with one "observation" (e.g., biosensor data from one sample) is termed the feature vector.
