**1. Introduction**

The process of optimizing synthesis parameters (factors) to provide high quality organic carbon dots (CDs) represents a complex process. It is a similitude of a search in the dark. Several researchers have reported a low value of photoluminescence, long hours synthesis, and high volume of resources used in the process of CDs [1,2]. Attempting to synthesize CDs needs optimization by an appropriate mathematical model, which is embedded with the task to optimize the synthesis process in terms of quality criteria and prediction with less error. This is necessary due to the influence of factors that may or may not affect the quality of yield. So far, the synthesis of sustainable organic CDs with high performance yield is still being researched. In the past, synthesis processes that involved multiple factors were conducted by varying a single factor while others were kept constant, known as one variable at a time (OVAT), but this method is time consuming. It became imperative to formulate multivariate statistics that substantially reduce the numbers of experiments [3].

Nwobi-Okoye and Ochieze [4] made a study based on the comparison of Response Surface Methodology (RSM) and Artificial Neural Network (ANN) data to validate the prediction of aluminum alloy A356/cow horn particulate composites hardness. The study confirmed that ANN with R<sup>2</sup> of 0.9921 exhibited better accuracy than the RSM with R<sup>2</sup> of 0.9583 in predicting the hardness values of the composites [4]. The ANN model is generally based upon artificial intelligence (machine learning), under which a predefined set of data is being trained [5,6], validated, and tested for prediction purposes. Due to this constraint, it is worthy to note that the values predicted by ANN will not often be the best predicted values, but will be within the range of the experimental study [7,8].

RSM is a technique that has been widely applied for defining the interactions between various process parameters and responses with the various desired criteria and taking note of the significance of these process parameters on the desired responses [9]. However, RSM is reported not to be desirable to optimize the non-linear system study that possesses minimal di fference in parts, processing boundaries, or investigated data sets because it a ffects the overall properties of material [9]. The prediction of RSM is based on a first or second order polynomial equation, hence, it is inadequate enough to capture non-linear behavior and can give a non-reliable estimation of photoluminescent quantum yield of fluorescent carbon dots of organic origin. Consequently, the application of the artificial neural network (ANN) can be employed to checkmate and surmount concerns of using a lone RSM in predicting the non-linear system. The concept of ANN is an independent method that uses a model that e ffectively handles nonlinearity in responses that concerns the synthesis and photoluminescent quantum yield of carbon dots.

Formulating an ANN model which accepts a small data set of experimental runs while supplying a useful output in the synthesis of an advanced nanoparticle (CDs) is studied here. From the studies conducted, there are no results published by using the Levenberg–Marquardt back propagation (LMBP) algorithm in the synthesis of fluorescent carbon dots from tapioca powder. With this in view, the LMBP training algorithm was built and deployed in the current study to predict the photoluminescent quantum yield of the synthesized carbon dots from tapioca powder.

In this report, sample data were acquired from design of experiment for response surface methodology (RSM). The training and predictions by an artificial neural network (ANN) were carried out by di fferent multi inputs, and multi output ANN, developed using the Levenberg–Marquardt back propagation (LMBP) algorithm to predict the fluorescent properties of carbon dots synthesized in the study.

#### **2. Mathematical Models and Analytical Methods**

#### *2.1. Response Surface Method and Mathematical Model*

The design software utilized was Design-Expert Version 11.0.5. Central composite design (CCD) was adopted for the analysis of e ffects. These consist of four (4) independent variables: Temperature (X1), Dosage (X2), Time (X3), and W/Ace/NaOH ratio (X4), as shown in Table 1 below. A total sum of 30 experimental runs were carried out, photoluminescent quantum yield (PLQY) was the response considered and expressed as the dependent variable shown in Table 2. The inputs variables were expressed individually as a function of independent variables. A second-order polynomial equation was used to express PLQY (Y1) as an independent variable. Given in the equation below:

$$Y = \beta\_0 + \sum\_{i=1}^{4} \beta\_i X\_i + \sum\_{i=1}^{4} \beta\_{ii} X\_i^2 + \sum\_{i$$

where, *Y* represents the response variable, β0 is a constant, β*<sup>i</sup>*, β*ii*, and β*ij* are the linear, quadratic, and cross-product coe fficients, respectively. *Xi* and *Xj* are the respective levels of the independent variables. Three dimensional (3D) surface response plots were generated by varying two variables at a time within the experimental range and holding the other two constant at the central point. Furthermore, a test of statistical significance was based on the total error criteria with a confidence level of 95.0%. Below is the response surface methodology (RSM) design summary.


**Table 1.** Independent variables used in the Response Surface Methodology (RSM) design.

**Table 2.** Design of experiment for response surface methodology report.


Predicted Value = Pred. value, Res. Value = Residual value. Study Type: Response Surface, Runs: 30, Initial Design:Central Composite, Design Model: Quadratic.

#### *2.2. Artificial Neural Network Mathematical Model and Method*

Response surface methodology data along with the experimental data was collected from sample formulation, totaling 30 samples. These sample data were used in the training of overall data. Note; the regular data collected where normalized to a range between 0 and 1 using Equation (2) below.

$$X\_{\text{morru}} = \frac{X - X\_{\text{min.}}}{X\_{\text{mux} - X\_{\text{min}}}} \tag{2}$$

The *Xnorm* is the normalized value, *X* is the variable, *Xmin* and *Xmax* are the minimum and maximum values among the data set.

The normalization is necessary to execute the sigmoid transfer function e ffectively. The network model was programmed by codes of multilayer perceptron (*MLP*) along with the training algorithm

of back propagation (BP), which consists of an input layer (the input variables of temperature, time, dosage, and solvent ratio), a hidden layer and an output layer (photoluminescent quantum yield) which was the response generated from the experimental values. These three node layers are neurons that utilizes non-linear activation function as given in Figure 1 below.

**Figure 1.** Architecture of multilayer perceptron neural network.

Data tests have shown that a pair of hidden layer resulted in a high performance value. Thus, after multiple iterations for the best data set performance, artificial neural network topologies were selected based on the log-sigmoid transfer function (Equation (3)), linear transfer function in the output layer (Equation (4)), and best performance criteria of coefficient of determination (R2) at Equation (5), mean absolute error (*MAE*) at Equation (6), and root mean square error (*RMSE*) at Equation (7).

$$\text{Logsig}\left(n\right) = \frac{1}{1 + \exp(-n)}\tag{3}$$

$$Purcline\ (n) = n\tag{4}$$

$$R^2 = 1 - \sum\_{i=1}^{n} \left[ \frac{\left(y\_{\text{prod}}^i - y\_{\text{targ}}^i\right)^2}{\left(y\_{\text{avg,targ}} - y\_{\text{targ}}^i\right)^2} \right] \tag{5}$$

$$MAE = \frac{1}{N} \sum\_{i=1}^{N} |y\_{pred}^i - y\_{tar\_{\mathcal{X}}}^i| \tag{6}$$

$$RMSE = \sqrt{\sum\_{i=1}^{N} \frac{\left(y\_{\text{Prad}}^i - y\_{\text{tagg}}^i\right)^2}{N}} \tag{7}$$

where, *n* is the number of experimental data, *yiPred* is the predicted value and *yitarg* is real value obtained from experimental data, *yavg*,*targ* is the average experimental value. However, the value of R<sup>2</sup> is the amount of reduction in the variability of the response by using a repressor variable in the model. R<sup>2</sup> close to 1 is desirable and the root mean square error (*RMSE*) is required to be negligibly infinitesimal [10].

The process of developing an artificial intelligence model for the prediction and optimization of fluorescent carbon dots followed the flow chart in Figure 2. The chart demonstrates the procedural flow involved in the formulation of the artificial neural network for photoluminescent quantum yield prediction and optimization for the synthesized fluorescent carbon dots (see Section 2.3 for synthesis approach).

**Figure 2.** Artificial neural network model flow chart.

A multi input and a single output artificial neural network model was developed by utilizing the Levenberg–Marquardt back propagation (*LMBP*) algorithm to effectively predict the photoluminescent quantum yield of synthesized fluorescent carbon dots [11,12].

#### *2.3. Synthesis of Carbon Dots (CDs)*

An environmental suitable technique for producing carbon dots, (hydrothermal synthesis process), were adopted from the response surface methodology analysis. The best photoluminescent quantum yield data were used here for the report of the response. A small quantity, (0.1 g), of tapioca flour was mixed in 12 mL prepared solvent ratio (deionized water + sodium hydroxide + acetone), see Figure 3 for mechanism flow. This mixture was placed in a stainless steel hydrothermal reactor in a convection oven at a temperature of 170 ◦C for a period of 1 h 40 min. This study has successfully reduced the needed temperature and time needed for synthesizing CDs [13–15].

**Figure 3.** Mechanism for synthesis of carbon dots.

The mixture was centrifuged for 20 min at 3000 rpm. For clarity (no substantial suspended solids detected). The photoluminescent quantum yield was thus calculated by;

$$Q = Q \kappa (\frac{GRAD}{GRAD\_R}) (\frac{e^2}{e\_R^2}) \tag{8}$$
