**1. Introduction**

The gasification of biomass allows for the production of syngas, consisting of hydrogen and carbon monoxide, which can be used as fuel or converted to other products. This is a renewable source of energy which can take various types of biomass, including wood, straw, and various crop residues, such as shells or husks etc.

To aid the design of gasification systems, modeling can be used to avoid the cost of expensive experiments for the prediction of output composition using different feedstocks and under various operating conditions [1]. The review of Patra and Sheth mentions several categories of model biomass gasifiers including more complex models based on kinetic rate expressions or computational fluid dynamics, in addition to relatively simpler models based on thermodynamic equilibrium assumptions and empirical models based on artificial neural networks [1]. In addition, they mention the possibility of modeling inside a process simulator such as Aspen Plus, which may include kinetic or equilibrium models, for example, inside the process units or associated subroutines [1]. For example, Safarian et al. simulated a gasification process in Aspen Plus using a Gibbs reactor to calculate the equilibrium point minimizing the Gibbs free energy [2]. Marcantonio et al. also modeled gasification using a Gibbs reactor inside Aspen Plus, which they compared against a more accurate kinetic model simulated in MATLAB [3].

To avoid the complexities associated with kinetic and CFD (computational fluid dynamics) models, a large number of studies have focused on equilibrium models, artificial neural networks, and other empirical or semi-empirical models which allow for the fast simulation, sensitivity analysis, and optimization of gasification systems. However, equilibrium models are known to have some inaccuracy because the real gasifier does not necessarily reach equilibrium and can lead to an overestimation of the hydrogen and

Model Reduction Applied to Empirical Models for Biomass Gasification in Downdraft Gasifiers. *Sustainability* **2021**, *13*, 12191. https:// doi.org/10.3390/su132112191

**Citation:** Binns, M.; Ayub, H.M.U.

Academic Editors: Julian Scott Yeomans, Mariia Kozlova and Dino Musmarra

Received: 4 October 2021 Accepted: 3 November 2021 Published: 4 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

carbon monoxide content of producer gas and an underestimation of methane content [4]. To address this inaccuracy, a number of studies have proposed adding correction factors or correlations to the equilibrium models to make the results closer to reality as detailed in the review of Ferreira et al. [5].

Despite this progress, recent studies have shown that even with corrections added, the equilibrium models still show some deviation from experimental values, leaving room for improvement [6]. Alternatively, artificial neural networks can also be utilized to predict the performance of gasifiers as shown by Baruah et al. [7]. Although they are shown to give relatively accurate predictions, this is achieved by limiting the study to woody biomass in small scale downdraft gasifiers [7]. Pandey et al. also show that an artificial neural network can achieve accurate predictions, but in that case, limited to predicting the results for gasification of municipal waste from a single lab-scale fluidized bed reactor [8]. Additionally, artificial intelligence-based machine learning has also been applied to predict the output of a downdraft gasifier in the form of least-squares support vector machines [9]. Although these and other artificial intelligence have shown high accuracy, the resulting models generally do not identify which parameters are important and their fitting requires the identification and fitting of a relatively large number of parameters (e.g., weights and bias values in the fitted equations). For example, the neural network of Baruah et al. for predicting the hydrogen content requires 25 parameters and 41 parameters for predicting the carbon monoxide content [7]. Although the sensitivity with respect to different inputs is not required for building this type of model, the relative impact of different inputs is calculated and shown in the study of Puig-Arnavat et al., for example, showing that carbon content of the feed biomass has a big effect on CO (carbon monoxide) gas yield [10].

Alternatively, simpler empirical expressions have also been considered for predicting the product gas composition as a function of the gasifier inputs and operating conditions. These have the advantage that they will typically have fewer parameters to fit, but the resulting model may be less accurate. For example, Chavan et al. compared a power-law type empirical formula against artificial neural networks for the prediction of gas production rate and heating value of gas products from coal gasification and showed that while both methods give a good fit, the artificial neural network method was slightly more accurate [11]. For the case of biomass gasification, the study of Chee looks at the experimental evaluation of a downdraft biomass gasifier and proposes various linear and non-linear correlation equations to predict outlet conditions [12]. However, these correlations are in terms of only a single inlet property and are obtained by varying only that parameter experimentally, so they cannot be used when more than one input is varied [12]. In another example, Pradhan et al. developed a number of thermodynamic models then fitted linear expressions to predict the results of the best fitting thermodynamic model [13]. They show that the linear models can adequately predict the output of the equilibrium model but do not show how well the linear expressions can predict experimental values [13]. This same procedure of developing equilibrium models then fitting linear correlations to the model outputs has also been demonstrated by Rupesh et al., who also show that linear models can fit well with the output of an equilibrium type model but do not show a comparison of experimental values against the linear correlations [14]. More recently, Pio and Tarelho have compared the prediction accuracy of equilibrium and linear models for predicting the performance of bubbling fluidized bed reactors for biomass gasification [4]. They show that the linear models can accurately be used to predict the output composition of the thermodynamic model (R squared values of 0.93 and 0.79 for hydrogen and carbon monoxide) but have limited accuracy when used to predict the experimental output composition values (R squared values of 0.04 and 0.23 for hydrogen and carbon monoxide) [4]. This could be due to the high variability of experimental composition values for bubbling fluidized bed reactors as suggested by Pio and Tarelho [4]. Alternatively, Mirmoshtaghi et al. have shown through partial least squares regression that higher prediction accuracy can be found from the resulting linear model expressions (R squared values of 0.8 and 0.53 for hydrogen and carbon monoxide) for circulating fluidized bed gasifiers [15]. Although

this higher accuracy achieved by Mirmoshtaghi et al. could be explained by the fact that they use a much larger number of different input values (18 different terms in the linear expressions) [15], compared to the two input values considered in the linear relations used by Pio and Tarelho (only considering temperature and equivalence ratio) [4].

In addition to regression, Mirmoshtaghi et al. also present principal component analysis and statistical analysis of *p*-values from the partial least squares regression to identify significant parameters showing that the equivalence ratio is the most important parameter [15]. The study of Gil et al. also applied principal component analysis to investigate the influence of different biomass properties on the resulting producer gas for a range of different biomass feedstocks when fed to a bubbling fluidized bed reactor [16]. This showed which feedstocks lead to higher production of combustible gases CO (carbon monoxide) and CH<sup>4</sup> (methane) [16]. In the similar study of Dellavedova et al., they also used partial least squares regression and principal component analysis for a set of data including different types of biomass gasifiers and while they do not report R squared values, they do find that the most important parameters are equivalence ratio, steam-to-biomass ratio, higher heating value, and carbon content of the feedstock and temperature [17]. They also mention that the limited accuracy of their linear model may be due to the non-complete homogeneity (high variability) of the data set they have used [17].

While linear models are simple, they have been shown to have relatively limited accuracy for predicting the output of gasifiers and it might be assumed that quadratic expressions could achieve a better prediction accuracy, accounting for interactions between pairs of different coefficients. However, Pan and Pandey have shown that both linear and quadratic expressions give high relative errors when they try to fit them to data for fluidized bed gasifiers fed with municipal solid waste [18]. They also show that an artificial neural network and their proposed Bayesian approach using Gaussian processes can achieve a much more accurate prediction, although the main aim of their proposed method is to incorporate uncertainty [18]. However, this high error in the quadratic regression may be because they attempted to fit a very large number of parameters based on combinations of the 9 input values (potentially 45 parameters or 81 parameters if interaction pairs are counted multiple times) with a full dataset of 67 points, which could be difficult to fit [18].

In summary, a number of studies mentioned above have used simple linear empirical models fitted to the outputs of some other model (e.g., an equilibrium model) and have shown that linear empirical models can quite accurately reproduce the result of the other models [4,13,14]. However, the "other model" can contain some inaccuracies when compared to experimental values and so the fitted correlations will not necessarily reproduce experimental values well. When simple empirical models are fitted directly to experimental values, the statistical fitting appears to be worse [4] (e.g., compared to fitting an empirical model to the output of a thermodynamic model). The use of more complex methods, such as quadratic expressions or artificial neural networks, could achieve a better fit by accounting for non-linear behavior. This prediction accuracy has been demonstrated by a number of studies for artificial neural networks [7–11] but has not been demonstrated for quadratic expressions. Additionally, while dimension reducing model reduction has been successfully applied (e.g., using principal component analysis) to identify significant parameters [15–17], the use of the LASSO [19] (least absolute shrinkage and selection operator) shrinkage method, which aims to eliminate large numbers of less significant parameters, has not so far been applied for the model reduction of biomass gasification models.

In this study, both linear and quadratic expressions are fitted to a set of data from a downdraft biomass gasifier. To avoid the problem of fitting large numbers of parameters, model reduction is included using the LASSO method [19] which is implemented together with cross validation to identify significant parameters and eliminate other parameters such that reduced expressions are obtained. This can be used, for example, in cases where the number of data points is less than the total number of parameters used in the full complex expressions (since the model reduction will eliminate most of the parameters such

that the number of fitted parameters in the reduced model is less than the number of data points). The resulting models are evaluated based on their ability to predict the gasifier output.
