**3. Materials and Methods**

For analyzing the available data, we used Minitab software, together with the Python programming language and Spyder integrated development environment. Today, Python is one of the most popular programming languages because it is free to use and highly productive, compared to other programming languages like C++ or Java.

Furthermore, Python is one of the most-used languages for data analysis/analytics, machine learning and artificial intelligence, possessing an extensive set of libraries dedicated to these kinds of applications. For the current analysis, the Seaborn and SciPy library were used. Seaborn is a library built on top of matplotlib, integrated with pandas' structures, used for statistical graphics.

In terms of functionality, Seaborn offers multiple usages: an API for examining relationships between multiple variables, support for categorical variables, visualization of univariate or bivariate distributions, estimation and plotting of linear regression models, convenient views of complex datasets, built-in themes for matplotlib figures and color palettes for revealing patterns in data.

A second library used was SciPy, which is a component of an entire Python ecosystem for engineering, mathematics and science, comprised of six core libraries, namely, NumPy (base N-dimensional array package), SciPy (library for scientific computing), Matplotlib (2-D plotting), IPython (an enhanced interactive console), SymPy (symbolic mathematics) and pandas (data structures and analysis). The SciPy library that was used is a core package of the SciPy stack, providing user-friendly numerical routines for integration, interpolation, optimization, linear algebra and statistics.

The research analysis employed the specific functions used to minimize the objective functions for nonlinear curve fitting problems.

Due to the nature of the data, most of the curve fitting situations were related to the case in which polynomial terms were added in the linear regression, more specifically squared predictors. Typically, we choose the model order depending on the number of bends observed in our data plotting.

Each increase in the exponent produces one additional bend in the curve-fitted line. Still, we could not identify a situation that would require the cubic term or more than that. Besides, using polynomial terms as predictors, different scenarios included firstly testing the reciprocal (1/X) of the predictor variable in the model both as a linear and quadratic model and, secondly, transforming the variables with log or ln functions in the linear regression. A log transformation is a method that allows linear regression to be used for curve fitting, otherwise possible only with nonlinear regression.

As an example, the nonlinear function: Y = eB0X1 B1X2 B2 can be expressed in the linear form of ln Y = B0 + B1lnX1 + B2lnX2.

The logarithm can be used on both sides of the equation (double-log form), or one side, known as the semi-log form. Log functional forms are powerful, but in the case where many predictors are involved, many combinations can be formed. For the current research, non-linear models were not proposed due to the low number of available data samples.

In order to test our model's goodness-of-fit in regression, we included in our research four residual plots, respectively, a normal probability plot of the residuals; a histogram of the residuals; residuals versus fits; and residual versus order.

A residual plot is represented by a graph that helps to determine if the OLS (ordinary least squares) assumptions are met, meaning that unbiased coefficient estimates with minimum variance were found. The normal probability plot of the residuals was used to verify the assumption that the residuals are normally distributed.

The histogram of the residuals determines whether the data is skewed or whether outliers exist in the data. Residuals versus fits verifies the assumption that the residuals have a constant variance, while the residuals versus order plot verifies the assumption that the residuals are independent from one another.

The analytical framework generates various perspectives, which can be used in order to elaborate on an economic sustainable development strategy of the Moldavian agriculture sector. Therefore, in order to identify the importance of agriculture sector for the Moldavian economy, the total GDP, GDP agriculture, GDP per capita, total GVA, GVA for agriculture and GVA per capita were included among the analyzed parameters of the present study. Thus, the GDP gives the economic output from the consumers side, while the GVA elucidates the state of economic activity from the producers' side or supply side. However, in order to characterize the sector in terms of financial input and production values, as well as production capacity, the value of the agriculture governmental subsidies, agriculture loans, main crops production quantity and production value were integrated among the analyzed parameters in the present study.

Wheat, maize, grapes and vegetables were crops considered as having considerable potential to influence the economic performance of the agriculture sector. In order to verify these, the total agriculture plant production and production values were integrated among the analyzed parameters. In addition, since only a small number of total agriculture farms manage to access the governmental subsidies, the value of the subsidies per subsidized farm was also considered to be included in the list of analyzed parameters, in order to offer a better result for the analytical framework, and thus more able to generate more accurate perspectives for sustaining the economic sustainable development of the Moldavian agriculture sector.

Our framework development was based on a dataset containing 21 parameters considered relevant, as previously explained in the introduction and further in the results section, and used here to describe the evolution of the agricultural sector during a period of nine years, between 2008 and 2016. The 21 parameters, statistically described in Figure 1, are as follows:



**Figure 1.** Descriptive statistics of the variables.

Data distribution in relation with the mean and percentiles determines the presence of outliers in the data. Several outliers were identified, like the Farm Subsidies parameter with a value of 16,100.00; GDP Agriculture (1079.98); Total Plants Value (638.32, 1355.50); Grape Production (482,000.00 and 685,000.00, respectively); and Number of Farms (2198.00 and 2357.00, respectively).
