Onion (Allium cepa) Profit Maximization via Ensemble Learning-Based Framework for Efficient Nitrogen Fertilizer Use

Kim, Youngjin; Kim, Sumin; Kim, Sojung

doi:10.3390/agronomy14092130

Open AccessArticle

Onion (Allium cepa) Profit Maximization via Ensemble Learning-Based Framework for Efficient Nitrogen Fertilizer Use

by

Youngjin Kim

¹

,

Sumin Kim

^2,*

and

Sojung Kim

^1,*

¹

Department of Industrial and Systems Engineering, Dongguk University-Seoul, Seoul 04620, Republic of Korea

²

Department of Environmental Horticulture & Landscape Architecture, College of Life Science & Biotechnology, Dankook University, Cheonan-si 31116, Chungnam, Republic of Korea

^*

Authors to whom correspondence should be addressed.

Agronomy 2024, 14(9), 2130; https://doi.org/10.3390/agronomy14092130

Submission received: 26 July 2024 / Revised: 4 September 2024 / Accepted: 17 September 2024 / Published: 19 September 2024

(This article belongs to the Special Issue Advanced Machine Learning in Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

Onion (Allium cepa) is a major field vegetable in South Korea and has been produced for a long time along with cabbage, radish, garlic, and dried peppers. However, as field vegetables, including onions, have recently been imported at low prices, the profitability of onion production in South Korea is beginning to be at risk. In order to maximize farmers’ profits through onion production, this study develops onion yield prediction models via an ensemble learning-based framework involving linear regression, polynomial regression, support vector regression, decision tree, ridge regression, and lasso regression. The use of nitrogen fertilizers is considered an independent variable in the development of the yield prediction model. This is because the use of nitrogen fertilizers accounts for the highest production cost (13.47%) after labor cost (41.21%) and seed cost (17.42%), and it also directly affects onions yields. For the model development, five research datasets on changes in onion yield according to changes in the use of existing nitrogen fertilizers were used. In addition, a non-linear optimization model was devised using onion yield prediction models for the profit maximization of onion production. As a result, the developed non-linear optimization model using polynomial regression enables an increase in profits from onion production by 67.28%.

Keywords:

onion; machine learning; crop yield estimation; economic analysis; nitrogen fertilizer

1. Introduction

Recently, as the problem of global warming has become more serious [1], the importance of the efficient management of crop growth taking into account climate change and efforts to improve yields is increasing [2]. The problem of climate change does not simply mean changes in environmental variables such as temperature, humidity, and rainfall, but also causes changes in the growth of plants [3]. This eventually results in increased instability in the supply and demand of food resources [4].

In order to solve the problem of climate change, countries around the world are holding the United Nations Climate Change Conference of the Parties every year, and in 2021, the Glasgow Climate Pact sought to promote the development of renewable energy and fossil fuels. An attempt was made to solve the climate change problem through fuel reduction [5]. The Korean government also announced a ‘mid- to long-term food security strengthening plan’ with the goal of securing an overall food self-sufficiency rate of 55.5% by 2027. Also, the ‘Supply and Demand Management Plan for Major Field Vegetables’, which aims to strengthen the supply stability of major field vegetables (cabbage, radish, garlic, onion, and dried pepper) in South Korea, has also been announced [6]. Through these efforts, the Korean government is trying to reduce the decline in field vegetable production due to existing weather disasters (typhoons, rainy seasons, heat waves) and solve the problem of decreased onion supply due to climate change in winter [7,8].

There are various approaches to solving this yield reduction issue, but it is important to first recognize changes in the agricultural production environment, including climate change, and respond preemptively to produce crops. In fact, the Korean government emphasized the importance of ‘improving the ability to predict agricultural activities’ in order to implement preemptive supply and demand stabilization measures [6]. As a result, various studies are being conducted on predicting crop yields through the use of highly reliable prediction models and building highly profitable crop distribution networks using the predicted information.

There are various approaches involving machine learning techniques to develop a yield prediction model. Chomba et al. [9] utilized the growth curve of maize to estimate its yield according to the use of nitrogen fertilizers in western Kenya. Shastry et al. [10] used multi-variate regression techniques (e.g., linear regression (LR) and polynomial regression (PR)) to estimate the yield of wheat, maize, and cotton crops taking into account climate variables (e.g., temperature, rainfall, and solar radiation). In fact, the advantage of regression techniques is that they are analytical models that can explain the impact of individual independent variables on dependent variables through coefficients [11]. For complex predictions of growth results under various conditions, not just crop yield, Yoon et al. [12] adopted crop growth simulation software entitled the Agricultural Land Management Alternative with Numerical Assessment Criteria (ALMANAC), and it was applied to the simulation of soybean growth under climate change scenarios in South Korea. To be more specific, using k-means clustering, which is one of the most popular unsupervised learning techniques, three cultivar groups were classified based on the cultivar similarities (growing degree days (GDDs), height, 100-grain weight, number of pods per plant, lodging, and yield across years) of six soybean sprout cultivars, and crop growth predictions were performed using ALMANAC under climate change scenarios. Machine learning techniques are not only used alone, but are also integrated with existing yield prediction modeling techniques.

In addition, other machine learning techniques, such as support vector regression (SVR), decision tree (DT), ridge regression (ridge), and lasso regression (lasso), are used for the yield estimation. Shaflee et al. [13] developed spring wheat yield prediction models via SVR and LAR from unmanned aerial vehicle (UAV) images. Cedric et al. [14] utilized DT for the yield estimation of rice, maize, cassava, seed cotton, yams, and bananas in West African countries (i.e., Burkina Faso, Gambia, Ghana, Guinea, Mali, Mauritania, Niger, Senegal, and Togo). Kim et al. [15] estimated the yield of corn, sesame, soybean, mung bean, and red bean via RR. The advantage of machine learning modeling techniques is that they have high prediction accuracies when sufficient data are obtained, regardless of the complex relationship between independent and dependent variables [16].

Unlike other major summer crops such as corn, rice, etc., the simulation of onion (Allium cepa) yields has not been well studied. In Asia, onion is planted in late fall and is harvested in the spring of the next year. To prevent flowering in the first year, the bulb must undergo a vernalization period in which the vegetable apical meristem is converted to a reproductive meristem [17]. Premature flowering during plant growth, also called bolting, negatively affects the economic yields of onion. The occurrence of bolting is significantly affected by temperature [18]. In addition to temperature, soil nutrient status (its nitrogen and phosphorus contents) also significantly affects the incidence of bolting [19,20,21]. These factors, including weather and cropping management (e.g., fertilizer use), induce both the uncertainty of production and the seasonality of production, resulting in high fluctuations in the onion price. To stabilize the price and production, accurate forecasting of production yield and price is critical to provide useful information to farmers and policymakers who can undertake efficient supply chain management of the onion market.

This study aims at developing a yield prediction model and an optimization model to stably produce economically viable onions. In particular, a yield prediction model based on nitrogen fertilizer use will be developed through an ensemble learning technique involving linear regression (LR), polynomial regression (PR), support vector regression (SVR), decision tree (DT), ridge regression (ridge), and lasso regression (lasso), and a non-linear crop profit optimization model will be developed using the developed yield estimation model with the highest prediction accuracy. Onions were selected because they are one of the major field vegetables in South Korea, with 1,172,848 tons produced in 2023 [22], and are very important crops for improving farm income and national food security. Moreover, this study discovers the best machine learning technique for predicting the yield of onions among six techniques and, through the proposed methodology, it is possible to propose the optimal amount of nitrogen fertilizer to maximize farmers’ profits.

The rest of the paper is organized as follows. Section 2 illustrates the field experiment data from five different studies and explains the ensemble learning module for yield estimation and the optimization module for economic production. Section 3 shows the economic feasibility comparison results by different levels of fertilizer use. Section 4 and Section 5 will discuss the study results and conclude the findings.

2. Materials and Methods

As mentioned in Section 1, the goal of this study is to propose an optimization model that maximizes onion production profits and uses appropriate quantities of nitrogen fertilizer to enable continuous onion production. To this end, Figure 1 illustrates the overall framework of the proposed models.

The proposed framework consists of a total of 10 processes under two modules, the Ensemble module and the Optimization module. In the data collection and data labeling phase, onion growth data are collected according to the designed experimental conditions, and data labeling is performed with yield as the dependent variable. The processed dataset is divided into a training dataset and a test dataset for later evaluation through cross validation of the developed model. Correlation analysis between independent and dependent variables is first performed through the training dataset, and ensemble learning is performed when a statistically significant correlation is found. If the relationship between the independent and dependent variables is not statistically significant, it is necessary to collect additional data again. Ensemble learning basically utilizes supervised learning algorithms where target values are given, and polynomial regression, support vector regression, decision tree, ridge regression, lasso regression, and linear regression are considered in this study (see Section 2.1 for more detail). However, new supervised learning algorithms may be added in the future as needed. The developed machine learning models are evaluated through a test dataset, and the coefficient of determination (R²), which is widely used to indicate prediction accuracy in statistics, is used.

In addition to developing an onion yield model, an optimization model scenario for economic analysis is determined, and related data and parameters are also collected. The optimization model is defined by setting nitrogen fertilizer usage as a decision variable with the goal of maximizing onion sales profits, and detailed explanations related to the optimization model can be found in Section 2.3. Once the optimal solution is obtained through the optimization model, it is analyzed and used in actual agricultural operation plans.

2.1. Data Collection

This study utilizes datasets from five different studies on yield of onions under different use of nitrogen fertilizer (see Table 1).

In Study A, Jilani et al. [23] conducted a field study to investigate yields of Faisalabad Early onions under the randomized complete block design (RCBD) with three replications using a spilt-plot arrangement. The size of a plot is 2.60 m² for all treatments. Seeds were planted at 10 cm intervals and irrigated using sprinklers. The first irrigation was performed at sowing, and irrigation was performed at 15-day intervals thereafter. Weeding was carried out manually, and seed germination began a week after sowing. Transplanting was carried out with a distance of 30 cm between rows and a distance of 10 cm between plants. All necessary cultivation practices, including irrigation, weed control, and pest and disease control, were kept constant in all experimental plots. A total of 1944 samples were collected to enable the observation of yield changes according to nitrogen treatment.

Study B was conducted by Halvorson et al. [24] to determine the yield of Ranchero onions, for which they were sown on 8 March 2005 and 2006 at a seeding rate of approximately 320,000 seeds/ha at a testbed in Arkansas Valley, U.S. Two rows of onions were planted 25 cm apart in the center of a 76 cm wide seedbed. Onions were harvested on August 30 in both years for fresh weight yield. The experimental design was a split-plot, randomized complete block design (RCBD) with four replicates. The plot size was 7.6 m × 15.2 m in 2005 and 9.1 m × 15.2 m in 2006. Furrow irrigation with a diameter of 3.2 cm and siphon tube irrigation were conducted in the study. A total of 73,033 samples were collected.

Study C is an experiment conducted by Gonçalves et al. [25] from June to October 2016 in the rural area of Mossoró municipality (UFESA located at 5° 3′ 37″ S, 37° 23′ 50″ W). The RCBD experiment was carried out on the cultivar Rio das Antas at different nitrogen (N) doses, with four replicates. The size of the plot was 1.0 m × 3.0 m, and 8 rows of plants were planted in each plot at intervals of 0.10 m × 0.06 m. Irrigation was applied using a drip system with four tubes per bed spaced 0.20 m apart and drippers spaced 0.30 m apart with an average flow rate of 1.5 L/ha. Irrigation was performed daily, and a total depth of 888.3 mm/ha was applied based on crop evapotranspiration [26]. A total of 5040 samples were collected.

Study D was conducted by Tekeste et al. [27] to find the influence of nitrogen and the total bulb yield of onion. The experiment was conducted in Melkassa Agricultural Research Centre, which is located in the Central Rift Valley Region of Ethiopia (8° 24′ N, 39° 21′ E). An RCBD was adopted with three replications. The plot size was 3.6 m × 2.0 m and 80 plants were planted. Transplanting was carried out in January 2011, 50 days after sowing, and the crop was harvested in April 2011. A total of 1200 samples were collected.

Study E is an experiment conducted in the South Korea by Lee and Min [28]. From 2014 to 2015, the study was conducted in an open field located in Changnyeong, Gyeongsangnam-do (35° 51′ 49″ N, 128° 45′ 13″ E). In one bed, plants were planted with 6 rows of transplant with 21.7 cm between rows and 12 cm of in-row-spacing. Each plot area was 16.0 m², and plant density was 31.3 plants per m². An RCBD was adopted with three replications. Seeding was conducted on 6 September 2014, crops were transplanted on 3 November 2014, and harvested on 1 June 2015. A total of 7512 samples were collected.

Table 2 describes the climate during the experimental period of each experimental region [28,29,30]. The highest temperature ranges were from 17.5 °C to 34.8 °C, the lowest temperature ranges were from 4.9 °C to 23.9 °C, the average temperature ranges were from 10.6 °C, to 28.5 °C, the humidity ranges were from 53% to 66.2%, and the precipitation ranges were from 20.4 mm/month to 71 mm/month. The climates of Study B and Study E are similar, and Study A and Study C have similar climates. The climate of Study D has intermediate climate characteristics among the five datasets.

The five datasets collected from existing research were used to develop a yield estimation model through the ensemble learning module involving the six supervised learning algorithms (i.e., linear regression, polynomial regression, support vector regression, decision tree, ridge regression, and lasso regression) described in Section 2.2.

2.2. Ensemble Learning Module for Development of a Yield Estimation Model

Ensemble learning is the process of combining multiple models that are made to solve a particular computational intelligence problem [31]. Although a linear regression model (or multivariate regression) that assumes a linear relationship between independent and dependent variables is widely utilized [32], when considering variables that may have non-linear or hierarchical relationships, finding the most appropriate machine learning model from among various models ultimately helps provide high prediction accuracy [33]. Basically, machine learning is largely divided into supervised learning, which utilizes labeled data in a situation where the output values of input data are known, and unsupervised learning, which is performed in a situation where the output values of input data are unknown [34]. In supervised learning, although new modeling techniques are continuously being developed, the most popular models include regression-based models (polynomial regression, least absolute shrinkage, and selection operator (lasso) regression, ridge regression, and random forest regression) that improve upon existing linear regression models [35] and neural network-based deep learning models (recurrent neural network (RNN) and convolutional neural network (CNN)) [36]. Considering that deep learning models such as CNN and RNN or multi-layer neural network models are mainly used for image data processing, regression-based models still play an important role in predicting the output values of input data [37]. Unsupervised learning is widely used to classify given data in terms of the absence of information about the output values of input data, and representative models include k-means clustering, principal component analysis (PCA), and latent Dirichlet allocation (LDA) [38,39,40]. Although there is a classification for supervised learning and unsupervised learning, support vector machines and decision trees, which have been widely used in data classification, are being used in supervised learning in the form of support vector regressors [41] and decision tree regressors [42] through continuous model development. In this study, six supervised learning models, such as linear regression, ridge regression, lasso regression, support vector regressor, decision tree regressor, and polynomial regression are intended to be used in the ensemble learning process.

2.2.1. Linear Regression

Linear regression is the model that the regression function is linear or assume that the model is a reasonable approximation and the form is Equation (1) [43].

f (X) = β_{0} + \sum_{j = 1}^{p} X_{j} β_{j}

(1)

Parameter

β_{0}, β_{1}, \dots, β_{p}

are estimated by the prediction using data, which is calculating the least residual sum of squares (RSS) of the coefficient [44].

2.2.2. Ridge and Lasso Regression

Ridge regression and Least Absolute Shrinkage and Selection Operator (lasso) regression are the shrinkage regression models that penalize the regression coefficients to handle multicollinearity [45]. Ridge regression is shrinking the coefficients close to 0, but not 0 so that retains all predictors [45,46]. However, lasso regression shrinks some coefficient to 0 and retains the good features [46]. Figure 2 shows the estimation for the ridge and lasso regression. The green area shows the constraint region and the blue ellipses are the contours of the least squares error function [43]. The constraint region shape is different between ridge and lasso regression. Notice that

β_{1}

and

β_{2}

are parameters of a regression model, and

\hat{β}

is a center of the parameters.

2.2.3. Support Vector Regressor

SVR is the process of finding the hyperplane form of the given data and finding the closest support vector; the hyperplane with the largest number of points is the best line [47,48]. The difference between SVR and other regression models is that SVR finds the best-fit line within a threshold value but other regression models try to minimize the real value and the predicted value [48]. SVR finds the linear function of Equation (2) by optimizing Equations (3)–(6) as shown in Figure 3 [49]. In Equations (3)–(6),

ε

is a deviation from observed values (a blue line) to an

ε

-insensitive tube boundary (a blue dotted line);

ξ_{i}

is the upper distance between a slack variable and an

ε

-insensitive tube boundary;

ξ_{i}^{*}

is the downward distance between a slack variable and an

ε

-insensitive tube boundary;

w

is a coefficient of input variable x;

b

is a constant term in a regression model; C is a regularization parameter; and

y_{i}

is a observed response value (a block dot).

f (x) = < w, x > + b w i t h w \in ℵ, b \in R

(2)

M i n i m i z e \frac{1}{2} {‖w‖}^{2} + C \sum_{i = 1}^{l} (ξ_{i} + {ξ_{i}}^{*})

(3)

Subject to

y_{i} - < w, x_{i} > - b \leq ε + ξ_{i}

(4)

< w, x_{i} > + b - y_{i} \leq ε + {ξ_{i}}^{*}

(5)

ξ_{i}, {ξ_{i}}^{*} \geq 0

(6)

2.2.4. Decision Tree Regressor

A decision tree regressor is a tree structure regressor that represents all inputs and outputs, which is similar to the flowchart structure [50]. Figure 4 shows the structure of the decision tree regressor. A decision tree regressor is composed of a root node (a green box at the top), branches (arrows), and a leaf node (a child green box). Leaf node branches have the attribute values which are evaluated [51]. Data can be used without normalization or scaling so that it is easily interpretable [51]. However, a decision tree has a risk of overfitting to the data [50].

2.2.5. Polynomial Regression

A polynomial regression (PR) model finds the relationship between the independent variable and dependent variable with polynomial functions. PR uses any kind of polynomial functions; it has an advantage of finding the non-linear relationship [52]. A PR model is represented as Equation (7).

Υ = g (X_{1}, \dots, X_{n}) = β_{0} + f_{1} (X_{1}) + \dots + f_{n} (X_{n}) + ε, ε ~ N (0, \sum_{j = 1}^{n} σ_{j}^{2})

(7)

The advantage of the PR model is that it shows good performance on various datasets. As data are not always linear, LR shows low performance in non-linear data because a linear line is not a best-fit line [48]. In a PR model, it is not necessary to have a linear relationship between the independent and dependent variables.

2.3. Optimization Module for Economic Production of Onions

Using polynomial regression, which showed the highest prediction accuracy in Section 2.1, an optimization model was created to find the appropriate amount of nitrogen fertilizer to maximize onion sales profits (see Equations (8)–(10)).

M a x Z = \sum_{t \in T} [R (N_{t}) - C (N_{t})]

(8)

Subject to

R (N_{t}) = P_{s a l e s, t} \times Y_{t} (N_{t}) f o r \forall t \in T

(9)

\begin{array}{l} C (N_{t}) = C_{s e e d} + C_{f e t i l i z e r} (N_{t}) + C_{p e s t i c i d e} + C_{d e p r e c i a t i o n} + C_{m a t e r i a l} + C_{r e n t} \\ + C_{l a b o r} + C_{o t h e r s} \end{array}

(10)

The complexity of the proposed optimization model may change depending on the given yield prediction model, but considering the results of Section 2.2, if the yield prediction model is a polynomial regression in the form of a quadratic function, Equation (9) can be described as Equation (11).

R (N_{t}) = P_{s a l e s, t} \times [a_{0} + a_{1} N_{t} + a_{2} N_{t}^{2}]

(11)

Therefore, Equation (8) can be re-expressed in the same form as Equation (12), and Equation (13) can be obtained by differentiating Equation (12) with respect to

N_{t}

to find the optimal solution of the quadratic function.

Z = \sum_{t \in T} [P_{s a l e s, t} \times [a_{0} + a_{1} N_{t} + a_{2} N_{t}^{2}] - C (N_{t})]

(12)

\frac{d Z}{d N_{t}} = \sum_{t \in T} [P_{s a l e s, t} \times [a_{1} + 2 a_{2} N_{t}] - C_{f e t i l i z e r}]

(13)

Assuming that Equation (13) is 0 to find the optimal solution, the proposed optimization model can obtain the amount of nitrogen fertilizer used that maximizes profit through Equation (14).

2 a_{2} \sum_{t \in T} P_{s a l e s, t} N_{t}^{*} = \sum_{t \in T} C_{f e t i l i z e r} - a_{1} \sum_{t \in T} P_{s a l e s, t}

(14)

If a polynomial regression model is selected using Equation (14), the optimal solution can be calculated more quickly.

3. Results

3.1. Yield Estimation Modeling

Based on the data in Table 1 and Table 2, ensemble learning is conducted. This study allocates 70% of the data for training and 30% for testing. Outlier detection is conducted via studentized deleted residuals (SDR), a methodology that detects and removes outliers when the value is three or greater when calculated [53]. Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9 represent the estimation performance results of the six supervised learning algorithms for five different datasets. In Figure 5, the black dots represent the observed yield of onions according to the use of nitrogen fertilizer, and the solid lines represent the yield estimated by the developed machine learning models. The Pearson correlation coefficient between harvest yield and nitrogen fertilizer use in Figure 5 is 0.9587, so we can conclude that there is a strong positive correlation between two variables. The R² values of PR, LR, DT, SVR, ridge, and lasso are 0.92, 0.83, 0.93, 0.85, 0.83, and 0.82, respectively. PR and DT show relatively good prediction accuracy compared to the other four models. Looking at the observed field study data in Figure 5, as the amount of nitrogen used increases, the yield increases in the form of a convex function, and then the yield increase trend gradually decreases. For this reason, it is believed that PR and DT are suitable for representing the non-linear trend, resulting in a high prediction accuracy. Equation (15) represents the PR model which can be expressed as a simple quadratic function.

Y = 3.7722 + 0.1676 N - 0.0004 N^{2}

(15)

where N is nitrogen usage (kg/ha), and Y represents onion yield (ton/ha).

In Figure 6, the R² values of PR, LR, DT, SVR, ridge, and lasso are 0.90, 0.77, 0.70, 0.89, 0.77, and 0.77, respectively. Similar to Figure 5, PR shows high prediction accuracy in predicting onion yield under various amounts of nitrogen fertilizer usage. Compared to Figure 5, the observed yield in Figure 6 has a smoother convex function shape, so the PR model is considered to have shown the highest prediction accuracy. However, the change in yield due to the use of nitrogen fertilizer was relatively small compared to Figure 5. Therefore, SVR which predicts an almost consistent value of about 90 ton/ha regardless of the amount of nitrogen fertilizer used, also shows high prediction accuracy. Given that the Pearson correlation coefficient between harvest yield and nitrogen fertilizer use in Figure 6 is 0.8784 (i.e., a strong positive correlation), it is possible that SVR modeling is inappropriate. This result also shows the importance of securing appropriate observation data when performing ensemble learning. Equation (16) represents the PR model of Figure 6.

Y = 81.464 + 0.1782 N - 0.0005 N^{2}

(16)

In Figure 7, the Pearson correlation coefficient between harvest yield and nitrogen fertilizer use is 0.8390 (i.e., a strong positive correlation). The R² values of PR, LR, DT, SVR, ridge, and lasso are 0.93, 0.77, 0.88, 0.89, 0.77, and 0.77, respectively. Similar to Figure 6, PR shows high prediction accuracy in predicting onion yield under various amounts of nitrogen fertilizer usage. The observed yield in Figure 7 also shows the shape of a convex function so that PR has the highest prediction accuracy. Equation (17) represents the PR model of Figure 7.

Y = 43.08 + 0.3686 N - 0.0011 N^{2}

(17)

In Figure 8, the R² values of PR, LR, DT, SVR, ridge, and lasso are 0.97, 0.92, 0.89, 0.95, 0.92, and 0.92, respectively. The Pearson correlation coefficient between harvest yield and nitrogen fertilizer use in Figure 8 is 0.9614 (i.e., a strong positive correlation). Equation (18) represents the PR model of Figure 8.

Y = 24.65 + 0.1744 N - 0.0005 N^{2}

(18)

In Figure 9, the R² values of PR, LR, DT, SVR, ridge, and lasso are 0.99, 0.47, 0.66, 0.85, 0.47, and 0.47, respectively. The Pearson correlation coefficient between harvest yield and nitrogen fertilizer use is 0.684 (i.e., a strong positive correlation). Equation (19) represents the PR model of Figure 9.

Y = 32.92 + 0.1148 N - 0.0003 N^{2}

(19)

Table 3 describes a summary of the estimation accuracy results in Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9. Considering the average prediction accuracy in terms of R², PR (0.95) appears to have the highest prediction accuracy, SVR and DT follow in that order, and three models (i.e., LR, ridge, and lasso) have an R² value of 0.74. Although all models show high yield prediction accuracy, the PR model was selected as the best model as the yield change trend according to nitrogen use has a convex function form.

To test model performance, the mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage Error (MAPE) are measured. Figure 10, Figure 11 and Figure 12 show the performance of each learning model. As MAE, RMSE, and MAPE are calculated based on the difference between observed values and predicted values, the lower metrics a model shows the better its performance. On average, PR and DT showed the lowest values of the error-based metrics. However, as DT model is highly overfitted to the study C, the PR model shows the general performance of each machine learning technique.

3.2. Onion Profit Maximization

Using the proposed optimization model in Section 2.3, the optimal amount of nitrogen fertilizer usage to maximize the total profit by year is obtained. For this purpose, the production cost and sales profit data of onions collected from 2010 to 2014 by the Rural Development Administration of South Korea [54] are used, and Table 4 shows the collected data.

The sale price of onions varies greatly from year to year depending on economic conditions and demand, but ranges from USD 0.26/kg to USD 0.76/kg. Due to these price changes and changes in onion production prices each year, differences in profits occur every year. On average, revenue is USD 23,628.57/ha, total production cost is USD 9511.43/ha, and profit is USD 14,117.14/ha. Considering the average production cost, labor costs account for 41.21%, seed costs account for 17.42%, and fertilizer costs account for 13.47%, confirming that these three factors are relatively important compared to other costs. In particular, while seed and labor costs, which are essential elements of production, are difficult to change, the use of fertilizers, which directly affects production efficiency, is a relatively easy decision for farmers to make. Moreover, as shown in Section 3.1, appropriate use of nitrogen fertilizer has a positive effect on onion yield and reduces the total production cost, making it an appropriate decision-making variable for maximizing profits from onion production.

As can be seen from the results in Table 5, the proposed optimization model in Section 2.3 obtains the optimal amount of nitrogen fertilizer usage to maximize profits and shows a relatively uniform fertilizer use of 1585.60 kg/ha to 1640.00 kg/ha as the optimal solution. In this case, the onion yield appears relatively stable from 73,840 kg/ha to 73,940 kg/ha, but due to changes in sales price, revenues vary from −43.01% (USD 18,935.71/ha) to 69.31% (USD 56,250.00) from the average annual revenue of USD 32,795.71. This shows that it is important to keep the selling price of onions stable. In addition, considering the existing average profit of USD 14,117.14/ha (see Table 4), the onion profit obtained through optimal fertilizer use reaches USD 23,615.71/ha in Table 5, which is a 67.28% improvement over the existing average profit. This result is due to the efficient use of nitrogen fertilizer, which reduced the existing average total cost by 3.48% from USD 9511.43/ha to USD 9180.00/ha and increased average sales by 38.80% from USD 23,628.57/ha to USD 32,795.71/ha.

4. Discussion

Onion is one of major field vegetables (cabbage, radish, garlic, onion, and dried pepper being the others) in South Korea, so its efficient production has received nationwide attention for decades. As mentioned in Section 1, the Korean government announced a ‘mid- to long-term food security strengthening plan’ with the goal of securing an overall food self-sufficiency rate of 55.5% by 2027, and onion is also a large portion of the plan. That is why there is a dedicated plan entitled ‘Supply and Demand Management Plan for Major Field Vegetables’. However, it is challenging to realize efficient production of onion considering its economic issues (e.g., market status). This is because whenever farmers and government agencies try to increase the yield of onions, additional materials such as fertilizer, pesticide, and labor have to be spent, which increases production costs. This results in a production cost increase as well as profit reduction. That is why the proposed framework has to be used to achieve the maximum sales profit from onion production.

Ensemble learning involving six supervised learning algorithms was used to develop the yield estimation model, and polynomial regression was shown to be the best performing of the algorithms. In fact, unlike other machine learning algorithms that are black-box approaches, polynomial regression has the advantage of having an analytical ability to express the influence of independent variables on the dependent variable through the coefficients of the independent variables. Since the pattern of change in onion yield according to nitrogen fertilizer use follows a quadratic function, polynomial regression showed the highest prediction accuracy with an average R² value of 0.95. This was also shown in other yield prediction studies. Sangeeta [55] used ML models to predict crop yield by considering soil and weather data. By dividing training data and test data into 75% and 25%, the accuracy of PR is 88% and the decision tree is 77% [55]. Unlike PR, DT is unable to reuse the variable that is used to classify the data in the first step. Therefore, if DT’s first step is overfitted, it can lead to poor results [55]. Kuradusenge et al. [48] also used ML models to predict the yield of Irish potatoes and maize by considering weather data, which are a non-controllable factor. The SVR model’s R² is 0.560 and 0.549 for Irish potatoes and maize, respectively [48]. However, the PR model’s R² is 0.773 and 0.716 for Irish potatoes and maize, respectively [48]. This is because the PR model showed better performance in expressing the influence of independent variables.

The developed polynomial regression model was used to derive onion yields under various nitrogen fertilizer use environments, and the proposed optimization methodology calculated the cost, revenue, and profit under each condition based on the predicted onion yield and the amount of nitrogen fertilizer used, thereby deriving the amount of nitrogen fertilizer to be used to maximize profit. By comparing the cost of onion production in the past and the profit data from it, we found that the proposed framework can improve the average profit by 67.28%. In so doing, this study has discovered once again that there is an opportunity to increase the production of onions, a major field crop in South Korea, through appropriate use of nitrogen fertilizers, and that this can also increase farmers’ profits. Although research should be conducted to further increase onion production in the future, taking into account various factors including climate change, the significance of this study lies in its ability to examine opportunities for increased profits and cost reductions.

5. Conclusions

In this study, an ensemble learning-based framework for efficient nitrogen fertilizer use has been proposed. To maximize onion production profits by using an appropriate amount of nitrogen fertilizer, the optimization framework consists of two major modules: (1) an ensemble learning module for the development of the best onion yield estimation model via the ensemble learning technique and (2) an optimization module considering the market situation for onions for the maximization of onion sales profits. In particular, in the ensemble learning technique, multiple supervised machine learning methods, such as polynomial regression, support vector regression, decision tree, ridge regression, and lasso regression, were utilized. From the field study datasets [23,24,25,27,28], polynomial regression has shown the best estimation accuracy with a determinant of coefficient (R²) of 0.95 on average. The optimization module has used the selected estimation model to estimate onion yields under different nitrogen fertilizer use cases, and it has identified the appropriate nitrogen use to maximize sales profits. As a result, the developed framework using polynomial regression has been able to increase profits from onion production by 67.28% compared to a historical case in South Korea. Therefore, farmers and policymakers can determine how they can increase their profits from onion production using the proposed framework.

Although the proposed methodology can increase the profitability of onion production through economical nitrogen use, further research should be conducted in the future to develop a more reliable machine learning model. In particular, since the production of onions is affected by various independent variables such as climate, soil nutrients (nitrogen, phosphorus, potassium, etc.), and soil moisture, additional experiments should be conducted for each independent variable. In addition, since crops are harvested only once during the growing period, data collection is more difficult than in other fields (manufacturing processes, self-driving cars, etc.), and additional research should be conducted so that various experiments can be performed over multiple years to develop a reliable machine learning model.

Author Contributions

Conceptualization, S.K. (Sojung Kim); methodology, S.K. (Sojung Kim) and Y.K.; software, S.K. (Sojung Kim) and Y.K.; validation, S.K. (Sojung Kim) and S.K. (Sumin Kim); resources, S.K. (Sumin Kim); writing—original draft preparation, S.K. (Sojung Kim), Y.K. and S.K. (Sumin Kim); writing—review and editing, S.K. (Sojung Kim) and S.K. (Sumin Kim); visualization, S.K. (Sojung Kim) and S.K. (Sumin Kim); project administration, S.K. (Sumin Kim); funding acquisition, S.K. (Sojung Kim) and S.K. (Sumin Kim) All authors have read and agreed to the published version of the manuscript.

Funding

This work was carried out with the support of “Cooperative Research Program for Agriculture Science and Technology Development (Project No. RS-2024-00394437)” Rural Development Administration, Republic of Korea.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors gratefully acknowledge the support of the Rural Development Administration, Republic of Korea. The views expressed in this paper are solely those of the authors and do not represent the opinions of the funding agency.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Schlenker, W.; Hanemann, W.M.; Fisher, A.C. The impact of global warming on US agriculture: An econometric analysis of optimal growing conditions. Rev. Econ. Stat. 2006, 88, 113–125. [Google Scholar] [CrossRef]
Shah, F.; Wu, W. Soil and crop management strategies to ensure higher crop productivity within sustainable environments. Sustainability 2019, 11, 1485. [Google Scholar] [CrossRef]
Kang, Y.; Khan, S.; Ma, X. Climate change impacts on crop yield, crop water productivity and food security–A review. Prog. Nat. Sci. 2009, 19, 1665–1674. [Google Scholar] [CrossRef]
Ziervogel, G.; Ericksen, P.J. Adapting to climate change to sustain food security. Wiley Interdiscip. Rev. Clim. Change 2010, 1, 525–540. [Google Scholar] [CrossRef]
Kim, Y.; On, Y.; So, J.; Kim, S.; Kim, S. A Decision support software application for the design of agrophotovoltaic systems in Republic of Korea. Sustainability 2023, 15, 8830. [Google Scholar] [CrossRef]
Rural Development Administration. The 3rd Basic Plan for Rural Development Projects. Available online: https://www.rda.go.kr/org/pln/pln_defaultQuery.do?mode=html&prgId=pln_defaultQuery (accessed on 2 July 2024).
2050 Carbon Neutral Green Growth Committee. Sustainable Agriculture Strategy Forum for Climate Crisis Adaptation. Available online: https://www.2050cnc.go.kr/storage/board/base/2023/11/10/BOARD_ATTACH_1699599840812.pdf (accessed on 1 September 2024).
Rural Development Administration. Onion Seeks Response to Climate Change. Available online: https://www.nics.go.kr/bbs/file/dwld.do?fileSn=6016&bbsId=news&m=100000020 (accessed on 1 September 2024).
Chomba, S.K.; Okalebo, J.R.; Imo, M.; Mutuo, P.K.; Walsh, M. Predicting maize response to fertilizer application using growth curves in western Kenya. Glob. J. Arable Crop Prod. 2013, 1, 78–85. [Google Scholar]
Shastry, A.; Sanjay, H.; Bhanusree, E. Prediction of crop yield using regression techniques. Int. J. Soft Comput. 2017, 12, 96–102. [Google Scholar]
Kim, S.; Kim, S.; Green, C.H.; Jeong, J. Multivariate polynomial regression modeling of total dissolved-solids in rangeland stormwater runoff in the Colorado River Basin. Environ. Model. Softw. 2022, 157, 105523. [Google Scholar] [CrossRef]
Yoon, C.Y.; Kim, S.; Cho, J.; Kim, S. Modeling the impacts of climate change on yields of various Korean soybean sprout cultivars. Agronomy 2021, 11, 1590. [Google Scholar] [CrossRef]
Shafiee, S.; Lied, L.M.; Burud, I.; Dieseth, J.A.; Alsheikh, M.; Lillemo, M. Sequential forward selection and support vector regression in comparison to LASSO regression for spring wheat yield prediction based on UAV imagery. Comput. Electron. Agric. 2021, 183, 106036. [Google Scholar] [CrossRef]
Cedric, L.S.; Adoni, W.Y.H.; Aworka, R.; Zoueu, J.T.; Mutombo, F.K.; Krichen, M.; Kimpolo, C.L.M. Crops yield prediction based on machine learning models: Case of West African countries. Smart Agric. Technol. 2022, 2, 100049. [Google Scholar] [CrossRef]
Kim, S.; Kim, S.; Yoon, C.-Y. An efficient structure of an agrophotovoltaic system in a temperate climate region. Agronomy 2021, 11, 1584. [Google Scholar] [CrossRef]
Lischeid, G.; Webber, H.; Sommer, M.; Nendel, C.; Ewert, F. Machine learning in crop yield modelling: A powerful tool, but no surrogate for science. Agric. For. Meteorol. 2022, 312, 108698. [Google Scholar] [CrossRef]
Jones, H.; Mann, L. Onions and Their Allies; London Leonard Hill: London, UK; Limited Interscience Publishers Inc.: New York, NY, USA, 1963; Volume 284. [Google Scholar]
Khokhar, K.M.; Hadley, P.; Pearson, S. Effect of photoperiod and temperature on inflorescence appearance and subsequent development towards flowering in onion raised from sets. Sci. Hortic. 2007, 112, 9–15. [Google Scholar] [CrossRef]
Brewster, J. Effects of photoperiod, nitrogen nutrition and temperature on inflorescence initiation and development in onion (Allium cepa L.). Ann. Bot. 1983, 51, 429–440. [Google Scholar] [CrossRef]
Paterson, D.; Blackhurst, H.; Siddiqui, S. Some effects of nitrogen and phosphoric acid on premature seedstalk development, yield and composition of three onion varieties. Proc. Am. Soc. Hortic. Sci. 1960, 76, 460–467. [Google Scholar]
Stuart, N.; Griffin, D. The influence of nitrogen nutrition on onion seed production in the greenhouse. Proc. Am. Soc. Hortic. Sci. 1946, 48, 398–402. [Google Scholar]
Korean Statistical Information Service. Vegetable Production (Spice & Culinary Vegetables). Available online: https://kosis.kr/statHtml/statHtml.do?orgId=101&tblId=DT_1ET0291&conn_path=I2&language=en (accessed on 10 July 2024).
Jilani, M.S.; Ghaffoor, A.; Waseem, K.; Farooqi, J.I. Effect of different levels of nitrogen on growth and yield of three onion varieties. Int. J. Agric. Biol. 2004, 6, 507–510. [Google Scholar]
Halvorson, A.D.; Bartolo, M.E.; Reule, C.A.; Berrada, A. Nitrogen effects on onion yield under drip and furrow irrigation. Agron. J. 2008, 100, 1062–1069. [Google Scholar] [CrossRef]
Gonçalves, F.D.C.; Grangeiro, L.C.; Sousa, V.D.F.D.; Santos, J.P.D.; Souza, F.I.D.; Silva, L.R.D. Yield and quality of densely cultivated onion cultivars as function of nitrogen fertilization. Rev. Bras. Eng. Agrícola Ambient. 2019, 23, 847–851. [Google Scholar] [CrossRef]
Allen, R.G.; Pereira, L.S.; Raes, D.; Smith, M. Crop Evapotranspiration-Guidelines for Computing Crop Water Requirements-FAO Irrigation and Drainage Paper 56; FAO: Rome, Italy, 1998; Volume 300, p. D05109. [Google Scholar]
Tekeste, N.; Dechassa, N.; Woldetsadik, K.; Dessalegne, L.; Takele, A. Influence of nitrogen and phosphorus application on bulb yield and yield components of onion (Allium cepa L.). Open Agric. J. 2018, 12, 194–206. [Google Scholar] [CrossRef]
Lee, J.; Min, B. Evaluation of controlled release fertilizer on bulb yield, nutrient content, and storage quality of overwintering intermediate-day onions. Korean J. Soil Sci. Fertil. 2022, 55, 324–342. [Google Scholar] [CrossRef]
Visual Crossing Corporation. Historical Weather Data. Available online: https://www.visualcrossing.com/weather-data (accessed on 1 September 2024).
World Bank Group. Historical Weather Data. Available online: https://climateknowledgeportal.worldbank.org/country/pakistan/climate-data-historical (accessed on 1 September 2024).
Polikar, R. Ensemble learning. In Ensemble Machine Learning: Methods and Applications; Springer: New York, NY, USA, 2012; pp. 1–34. [Google Scholar]
Gu, Z.; Qi, Z.; Burghate, R.; Yuan, S.; Jiao, X.; Xu, J. Irrigation scheduling approaches and applications: A review. J. Irrig. Drain. Eng. 2020, 146, 04020007. [Google Scholar] [CrossRef]
Sagi, O.; Rokach, L. Ensemble learning: A survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2018, 8, e1249. [Google Scholar] [CrossRef]
Alpaydin, E. Introduction to Machine Learning; MIT Press: Cambridge, MA, USA, 2020. [Google Scholar]
Maulud, D.; Abdulazeez, A.M. A review on linear regression comprehensive in machine learning. J. Appl. Sci. Technol. Trends 2020, 1, 140–147. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Kim, S.; Seo, J.; Kim, S. Machine Learning Technologies in the Supply Chain Management Research of Biodiesel: A Review. Energies 2024, 17, 1316. [Google Scholar] [CrossRef]
Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A k-means clustering algorithm. J. R. Stat. Society. Ser. C (Appl. Stat.) 1979, 28, 100–108. [Google Scholar] [CrossRef]
Abdi, H.; Williams, L.J. Principal component analysis. Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 433–459. [Google Scholar] [CrossRef]
Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
Awad, M.; Khanna, R.; Awad, M.; Khanna, R. Support vector regression. In Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers; Apress: Berkeley, CA, USA, 2015; pp. 67–80. [Google Scholar]
Tso, G.K.; Yau, K.K. Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks. Energy 2007, 32, 1761–1768. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J.H.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: Berlin/Heidelberg, Germany, 2009; Volume 2. [Google Scholar]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R.; Taylor, J. Linear regression. In An Introduction to Statistical Learning: With Applications in Python; Springer: Berlin/Heidelberg, Germany, 2023; pp. 69–134. [Google Scholar]
Sridhara, S.; Ramesh, N.; Gopakkali, P.; Das, B.; Venkatappa, S.D.; Sanjivaiah, S.H.; Kumar Singh, K.; Singh, P.; El-Ansary, D.O.; Mahmoud, E.A. Weather-based neural network, stepwise linear and sparse regression approach for rabi sorghum yield forecasting of Karnataka, India. Agronomy 2020, 10, 1645. [Google Scholar] [CrossRef]
Enwere, K.; Nduka, E.; Ogoke, U. Comparative Analysis of Ridge, Bridge and Lasso Regression Models In the Presence of Multicollinearity. IPS Intelligentsia Multidiscip. J. 2023, 3, 1–8. [Google Scholar] [CrossRef]
Paidipati, K.K.; Chesneau, C.; Nayana, B.; Kumar, K.R.; Polisetty, K.; Kurangi, C. Prediction of rice cultivation in India—Support vector regression approach with various kernels for non-linear patterns. AgriEngineering 2021, 3, 182–198. [Google Scholar] [CrossRef]
Kuradusenge, M.; Hitimana, E.; Hanyurwimfura, D.; Rukundo, P.; Mtonga, K.; Mukasine, A.; Uwitonze, C.; Ngabonziza, J.; Uwamahoro, A. Crop yield prediction using machine learning models: Case of Irish potato and maize. Agriculture 2023, 13, 225. [Google Scholar] [CrossRef]
Basak, D.; Pal, S.; Patranabis, D.C. Support vector regression. Neural Inf. Process. Lett. Rev. 2007, 11, 203–224. [Google Scholar]
Panigrahi, B.; Kathala, K.C.R.; Sujatha, M. A machine learning-based comparative approach to predict the crop yield using supervised learning with regression models. Procedia Comput. Sci. 2023, 218, 2684–2693. [Google Scholar] [CrossRef]
Iniyan, S.; Varma, V.A.; Naidu, C.T. Crop yield prediction using machine learning techniques. Adv. Eng. Softw. 2023, 175, 103326. [Google Scholar] [CrossRef]
Kim, S.; Kim, S. Performance estimation modeling via machine learning of an agrophotovoltaic system in South Korea. Energies 2021, 14, 6724. [Google Scholar] [CrossRef]
The Pennsylvania State University. Deleted Residuals. Available online: https://online.stat.psu.edu/stat501/lesson/11/11.4 (accessed on 2 September 2024).
Choi, C.G. Agricultural Management Guide 1: Onion Management (No.11-1390000-003926-10). Rural Development Administration of South Korea: Jeonju, Republic of Korea, 2015. [Google Scholar]
Sangeeta, S.G. Design and implementation of crop yield prediction model in agriculture. Int. J. Sci. Technol. Res. 2020, 8, 544–549. [Google Scholar]

Figure 1. Proposed optimization framework for efficient nitrogen fertilizer use with ensemble learning.

Figure 2. Estimation for the ridge (left) and lasso regression (right) (edited from [43]).

Figure 3. Parameters of support vector regressor.

Figure 4. Structure of the decision tree regressor.

Figure 5. Estimation model performance comparison using the dataset of Study A.

Figure 6. Estimation of model performance comparison using the dataset of Study B.

Figure 7. Estimation of model performance comparison using the dataset of Study C.

Figure 8. Estimation of model performance comparison using the dataset of Study D.

Figure 9. Estimation of model performance comparison using the dataset of Study E.

Figure 10. Mean absolute error of each machine learning technique.

Figure 11. Root mean square error of each machine learning technique.

Figure 12. Mean absolute percentage error of each machine learning technique.

Table 1. Field experiment data summary from five studies.

Study A		Study B		Study C		Study D		Study E
N (kg/ha)	Yield (ton/ha)	N (kg/ha)	Yield (ton/ha)	N (kg/ha)	Yield (ton/ha)	N (kg/ha)	Yield (ton/ha)	N (kg/ha)	Yield (ton/ha)
0	5.0	0	83	0	42.66	0	25.3	0	32.78
40	8.2	45	85	45	56.66	34.5	28.64	48	38.07
80	12.4	90	95	90	70.65	69	34.35	96	40.89
120	20.4	135	99	135	71.21	103.5	37.94	288	40.44
160	19.4	180	97	180	71.21	138	37.82
200	18.8	225	98	225	71.21

Table 2. Meteorological data from five studies during the experiment period.

Category	Year	Region	Maximum Temperature (°C)	Minimum Temperature (°C)	Average Temperature (°C)	Relative Humidity (%)	Precipitation (mm/day)
Study A	2003	Khyber Pakhtunkhwa, Pakistan	30.1	20.9	25.6	60.5	0.98
Study B	2005	Colorado, USA	21.0	5.2	12.7	53.0	0.68
Study C	2016	Rio Grande do Norte, Brazil	34.8	23.9	28.5	66.2	1.54
Study D	2011	Oromiya, Ethiopia	25.3	10.1	16.6	61.4	1.43
Study E	2014~2015	Gyeongsangnam-do, South Korea	17.5	4.9	10.6	63.2	2.37
Average			25.74	13.00	18.8	60.86	1.40
Standard deviation			6.92	8.89	7.90	4.90	0.64

Table 3. R² values of ensemble learning models.

Category	PR	LR	DT	SVR	Ridge	Lasso
Study A	0.92	0.83	0.93	0.85	0.83	0.83
Study B	0.90	0.77	0.70	0.89	0.77	0.77
Study C	0.97	0.70	0.99	0.83	0.70	0.70
Study D	0.97	0.92	0.89	0.95	0.92	0.92
Study E	0.99	0.47	0.66	0.85	0.47	0.47
Average	0.95	0.74	0.83	0.87	0.74	0.74

Table 4. Cost and price data of onions in South Korea.

Category	2010	2011	2012	2013	2014	Average
Sales price (USD/kg)	0.57	0.30	0.26	0.76	0.33	0.44
Revenue (USD/ha)	28,192.86	16,521.43	14,800.00	37,957.14	20,671.43	23,628.57
Profit (USD/ha)	19,521.43	7185.71	5200.00	28,078.57	10,600.00	14,117.14
Total production cost (USD/ha)	8671.43	9335.71	9600.00	9878.57	10,071.43	9511.43
Seed (USD/ha)	1457.14	1707.14	1692.86	1750.00	1678.57	1657.14
Fertilizer (USD/ha)	1135.71	928.57	1021.43	1500.00	1821.43	1281.43
Pesticide (USD/ha)	457.14	535.71	485.71	528.57	492.86	500.00
Labor (USD/ha)	3428.57	3864.29	4171.43	3764.29	4371.43	3920.00
Land rent (USD/ha)	764.29	728.57	692.86	828.57	35.71	610.00
Depreciation (USD/ha)	514.29	528.57	392.86	478.57	371.43	457.14
Material cost ^a (USD/ha)	750.00	914.29	985.71	857.14	535.71	808.57
Other costs ^b (USD/ha)	164.29	128.57	157.14	171.43	764.29	277.14

^a Material cost includes the cost of consumable materials such as the seedbed and mulching tools required for onion growth; ^b other costs includes electricity bills, taxes, and other management fees.

Table 5. Economic feasibility analysis by nitrogen fertilizer use.

Category	2010	2011	2012	2013	2014	Average
Nitrogen fertilizer use (kg/ha)	1628.20	1585.60	1570.40	1640.00	1594.10	1603.66
Nitrogen fertilizer cost (USD/ha)	964.29	942.86	928.57	971.43	942.86	950.00
Yield (kg/ha)	73,930	73,870	73,840	73,940	73,890	73,894
Revenue (USD/ha)	42,142.86	22,164.29	18,935.71	56,250.00	24,485.71	32,795.71
Total production cost (USD/ha) ^a	8500.00	9350.00	9507.14	9350.00	9192.86	9180.00
Profit (USD/ha)	33,642.86	12,814.29	9428.57	46,900.00	15,292.86	23,615.71

^a Total production cost includes nitrogen fertilizer cost and all costs described in Table 4.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, Y.; Kim, S.; Kim, S. Onion (Allium cepa) Profit Maximization via Ensemble Learning-Based Framework for Efficient Nitrogen Fertilizer Use. Agronomy 2024, 14, 2130. https://doi.org/10.3390/agronomy14092130

AMA Style

Kim Y, Kim S, Kim S. Onion (Allium cepa) Profit Maximization via Ensemble Learning-Based Framework for Efficient Nitrogen Fertilizer Use. Agronomy. 2024; 14(9):2130. https://doi.org/10.3390/agronomy14092130

Chicago/Turabian Style

Kim, Youngjin, Sumin Kim, and Sojung Kim. 2024. "Onion (Allium cepa) Profit Maximization via Ensemble Learning-Based Framework for Efficient Nitrogen Fertilizer Use" Agronomy 14, no. 9: 2130. https://doi.org/10.3390/agronomy14092130

APA Style

Kim, Y., Kim, S., & Kim, S. (2024). Onion (Allium cepa) Profit Maximization via Ensemble Learning-Based Framework for Efficient Nitrogen Fertilizer Use. Agronomy, 14(9), 2130. https://doi.org/10.3390/agronomy14092130

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Onion (Allium cepa) Profit Maximization via Ensemble Learning-Based Framework for Efficient Nitrogen Fertilizer Use

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Collection

2.2. Ensemble Learning Module for Development of a Yield Estimation Model

2.2.1. Linear Regression

2.2.2. Ridge and Lasso Regression

2.2.3. Support Vector Regressor

2.2.4. Decision Tree Regressor

2.2.5. Polynomial Regression

2.3. Optimization Module for Economic Production of Onions

3. Results

3.1. Yield Estimation Modeling

3.2. Onion Profit Maximization

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI