A Smart Farm DNN Survival Model Considering Tomato Farm Effect

Kim, Jihun; Ha, Il Do; Kwon, Sookhee; Jang, Ikhoon; Na, Myung Hwan

doi:10.3390/agriculture13091782

Open AccessArticle

A Smart Farm DNN Survival Model Considering Tomato Farm Effect

by

Jihun Kim

¹,

Il Do Ha

^1,*

,

Sookhee Kwon

¹,

Ikhoon Jang

² and

Myung Hwan Na

³

¹

Department of Statistics, Pukyong National University, Busan 48513, Republic of Korea

²

Institute of Technology, Jinong Inc., Anyang 14067, Republic of Korea

³

Department of Mathematics/Statistics, Chonnam National University, Gwangju 61186, Republic of Korea

^*

Author to whom correspondence should be addressed.

Agriculture 2023, 13(9), 1782; https://doi.org/10.3390/agriculture13091782

Submission received: 28 July 2023 / Revised: 5 September 2023 / Accepted: 7 September 2023 / Published: 8 September 2023

(This article belongs to the Special Issue Intelligent Systems in Precision Agriculture: Data, Applications and Techniques)

Download

Browse Figures

Versions Notes

Abstract

:

Recently, smart farming research based on artificial intelligence (AI) has been widely applied in the field of agriculture to improve crop cultivation and management. Predicting the harvest time (time-to-harvest) of crops is important in smart farming to solve problems such as planning the production schedule of crops and optimizing the yield and quality. This helps farmers plan their labor and resources more efficiently. In this paper, our concern is to predict the time-to-harvest (i.e., survival time) of tomatoes on a smart farm. For this, it is first necessary to develop a deep learning modeling approach that takes into account the farm effect on the tomato plants, as each farm has multiple tomato plant subjects and outcomes on the same farm can be correlated. In this paper, we propose deep neural network (DNN) survival models to account for the farm effect as a fixed effect using one-hot encoding. The tomato data used in our study were collected on a weekly basis using the Internet of Things (IoT). We compare the predictive performance of our proposed method with that of existing DNN and statistical survival modeling methods. The results show that our proposed DNN method outperforms the existing methods in terms of the root mean squared error (RMSE), concordance index (C-index), and Brier score.

Keywords:

DNN model; farm effect; one-hot encoding; survival model; time-to-harvest

1. Introduction

Recently, smart farm research based on artificial intelligence (AI), which is the foundation of the fourth industrial revolution, has been widely applied in the field of agriculture to improve crop production and management [1,2]. It is particularly important to predict the harvest time of crops in smart farm research in order to solve problems such as planning the production schedule of crops and optimizing yield and quality, which can help farmers plan their labor and resources more efficiently [3]. If these smart farms are universally implemented, it will become possible to enhance the competitiveness of agriculture even further. This can be achieved by optimally predicting output variables (outcomes) based on appropriate input variables. As a result, agriculture can lead the way as a future growth industry.

On a smart farm which grows crops with the help of AI, a single farm will have a variety of different plants, and the outcomes from the same farm can be correlated. Thus, the use of a prediction model considering a single farm’s characteristics or identity can enhance the predictive power of the outcomes. The farm’s characteristic can be represented as a categorical input variable (feature) or an individual-level covariate based on one-hot encoding (OHE). In particular, the OHE is a standard first-stage method for handling the categorical feature [4].

In this paper, we are interested in predicting of the outcomes of tomato crop data collected weekly from a smart farm via the IoT, as part of joint research with the Rural Development Administration of Korea [5,6]. Here, the outcome of interest is the time-to-harvest (i.e., survival time) of tomato plants. IoT-enabled sensors measure and monitor the growth and environmental information of tomatoes in real time. In general, this IoT technology enables automatic management of crops by establishing an optimal growth management system, leading to a significant increase in productivity and quality [1,3]. The unmeasured effect that represents this particular farm’s characteristics is called the “farm effect”. The existing research on deep neural networks (DNN) [7,8,9,10,11,12] has ignored the farm effect, which may lead to inaccurate prediction results when applying the model to new datasets from different farms. For example, Kim et al. [7] studied the prediction of harvest time using DNN and machine learning methods without considering the impact of the farm effect.

Accordingly, in this paper we propose a DNN survival model that describes the farm effect as a fixed effect based on the OHE. The DNN with OHE is applied to two types of existing survival models, namely, the Accelerated Failure Time (AFT) and Cox Proportional Hazards (PH) models, which are two broad classes of survival regression models [13,14]. We compare our proposed modelling method with the existing survival modelling methods in terms of predictive measures such as the RMSE, C-index, and Brier score. The main objective of this paper is to demonstrate the superior performance of the proposed DNN survival model incorporating OHE for predicting the harvest time.

The rest of this paper is organized as follows. In Section 2, we describe the tomato dataset. In Section 3, we outline a brief description of classical survival regression models. In Section 4, we explain the DNN method and present the proposed model based on the OHE. The prediction results for the tomato data are presented in Section 5. Finally, we discuss the results and conclude the paper in Section 6.

2. Tomato Data

2.1. Data Description

As part of a joint study with the Rural Development Administration, we used a raw data set [5,6] consisting of data from a total of 83 farm households collected from 2017 to 2018 and from 2018 to 2019 in three regions in South Korea, namely, Gyeongam, Jeonbuk, and Jeonnam. The dataset combines greenhouse environment data measured every minute or hour with tomato growth data measured every 1–2 weeks on different smart farms. In this context, a day is defined as being from sunrise to sunrise the following day. For the analysis that considered the farm effect, farms with fewer than 11 tomato subjects (i.e., farm size) were removed. As a result, there were 65 farms in the dataset ranging in size from 14 to 37, with a mean of 25.1 and a median of 24.0. The dataset used in this paper consisted of 30 input variables and 1633 observations.

The description and summary of input variables are presented in Table 1, as shown in [6]. Here, the input variables (

x_{1}

∼

x_{28}

) are all continuous except for tomato cultivation, greenhouse type (

x_{29}

), and region type (

x_{30}

). The 28 continuous variables were grouped into binary variables as follows. The average production (yield) per square of the top ten farms was considered as the standard average (average of the group level). A code of 1 was assigned if the variable of each farm was above the average and a code of 0 otherwise.

2.2. Definition of Harvest Time Data

The harvest time of tomatoes is determined based on the number of flower clusters in the fruit group and the number of flower clusters in the harvest group. Typically, the harvest time ranges from 6 to 10 weeks. If the harvest time is 6 weeks, it can be calculated as follows [6]:

\begin{matrix} Harvest time = (the number of flower clusters in fruit group before 6 weeks) \\ - (the number of flower clusters in harvest group of the corresponding week) + 6 . \end{matrix}

In this case, if the corresponding week is 40 weeks, the harvest time is 6.3112, as shown in Table 2. This calculation is based on (the number of flower clusters in the fruit group 34 weeks before) − (the number of flower clusters in the harvest group at 40 weeks)

+ 6

.

Below, we provide a basic summary and distribution of the harvest time. As shown in Table 2, a harvest time with a positive real value indicates no censoring. The mean harvest time is 7.649 weeks, with a standard deviation of 1.048 weeks. Figure 1 displays three histograms for the harvest time: (a) the harvest time; (b) the log-transformation of the harvest time; and (c) the square root transformation of the harvest time. The corresponding skewness statistics (SW) that indicate the degree of symmetry are (a) −0.211, (b) −0.610, and (c) −0.406, respectively. These SW results confirm that the histogram without any transformation, histogram (a) appears more symmetrical.

3. Survival Regression Models

3.1. Accelerated Failure Time Model

Let T denote the harvest time (i.e., survival time) and let

x

be a vector of p-dimensional input variables. In survival analysis, the functional relationship between T and

x

is typically described by the following AFT regression model [15]:

g (T) = f (x) + ϵ,

where

g (\cdot)

is a transformation of T (usually,

log (T)

),

f (x)

is a function of

x

, and

ϵ

is a random error with

E (ϵ) = 0

.

For a simple analysis, we aim to find a transformation

g (\cdot)

that yields a symmetric distribution concerning T, as depicted in Figure 1’s histogram. This is because the harvest time T is not censored. In particular, Kumar [16] pointed out that the predictive performance of regression and neural network models can be improved by constructing a symmetric distribution by using an appropriate transformation on the output variable (dependent variable). As indicated by the SW values in Figure 1, the original scale (

g (T) = T

) exhibits greater symmetry. Therefore, in this paper, we consider the following AFT model:

T = f (x) + ϵ,

(1)

where

f (x) = x^{T} β

is a linear predictor,

β

is a vector of regression parameters, and

E (ϵ) = 0

. Hence, the regression parameters

β

can be easily estimated via the least squares method, resulting in the prediction of output variable T provided by

\hat{T} = x^{T} \hat{β}

.

3.2. The Cox Proportional Hazards Model

The hazard function of the harvest time T with given input variables

x

is defined as

λ (t | x) = \lim_{▵ t \to 0} \frac{P r (t \leq T < t + ▵ t | T \geq t, x)}{▵ t} .

The functional relationship between the hazard function of T and

x

can be described as the following hazard model:

λ (t | x) = λ_{0} (t) exp (f (x)),

(2)

where

λ_{0} (t)

is an unknown baseline hazard function. Here, Model (2) is called the Cox Proportional Hazards (PH) model [17] when

f (x) = x^{T} β

is a linear predictor without the intercept

β_{0}

. The regression parameters

β

are estimated by maximizing a partial likelihood (e.g., the Breslow or Efron likelihood) with

λ_{0} (t)

eliminated.

Under the Cox PH model, the survival function of T given x,

S (t | x) = P r (T > t | x)

can be expressed as follows:

S (t | x) = exp {- Λ_{0} (t) e^{x^{T} β}}

(3)

where the cumulative baseline hazard function

Λ_{0} (t) = \int_{0}^{t} λ_{0} (u) d u

is estimated using the Breslow [18] estimator. Then, the estimated (or predicted) survival function given

x

is provided by

\hat{S} (t | x) = exp {- {\hat{Λ}}_{0} (t) e^{x^{T} \hat{β}}} .

(4)

4. DNN Survival Models

This section provides an overview of the fundamental DNN framework followed by the introduction of the proposed DNN–OHE survival models.

4.1. DNN Model

The DNN models are structured neural networks that include input, hidden, and output layers representing and modeling the nonlinear relationship between the input and output variables [19,20]. The primary objective is to find a nonlinear predictor for the output variable y given input variables x.

In a given dataset

D = {(y_{i}, x_{i}); i = 1, \dots, n}

,

y_{i}

is an output (or target) variable of the ith cluster and

x_{i} = {(x_{i 1}, \dots, x_{i p})}^{T}

is the corresponding p-dimensional input (or feature) vector. Here, the

y_{i}

s are assumed to be independent. The general structure of the DNN consists of one input layer, ℓ hidden layers (

ℓ = 1, 2, \dots, L

), and one output layer, as shown in Figure 2. When the number of hidden layers is one it is called a neural network (NN), while when the number of hidden layers is two or more it is called a DNN. Here,

p_{ℓ}

is the number of nodes in the ℓth hidden layer,

p_{0} = p

is the number of nodes in the input layer,

W^{(ℓ)} = (w_{0}^{(ℓ)}, w_{1}^{(ℓ)}, \dots, w_{p_{ℓ - 1}}^{(ℓ)})

is the weight matrix of the ℓ-hidden layer, and

w_{i}^{(ℓ)} = (w_{i 1}^{(ℓ)}, w_{i 2}^{(ℓ)}, \dots, W_{i p_{ℓ}}^{(ℓ)})

is the component vector of the ith node with element

w_{i j}^{(ℓ)}

, which is the jth weight of the ith node for each ℓ hidden layer. Here,

w_{0}^{(ℓ)}

is the bias (intercept) vector of the

p_{ℓ}

node of the ℓ hidden layer and

B = {(β_{0}, β_{1}, \dots, β_{p_{L}})}^{T}

is the weight vector of output layer, including the bias

β_{0}

. Now, the three layers (input, hidden, and output) constituting the DNN can be expressed as follows:

Input layer:

$h_{i}^{(0)} = x_{i} (i = 1, 2, \dots, p_{0});$
Hidden layer:

$\begin{matrix} h_{j}^{(1)} & = & f^{(1)} (\sum_{i = 1}^{p_{0}} w_{i j}^{(1)} h_{i}^{(0)} + w_{0 j}^{(1)}), j = 1, \dots, p_{1}, \\ h_{j}^{(2)} & = & f^{(2)} (\sum_{i = 1}^{p_{1}} w_{i j}^{(2)} h_{i}^{(1)} + w_{0 j}^{(2)}), j = 1, \dots, p_{2}, \\ ⋮ \\ h_{j}^{(L)} & = & f^{(L)} (\sum_{i = 1}^{p_{L - 1}} w_{i j}^{(L)} h_{i}^{(L - 1)} + w_{0 j}^{(L)}), j = 1, \dots, p_{L}; \end{matrix}$
Output layer:

$\hat{y} = f^{(y)} (\sum_{j = 1}^{p_{L}} β_{j} h_{j}^{(L)} + β_{0}) = f^{(y)} (N N (x));$

where

f^{(ℓ)} (\cdot)

and

f^{(y)} (\cdot)

are the activation functions of the ℓ hidden layer and output layer, respectively, and

N N (x) = N N (x; w, β) = \sum_{j = 1}^{p_{L}} β_{j} h_{j}^{(L)} + β_{0}

(5)

denotes a neural network (NN) predictor that describes a nonlinear function of

x

. Activation functions typically include linear, sigmoid, Tanh (hyperbolic tangent), and ReLU (rectified linear unit) functions. Specifically, a nonlinear function such as sigmoid or ReLU is commonly used as an activation function in the hidden layer, while the activation function in the output layer is selected as either a linear or nonlinear function depending on the type of output variables [20].

4.2. Learning Procedure of DNN

Next, we describe the learning procedure used to estimate the weights (i.e., model parameters)

θ = (w, β)

for the DNN in (5) using training data D. The estimation or learning process is conducted through the optimization of a loss function (called the objective function), which is denoted by

L (θ) = L (f (x), y)

, which consists of true target y and true regression function

f (x)

, as in (1) and (2). It should be noted that a negative log-likelihood can be used as a loss function.

The model parameters

θ

are estimated by optimizing, that is, by minimizing the loss function

L (θ)

; their estimates are defined by

\hat{θ} = \underset{θ}{arg min} L (θ),

and are equivalent to the maximum likelihood (ML) estimators of

θ

, obtained by solving

\frac{\partial L (θ)}{\partial θ} = 0 .

(6)

However, the estimating Equations (6) are generally complex and highly nonlinear. Therefore, gradient descent (GD) methods are often used to solve (6). For a given loss function L, the formula for updating parameters

θ

on iteration k using the GD method is provided by

θ_{k + 1} = θ_{k} - α \frac{\partial}{\partial θ_{k}} L (θ),

(7)

where

α (> 0)

is the learning rate (or step size). Thus, for given initial values of

θ

and the learning rate

α

, parameter estimation (or learning) is determined by computing the gradient vector

\partial L (θ) / \partial θ_{k}

in (7), i.e., back-propagation [21]. It is important to note that optimization of

L (θ)

is usually performed using a mini-batch stochastic GD (SGD) method, particularly for very large datasets.

4.3. One-Hot Encoding (OHE)

The growth environment of individual tomato subjects may be similar within the same farm even if it is heterogeneous across different farms. This farm effect can be treated as a fixed effect, with the farm feature represented through the OHE method. OHE is a process in which categorical variables are converted into binary variables that take values of 0 or 1; it is sometimes known as dummy encoding. If there are multiple farms with several tomato plants, OHE creates binary variables (features)

z_{1}, \dots, z_{q}

, where the element

z_{k i}

equals one if tomato plant i belongs to farm k and equals 0 otherwise. Figure 3 illustrates the OHE method considering the farm index. Note that

q = 65

for the tomato data in Section 2.1.

4.4. DNN–OHE Survival Models

To consider the tomato farm effect as a fixed effect, we propose a DNN model based on OHE. The proposed models can be described by the following two types according to their method of applying OHE to the DNN model:

DNN-I (DNN OHE-input): the DNN model applies OHE to the input layer (I).
DNN-L (DNN OHE-last): the DNN model applies OHE to the last hidden layer (L).

Figure 4 presents a schematic representation of the DNN-I model, a DNN model in which the farm ID is applied to the input layer (I). This indicates that in the input layer the ID variables represented as OHE are combined with the 30 input variables from Table 1. DNN-I is a natural choice, as the binary variables generated by OHE continue to be treated as input variables. On the other hand, Figure 5 shows a schematic representation of the DNN-L model, in which the farm ID is applied to the last hidden layer (L), i.e., just before the output layer. This indicates that another input layer with only the ID variables is combined together with the last hidden layer in the output layer.

Accordingly, the two DNN–OHE models, namely, DNN-I and DNN-L, can be easily applied to the AFT model (1) and Cox model (2) to analyze the harvest time. The resulting models are described below. The DNN–OHE model based on the AFT model follows

T = f (x) + ϵ,

where

f (x) = N N (x; w, β)

is the NN predictor in (5). The DNN–OHE model based on the Cox model follows

λ (t | x) = λ_{0} (t) exp (f (x)),

where

f (x) = N N (x; w, β)

is the same NN predictor except without the bias

β_{0}

in (5).

5. Prediction Performance Results of DNN Survival Models

5.1. Model Fitting and Predictive Measures

In this section, we aim to compare the predictive performance of the proposed DNN-I and DNN-L survival models with the existing AFT and Cox PH models and their corresponding DNN models. All DNN models, including the proposed models, were computed using Python-based TensorFlow-Keras, while the Cox PH model was implemented using the lifelines package in Python.

The total dataset was divided into three separate sets; within each farm, the last three observations were assigned to the test set, the middle four observations were assigned to the validation set, and the remaining observations were assigned to the training set.

The optimal hyperparameters used in the DNN models are summarized in Table 3; early stopping was employed to prevent overfitting [20]. In the Cox model DNN, the negative Efron log-likelihood [22] was used as the loss function, while the RMSE loss was utilized in the AFT-based DNN because it penalizes larger errors more, especially when there is no censoring.

Note that

y_{i} = T_{i}

in Section 4.1 due to no censoring. The predictive performance for survival models based on the

T_{i}

s without censoring was evaluated using the following measures. For the AFT-type models, we used the RMSE and mean absolute error (MAE), defined as follows:

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(T_{i} - {\hat{T}}_{i})}^{2}}

and

MAE = \frac{1}{n} \sum_{i = 1}^{n} | T_{i} - {\hat{T}}_{i} |,

respectively, where

T_{i}

is the ith observed harvest time and

{\hat{T}}_{i}

is the ith harvest time predicted by the fitted model. The MAE is more robust against outliers compared to the RMSE. Notably, AFT-type models are particularly useful for predicting the harvest time because they directly model the survival times.

For the Cox-type hazard models we used the concordance index (C-index), defined as

\begin{matrix} C_{H} & = & P ({\hat{T}}_{i} < {\hat{T}}_{j} | T_{i} < T_{j}) \\ = & P (f (x_{i}) > f (x_{j}) | T_{i} < T_{j}); \end{matrix}

here,

f (x_{i})

is the risk function of

x_{i}

in (2), and is estimated as follows [23]:

{\hat{C}}_{H} = \frac{\sum_{i} \sum_{j} δ_{i} I (T_{i} < T_{j}) {I ({\hat{T}}_{i} < {\hat{T}}_{j}) + 0.5 I ({\hat{T}}_{i} = {\hat{T}}_{j})}}{\sum_{i} \sum_{j} δ_{i} I (T_{i} < T_{j}))}

where

δ_{i}

is the censoring indicator of the ith observation and

{\hat{T}}_{i}

is the ith observed harvest time, which is the corresponding predicted harvest time obtained from the fitted model. The C-index takes a value between 0 and 1, with values closer to 1 indicating better predictive performance. Note that Cox-type models are particularly useful for predicting survival probability, i.e., the survival function; the survival probability can be easily computed using the Cox hazard model, as shown in (3) and (4). We used the time-dependent Brier Score (BS) to evaluate the accuracy of the predicted survival function at a given time point t. The BS represents the average squared distance between the observed survival status and the predicted survival probability at that time point t, and is defined by for a given time point t as follows:

B S (t) = E {I (T > t) - S (t | x)}^{2} .

Without censoring, the BS is estimated as follows [24]:

\hat{B} S (t) = \frac{1}{n} \sum_{i = 1}^{n} {I (T_{i} > t) - \hat{S} (t | x_{i})}^{2},

where

\hat{S} (t | x_{i})

represents the predicted survival function obtained from the fitted model. Note that a lower BS indicates better prediction performance, similar to the RMSE. The integrated BS (IBS) provides an overall evaluation of model performance across all available times (

t_{1} < t < t_{\max}

). The IBS over the interval

[0, t_{\max}]

for

t_{1} = 0

is defined as

I B S = \frac{1}{t_{\max}} \int_{0}^{t_{\max}} B S (s) d s .

5.2. Prediction Results for AFT-Type DNN Models

We first consider the four AFT-type models: AFT, AFT-DNN, AFT-DNN-I, and AFT-DNN-L. Figure 6 illustrates the predicted values of T against the observed values of T on the test set. The results suggest that the AFT DNN-L model effectively predicts the output variables, with a Pearson’s sample correlation coefficient of 0.624. Furthermore, Table 4 demonstrates that the AFT DNN-L model achieves the lowest RMSE (0.8067) and MAE (0.6090), indicating superior performance.

Table 4 compares the prediction performances of the AFT-type methods with those of three popular machine learning (ML) methods: random forest (RF; [25]), XGBoost (XGB; [26]), and support vector regression (SVR; [27]). The hyperparameters for the three ML methods were tuned using ten-fold cross-validation through the Python packages RandomForestRegressor, xgboost, and svm, and the resulting optimal settings were as follows: (i) for the RF model, the number of trees was 500, the number of features randomly selected as candidates for splitting a node was

\sqrt{p}

, and the maximum depth of trees was ten; (ii) for the XGB model, the number of trees was 300, the learning rate was 0.1, and the maximum depth of trees was one; (iii) for the SVR model, the trade-off between maximizing the margin and minimizing the error was 0.1, the width of the margin was 0.01, and the kernel function was linear. The resulting ML predictive results are summarized as follows: for RF, the RMSE was 1.1929 and the MAE was 0.9384; for XGB, the RMSE was 1.2256 and the MAE was 0.9443; and for SVR, the RMSE was 1.1973 and the MAE was 0.9314. It is worth noting that none of these three ML methods are able to directly account for the farm effect. Consequently, we observe that all three ML methods yield inferior predictive results compared to the AFT–DNN methods presented in Table 4.

5.3. Prediction Results for Cox-Type DNN Hazard Models

Next, we consider the four Cox-type hazard models: Cox, Cox-DNN, Cox-DNN-I, and Cox-DNN-L. Table 5 presents the C-index and IBS results for the four Cox-type models on the test set. Among these models, the Cox-DNN-L model demonstrates the highest performance in terms of the C-index and IBS. Figure 7 displays the time-dependent BS results on the test set for the four Cox-type models. The BS values of the proposed Cox-DNN-L model and Cox-DNN model are very similar at each time point (week), and are consistently lower than those of the other two models (Cox and Cox-DNN-I) across almost all time points. Notably, the base Cox model exhibits exceptionally high BS values for weeks 6 and 8. These results indicate that the proposed Cox-DNN-L model outperforms the other three Cox-type models (Cox, Cox-DNN, and Cox-DNN-F) in terms of overall prediction performance.

6. Discussion and Conclusions

In this paper, we have presented DNN modelling approaches for tomato plants that take into account the farm effect. Because each farm has multiple tomato plant subjects, outcomes within the same farm can be correlated. Through analysis of the tomato data in Table 4 and Table 5, we observe that the proposed DNN survival models (AFT-DNN-L and Cox-DNN-L) demonstrate the best model performance in terms of predictive measures (RMSE, MAE, C-index, and IBS) for both harvest time and hazard rate as compared to the existing AFT-type and Cox-type models. In particular, we find that the three machine learning methods (i.e., RF, XGB, and SVR) used for the AFT-type models show relatively poor performance compared to all the AFT–DNN methods in Table 4. These results confirm that taking the farm effect into consideration enhances the models’ predictive capability.

In conclusion, the proposed DNN–OHE models (DNN-I and DNN-L) can be easily implemented using the existing DNN modeling approach. However, we recommend using the DNN-L model when considering farm effects, as the DNN-I model can pose computational problems due to the large number of parameters in the neural network as the number of farms increases [4] and provided lower prediction performance compared to the DNN-L model on our tomato data. One advantage of the proposed DNN-L method is that it can predict harvest times for individual tomato farms. This can assist farmers in improving or modifying their strategies to optimize yields and quality while reducing labor and resource usage [28].

Because it incorporates the farm effect based on OHE, the proposed DNN-L model can be applied to various types of data from smart farms, including yield data, sequential data, and image data. Belouz et al. [29] used artificial neural networks for the prediction of tomato yields, while Cho et al. [5] studied an encoding attention-based long short-term memory (LSTM) network. Minagawa and Kim [3] demonstrated the prediction of harvest times using a mask region-based convolutional neural network (Mask R-CNN [30]) to detect tomato bunch images. Furthermore, Nugroho et al. [31] compared the prediction accuracy of models based on Faster R-CNN, multibox Single-Shot Detector (SSD), and You Only Look Once (YOLO) for detecting tomato ripeness using input images [32,33,34,35,36]. Developing a DNN-L framework that allows for the above deep learning methods would be an interesting task for future research.

Another potential avenue for future work is to develop a new deep learning survival model that treats the farm effect considered in this paper as a random effect.

Author Contributions

Conceptualization, I.D.H., S.K. and M.H.N.; methodology, I.D.H. and M.H.N.; software, J.K. and I.D.H.; validation, J.K., S.K. and I.J.; formal analysis, J.K. and S.K.; investigation, J.K., I.J. and S.K.; data curation, J.K., S.K. and I.J.; writing—original draft preparation, J.K., S.K. and I.J.; writing—review and editing, I.D.H.; visualization, I.D.H. and M.H.N.; supervision, I.D.H., I.J. and M.H.N.; project administration, M.H.N. and I.J.; funding acquisition, M.H.N. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Korean Institute of Planning and Evaluation for Technology in Food, Agriculture, and Forestry (IPET) and the Korean Smart Farm R&D Foundation (KosFarm) through the Smart Farm Innovation Technology Development Program funded by the Ministry of Agriculture, Food, and Rural Affairs (MAFRA) and the Ministry of Science and ICT (MSIT), Rural Development Administration (RDA) (421010-02).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data presented in this study are not available due to government restrictions.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AFT	Accelerated Failure Time
BS	Brier Score
DNN	Deep Neural Network
IBS	Integrated Brier Score
IoT	Internet of Things
OHE	One-Hot Encoding

References

Na, M.H.; Park, Y.; Cho, W.H. A study on optimal environmental factors of tomato using smart farm data. JKDISS 2008, 10, 1427–1435. [Google Scholar]
Gadekallu, T.R.; Rajput, D.S.; Reddy, M.P.K.; Lakshmanna, K.; Bhattacharya, S.; Singh, S.; Jolfaei, A.; Alazab, M. A novel PCA–whale optimization-based deep neural network model for classification of tomato plant diseases using GPU. J. Real-Time Image Process. 2021, 18, 1383–1396. [Google Scholar] [CrossRef]
Minagawa, D.; Kim, J. Prediction of harvest time of tomato using mask R-CNN. AgriEngineering 2022, 4, 356–366. [Google Scholar] [CrossRef]
Hancock, J.T.; Khoshgoftaar, T.M. Survey on categorical data for neural networks. J. Big Data 2020, 7, 28. [Google Scholar] [CrossRef]
Cho, W.; Kim, S.; Na, M.; Na, I. Forecasting of tomato yields using attention-based LSTM network and ARMA Model. Electronics 2021, 10, 1576. [Google Scholar] [CrossRef]
Kim, J.C.; Kwon, S.; Ha, I.D.; Na, M.H. Survival analysis for tomato big data in smart farming. JKDISS 2021, 32, 361–374. [Google Scholar] [CrossRef]
Kim, J.C.; Kwon, S.; Ha, I.D.; Na, M.H. Prediction of smart farm tomato harvest time: Comparison of machine learning and deep learning approaches. JKDISS 2022, 33, 283–298. [Google Scholar] [CrossRef]
Luna, R.; Dadios, E.; Bandala, A.; Vicerra, R. Tomato growth stage monitoring for smart farm using deep transfer learning with machine learning-based maturity grading. Agrivita 2020, 42, 24–36. [Google Scholar]
Haggag, M.; Abdelhay, S.; Mecheter, A.; Gowid, S.; Musharavati, F.; Ghani, S. An intelligent hybrid experimental-based deep learning algorithm for tomato-sorting controllers. IEEE Access 2019, 7, 106890–106898. [Google Scholar] [CrossRef]
Alajrami, A.; Abu-Naser, S. Type of tomato classification using deep learning. IJAPR 2019, 12, 21–25. [Google Scholar]
Grimberg, R.; Teitel, M.; Ozer, S.; Levi, A.; Levy, A. Estimation of greenhouse tomato foliage temperature using DNN and ML models. Agriculture 2022, 12, 1034. [Google Scholar] [CrossRef]
Jeong, S.; Jeong, S.; Bong, J. Detection of tomato leaf miner using deep neural network. Sensors 2022, 22, 9959. [Google Scholar] [CrossRef] [PubMed]
Lawless, J.F. Statistical Models and Methods for Lifetime Data, 2nd ed.; Wiley: New York, NY, USA, 2003. [Google Scholar]
Ha, I.D.; Jeong, J.H.; Lee, Y. Statistical Modelling of Survival Data with Random Effects: H-Likelihood Approach; Springer: Singapore, 2017. [Google Scholar]
Kalbfleisch, J.D.; Prentice, R.L. The Statistical Analysis of Failure Time Data; Wiley: New York, NY, USA, 1980. [Google Scholar]
Kumar, U.A. Comparison of neural networks and regression analysis: A new insight. Expert Syst. Appl. 2005, 29, 424–430. [Google Scholar] [CrossRef]
Cox, D.R. Regression models and life tables (with Discussion). J. R. Stat. Soc. B 1972, 74, 187–220. [Google Scholar]
Breslow, N.E. Covariance analysis of censored survival data. Biometrics 1974, 30, 89–99. [Google Scholar] [CrossRef] [PubMed]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
Sun, T.; Wei, Y.; Chen, W.; Ding, Y. Genome-wide association study-based deep learning for survival prediction. Stat. Med. 2020, 39, 4605–4620. [Google Scholar] [CrossRef]
Harrell, F.E.; Lee, K.L.; Mark, D.B. Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med. 1996, 15, 361–387. [Google Scholar] [CrossRef]
Graf, E.; Schmoor, C.; Sauerbrei, W.; Schumacher, M. Assessment and comparison of prognostic classification schemes for survival data. Stat. Med. 1999, 18, 2529–2545. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Vapnik, V.; Golowich, S.E.; Smola, A.J. Support vector method for function approximation, regression estimation, and signal processing. Adv. Neural Inf. Process. Syst. 1997, 9, 281–287. [Google Scholar]
Liu, S.-C.; Jian, Q.-Y.; Wen, H.-Y.; Chung, C.-H. A crop harvest time prediction model for better sustainability, integrating feature selection and artificial intelligence methods. Sustainability 2022, 14, 14101. [Google Scholar] [CrossRef]
Belouz, K.; Nourani, A.; Zereg, S.; Bencheikh, A. Prediction of greenhouse tomato yield using artificial neural networks combined with sensitivity analysis. Sci. Hortic. 2022, 293, 110666. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. IEEE Int. Conf. Comput. Vis. 2017, 322, 2980–2988. [Google Scholar]
Nugroho, D.P.; Widiyanto, S.; Wardani, D.T. Comparison of deep learning-based object classification methods for detecting tomato ripeness. Int. J. Fuzzy Log. Intell. 2022, 22, 223–232. [Google Scholar] [CrossRef]
Afonso, M.; Fonteijn, H.; Fiorentin, F.S.; Lensink, D.; Mooij, M.; Faber, N.; Polder, G.; Wehrens, R. Tomato fruit detection and counting in greenhouses using deep learning. Front. Plant Sci. 2020, 11, 571299. [Google Scholar] [CrossRef] [PubMed]
Mishra1, A.M.; Harnal1, S.; Gautam, V.; Tiwari, R.; Upadhyay, S. Weed density estimation in soya bean crop using deep convolutional neural networks in smart agriculture. J. Plant Dis. Prot. 2022, 129, 593–604. [Google Scholar] [CrossRef]
Kaur, P.; Harnal, S.; Tiwari, R.; Upadhyay, S.; Bhatia, S.; Mashat, A.; Alabdali, A.M. Recognition of leaf disease using hybrid convolutional neural network by applying feature reduction. Sensors 2022, 22, 575. [Google Scholar] [CrossRef]
Ireri, D.; Belal, E.; Okinda, C.; Makange, N.; Ji, C. A computer vision system for defect discrimination and grading in tomatoes using machine learning and image processing. Artif. Intell. Agric. 2019, 2, 28–37. [Google Scholar] [CrossRef]
Arthur, Z.; Hugo, E.; Juliana, A. Computer vision based detection of external defects on tomatoes using deep learning. Biosyst. Eng. 2020, 190, 131–144. [Google Scholar]

Figure 1. Histograms of the transformation of the harvest time [7].

Figure 2. A schematic diagram of a deep neural network; bias terms are omitted for brevity, and can be found in the main text.

Figure 3. A schematic diagram of one-hot encoding (OHE) for the farm index.

Figure 4. Schematic diagram of the DNN–OHE-input model (DNN-I).

Figure 5. Schematic diagram of the DNN–OHE-last model (DNN-L).

Figure 6. Predicted value of T against observed value of T and correlation (corr) results of the four AFT-type models on the tomato data test set.

Figure 7. Time-dependent Brier scores for four Cox-type hazard models on the tomato data test set.

Table 1. Description and summary of input variables [6].

Variable	Description	Average	Variable	Description	Average
$x_{1}$	Cumulative insolation	1275.89	$x_{16}$	Internal humidity-sunset	80.59
$x_{2}$	Internal temperature-all	19.36	$x_{17}$	Internal humidity-evening	84.24
$x_{3}$	Internal temperature-daytime1	21.94	$x_{18}$	Internal humidity-night	86.36
$x_{4}$	Internal temperature-daytime2	16.67	$x_{19}$	Internal humidity-dawn	87.40
$x_{5}$	Internal temperature-am	20.28	$x_{20}$	CO $_{2}$ -am	417.91
$x_{6}$	Internal temperature-pm	24.34	$x_{21}$	CO $_{2}$ -daytime1	433.11
$x_{7}$	Internal temperature-sunset	20.05	$x_{22}$	CO $_{2}$ -daytime2	507.47
$x_{8}$	Internal temperature-am	17.33	$x_{23}$	CO $_{2}$ -am	478.38
$x_{9}$	Internal temperature-night	16.53	$x_{24}$	CO $_{2}$ -pm	404.59
$x_{10}$	Internal temperature-dawn	16.76	$x_{25}$	CO $_{2}$ -sunset	398.67
$x_{11}$	Internal humidity-all	82.25	$x_{26}$	CO $_{2}$ -evening	429.18
$x_{12}$	Internal humidity-daytime1	78.74	$x_{27}$	CO $_{2}$ -night	506.41
$x_{13}$	Internal humidity-daytime2	86.08	$x_{28}$	CO $_{2}$ -dawn	580.32
$x_{14}$	Internal humidity-am	81.92	$x_{29}$	Greenhouse type †	·
$x_{15}$	Internal humidity-pm	74.41	$x_{30}$	Region ‡	·

Note: Average, Average of group level; am, ante meridiem; pm, post meridiem; CO2, Carbon dioxide; † Greenhouse type: 0 = vinyl (864), 1 = glass (4313); ‡ Region: 0 = Outside Jangsu (1044), 1 = Jangsu (4133).

Table 2. Definition of harvest time [6].

Week	fgroup	hgroup	Harvtime
34	0.9775	·	·
35	2.0000	·	·
36	2.7275	·	·
37	3.6625	·	·
38	4.3975	·	·
39	5.0000	·	·
40	5.6413	0.6663	6.3112
41	6.2825	1.3325	6.6675
42	7.0625	1.8750	6.8525

Note: fgroup, fruit group; hgroup, harvest group; Harvtime, harvest time.

Table 3. Optimal hyperparameter settings.

Hyper Parameter	Setting
No. of hidden layers	3
No. of nodes per layer	${2, 32, 16}$
Learning rate	0.001
Batch size	length of validation set of y
No. of epoch	1000
Activation function (hidden layer)	elu
Activation function (output layer)	linear
Optimizer (AFT-type models)	AdamW
Optimizer (Cox-type models)	Nadam

Table 4. RMSE and MAE results of the four AFT-type models on the tomato data test set.

Predictive Measure	AFT	AFT-DNN	AFT-DNN-I	AFT-DNN-L
RMSE	0.8257	0.8124	0.9726	0.8067
MAE	0.6487	0.6167	0.7375	0.6090

Table 5. C-index and IBS results of four Cox-type hazard models on the tomato data test set.

Predictive Measure	Cox	Cox-DNN	Cox-DNN-I	Cox-DNN-L
C-index	0.6582	0.6527	0.6506	0.6600
IBS	0.1125	0.0471	0.0584	0.0468

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, J.; Ha, I.D.; Kwon, S.; Jang, I.; Na, M.H. A Smart Farm DNN Survival Model Considering Tomato Farm Effect. Agriculture 2023, 13, 1782. https://doi.org/10.3390/agriculture13091782

AMA Style

Kim J, Ha ID, Kwon S, Jang I, Na MH. A Smart Farm DNN Survival Model Considering Tomato Farm Effect. Agriculture. 2023; 13(9):1782. https://doi.org/10.3390/agriculture13091782

Chicago/Turabian Style

Kim, Jihun, Il Do Ha, Sookhee Kwon, Ikhoon Jang, and Myung Hwan Na. 2023. "A Smart Farm DNN Survival Model Considering Tomato Farm Effect" Agriculture 13, no. 9: 1782. https://doi.org/10.3390/agriculture13091782

APA Style

Kim, J., Ha, I. D., Kwon, S., Jang, I., & Na, M. H. (2023). A Smart Farm DNN Survival Model Considering Tomato Farm Effect. Agriculture, 13(9), 1782. https://doi.org/10.3390/agriculture13091782

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Smart Farm DNN Survival Model Considering Tomato Farm Effect

Abstract

1. Introduction

2. Tomato Data

2.1. Data Description

2.2. Definition of Harvest Time Data

3. Survival Regression Models

3.1. Accelerated Failure Time Model

3.2. The Cox Proportional Hazards Model

4. DNN Survival Models

4.1. DNN Model

4.2. Learning Procedure of DNN

4.3. One-Hot Encoding (OHE)

4.4. DNN–OHE Survival Models

5. Prediction Performance Results of DNN Survival Models

5.1. Model Fitting and Predictive Measures

5.2. Prediction Results for AFT-Type DNN Models

5.3. Prediction Results for Cox-Type DNN Hazard Models

6. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI