Growth Monitoring of Greenhouse Tomatoes Based on Context Recognition

Rahman, Fisilmi Azizah; Takanayagi, Miho; Eguchi, Taiga; Yeoh, Wen Liang; Yamaguchi, Nobuhiko; Okumura, Hiroshi; Tanaka, Munehiro; Inaba, Shigeki; Fukuda, Osamu

doi:10.3390/agriengineering6030119

Open AccessArticle

Growth Monitoring of Greenhouse Tomatoes Based on Context Recognition

by

Fisilmi Azizah Rahman

^1,*

,

Miho Takanayagi

¹,

Taiga Eguchi

¹

,

Wen Liang Yeoh

¹

,

Nobuhiko Yamaguchi

¹,

Hiroshi Okumura

¹,

Munehiro Tanaka

²,

Shigeki Inaba

² and

Osamu Fukuda

¹

Graduate School of Science and Engineering, Saga University, Saga 840-8502, Japan

²

Faculty of Agriculture, Saga University, Saga 840-8502, Japan

^*

Author to whom correspondence should be addressed.

AgriEngineering 2024, 6(3), 2043-2056; https://doi.org/10.3390/agriengineering6030119

Submission received: 1 May 2024 / Revised: 17 June 2024 / Accepted: 25 June 2024 / Published: 1 July 2024

Download

Browse Figures

Versions Notes

Abstract

:

To alleviate social problems in agriculture such as aging and labor force shortages, automatic growth monitoring based on image measurement has been introduced to tomato cultivation in greenhouses. The overlap of leaves and fruits makes precise observations challenging. In this study, we applied context recognition to tomato growth monitoring by using a Bayesian network. The proposed method combines image recognition using convolutional networks and context recognition using Bayesian networks. It enables not only the recognition of individual tomatoes but also the evaluation of tomato plants. An accurate number of tomatoes and the condition of the stocks can be estimated based on the number of ripe and unripened tomatoes in addition to their density information. The verification experiments clarified that a more accurate number of tomatoes could be estimated than with simple tomato detection, and the stock states could also be evaluated correctly. Compared to conventional methods, the method used in this study has improved tomato decision accuracy by 23%.

Keywords:

tomato; greenhouse; autonomous robot; context recognition; Bayesian network

1. Introduction

Aging farmers and workforce crises have been affecting the agriculture industry in Japan. Japan’s Ministry of Agriculture, Forestry, and Fisheries (MAFF) released data showing a 17% decrease in the number of agricultural workers over five years from 2019 to 2023. The average age of farmers employed in Japan is over 65, and the percentage of young farmers is decreasing [1]. To address this issue, MAFF is conducting research on smart agriculture that utilizes robots and information technology. For example, cameras and other network-connected sensors can be used to automatically monitor crop growth, which helps forecast future growth and improve growth strategies. Furthermore, this visual representation of the measured data may also facilitate information sharing among the operators. In addition, the ability to predict crop growth without requiring years of experience is helpful for new farmers. Another growing trend is the development and deployment of agricultural robots to resolve labor shortages, for example, in greenhouse tomato farming. In line with this trend, image measurement has also been tested for tomato detection using these robots.

Camera-based crop growth monitoring and automated harvesting using robot technologies are limited by image recognition. For example, current image processing algorithms for detecting tomatoes struggle to accurately detect all tomatoes due to challenges such as tomato overlapping and occlusion by leaves. Moreover, conventional image processing detects only individual tomatoes and does not consider the attributes of multiple tomatoes, such as their color, location, and distribution. In other words, it is impossible to monitor the conditions of an entire tomato plant or greenhouse. If we can recognize the characteristics of the entire image and factor in the relationships between the multiple detected tomatoes, we can expect significant advances in growth monitoring and automatic harvesting robotics technology.

Context recognition not only allows the detection of individual tomatoes but also enables recognition of the state of the entire tomato plant in a greenhouse. Further, context models also contribute to the accurate estimation of the number of tomatoes on a tomato plant. This study aimed to monitor the growth of greenhouse tomatoes by combining image recognition based on convolutional networks and context recognition based on Bayesian networks to accurately detect multiple tomatoes and recognize the state of the entire tomato plant.

The proposed system was expected to, first, estimate the total number of tomatoes in an image with high accuracy. The total number of tomatoes covered by other tomatoes and leaves was predicted by the context model, which was also adjusted for differences between the number identified from the image and the actual number. Second, the system was expected to recognize the status of individual tomato plants based on the number and color of tomatoes, which enables the operator to monitor the growth of the tomato plants and predict future harvest times. Visualization of the conditions of tomato plants in a greenhouse allows the operator to efficiently and easily identify the locations of harvestable tomato plants and can be used for estimating yields.

The remainder of this paper is organized as follows: Relevant research is introduced in Section 2, and the system configuration is discussed in detail in Section 3. The experiments employing this system are presented in Section 4, and a summary and outlook are given in Section 5.

2. Related Works

In recent years, automatic harvesting robots have attracted attention as a method for reducing the time and labor required to harvest plants. Various automatic harvesting robots have been proposed and developed for tomato cultivation [2,3,4,5]. For example, a study was conducted to identify fully ripened tomatoes for automated harvesting [6,7,8,9]. In these studies, a method was proposed to recognize fully ripened tomatoes that adapt to environmental light conditions and tomato backgrounds, as well as recognize the condition of the tomato before it is fully ripened. Notably, farmers may have to adjust their harvest time based on the distance and time of distribution. Studies on tomato ripeness classification that employed real-time object detection algorithms have also been conducted [10,11,12]. A convolutional neural network (CNN)-based system was created to increase the precision and speed of tomato maturity analysis with a limited quantity of training data.

Current research on image recognition has focused on the early detection and prevention of diseases caused by insect pests and nutrient deficiencies during plant growth [13], because of their significant impact on plant yield. In [14], deep learning was used to detect tomato leaves and insect pests from images and diseases. Other studies have identified nutritional stress conditions and predicted nutrient deficiencies, which could weaken plant defense mechanisms against pathogens and lead to epidemics. Weakening of the skin layer increases the risk of infection and can lead to epidemics. A study by [15] used a CNN to predict nutrient deficiencies from images of tomatoes and their leaves. However, previous studies were only able to detect and evaluate individual tomatoes and leaves and were unable to evaluate the overall condition of tomato plants. In other words, they could not assess the context of an abnormality in a single tomato or the entire tomato plant.

Bayesian networks, which can construct models based on accumulated data and subject-matter expertise, have been used for research in the agricultural field. Based on this model, statistical predictions can be made using the observed information. For example, Bayesian networks hasve been used to predict the appropriate climatic conditions for greenhouse cultivation, evaluate the nutritional value of water and nutrient recycling in hydroponic systems, predict the appropriate climate conditions for greenhouse cultivation, and predict the appropriate climatic conditions for greenhouse cultivation. In addition, crop growth was predicted using dynamic Bayesian networks [16]. In this study, lettuce growth was predicted under multiple environmental conditions, and accurate predictions for several days ahead were shown to be possible. This Bayesian network technique can be applied to context recognition of tomato plants. Studies on image recognition combining Bayesian networks and convolutional networks have also been conducted to separate gangue from raw coal, and they have shown better and steadier identification performance [17].

In this study, we demonstrated a combination of convolutional networks and Bayesian networks for monitoring the growth of tomato plants. The Bayesian network is versatile because it allows for the addition of new elements and the factors necessary for estimation after the model is built. In addition, even if data are missing when conducting estimations using the model, the estimation can be performed according to prior probabilities. These excellent features may be applied in the future, not only for growth monitoring but also for estimating diseases in tomato plants.

3. System Components

The proposed system was divided into three parts. Figure 1 shows the system configuration. The first part is the tomato detection section. The first part detects tomatoes from images captured by a web camera mounted on a mobile robot and collects information such as the detection position and color of the tomatoes. The second part is the context recognition part using the Bayesian network. The information obtained from the tomato detection section was used for the context recognition of the entire image. The third part is an automatic monitoring robot for tomato growth. The robot automatically travels through the greenhouse, stopping in front of individual tomato plants to photograph the upper, middle, and lower positions of the entire tomato plant.

3.1. Object Detection

You Only Look At CoefficienTs (YOLACT), a fast Instance Segmentation model, is used for tomato detection. YOLACT can perform in real-time while maintaining recognition accuracy, which is difficult to achieve with the two-stage Mask R-CNN and FCIS [18]. The YOLACT structure is shown in Figure 2. YOLACT decomposes the process into two parallel tasks, “Prediction of Prototype Mask” and “Prediction of Mask Coefficients”, which are linearly combined and input to a sigmoid function to generate an Instance Mask.

M = σ (P C^{T}),

(1)

where P is the

h * w * k

matrix of Prototype Masks, and C is the

n * k

matrix of Mask Coefficients that pass through Non-Maximal Suppression (NMS) and score thresholds. h, w, n, and k denote height, width, number of instances, and number of prototypes, respectively; it is difficult to achieve this with a faster approach like YOLO.

The detection of tomatoes in clusters often fails because of overlapping tomatoes, and it is difficult to determine the number and level of maturity of tomatoes that are hidden behind other tomatoes. Therefore, we also evaluated tomato clusters. To evaluate the clusters, we used DBSCAN, a clustering algorithm based on data density that does not require a prior determination of the number of clusters to be detected and can easily determine outliers. As shown in Figure 3, data were classified into three types: “core points”, “reachable points”, and “noise points”; however, the noise points became grouped into a single cluster. In DBSCAN, a point is considered a core point if

m i n p o i n t s

or more points are contained in a circle with an

e p s

radius centered on the point. Figure 3 shows the case where

m i n p o i n t s = 3

. Points other than the core point within the circle centered on the core point are reachable points, where the other points are noise points. In this study,

e p s = 75

and

m i n p o i n t s = 2

were determined through trial and error.

3.2. Context Recognition

Corrections were made to the number of tomatoes detected, and the state of the tomato plants was estimated based on this count along with contextual factors such as their color and the number of clusters. In this study, a Bayesian network, a graphical model that describes causal relationships using probabilities, was used for context recognition. Complex cause-and-effect interactions are represented by aesthetically-directed graphs, and conditional probabilities are used to describe the links between individual constituents [19].

The interaction between random variables

Y_{i}, Y_{j}

in Bayesian networks can be described as

X_{i} \to Y_{i}

, where

X_{i}

is the parent node and

Y_{j}

is the child node, as shown in Figure 4. In this situation, the parent node is the cause of an event, and the child nodes are the result. The probability of another event

Y_{i}

occurring conditionally upon the occurrence of an event

X_{i}

is known as the conditional probability and is denoted by

P (Y_{i} | X_{i})

. In this study, Bayes’ theorem can be used to obtain the probability of another event

X_{i}

occurring under the condition that an event

Y_{i}

has occurred (

P (X_{i} | Y_{i})

). This theorem allows us to infer the cause of an event from its outcome.

The model constructed in this study is shown in Figure 4. The parent node of the model was set to “Difference in number of tomatoes” and “Plant condition”. The child nodes were set to “Number of tomatoes in a cluster” and “Recognition Result”, in which the number of tomatoes in a cluster were named “Cluster1” and “Cluster2”. The recognition result was divided into “Green”, “Half-ripened”, and “Fully ripened”. Each state variable was adjusted through trial and error based on the experimental environment.

The difference between the actual and YOLACT-detected numbers of tomatoes is represented by the “Corrected number of tomatoes detected” node. The following five state variables are recognized: “0”, “1”, “2”, “3”, and “>=4”. To estimate the actual number of tomatoes, the number of tomatoes detected by YOLACT was added by the “difference in the number of tomatoes” that the Bayesian Network finds to have the highest probability. The status of the tomato plants includes five state variables: “0: Before ripening”, “1: Ripe in 1–3 weeks”, “2: Ripe within one week”, “3: Fully ripened”, and “4: No detection”, as shown in Figure 5a. Plant Condition. “State 0: Before ripening”, all tomatoes are green. “State 1: Ripe in 1–3 weeks” indicates that some tomatoes in the cluster have started to turn red. The green tomatoes in the same cluster began to turn red. “State 2: Ripe within one week” indicates that most of the tomatoes in the image are half-ripened. An average farmer typically harvests in states 1 or 2, depending on the distance to be distributed [20]. “State 3: Fully ripened” indicates that most tomatoes are fully ripened, and “State 4: No detection” indicates no tomatoes. The number of tomatoes in a cluster had a significant impact on the false positives. Figure 5b shows an example of when the tomatoes overlap within a cluster. The cluster in Figure 5b consists of four tomatoes, but there is one tomato that cannot be identified because of the overlapping tomatoes. In this experimental environment, there are only two clusters in one image; therefore, there are two nodes for the number of tomatoes in a cluster. The number of tomatoes in the first cluster contains five state variables: “0”, “2”, “3”, “4”, “>=5”; the number of tomatoes in the second cluster contains three state variables: “0”, “2”, and “>=3”. For the number of Green, Half-ripened, and Fully ripened tomatoes detected, there were six state variables for Green: “0”, “1”, “2”, “3”, “4”, and “>=5”, and five state variables for Half-ripened and Fully ripened: “0”, “1”, “2”, “3”, and “>=4”. The five state variables for Half-ripened and Fully ripened were: “0”, “1”, “2”, “3”, “4”, and “>=4”.

The CPTs (Conditional Probability Tables) for each node of the Bayesian network were generated by learning from the actual data. By implementing this model, the precise number of tomatoes and the conditions of tomato plants can be predicted based on the detection of individual tomatoes. Bayesian networks also provide significant model scalability. In this study, five child nodes were used, but various other nodes might be added, such as “date”, “location of tomato plants”, “humidity”, “temperature”, and other information.

3.3. Autonomous Growth Monitoring Robot

The system components of the mobility robot are shown in Figure 6. The mobility robot comprises a web camera, servo motor, drive wheels, passive casters, mobile battery, camera module, control microcontroller, and laptop. This mobile robot has two wheels on each side and an additional caster. The mobile robot was moved around a greenhouse using a line tracing method.

Object detection and data collection of tomato plants were performed using the default settings of a web camera (C922n Pro) [21]. We utilized the CMCUcam5 Pixy2 camera module, a collaboration between Carnegie Mellon University and Charmed Labs, to perform barcode scanning and line tracing [22]. Control signals were being transmitted via Bluetooth communication from the laptop to the motor based on the detection information obtained from the camera module.

Figure 7 shows the robot moving in the greenhouse. The camera module detects the line drawn on the ground and calculates the shift in x-coordinates around the center of the image: e. The robot’s orientation was then controlled such that the x coordinate of the line was closer to the center of the image. The velocities of the left and right wheels,

v_{L}

and

v_{R}

, respectively, were calculated using (2) and (3).

v_{L} = v + K_{p} e

(2)

v_{R} = v - K_{p} e,

(3)

where

K_{p}

is the proportional gain and v is the wheel’s straight-line velocity. The difference in velocity between the left and right wheels increases in proportion to the magnitude of e, allowing the mobile robot to move in a straight line. Barcodes were painted on the ground in front of the tomato plants. When the robot detects these barcodes, it stops and captures images of the tomato plants using three cameras. To avoid the duplicate detection of the same barcode, the barcodes on the right and left sides of the line were alternately detected. After the shooting image capture was completed, the line was detected again.

4. Experiments

To confirm the effectiveness of the proposed system, verification experiments were conducted using 18 tomato plants grown in a greenhouse (Figure 7). The tomatoes used in the present study were Solanum lycopersicum L. Var. Cul-ta. Momotarou Peace (Takii & Co., Ltd., Kyoto, Japan), cultivated in solid medium culture in a glass greenhouse (9.1 × 7.3 m; 4.5 m height) from October 2021 to June 2022, at the Faculty of Agriculture, Saga University. The nutrient medium was prepared by a balanced medium prescription using commercial fertilizer (OAT Agrio Co., Ltd., Tokyo, Japan) and supplied in irrigation tubes. Therefore, barcodes were placed at 15 cm intervals so that the moving robot could stop every 15 cm to photograph the tomato plants. In the greenhouse, four rows containing seven to ten tomato plants each were photographed row-by-row. YOLACT was trained using “Laboto Tomato”, a tomato Instance Segmentation dataset published by Laboro.AI [23]. The dataset contains three classes of tomatoes—“Fully ripened”, “Half ripened”, and “Green”—classified as three levels of tomato maturity. In this study (see Figure 8), 392 images were used as training data.

In this research, the three classes of tomatoes were defined as Fully ripened: red (warm color percentage greater than 90%), Half-ripened: partially green tomatoes (warm color percentage between 30 and 89%), and Green: totally green or white (warm color percentage between 0 and 29%). To create the learning data, we used the recognition results of 287 tomato plant images captured in a greenhouse using YOLACT. The results of the recognition by YOLACT were saved in CSV format, from which only the necessary data for learning were extracted and used for training.

4.1. Experimental Results

4.1.1. Detection of Tomatoes

Tomatoes were detected using YOLACT in 18 tomato plants. The confusion matrix is listed in Table 1. Half-ripened tomatoes were misidentified as Fully ripened or Green, whereas objects that were not tomatoes were misidentified as Green. A total of 49 tomatoes were not identified due to fruit overlap. Figure 9 shows the changes in the F1-score as confidence increased during YOLACT training. The model demonstrates an overall F1-score of 84%.

The effects of context recognition using Bayesian networks are shown in Figure 10 and Figure 11. These graphs compare the number of tomatoes detected by convolutional networks (YOLACT) labeled as “Recognition result”, the number of detections corrected by context recognition, which is the result of the proposed method, a combination of YOLACT and Bayesian networks labeled as “Estimation result”, and the actual number of tomatoes labeled as “actual number”.

Figure 11 shows the number of tomatoes detected by YOLACT on 18 tomato plants. The graph shows that the number of tomatoes detected and corrected by context recognition is closer to the actual number of tomatoes. Comparing the total number of tomatoes for all 18 tomato plants, the number from the recognition result, detected by YOLACT, was 133; the estimation result, from a combination of YOLACT and the Bayesian network, was 159; and the actual number was 182, indicating that the proposed method detected 26 more tomatoes than YOLACT, which is closer to the actual number of tomatoes. These results confirm that combining convolutional and Bayesian networks enables more accurate tomato detection. However, in tomato plants 3, 6, 9, 11, 15, and 17 (see Figure 10), the difference between the actual number of tomatoes and the estimated value was more than 2. This was due to the large number of tomatoes hidden in the clusters. The sample size of the dataset must be increased to further improve accuracy.

Figure 10 indicates the number of tomatoes per tomato plant. The results show that the number of tomatoes detected and corrected by context recognition approached the actual number.

4.1.2. Recognition of Tomato Plant Status

The results of the estimation of tomato plant status are presented in Figure 12. The top plot shows the results of tomato detection using the proposed method, and the bottom graph shows the results of tomato plant condition estimation. The horizontal axis represents the number of tomato plants, and I to III represent the height of the camera. The plot size was proportional to the number of tomatoes detected. The five levels of tomato plant status are indicated by color.

The status of tomato plants was estimated almost correctly, but an error was observed during the estimation of the 17th tomato plant (Figure 13). First, at camera height III, only the color green was detected, and the status of the tomato plant was correctly identified before ripening. Green and half-ripened tomatoes were also recognized in the same cluster at camera height II and were correctly identified as ripe at 1–3 weeks. However, at camera height I, recognition was slightly inaccurate. In the image, only half-ripened and fully ripened clusters were recognized. The probability of estimation was higher in the following order: Ripe within 1 week, fully ripened, and ripe in 1–3 weeks. This may be because the dataset, which included information on plant density and ambient lighting, had not yet been sufficiently sampled. Additional samples should be added in the future. The problem of overlapping tomatoes is part of this result. As it is very difficult to completely eliminate these issues, we included a Bayesian network to increase recognition accuracy based on an empirical model. However, improving the proposed method is challenging when considering plant conditions that deviate from the empirical model, such as localized growth anomalies. Therefore, future studies should include additional samples.

4.1.3. Visualization of Information

In the developed system, an automatic monitoring robot moved around to monitor the growth of tomatoes in the entire greenhouse. Visualizing the conditions of tomato plants is extremely useful for planning harvest and cultivation. The developed system has an information–visualization function to display the monitoring results.

Figure 14 is an interface for visualizing conditions in the greenhouse. Figure 14a shows the arrangement of tomato plants in the greenhouse, with four rows of tomato plants, starting from the right. Pressing the green button displays the tomato detection results for the tomato plants in this row (Figure 14b), and pressing the yellow button visualizes the cluster information along with the tomato detection results (Figure 14c). The blue “Clear” button hides the detection results, and the “Cancel” button closes the house window.

The horizontal axis of the graph in (Figure 14b) shows the number of tomato plants, and I–III represent the heights of the cameras used to capture the images. Information such as the color, location, and number of tomatoes can be visualized in an intuitive and easy-to-understand manner. As in Figure 12, the number of tomatoes is proportional to the size of the plot. The upper part of the tomato plant (camera position III) contained no tomatoes, whereas the lower part (camera position I) contained many half-ripe and fully-ripe tomatoes. It can be inferred that the tomatoes at the top were harvested and that the tomatoes at the bottom were either ready to be harvested or still growing green.

(Figure 14c) Detailed visualization of the number of clusters, color, and number of tomatoes within a cluster. The black circles represent the clusters. For example, at camera position I on tomato plant 2, one cluster contained two Greens, one Green that was not in a cluster, and two fully ripened clusters. In tomato strain 2, the actual number of tomatoes was expected to be slightly higher than the estimated number because of overlapping and hidden tomatoes, as there were three clusters. This visualization of conditions in greenhouses reduces the labor required to make observations in the field.

5. Conclusions

This project aimed to monitor tomato growth in greenhouses by identifying the condition of entire plants and detecting multiple tomatoes with high accuracy, using a combination of convolutional networks and Bayesian networks for context recognition in tomato detection. This proposed algorithm was incorporated into an automatic robot designed to monitor tomato growth. The experiments were conducted in a greenhouse to verify the effectiveness of the proposed system and showed that context recognition could detect tomatoes with high accuracy and successfully recognize the state of tomato seedlings. An interface for visualizing the monitoring results was also constructed to provide feedback to the operator. Our proposed model, which combines convolutional and Bayesian networks, managed to increase the accuracy by 23% compared to the convolutional model alone. In addition to convolutional networks, we were unable to compare our results with those of other conventional methodologies, and we are aware that this is our limitation. However, we found that there was room for further improvement because of the insufficient sample size of the datasets on which the model was trained. In the future, we will increase the sample size and improve the model to increase the accuracy of estimating the number of tomatoes and the status of tomato plants.

Author Contributions

Methodology, M.T. (Miho Takanayagi); Validation, W.L.Y., N.Y. and H.O.; Investigation, F.A.R.; Resources, M.T. (Munehiro Tanaka) and S.I.; Visualization, T.E.; Supervision, O.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Japan Society for the Promotion of Science research fellowship, JSPS KAKENHI, grant number 24K15072. https://kaken.nii.ac.jp/ja/grant/KAKENHI-PROJECT-24K15072/ (accessed on 10 April 2024).

Data Availability Statement

The tomato data for training are available in a publicly accessible repository: https://github.com/laboroai/LaboroTomato (accessed on 14 January 2022).

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Ministry of Agriculture, Forestry and Fisheries. Agricultural Statistics. Available online: https://www.maff.go.jp/j/tokei/kouhyou/noukou/ (accessed on 7 June 2024).
Ling, X.; Zhao, Y.; Gong, L.; Liu, C.; Wang, T. Dual-arm cooperation and implementing for robotic harvesting tomato using binocular vision. Robot. Auton. Syst. 2019, 114, 134–143. [Google Scholar] [CrossRef]
Wang, L.; Zhao, B.; Fan, J.; Hu, X.; Wei, S.; Li, Y.; Zhou, Q.; Wei, C. Development of a tomato harvesting robot used in greenhouse. Int. J. Agric. Biol. Eng. 2017, 10, 140–149. [Google Scholar]
Jun, J.; Kim, J.; Seol, J.; Kim, J.; Son, H.I. Towards an Efficient Tomato Harvesting Robot: 3D Perception, Manipulation, and End-Effector. IEEE Access 2021, 9, 17631–17640. [Google Scholar] [CrossRef]
Kondo, N.; Yamamoto, K.; Shimizu, H.; Yata, K.; Kurita, M.; Shiigi, T.; Monta, M.; Nishizu, T. A Machine Vision System for Tomato Cluster Harvesting Robot. Eng. Agric. Environ. Food 2009, 2, 60–65. [Google Scholar] [CrossRef]
Arefi, A.; Motlagh, A.M.; Mollazade, K.; Teimourlou, R.F. Recognition and localization of ripen tomato based on machine vision. Aust. J. Crop. Sci. 2011, 5, 1144–1149. [Google Scholar]
Malik, M.H.; Zhang, T.; Li, H.; Zhang, M.; Shabbir, S.; Saeed, A. Mature Tomato Fruit Detection Algorithm Based on improved HSV and Watershed Algorithm. IFAC-PapersOnLine 2018, 51, 431–436. [Google Scholar] [CrossRef]
Zhao, Y.; Gong, L.; Zhou, B.; Huang, Y.; Liu, C. Detecting tomatoes in greenhouse scenes by combining AdaBoost classifier and colour analysis. Biosyst. Eng. 2016, 148, 127–137. [Google Scholar] [CrossRef]
Zhao, Y.; Gong, L.; Huang, Y.; Liu, C. Robust Tomato Recognition for Robotic Harvesting Using Feature Images Fusion. Sensors 2016, 16, 173. [Google Scholar] [CrossRef] [PubMed]
Liu, G.; Nouaze, J.C.; Mbouembe, P.L.T.; Kim, J.H. YOLO-Tomato: A Robust Algorithm for Tomato Detection Based on YOLOv3. Sensors 2020, 20, 2145. [Google Scholar] [CrossRef] [PubMed]
Su, F.; Zhao, Y.; Wang, G.; Liu, P.; Yan, Y.; Zu, L. Tomato Maturity Classification Based on SE-YOLOv3-MobileNetV1 Network under Nature Greenhouse Environment. Agronomy 2020, 12, 1638. [Google Scholar] [CrossRef]
Magalhães, S.A.; Castro, L.; Moreira, G.; Santos, F.N.; Cunha, M.; Dias, J.; Moreira, A.P. Evaluating the Single-Shot MultiBox Detector and YOLO Deep Learning Models for the Detection of Tomatoes in a Greenhouse. Sensors 2021, 21, 3569. [Google Scholar] [CrossRef] [PubMed]
Ampatzidis, Y.; Bellis, L.D.; Luvisi, A. iPathology: Robotic Applications and Management of Plants and Plant Diseases. Sustainability 2017, 9, 1010. [Google Scholar] [CrossRef]
Fuentes, A.; Yoon, S.; Kim, S.C.; Park, D.S. A Robust Deep-Learning-Based Detector for Real-Time Tomato Plant Diseases and Pests Recognition. Sensors 2017, 17, 2022. [Google Scholar] [CrossRef]
Trung-Tin, T.; Jae-Won, C.; Thien-Tu, L.H.; Jong-Wook, K. A Comparative Study of Deep CNN in Forecasting and Classifying the Macronutrient Deficiencies on Development of Tomato Plant. Appl. Sci. 2019, 9, 1601. [Google Scholar] [CrossRef]
Kocian, A.; Massa, D.; Cannazzaro, S.; Incrocci, L.; Lonardo, S.D.; Milazzo, P.; Chessa, S. Dynamic Bayesian network for crop growth prediction in greenhouses. Comput. Electron. Agric. 2020, 169, 105167. [Google Scholar] [CrossRef]
Hu, F.; Zhou, M.; Yan, P.; Liang, Z.; Li, M. A Bayesian optimal convolutional neural network approach for classification of coal and gangue with multispectral imaging. Opt. Lasers Eng. 2022, 156, 107081. [Google Scholar] [CrossRef]
Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9157–9166. [Google Scholar]
Richard, E. Neapolitan, Learning Bayesian Networks; Prentice-Hall, Inc.: Upper Saddle River, NJ, USA, 2003. [Google Scholar]
Johnson, L.K.; Bloom, J.D.; Dunning, R.D.; Gunter, C.C.; Boyette, M.D.; Creamer, N.G. Farmer harvest decisions and vegetable loss in primary production. Agric. Syst. 2019, 176, 102672. [Google Scholar] [CrossRef]
Logicool. C922n Pro Stream Webcam. Available online: https://www.logicool.co.jp/ja-jp/products/webcams/c922n-pro-stream-webcam.960-001262.html (accessed on 7 June 2024).
Charmed Labs and Carnegie Mellon. Pixy (CMUcam5): A Fast, Easy-to-Use Vision Sensor. Kickstarter. Available online: https://www.kickstarter.com/projects/254449872/pixy-cmucam5-a-fast-easy-to-use-vision-sensor (accessed on 14 February 2022).
Laboro.AI Inc. Laboro Tomato: Instance Segmentation Dataset. GitHub. Available online: https://github.com/laboroai/LaboroTomato.git (accessed on 15 January 2022).

Figure 1. System components.

Figure 2. YOLACT architecture.

Figure 3. The DBSCAN process. Black circle refers to the core points; grey circle refers to reachable points; white circle refers to noise points.

Figure 4. Constructed model.

Figure 5. Example of Bayesian network training data.

Figure 6. Autonomous mobile robot.

Figure 7. Autonomous mobile robot movement in a real environment.

Figure 8. Objects in the YOLACT dataset.

Figure 9. The F1 confidence curve of YOLACT.

Figure 10. The comparison result of recognition result (YOLACT) and estimation result (YOLACT + BAYESIAN). The estimation result has a higher number of detected tomatoes than the recognition result.

Figure 11. The difference between the estimated number and the actual number. The estimated result shows the number that is closest to the actual number.

Figure 12. Results of plant status estimation.

Figure 13. The 17th tomato plant. I to III represent the tomato status in specific height of camera. tomato plant.

Figure 14. Visualization of results.

Table 1. Confusion matrix.

		Predicted
		Fully Ripened	Half Ripened	Green	No Detection
Actual	Fully ripened	55	0	0	15
	Half ripened	4	13	1	6
	Green	0	0	59	28
	Others	0	0	1	0

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Rahman, F.A.; Takanayagi, M.; Eguchi, T.; Yeoh, W.L.; Yamaguchi, N.; Okumura, H.; Tanaka, M.; Inaba, S.; Fukuda, O. Growth Monitoring of Greenhouse Tomatoes Based on Context Recognition. AgriEngineering 2024, 6, 2043-2056. https://doi.org/10.3390/agriengineering6030119

AMA Style

Rahman FA, Takanayagi M, Eguchi T, Yeoh WL, Yamaguchi N, Okumura H, Tanaka M, Inaba S, Fukuda O. Growth Monitoring of Greenhouse Tomatoes Based on Context Recognition. AgriEngineering. 2024; 6(3):2043-2056. https://doi.org/10.3390/agriengineering6030119

Chicago/Turabian Style

Rahman, Fisilmi Azizah, Miho Takanayagi, Taiga Eguchi, Wen Liang Yeoh, Nobuhiko Yamaguchi, Hiroshi Okumura, Munehiro Tanaka, Shigeki Inaba, and Osamu Fukuda. 2024. "Growth Monitoring of Greenhouse Tomatoes Based on Context Recognition" AgriEngineering 6, no. 3: 2043-2056. https://doi.org/10.3390/agriengineering6030119

APA Style

Rahman, F. A., Takanayagi, M., Eguchi, T., Yeoh, W. L., Yamaguchi, N., Okumura, H., Tanaka, M., Inaba, S., & Fukuda, O. (2024). Growth Monitoring of Greenhouse Tomatoes Based on Context Recognition. AgriEngineering, 6(3), 2043-2056. https://doi.org/10.3390/agriengineering6030119

Article Menu

Growth Monitoring of Greenhouse Tomatoes Based on Context Recognition

Abstract

1. Introduction

2. Related Works

3. System Components

3.1. Object Detection

3.2. Context Recognition

3.3. Autonomous Growth Monitoring Robot

4. Experiments

4.1. Experimental Results

4.1.1. Detection of Tomatoes

4.1.2. Recognition of Tomato Plant Status

4.1.3. Visualization of Information

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI