Feasibility of Combining Deep Learning and RGB Images Obtained by Unmanned Aerial Vehicle for Leaf Area Index Estimation in Rice
Round 1
Reviewer 1 Report
This study developed a deep learning approach to estimate LAI from RGB images taken by a UAV. The technique is novel and can achieve high accuracy. However, I think there are several limitations.
First, the significance is of this study is not well elaborated. The deep learning approach does not estimate LAI directly, but requires destructive LAI ground truth. To build a representitive model, a large amount of ground truth needs to be collected along with collocated UAV images, which is unrealistic. Even if we have adequate ground truth data, why not just using high spatiotemporal resolution multispectral satellite data, e.g., CubeSat or even HLS, to scale up?
Second, the experiment is not well designed. The author only compared super-complex deep learning approach with super-simple linear/exponential/logarithmic approaches. This cannot fully demonstrate the advantage of deep learning approach. I suggest user incorporate some commonly-used machine learning approaches, such as PLSR, ANN and RF, with 9 CIs + RGB + multiple textural features as inputs. In this way, we can see whether the deep learning has significant improvement over well-designed machine learning or not.
Third, the deep learning part is too black-boxed and lack of justification and interpretation. How was the complex network architectures and various model parameters determined? What's the sensitivity of these configurations? What's the performance for training/validation/testing datasets, respectively? What are the underlying mechanism and potential feature leading to better performance of the deep learning? How many (n=48 in this study) and what kind (8 hills (60 cm x 60 cm) in this study) of ground truth is needed. I had rich experience in destructive LAI measurements for rice. I'd say the ground truth data in this study is extremely costly.
Fourth, the comparison with PCA is unfair. PCA does not need destructive LAI as ground truth. The data collection is pretty easy. LAI-2200 not that expensive compared to a high-quality UAV, and by using DHP and DCP cameras (Ryu et al., 2010, 2012) the cost is even lower. It's a scalable solution. For LAI-2200, the underestimation issue can be mitigated by using 4-ring data instead of 5-ring data (Fang et al., 2014; AFM).
Overall, I feel this study looks like playing a sexy tech without a solid science behind it. Hope the authors can think more and deeper.
Author Response
Dear Editorial Office
Enclosed is our manuscript entitled "Feasibility of combining deep learning and RGB images obtained by UAV for LAI estimation in rice" by Tomoaki Yamaguchi, Yukie Tanaka, Yuto Imachi, Megumi Yamashita and Keisuke Katsura. This is the first study to demonstrate that the model using deep learning algorithm with RGB images obtained by UAV could estimate rice LAI as accurately as conventional methods, such as using vegetation indices obtained from multispectral camera, which shows that our model could be the alternative to estimate crop growth. We declare that all the materials and results are original, that no part has been submitted in any other journal, and that all authors agree to submission. This is the second draft, with revisions based on comments from reviewer. The changes are described below. We think we are able to provide a satisfactory response.
We are grateful to reviewer #1 and #2 for the critical comments and valuable suggestions that have helped us to improve our paper. Based on the second comments of the Reviewer #1, we performed additional analysis of machine-learning algorithms and show the results in Table 5. Also, Figure 12 was revised to compare the machine-learning algorithm and other estimation methods. The details, including responses to other comments, have been revised as follows.
Reviewer 1
- First, the significance is of this study is not well elaborated. The deep learning approach does not estimate LAI directly, but requires destructive LAI ground truth. To build a representative model, a large amount of ground truth needs to be collected along with collocated UAV images, which is unrealistic. Even if we have adequate ground truth data, why not just using high spatiotemporal resolution multispectral satellite data, e.g., CubeSat or even HLS, to scale up?
Response:
We think destructive sampling is necessary because accurate ground truth data is needed to develop a precise estimation model. However, as you mentioned, it is difficult to collect a large amount of ground data by direct sampling. We made a comment that it would be possible to obtain a ground truth easily by using PCA in a correct way (L379 - 382).
One of the significances of this study is that a visible camera mounted on a UAV enables detailed monitoring of the growth of rice. Hence, we added more information about the advantages of UAV in the introduction section briefly (L56 - 58). Also, we put the future perspectives of UAV image analysis for scaling up to satellite level in the discussion section (L356 - 359).
- Second, the experiment is not well designed. The author only compared super-complex deep learning approach with super-simple linear/exponential/logarithmic approaches. This cannot fully demonstrate the advantage of deep learning approach. I suggest user incorporate some commonly-used machine learning approaches, such as PLSR, ANN and RF, with 9 CIs + RGB + multiple textural features as inputs. In this way, we can see whether the deep learning has significant improvement over well-designed machine learning or not.
Response:
We really appreciate your comments. In this revised version, we newly evaluated four kinds of machine-learning algorithms (artificial neural network, partial least squares regression, random forest and support vector regression) for the comparison with DL approach. As a result, DL was superior to other machine-learning algorithms and its potential was shown. I added the explanation of model development (L186 - 193). Table 5 was revised to show the results of the estimation accuracy of each model and added comment in results section (L259 - 269). Figure 12 was also revised and brief discussions about comparison of estimation accuracy of machine-learning algorithms and other methods were developed (L315, L329 - 335).
- Third, the deep learning part is too black-boxed and lack of justification and interpretation. How was the complex network architectures and various model parameters determined? What's the sensitivity of these configurations? What's the performance for training/validation/testing datasets, respectively? What are the underlying mechanism and potential feature leading to better performance of the deep learning? How many (n=48 in this study) and what kind (8 hills (60 cm x 60 cm) in this study) of ground truth is needed. I had rich experience in destructive LAI measurements for rice. I'd say the ground truth data in this study is extremely costly.
Response
As you pointed out, further study is needed to determine the optimal network structure. In this study, however, ResNeXt was used for estimation model development from images because it is an improved version of ResNet, which brought a breakthrough in the task of image recognition. It is also one of the reasons that ResNeXt showed its high potential in agricultural research (e.g. Afonso et al., 2019; Liu et al., 2020). We added more explanation about these contents (L200 - 203) and Table 6 of the information of hyper parameters of DL to improve the reproducibility (L215 - 216). Moreover, we added the results of the training data indicating that the training was conducted correctly (L273 - 275, 286 - 288).
- Fourth, the comparison with PCA is unfair. PCA does not need destructive LAI as ground truth. The data collection is pretty easy. LAI-2200 not that expensive compared to a high-quality UAV, and by using DHP and DCP cameras (Ryu et al., 2010, 2012) the cost is even lower. It's a scalable solution. For LAI-2200, the underestimation issue can be mitigated by using 4-ring data instead of 5-ring data (Fang et al., 2014; AFM).
Response:
We agree with you that PCA is a powerful tool to get ground-truth data for LAI with low labor input. However, as Fang et al. (2014) shows, it is true that some corrections may be necessary to use PCA to get accurate data and the correction would differ depending on environmental conditions and genotypes. Therefore, we added the discussion about that (L379 - 382).
Reviewer 2
- I can point two concerns about the work. First one, it is not clear when the authors use DN (Digital Number) or reflectance. It is very crucial due to the IAF is one indirect measure which is possible access by VIs. However, the relationship is between area and physical aspects of the electromagnetic radiation after interacting with the target. In works like that the physical measurements should be well clarified.
Response:
Thank you for your advice. Since it was unclear that reflectance of multispectral camera was used for VIs calculation and DN of RGB camera was used for CIs calculation, we added the explanation to make it more clearly (L136-137, L154).
- The second point the paper must be clear, through the discussion that this kind of measurement is only possible in crops where the target is relatively homogenous and in small areas. In sugarcane for example the variability is very high and it is not possible to reach the same conclusion using only a few samples. Then, the discussion should increase. Considering big area, the use of drones is not feasible, please take in consideration this aspect as well.
Response:
As you mentioned, this study is limited to relatively homogenous condition. However, we believe that we can develop more robust estimation model for various agricultural condition in the future because DL is known to have a potential to develop a general model by inputting various data (L351 - 354). Although it is sure that this research is limited in small areas, there is possibility to scale up to wider field by conducting a research of various altitudes using fixed-wing UAVs. Therefore, we added the future perspective in the discussion section (L354 - 357).
Sincerely yours,
Keisuke Katsura
International Environmental and Agricultural Sciences,
The Graduate School of Agriculture,
Tokyo University of Agriculture and Technology
3-5-8, Saiwai-cho, Fuchu, Tokyo, 183-8509, Japan
Tel: +81-42-367-5952
Email: [email protected]
Reviewer 2 Report
Article : Feasibility of combining deep learning and RGB 2 images obtained by UAV for LAI estimation in rice has the goal to find relationship between drone measurements using CI and VI´s and IAF.
I can point two concerns about the work. First one, it is not clear when the authors use DN (Digital Number ) or reflectance. It is very crucial due to the IAF is one indirect measure which is possible access by VIs., However the relationship is between area and physical aspects of the electromagnetic radiation after interacting with the target. In works like that the physical measurements should be well clarified .
THe second point the paper must be clear, through the discussion that this kind of measurement is only possible in crops where the target is relatively homogenous and in small areas. In sugarcane for example the variability is very high and it is not possible to reach the same conclusion using only a few samples.Then, the discussion should increase. Considering big areas the use of drones is not feasible, please take in consideration this aspect as well.
Thanks in advance with this support,
Best regards.
Please, see the annex with my commnets and questions.
Comments for author File: Comments.pdf
Author Response
Dear Editorial Office
Enclosed is our manuscript entitled "Feasibility of combining deep learning and RGB images obtained by UAV for LAI estimation in rice" by Tomoaki Yamaguchi, Yukie Tanaka, Yuto Imachi, Megumi Yamashita and Keisuke Katsura. This is the first study to demonstrate that the model using deep learning algorithm with RGB images obtained by UAV could estimate rice LAI as accurately as conventional methods, such as using vegetation indices obtained from multispectral camera, which shows that our model could be the alternative to estimate crop growth. We declare that all the materials and results are original, that no part has been submitted in any other journal, and that all authors agree to submission. This is the second draft, with revisions based on comments from reviewer. The changes are described below. We think we are able to provide a satisfactory response.
We are grateful to reviewer #1 and #2 for the critical comments and valuable suggestions that have helped us to improve our paper. Based on the second comments of the Reviewer #1, we performed additional analysis of machine-learning algorithms and show the results in Table 5. Also, Figure 12 was revised to compare the machine-learning algorithm and other estimation methods. The details, including responses to other comments, have been revised as follows.
Reviewer 1
- First, the significance is of this study is not well elaborated. The deep learning approach does not estimate LAI directly, but requires destructive LAI ground truth. To build a representative model, a large amount of ground truth needs to be collected along with collocated UAV images, which is unrealistic. Even if we have adequate ground truth data, why not just using high spatiotemporal resolution multispectral satellite data, e.g., CubeSat or even HLS, to scale up?
Response:
We think destructive sampling is necessary because accurate ground truth data is needed to develop a precise estimation model. However, as you mentioned, it is difficult to collect a large amount of ground data by direct sampling. We made a comment that it would be possible to obtain a ground truth easily by using PCA in a correct way (L379 - 382).
One of the significances of this study is that a visible camera mounted on a UAV enables detailed monitoring of the growth of rice. Hence, we added more information about the advantages of UAV in the introduction section briefly (L56 - 58). Also, we put the future perspectives of UAV image analysis for scaling up to satellite level in the discussion section (L356 - 359).
- Second, the experiment is not well designed. The author only compared super-complex deep learning approach with super-simple linear/exponential/logarithmic approaches. This cannot fully demonstrate the advantage of deep learning approach. I suggest user incorporate some commonly-used machine learning approaches, such as PLSR, ANN and RF, with 9 CIs + RGB + multiple textural features as inputs. In this way, we can see whether the deep learning has significant improvement over well-designed machine learning or not.
Response:
We really appreciate your comments. In this revised version, we newly evaluated four kinds of machine-learning algorithms (artificial neural network, partial least squares regression, random forest and support vector regression) for the comparison with DL approach. As a result, DL was superior to other machine-learning algorithms and its potential was shown. I added the explanation of model development (L186 - 193). Table 5 was revised to show the results of the estimation accuracy of each model and added comment in results section (L259 - 269). Figure 12 was also revised and brief discussions about comparison of estimation accuracy of machine-learning algorithms and other methods were developed (L315, L329 - 335).
- Third, the deep learning part is too black-boxed and lack of justification and interpretation. How was the complex network architectures and various model parameters determined? What's the sensitivity of these configurations? What's the performance for training/validation/testing datasets, respectively? What are the underlying mechanism and potential feature leading to better performance of the deep learning? How many (n=48 in this study) and what kind (8 hills (60 cm x 60 cm) in this study) of ground truth is needed. I had rich experience in destructive LAI measurements for rice. I'd say the ground truth data in this study is extremely costly.
Response
As you pointed out, further study is needed to determine the optimal network structure. In this study, however, ResNeXt was used for estimation model development from images because it is an improved version of ResNet, which brought a breakthrough in the task of image recognition. It is also one of the reasons that ResNeXt showed its high potential in agricultural research (e.g. Afonso et al., 2019; Liu et al., 2020). We added more explanation about these contents (L200 - 203) and Table 6 of the information of hyper parameters of DL to improve the reproducibility (L215 - 216). Moreover, we added the results of the training data indicating that the training was conducted correctly (L273 - 275, 286 - 288).
- Fourth, the comparison with PCA is unfair. PCA does not need destructive LAI as ground truth. The data collection is pretty easy. LAI-2200 not that expensive compared to a high-quality UAV, and by using DHP and DCP cameras (Ryu et al., 2010, 2012) the cost is even lower. It's a scalable solution. For LAI-2200, the underestimation issue can be mitigated by using 4-ring data instead of 5-ring data (Fang et al., 2014; AFM).
Response:
We agree with you that PCA is a powerful tool to get ground-truth data for LAI with low labor input. However, as Fang et al. (2014) shows, it is true that some corrections may be necessary to use PCA to get accurate data and the correction would differ depending on environmental conditions and genotypes. Therefore, we added the discussion about that (L379 - 382).
Reviewer 2
- I can point two concerns about the work. First one, it is not clear when the authors use DN (Digital Number) or reflectance. It is very crucial due to the IAF is one indirect measure which is possible access by VIs. However, the relationship is between area and physical aspects of the electromagnetic radiation after interacting with the target. In works like that the physical measurements should be well clarified.
Response:
Thank you for your advice. Since it was unclear that reflectance of multispectral camera was used for VIs calculation and DN of RGB camera was used for CIs calculation, we added the explanation to make it more clearly (L136-137, L154).
- The second point the paper must be clear, through the discussion that this kind of measurement is only possible in crops where the target is relatively homogenous and in small areas. In sugarcane for example the variability is very high and it is not possible to reach the same conclusion using only a few samples. Then, the discussion should increase. Considering big area, the use of drones is not feasible, please take in consideration this aspect as well.
Response:
As you mentioned, this study is limited to relatively homogenous condition. However, we believe that we can develop more robust estimation model for various agricultural condition in the future because DL is known to have a potential to develop a general model by inputting various data (L351 - 354). Although it is sure that this research is limited in small areas, there is possibility to scale up to wider field by conducting a research of various altitudes using fixed-wing UAVs. Therefore, we added the future perspective in the discussion section (L354 - 357).
Sincerely yours,
Keisuke Katsura
International Environmental and Agricultural Sciences,
The Graduate School of Agriculture,
Tokyo University of Agriculture and Technology
3-5-8, Saiwai-cho, Fuchu, Tokyo, 183-8509, Japan
Tel: +81-42-367-5952
Email: [email protected]
Reviewer 3 Report
RGB images are the most common UAV data. The authors propose a deep learning method for LAI estimation using UAV RGB images over paddy field. The manuscript first analyzes and compares the error conditions when the input parameters are different CIs and VIs, then uses the fully connected network and the convolutional neural network to estimate LAI. The work is abundant, but some specific details and logical analysis are absent. There are some low-level errors in the text, including but not limited to spelling, the name of the figure, and the format of reference citation, etc. A careful revision is necessary for possible publication.
Major concerns:
- First of all, the authors think "LAI plays an important role in crop growth estimation and yield prediction", then I would like to know what’s the accuracy of LAI estimation can be accepted in applications, and whether the improvements are meaningful.
- When establishing the deep neural network, the authors input 9 CIs and RGB images. It is recommended to use sensitivity analysis methods to analyze these 9 parameters. When inputting RGB images, did you control the data quality of the images? If so, please give a breif description.
- In my cognition, when the author uses RGB images for convolutional network processing, they actually give the network both spatial information and RBG three-band spectral information.In the process of comparing the results of regression methods, the author found that the parameters obtained by multispectral cameras have more advantages than RGB cameras. For the traditional analysis method, it simply uses the spectral information of each pixel to estimate the LAI, while the spectral information of the multi-spectral camera is more abundant. So the results usually perform better. Can the authors try to combine the DL method on the basis of the multi-spectral camera to see if the accuracy can be further improved?
- In the input data preparation stage, the authors used methods such as changing the brightness, flipping left and right, and upside down to expand the image samples. However, for the parameters of the deep network, the sample size of 576 samples obtained after the expansion of 48 samples is still small. One question is why only 0.7 and 1.4 times the brightness is used? It is recommended to increase the brightness variation range and increase the result of image rotation to further expand the capacity.
Minor comments:
- Please give a brief introduction of ResNeXt
- Please use a picture to show the whole study area, and mark the area of flight and the area where the LAI is manually measured in the picture.
- The author’s process of destructive sampling on the ground to measure the true value of LAI is not described in detail in the manuscript, and there is no corresponding data details but the results are given. I hope the author can add the data details in the attachment.
- Line 24, what’s “PCR”?
- Line 33, do you mean direct LAI measurement?
- Line106, The 80% overlap rate here refers to forward OR lateral overlap? or both?
- Line107, “LAI was measured at 10 points”, please show the points in figures.
- Line148, 12 VIs means which?
- Line196 and 198 , “Error! Reference source not found” Please check these kind of problems!
- Line 222, it should be figure 7.
- Figure 6, please stagger the marks of the polylines, they are all overlapped now and cannot be distinguished.
- Line250, figure 2?
- [1], wrong format and content.
Author Response
2, Dec, 2020
Dear Editorial Office
Enclosed is our manuscript entitled "Feasibility of combining deep learning and RGB images obtained by UAV for LAI estimation in rice" by Tomoaki Yamaguchi, Yukie Tanaka, Yuto Imachi, Megumi Yamashita and Keisuke Katsura. This is the first study to demonstrate that the model using deep learning algorithm with RGB images obtained by UAV could estimate rice LAI as accurately as conventional methods, such as using vegetation indices obtained from multispectral camera, which shows that our model could be the alternative to estimate crop growth. We declare that all the materials and results are original, that no part has been submitted in any other journal, and that all authors agree to submission. This is the second draft, with revisions based on comments from reviewer. The changes are described below. We think we are able to provide a satisfactory response.
We are grateful to the three reviewers for the critical comments and valuable suggestions that have helped us to improve our paper. Especially, based on the second comments of the Reviewer #1, we performed additional analysis of machine-learning algorithms and show the results in Table 5. Also, Figure 12 was revised to compare the machine-learning algorithm and other estimation methods. The details, including responses to other comments, have been revised as follows.
Reviewer 1
- First, the significance is of this study is not well elaborated. The deep learning approach does not estimate LAI directly, but requires destructive LAI ground truth. To build a representative model, a large amount of ground truth needs to be collected along with collocated UAV images, which is unrealistic. Even if we have adequate ground truth data, why not just using high spatiotemporal resolution multispectral satellite data, e.g., CubeSat or even HLS, to scale up?
Response:
We think destructive sampling is necessary because accurate ground truth data is needed to develop a precise estimation model. However, as you mentioned, it is difficult to collect a large amount of ground data by direct sampling. We made a comment that it would be possible to obtain a ground truth easily by using PCA in a correct way (L396-400).
One of the significances of this study is that a visible camera mounted on a UAV enables detailed monitoring of the growth of rice. Hence, we added more information about the advantages of UAV in the introduction section briefly (L59-62). Also, we put a comment about resolution and future perspectives of UAV image analysis for scaling up to wider area in the discussion and conclusion section (L370-372, 375-376, 407-410).
- Second, the experiment is not well designed. The author only compared super-complex deep learning approach with super-simple linear/exponential/logarithmic approaches. This cannot fully demonstrate the advantage of deep learning approach. I suggest user incorporate some commonly-used machine learning approaches, such as PLSR, ANN and RF, with 9 CIs + RGB + multiple textural features as inputs. In this way, we can see whether the deep learning has significant improvement over well-designed machine learning or not.
Response:
We really appreciate your comments. In this revised version, we newly evaluated four kinds of machine-learning algorithms (artificial neural network, partial least squares regression, random forest and support vector regression) for the comparison with DL approach. As a result, DL was superior to other machine-learning algorithms and its potential was shown. I added the explanation of model development (L197-204). Table 5 was revised to show the results of the estimation accuracy of each model and added comment in results section (L271 - 280). Figure 12 was also revised and brief discussions about comparison of estimation accuracy of machine-learning algorithms and other methods were developed (L327, L341-347).
- Third, the deep learning part is too black-boxed and lack of justification and interpretation. How was the complex network architectures and various model parameters determined? What's the sensitivity of these configurations? What's the performance for training/validation/testing datasets, respectively? What are the underlying mechanism and potential feature leading to better performance of the deep learning? How many (n=48 in this study) and what kind (8 hills (60 cm x 60 cm) in this study) of ground truth is needed. I had rich experience in destructive LAI measurements for rice. I'd say the ground truth data in this study is extremely costly.
Response:
As you pointed out, further study is needed to determine the optimal network structure. In this study, however, ResNeXt was used for estimation model development from images because it is an improved version of ResNet, which brought a breakthrough in the task of image recognition. It is also one of the reasons that ResNeXt showed its high potential in agricultural research (e.g. Afonso et al., 2019; Liu et al., 2020). We added more explanation about these contents (L213 - 217) and Table 6 of the information of hyper parameters of DL to improve the reproducibility (L226-227). Moreover, we added the results of the training data indicating that the training data was fitted (L285-287, 298-300).
- Fourth, the comparison with PCA is unfair. PCA does not need destructive LAI as ground truth. The data collection is pretty easy. LAI-2200 not that expensive compared to a high-quality UAV, and by using DHP and DCP cameras (Ryu et al., 2010, 2012) the cost is even lower. It's a scalable solution. For LAI-2200, the underestimation issue can be mitigated by using 4-ring data instead of 5-ring data (Fang et al., 2014; AFM).
Response:
We agree with you that PCA is a powerful tool to get ground-truth data for LAI with low labor input. However, as Fang et al. (2014) shows, it is true that some corrections may be necessary to use PCA to get accurate data and the correction would differ depending on environmental conditions and genotypes. Therefore, we added the discussion about that (L396-400).
Reviewer 2
- I can point two concerns about the work. First one, it is not clear when the authors use DN (Digital Number) or reflectance. It is very crucial due to the IAF is one indirect measure which is possible access by VIs. However, the relationship is between area and physical aspects of the electromagnetic radiation after interacting with the target. In works like that the physical measurements should be well clarified.
Response:
Thank you for your advice. Since it was unclear that reflectance of multispectral camera was used for VIs calculation and DN of RGB camera was used for CIs calculation, we added the explanation to make it more clearly (L147-148, 165).
- The second point the paper must be clear, through the discussion that this kind of measurement is only possible in crops where the target is relatively homogenous and in small areas. In sugarcane for example the variability is very high and it is not possible to reach the same conclusion using only a few samples. Then, the discussion should increase. Considering big area, the use of drones is not feasible, please take in consideration this aspect as well.
Response:
As you mentioned, this study is limited to relatively homogenous condition. However, we believe that we can develop more robust estimation model for various agricultural condition in the future because DL is known to have a potential to develop a general model by inputting various data (L365-368). Although it is sure that this research is limited in small areas, there is possibility to scale up to wider field by conducting a research of various altitudes using fixed-wing UAVs. Therefore, we added the future perspective in the discussion section (L370-372, 375-376) and revised conclusions (L407-410).
- L:311
This result was consistent with the previous studies by Maruyama et al. [54] and Fang et al. [55], 311 which reported that PCA underestimates the LAI measurements of rice canopy.
-> Here, the authors should explain in which part of the season (life season) the measurement were done). Was it the same of this study?
Response:
We appreciate your comments. Both of these studies monitored rice LAI throughout the growth stage (from transplanting till maturity). We added the comments "throughout the growth stage" at the end of the sentence (L388).
Reviewer 3
Major concerns:
- First of all, the authors think "LAI plays an important role in crop growth estimation and yield prediction", then I would like to know what’s the accuracy of LAI estimation can be accepted in applications, and whether the improvements are meaningful.
Response:
We think this is a very important point. However, we think it depends on the purpose for which the data are used. Therefore, in this study, we have tried to establish a method to measure as accurately as possible using RGB image data. We ask for your kind understanding. In order to emphasize the importance of accurate LAI estimation, we added the comments in introduction (L33-36).
- When establishing the deep neural network, the authors input 9 CIs and RGB images. It is recommended to use sensitivity analysis methods to analyze these 9 parameters. When inputting RGB images, did you control the data quality of the images? If so, please give a breif description.
Response:
Certainly, it would be valuable to find out which parameters make more contribution to the estimation accuracy with sensitivity analysis. However, the purpose of this paper was to show that sufficient explanatory accuracy can be achieved with RGB image data alone by comparing various estimation methods. Therefore, we thought conducting the sensitivity test was a bit far from the purpose, so it was omitted. We would appreciate your understanding. Regarding to the quality of the image data, we did not control that. However, we think resolution of images are very important point to apply the estimation model to wider area. We added the comments (L370-372, 375-376) and revised conclusions (L407-410).
- In my cognition, when the author uses RGB images for convolutional network processing, they actually give the network both spatial information and RBG three-band spectral information. In the process of comparing the results of regression methods, the author found that the parameters obtained by multispectral cameras have more advantages than RGB cameras. For the traditional analysis method, it simply uses the spectral information of each pixel to estimate the LAI, while the spectral information of the multi-spectral camera is more abundant. So the results usually perform better. Can the authors try to combine the DL method on the basis of the multi-spectral camera to see if the accuracy can be further improved?
Response:
Thank you for your valuable feedback. Using multispectral images for deep learning, the accuracy of LAI prediction is further improved. However, the purpose of this paper is to show clearly that the use of RGB images that are easily accessible to everyone can achieve the same level of accuracy in predicting LAI as the conventional method using a multispectral camera. For the sake of clarity in the abstract of the paper, I would like to omit some of the points you raised. We, however, think your point is an important one and we've included it in L372-374 for future work.
- In the input data preparation stage, the authors used methods such as changing the brightness, flipping left and right, and upside down to expand the image samples. However, for the parameters of the deep network, the sample size of 576 samples obtained after the expansion of 48 samples is still small. One question is why only 0.7 and 1.4 times the brightness is used? It is recommended to increase the brightness variation range and increase the result of image rotation to further expand the capacity.
Response:
Thank you for your valuable feedback. We considered increasing the number of image data, but the number of data was sufficient to fulfill the purpose of this study, so we decided not to increase the variation range in the brightness. However, by increasing the number of image data by changing brightness, flipping, etc., the accuracy could be further increase depending on the situation. We added the comments in L368-370.
Minor comments:
- Please give a brief introduction of ResNeXt
We added the explanation (L214-217)
- Please use a picture to show the whole study area, and mark the area of flight and the area where the LAI is manually measured in the picture.
In addition to this experiment, there are other experimental materials grown in one field, and these fields are photographed together by UAV. Because of the conflicts with these other experiments, the image of the whole field is not shown here. The flight altitude of the UAVs and the overlap rate of the images are described here, so there is no problem.
- The author’s process of destructive sampling on the ground to measure the true value of LAI is not described in detail in the manuscript, and there is no corresponding data details but the results are given. I hope the author can add the data details in the attachment.
The procedures for obtaining destructive LAI sampling is described in L118-121, and LAI data is shown in Figure 7 (L235-238), even though shown only the average values for each treatment. Could you please tell us what kind of data is further required?
- Line 24, what’s “PCR”?
Thank you for your comment. PCR is a mistake of PCA. We revised the manuscript (L24 and 407).
- Line 33, do you mean direct LAI measurement?
Thank you for your comment. Yes, we mean direct LAI measurement. We revised the manuscript (L36).
- Line106, The 80% overlap rate here refers to forward OR lateral overlap? or both?
Thank you for your comment. The overlap value is for forward and lateral. We revised the manuscript (L112-113).
- Line107, “LAI was measured at 10 points”, please show the points in figures.
Thank you for your comment. We revised the manuscript and added Figure. 1 (L115-116, 124-130).
- Line148, 12 VIs means which?
Please see the text in L161-164 and the caption of Table 2 (L168-169). Three patterns of reflectance combination were substituted into the equations of four kinds of VIs, so in total, 12 VIs were calculated.
- Line196 and 198 , “Error! Reference source not found” Please check these kind of problems!
Thank you for your comment. We have checked the reference list again.
- Line 222, it should be figure 7.
Thank you for your comment. We revised all in the manuscript.
- Figure 6, please stagger the marks of the polylines, they are all overlapped now and cannot be distinguished.
Thank you for your comment. We revised Figure 7 so that you can distinguish markers more clearly.
- Line250, figure 2?
Thank you for your comment. We have checked the manuscript.
- [1], wrong format and content.
Thank you for your comment. We revised the reference (L424-426).
Sincerely yours,
Keisuke Katsura
International Environmental and Agricultural Sciences,
The Graduate School of Agriculture,
Tokyo University of Agriculture and Technology
3-5-8, Saiwai-cho, Fuchu, Tokyo, 183-8509, Japan
Tel: +81-42-367-5952
Email: [email protected]
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
The manuscript has been improved and can be published on Remote Sensing.
Author Response
We are grateful to the three reviewers for the critical comments and valuable suggestions that have helped us to improve our paper. For the Reviewer 1 and 2, we think there is any more points that need to be revised. For the Reviewer 3, we think almost all the comments are the same with the previous versions except for two minor comments. Therefore, we will resend the same response as we sent last time. New points are shown in red, and new correspondence has been added (see attached file).
Reviewer 2 Report
I have no more comments.
Author Response
We are grateful to the three reviewers for the critical comments and valuable suggestions that have helped us to improve our paper. For the Reviewer 1 and 2, we think there is any more points that need to be revised. For the Reviewer 3, we think almost all the comments are the same with the previous versions except for two minor comments. Therefore, we will resend the same response as we sent last time. New points are shown in red, and new correspondence has been added (see attached file).
Author Response File: Author Response.docx
Reviewer 3 Report
RGB images are the most common UAV data. The authors propose a deep learning method for LAI estimation using UAV RGB images over paddy field. The manuscript first analyzes and compares the error conditions when the input parameters are different CIs and VIs, then uses the fully connected network and the convolutional neural network to estimate LAI. Finally, the author compared the accuracy of six different methods and explain results in detail. The work is abundant, but some specific details and logical analysis are absent. There are still some errors in reference citation, a careful revision is necessary for possible publication.
Major concerns:
- First of all, the authors think "LAI plays an important role in crop growth estimation and yield prediction", then I would like to know what’s the accuracy of LAI estimation can be accepted in applications, and whether the improvements are meaningful.
- When establishing the deep neural network, the authors input 9 CIs and RGB images. It is recommended to use sensitivity analysis methods to analyze these 9 parameters. When inputting RGB images, did you control the data quality of the images? If so, please give a breif description.
- In my cognition, when the author uses RGB images for convolutional network processing, they actually give the network both spatial information and RBG three-band spectral information.In the process of comparing the results of regression methods, the author found that the parameters obtained by multispectral cameras have more advantages than RGB cameras. For the traditional analysis method, it simply uses the spectral information of each pixel to estimate the LAI, while the spectral information of the multi-spectral camera is more abundant. So the results usually perform better. Can the authors try to combine the DL method on the basis of the multi-spectral camera to see if the accuracy can be further improved?
- In the input data preparation stage, the authors used methods such as changing the brightness, flipping left and right, and upside down to expand the image samples. However, for the parameters of the deep network, the sample size of 576 samples obtained after the expansion of 48 samples is still small. One question is why only 0.7 and 1.4 times the brightness is used? It is recommended to increase the brightness variation range and increase the result of image rotation to further expand the capacity.
Minor comments:
- Please give a brief introduction of ResNeXt
- Please use a picture to show the whole study area, and mark the area of flight and the area where the LAI is manually measured in the picture.
- The author’s process of destructive sampling on the ground to measure the true value of LAI is not described in detail in the manuscript, and there are no corresponding data details but the results are given. I hope the author can add the data details in the attachment.
- Line 33, do you mean direct LAI measurement?
- Line106, the 80% overlap rate here refers to forward OR lateral overlap? or both?
- Line107, “LAI was measured at 10 points”, please show the points in figures.
- Figure 6, please stagger the marks of the polylines, they are all overlapped now and cannot be distinguished.
- [1], wrong format and content.
- [14 ], please do not cite articles that have not been published.
- [30], Ref.[48], wrong format and content.
Author Response
We are grateful to the three reviewers for the critical comments and valuable suggestions that have helped us to improve our paper. For the Reviewer 1 and 2, we think there is any more points that need to be revised. For the Reviewer 3, we think almost all the comments are the same with the previous versions except for two minor comments. Therefore, we will resend the same response as we sent last time. New points are shown in red, and new correspondence has been added (see attached file).
Author Response File: Author Response.pdf