Optimization of Crop Recommendations Using Novel Machine Learning Techniques
Abstract
:1. Introduction
- Hierarchical and partitioning approach to developing clusters based on factors such as location, output, productivity, etc.;
- Comparative analysis is performed to identify the best method for structuring zones into clusters;
- Recommend areas or fields with the potential to produce crops with high crop yields based on the scale value specified.
2. Materials and Methods
2.1. Dataset Overview
2.2. Proposed Framework for Determination of Yield Trend
2.3. Algorithm for Determination of Yield Trends
Algorithm 1: Determination of Yield Trends |
Input: Data, D, which is a crop yield dataset; attrlist, the set of numeric candidate attributes; Output: A set of k territories’ crop yield based on the scales defined. 1 //find the correlated variables; 2 apply FeatureSelection(D, attrList); 3 if redundantVars then 4 attrList := attrList − redundantVars; 5 end 6 var := “area”; 7 DetectionAndInspection(D, var, locations); 8 //find the best k; 9 k := kSelection(mean(D [var]),locations); 10 //construct k partitions; 11 pObj := pGroup(mean(D [var]), k, locations); 12 var := “yield”; 13 for each outcome k of the partitions do 14 let Dk be the set of observations in the partition D satisfying outcome k; DetectionAndInspection(Dk, var, locations); 15 k := kSelection(mean(Dk [var]),locations); 16 hObj := hGroup(mean(Dk [var]), k,locations); 17 end |
- Cluster validation decided by the obtained solution is precise;
- To obtain an appropriate number of clusters for the yield dataset (compactness or cluster separation).
2.3.1. Calinski–Harabasz Index
2.3.2. Average Silhouette Width Index
- The Maximum or Complete_linkage clustering measures the difference between two groups by the largest distance between any two observations in each group, and it is mathematically given Equation (5) as the distance D(X, Y) between cluster X and Y
- The Minimum or Single_linkage clustering measures the difference between two groups by the smallest distance between any two observations in each group, and it is mathematically given Equation (6) as the distance D(X, Y) between cluster X and Y
- The Average_linkage measures the difference between the two groups by the average distance between any two observations in each group, and it is mathematically given Equation (7) as the mean distance between elements of each cluster
- Ward’s method aims to minimize the total within-cluster variance. At each step, the pair of clusters with minimum between-cluster distance is merged, and it is mathematically given Equation (8) as the squared Euclidean distance between points
3. Data Analysis
3.1. Eliminating the Unwanted Observations
3.2. Removing the Impossible Data Combination
3.3. Fill in the Missing Values
3.4. Removing the Impossible Data Mixes
4. Results and Discussion
- Calinski–Harabasz Index (“CH”).
- Average Silhouette Width Index (“ASW”).
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Wolfert, S.; Ge, L.; Verdouw, C.; Bogaardt, M.-J. Big Data in Smart Farming—A Review. Agric. Syst. 2017, 153, 69–80. [Google Scholar] [CrossRef]
- Kamilaris, A.; Kartakoullis, A.; Prenafeta-Boldú, F.X. A Review on the Practice of Big Data Analysis in Agriculture. Comput. Electron. Agric. 2017, 143, 23–37. [Google Scholar] [CrossRef]
- Xu, X.; Gao, P.; Zhu, X.; Guo, W.; Ding, J.; Li, C.; Zhu, M.; Wu, X. Design of an Integrated Climatic Assessment Indicator (ICAI) for Wheat Production: A Case Study in Jiangsu Province, China. Ecol. Indic. 2019, 101, 943–953. [Google Scholar] [CrossRef]
- Filippi, P.; Jones, E.J.; Wimalathunge, N.S.; Somarathna, P.D.S.N.; Pozza, L.E.; Ugbaje, S.U.; Jephcott, T.G.; Paterson, S.E.; Whelan, B.M.; Bishop, T.F.A. An Approach to Forecast Grain Crop Yield Using Multi-Layered, Multi-Farm Data Sets and Machine Learning. Precis. Agric. 2019, 20, 1015–1029. [Google Scholar] [CrossRef]
- Van Klompenburg, T.; Kassahun, A.; Catal, C. Crop Yield Prediction Using Machine Learning: A Systematic Literature Review. Comput. Electron. Agric. 2020, 177, 105709. [Google Scholar] [CrossRef]
- Rosenzweig, C.; Jones, J.W.; Hatfield, J.L.; Ruane, A.C.; Boote, K.J.; Thorburn, P.; Antle, J.M.; Nelson, G.C.; Porter, C.; Janssen, S.; et al. The Agricultural Model Intercomparison and Improvement Project (AgMIP): Protocols and Pilot Studies. Agric. For. Meteorol. 2013, 170, 166–182. [Google Scholar] [CrossRef]
- Ciscar, J.-C.; Fisher-Vanden, K.; Lobell, D.B. Synthesis and Review: An Inter-Method Comparison of Climate Change Impacts on Agriculture. Environ. Res. Lett. 2018, 13, 070401. [Google Scholar] [CrossRef]
- Lobell, D.B.; Asseng, S. Comparing Estimates of Climate Change Impacts from Process-Based and Statistical Crop Models. Environ. Res. Lett. 2017, 12, 015001. [Google Scholar] [CrossRef]
- Schlenker, W.; Roberts, M.J. Nonlinear Temperature Effects Indicate Severe Damages to U.S. Crop Yields under Climate Change. Proc. Natl. Acad. Sci. USA 2009, 106, 15594–15598. [Google Scholar] [CrossRef] [PubMed]
- Roberts, M.J.; Schlenker, W.; Eyer, J. Agronomic Weather Measures in Econometric Models of Crop Yield with Implications for Climate Change. Am. J. Agric. Econ. 2012, 95, 236–243. [Google Scholar] [CrossRef]
- Roberts, M.J.; Braun, N.O.; Sinclair, T.R.; Lobell, D.B.; Schlenker, W. Comparing and Combining Process-Based Crop Models and Statistical Models with Some Implications for Climate Change. Environ. Res. Lett. 2017, 12, 095010. [Google Scholar] [CrossRef]
- Urban, D.W.; Sheffield, J.; Lobell, D.B. The Impacts of Future Climate and Carbon Dioxide Changes on the Average and Variability of US Maize Yields under Two Emission Scenarios. Environ. Res. Lett. 2015, 10, 045003. [Google Scholar] [CrossRef]
- Majumdar, J.; Naraseeyappa, S.; Ankalaki, S. Analysis of Agriculture Data Using Data Mining Techniques: Application of Big Data. J. Big Data 2017, 4, 20. [Google Scholar] [CrossRef]
- Crop Production Statistics for Selected States, Crops and Range of Year. Available online: https://aps.dac.gov.in/APY/Public_Report1.aspx (accessed on 2 January 2021).
- Gandhi, N.; Armstrong, L.J.; Petkar, O. Proposed decision support system (DSS) for Indian rice crop yield prediction. In Proceedings of the 2016 IEEE Technological Innovations in ICT for Agriculture and Rural Development (TIAR), Chennai, India, 15–16 July 2016; pp. 13–18. [Google Scholar] [CrossRef]
- Pearson Correlation Coefficient—Wikipedia. Available online: https://en.wikipedia.org/wiki/Pearson_correlation_coefficient (accessed on 22 March 2021).
- Wei, H. How to Measure Clustering Performances When There Are No Ground Truth? Available online: https://medium.com/@haataa/how-to-measure-clustering-performances-when-there-are-no-ground-truth-db027e9a871c (accessed on 2 January 2021).
- Torgo, L. Data Mining with R; Chapman and Hall/CRC Data Mining and Knowledge Discovery Series; CRC Press: Boca Raton, FL, USA, 2016; ISBN 978-1-315-39909-6. [Google Scholar]
- Han, J.; Kamber, M.; Pei, J. Data Mining: Concepts and Techniques; The\Morgan Kaufmann Series in Data Management Systems Ser. Morgan Kaufmann: Burlington, MA, USA, 2011; ISBN 978-0-12-381479-1. [Google Scholar]
- Williams, G.J. The Essentials of Data Science: Knowledge Discovery Using R; Chapman and Hall/CRC the R Series; Chapman & Hall/CRC: Boca Raton, FL, USA, 2017; ISBN 978-1-4987-4001-2. [Google Scholar]
- Toomey, D. R for Data Science|Packt. Available online: https://www.packtpub.com/product/r-for-data-science/9781784390860 (accessed on 22 March 2021).
- Spector, P. Stat 133 Class Notes—Spring. UC Berkeley Statistics. 2011. Available online: https://www.stat.berkeley.edu/~s133/all2011.pdf (accessed on 21 March 2022).
- Bock, T. What Is a Dendrogram? Available online: https://www.displayr.com/what-is-dendrogram/ (accessed on 22 March 2021).
Crop | Year | Season | Area | Production | Yield |
---|---|---|---|---|---|
Rice | 1998–1999 | kharif | 197 | 316 | 1.6 |
Rice | 1999–2000 | kharif | 128 | 202 | 1.58 |
Rice | 2000–2001 | kharif | 171 | 311 | 1.82 |
Rice | 2001–2002 | kharif | 171 | 411 | 2.4 |
Rice | 2001–2002 | summer | 13 | 19 | 1.46 |
Rice | 2001–2002 | total | 184 | 430 | 2.34 |
Rice | 2002–2003 | kharif | 112 | 230 | 2.05 |
Rice | 2002–2003 | summer | 15 | 16 | 1.07 |
Rice | 2002–2003 | total | 127 | 246 | 1.94 |
Rice | 2003–2004 | kharif | 93 | 210 | 2.26 |
Crop | Year | Season | Area | Production | Yield |
---|---|---|---|---|---|
moong | 2015–2016 | kharif | 1 | NA | 0 |
linseed | 2016–2017 | rabi | 2 | NA | 0 |
urad | 2005–2006 | kharif | 2 | NA | 0 |
cowpea | 2015–2016 | summer | 1 | NA | 0 |
rapeseed | 2015–2016 | kharif | 2 | NA | 0 |
urad | 2016–2017 | kharif | 1 | NA | 0 |
var | var2 | cor |
---|---|---|
production | area | 95% |
yield | production | 36% |
yield | area | 26% |
clust: | 2 | 3 | 4 | 5 | 6 |
crit: | 0.6 | 0.6 | 0.7 | 0.7 | 0.7 |
Clust | Min | Max | Mean |
---|---|---|---|
area 1 | 4 | 92,740 | 13,284 |
area 2 | 21,625 | 201,286 | 80,863 |
k | Complete | Single | Average | ward.D2 | ||||
---|---|---|---|---|---|---|---|---|
Small | Large | Small | Large | Small | Large | Small | Large | |
2 | 0.55 | 0.58 | 0.39 | 0.41 | 0.62 | 0.55 | 0.55 | 0.58 |
3 | 0.47 | 0.59 | 0.34 | 0.53 | 0.54 | 0.53 | 0.47 | 0.59 |
4 | 0.41 | 0.71 | 0.46 | 0.42 | 0.48 | 0.71 | 0.41 | 0.71 |
5 | 0.48 | 0.68 | 0.43 | 0.68 | 0.52 | 0.68 | 0.48 | 0.68 |
6 | 0.43 | 0.61 | 0.32 | 0.61 | 0.43 | 0.61 | 0.43 | 0.65 |
7 | 0.43 | 0.62 | 0.35 | 0.56 | 0.41 | 0.62 | 0.43 | 0.62 |
Small Area—Yield | Large Area—Yield | ||||||
---|---|---|---|---|---|---|---|
Clust | Min | Max | Mean | Clust | Min | Max | Mean |
zone 1 | 0.4 | 7.62 | 3.19 | zone 1 | 1.65 | 6.1 | 4.7 |
zone 2 | 2.73 | 11.6 | 7.17 | zone 2 | 4.04 | 9.29 | 6.18 |
- | - | - | - | zone 3 | 5.16 | 10.9 | 7.38 |
- | - | - | - | zone 4 | 5.62 | 12.4 | 8.65 |
Area (In Hectares) | Production (In Hectares) | Yield (In Tons) | |||
---|---|---|---|---|---|
Season: Kharif | |||||
mean: | 37,112 | mean: | 95,330 | mean: | 2.47 |
Season: Rabi | |||||
mean: | 3240.6 | mean: | 7468.4 | mean: | 2.4 |
Season: Summer | |||||
mean: | 9845.6 | mean: | 30,887 | mean: | 2.9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lahza, H.; Naveen Kumar, K.R.; Sreenivasa, B.R.; Shawly, T.; Alsheikhy, A.A.; Hiremath, A.K.; Lahza, H.F.M. Optimization of Crop Recommendations Using Novel Machine Learning Techniques. Sustainability 2023, 15, 8836. https://doi.org/10.3390/su15118836
Lahza H, Naveen Kumar KR, Sreenivasa BR, Shawly T, Alsheikhy AA, Hiremath AK, Lahza HFM. Optimization of Crop Recommendations Using Novel Machine Learning Techniques. Sustainability. 2023; 15(11):8836. https://doi.org/10.3390/su15118836
Chicago/Turabian StyleLahza, Husam, K. R. Naveen Kumar, B. R. Sreenivasa, Tawfeeq Shawly, Ahmed A. Alsheikhy, Arun Kumar Hiremath, and Hassan Fareed M. Lahza. 2023. "Optimization of Crop Recommendations Using Novel Machine Learning Techniques" Sustainability 15, no. 11: 8836. https://doi.org/10.3390/su15118836
APA StyleLahza, H., Naveen Kumar, K. R., Sreenivasa, B. R., Shawly, T., Alsheikhy, A. A., Hiremath, A. K., & Lahza, H. F. M. (2023). Optimization of Crop Recommendations Using Novel Machine Learning Techniques. Sustainability, 15(11), 8836. https://doi.org/10.3390/su15118836