1. Introduction
The applications of operations research are numerous and belong to a wide variety of fields of knowledge. One of these fields of knowledge is agriculture [
1,
2]. In particular, the part of agriculture that deals with crop planning to obtain crops with specific physical and chemical characteristics is reflected in the products’ quality. This area is known as precision agriculture.
In [
3], Ortega et al. proposed the following problem. A farmer who wants to know how much fertilizer he should use in one of his crops takes a sample of the crop soil area of interest. Depending on the levels of nutrients detected in the sample, an amount of fertilizer will be necessary. However, it is a fact that farmland does not have uniform soil properties, so it is necessary to take different samples to deal with this variability of soil properties.
The Site Specific Management Zone (SSMZ) problem allows us to not have to make too many changes in the necessary fertilizer levels since a relative variance is considered between samples that belong to the same region. There is an indicator to measure this relative variance globally, and it is known as homogeneity. The farmers consider this parameter to know if the delimited regions have a slight variance concerning the samples that form them [
3]. The delimited regions’ shape is directly designed to work with current and massively used agricultural technology (tractors).
The weather is a factor that can alter soil properties and affect subsequent crops. Suppose we want to consider how much the climate affects these properties over time. In that case, we think of a robust delimitation problem where the delimitation is simultaneously resistant to changes in all soil properties. This work is limited to specific measurements of soil properties and delimitation that consider the properties separately.
The SSMZ manages soil, pests, and crops based on the spatial variation within a field. Therefore, nutrient management plans for site-specific situations should minimize undesired environmental effects while optimizing whole-farm profits and production. The SSMZ comprises delimited areas according to specific physicochemical characteristics of the soil, so the planned crop optimally uses them. This decision about soil allows us to take advantage of the characteristics according to the crop. However, Corwin [
4] points out that it is essential to consider the complex interactions between the following factors:
Edaphic, such as salinity, nutrients;
Biological, earthworms, microbes, etc;
Anthropogenic, e.g., irrigation management, among others;
Topographic, e.g., slope elevation;
Meteorological, e.g., temperature, rainfall;
How to divide the facility into regions,
where the facility is the land to be studied. In [
5], Ortega and Santibáñez solved a case study related to corn crops using clustering techniques in an irregular-shaped facility based on a k-means proposed by the same authors in [
6]; also in [
7,
8] solutions based on fuzzy k-means were proposed. The key problem with these k-means and fuzzy k-means is that they generate solutions with shapes that are difficult to process with current agricultural technology.
Next, in [
9] Cid-Garcia et al. show a solution of SSMZ for rectangular facilities based on a delimitation of zones with rectangular area units as shown in
Figure 1. In the same work, an integer linear program with a pre-processing step is proposed that generates feasible solutions with rectangular regions as in
Figure 2. In this work, we assume that the facility shape is rectangular, and it is divided into rectangular area units. Finally, in [
10], Velasco et al. introduce a metaheuristic approach based on Estimation of Distribution Algorithms (EDA) which is an extension of the rectangular-based solution in [
9] due to their proposed orthogonal solutions to extend the quality and shape of the regions in a solution. In this work, we deal with that approach.
The criteria for the division of the facility depend on the proportion of some physicochemical properties in the soil like (pH, organic matter rate (OM), amount of phosphorus (P), and the sum of bases (SB). In this sense, each area unit is, simultaneously, a sample of soil with a numerical value associated to it. In [
11,
12], the authors show how to get this numerical value for each area unit. We assume this information is a part of the input of SSMZ. The objective of SSMZ is to minimize the number of regions of the facility under a homogeneity constraint. The homogeneity constraint is a global measure of the variation of specific properties of the soil and the number of regions of a facility.
3. Methodology
In this work, we proposed a greedy heuristic that needs the following definitions.
Definition 1. A valid area unit is an area unit with indices in set .
We built a graph
where
is the set of vertices such that each element of
is associated with a valid area unit. We put an edge between two vertices if the associated area units share a side. In
Figure 3, the vertices are shown as blue circles and the set of edges as red lines.
Definition 2. A labeling of is a map , where such that is the set of labels (regions).
Definition 3. A path over is a succession of non repeated vertices connected by edges of the set .
Definition 4. A monochromatic path is a path with the same label in all vertices of the path.
Definition 5. Two vertices with are path connected if there exists a monochromatic path over from u to v, as in Figure 4. Definition 6. A region is a set of valid area units where associated vertices have the same label.
Definition 7. An orthogonal region is a region where all pairs of vertices on it are path connected or it is a region with a single associated vertex.
Definition 8. A feasible labeling ℓ of is a labeling with only orthogonal regions.
Definition 9. A feasible solution Z is a feasible labeling such that .
First, we propose an initial solution
, where each vertex is a region with a single area unit, that is
for all
with
, as with the blue labels in
Figure 3. In [
9] we can observe if
is near to 1, and if the number of regions needed to satisfy
is near to
, therefore
is a feasible solution for SSMZ. To refer to each one of the area units of the facility, we will label them as shown in
Figure 3.
Next, we try to merge adjacent area units, i.e., if two vertices
are connected by an edge and have a different label. We try to change the labels such that
. For a vertex
we have up to four potential neighbors to merge with vertex
as in
Figure 5. It is easy to see that this merging step produces solutions with only orthogonal regions. Here, we do not directly consider the neighbors marked with red in
Figure 5, because they can produce regions with path-disconnected vertices. There are two ways to avoid the generation of solutions with disconnected regions
Only consider the neighbors to merge with an area unit ;
We can consider the set of neighbors to merge with an area unit .
The elements of
depend on the elements of
. The activation of the elements of
follows the next procedure and it is visualized in
Figure 6.
If a valid area unit is merged with , then area units and are added to if they are valid.
If a valid area unit is merged with , then area units and are added to if they are valid.
If a valid area unit is merged with , then area units and are added to if they are valid.
If a valid area unit is merged with , then area units and are added to if they are valid.
It can be clearly seen that considering as a neighbor avoids solutions with disconnected regions because this construction allows only orthogonal regions. The criterion for deciding whether an area unit of N is merged with the area unit is that if a potential neighbor has merged, the value of H remains greater than or equal to the value of the parameter . The vertices whose label was changed in this step are marked so that their labels cannot be altered in the next merging steps. If this step is not included, there is a risk of generating an infeasible labeling.
After the merging step, the area unit must be marked as visited and added to the visited set (V). The new becomes an element of set ; if set then the new is the pair with the smallest values for and and marked as not visited. Repeat until . The whole procedure is shown in Algorithm 1.
Procedure generates an initial solution where each area unit corresponds to a region and . Procedure gets the set of area unit , similar to . Procedure relabels the area units. After the merge step in the move step, there are three ways to perform this procedure, as follows:
: New is the first nonvisited neighbor in lexicographical order from N;
: New is a random nonvisited neighbor from N;
: New is a nonvisited neighbor that produces a minimal decrease in H from N.
Note that the optimization process is implicit because we do not have an explicit objective function in Algorithm 1. However, it is implicit because the heuristic tries to generate fewer regions by mixing the existing ones. We must also remark that the search space in our method is bigger than the search space of the integer linear program. The above is because the shapes of the regions in our method are a super set of the set of shapes available in the integer linear program.
Algorithm 1 A simple greedy heuristic for the SSMZ problem. |
- Require:
A rectangle shape facility M divided in area units, each one with a soil sample value. - Ensure:
A feasible partition of facility M divided in Z regions. - 1:
- 2:
- 3:
, - 4:
whiledo - 5:
- 6:
- 7:
- 8:
- 9:
- 10:
- 11:
end whilereturn - 12:
return
|
4. Experimental Results
The computational experiments were carried out on a personal computer with an AMD FX-8800P Radeon R7, Four-Core Processor @3.40 GHz, running the Linux operating system with Ubuntu 20.04 LTS, and 16 GB of RAM. The simple greedy heuristic was implemented in Python 3.8. For each facility size, there are ten instances where the fundamental difference is in the variance of sample soils of area units. For each instance, the homogeneity parameter takes the values , giving us 150 artificial instances and 40 real data instances, totaling 190 instances.
To evaluate the performance of the heuristic, we run 190 instances each with each algorithm variant. We calculate the percentage of each class of instances where our methods are better than or equal to the solutions provided using the MILP and EDA methods. In
Table 1, we can observe that our method has better results than the MILP method in most instances. In counterpart, the EDA method gives us better results in most instances.
We can observe the performance of our heuristics in each real and artificial instance in
Table 2,
Table 3,
Table 4 and
Table 5. We report the best solution found (
Z) for the heuristics
and
. On the other hand, for the heuristic
, we report the best solution after 100 runs, the mean value of solutions (
), and the standard deviation (
s). The above is due to the inherent stochastic component.
Table 3,
Table 4 and
Table 5 show, with bold fonts, all instances where our method yields solutions that are better than or equal to the MILP based solutions. The grey shades have solutions that are better than or equal to the EDA solutions. Finally, in
Figure 7 we show an example of the results of each method for a specific instance.
Furthermore, we can observe the runtime of the MILP-based solution, the EDA-based solution, and our approach in the three different versions in
Figure 8. Based on the runtime data shown in
Figure 8 and
Table 1, our method’s runtime is much smaller than the runtime of the EDA but greater than the MILP. However, our method does not provide solutions that are as good as the EDA but provides better quality solutions than the MILP ones.
5. Conclusions and Future Work
This paper introduces a simple greedy heuristic for the site-specific management zone problem (SSMZ). This problem consists of partitioning the field into small regions considering a specific soil property. The objective is to minimize the number of regions of the facility under a homogeneity constraint. The homogeneity constraint is a global measure of the variation of specific properties of the soil and the number of regions of a facility.
Our methodology was tested on a set of real data instances as well as a set of artificial instances, and it was compared with other methodologies presented in the literature. The experimental results show that all versions of our algorithm produce better solutions than those obtained using the MILP method. Our method can deal with non-rectangular shapes, similarly to the EDA approach. Our heuristic produces feasible solutions with a reasonable running time compared with the running times of EDA.
The present work is a significant advancement towards solving the problem. In future work, the structure of the solutions and the deterministic aspect of our approach could be used to provide an approximation guarantee and have an approximation algorithm and not just a heuristic method. Finally, we will consider hybridizing our simple greedy algorithm with other metaheuristics.