Next Article in Journal
Unmanned Surface Vehicle Using a Leader–Follower Swarm Control Algorithm
Previous Article in Journal
Exploring the Capabilities of a Lightweight CNN Model in Accurately Identifying Renal Abnormalities: Cysts, Stones, and Tumors, Using LIME and SHAP
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Classification Tree for Modeling Ground Fractures from Subsidence

1
Engineering Institute, National Autonomous University of Mexico, Mexico City 04510, Mexico
2
Department of Mechatronics, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India
*
Author to whom correspondence should be addressed.
Appl. Sci. 2023, 13(5), 3123; https://doi.org/10.3390/app13053123
Submission received: 25 November 2022 / Revised: 25 January 2023 / Accepted: 28 January 2023 / Published: 28 February 2023
(This article belongs to the Section Civil Engineering)

Abstract

:
This article presents a classification tree with predictive and prescriptive capabilities for the management of ground fractures in a crowded suburb in Tláhuac, a municipality of Mexico City. The tree is trained with observations of fractures parametrized with basic geotechnical and geological variables and specifications of the urban environment where they manifest. With the trained tree, the complexity of the scenario affected by the subsidence phenomenon is cleared because the relations between parameters can be viewed easily. With the trained tree, the influence of stratigraphic arrangements (geotechnical properties), geological conditions, size of roads and inhabited units, and location of water-pumping infrastructure in the appearance and exacerbation of cracks in soils is recognized, and this offers citizens and government administrators the possibility of anticipating damages and working on programs for improving structures and foundations, including relocation programs for communities at risk.

1. Introduction

In areas with rapid urbanization and demographic growth, the prolonged exploitation of groundwater is causing the land to sink with consequent dangerous impacts [1,2,3]. One of the responses of soil sinking is the collapse of the superficial layers, which affects fragile buildings and buried facilities that fail in blocs or by kilometers, resulting in huge economic losses and detriments to the quality of life of citizens. Superficial ground fracturing is one of the soil phenomena that Mexico City must face with extreme urgency and humanism, since it occurs in large portions of the deprived regions of the metropolis (Figure 1). Understanding what provokes a soil fracture is vital for the Mexican metropolis. Soil rupture, with opening and/or vertical displacement, is a geomechanical process that could be associated with the pumping of groundwater from unconsolidated sedimentary aquifer systems, and normally develops in arid or semiarid basins.
The generation and propagation of ruptures requires the development of failure conditions, that is, traction and/or shear stress not supported by shallow alluvial sediments. There are numerous analyses of stress fields that model the occurrence of ruptures, with locations in Mexico [4,5,6,7,8,9], China [10,11,12,13], and the United States [14,15,16,17,18,19,20,21,22,23] being among the most cited investigations. In CDMX, cracks associated with pumping have been studied from local and regional, timeless and evolutionary, and forensic and predictive perspectives [24,25,26,27,28,29,30,31,32], which have represented great contributions to the state-of-the-art research and the documentation of events. However, the discussion is still open in regard to defining the risk levels in specific properties (vulnerability and threats to the management of the effects) and recommending (preventing) certain structural solutions or accepting urban projects in localities that are affected by this phenomenon.
Tláhuac, a deprived mayoralty in southeastern Mexico City, is the subject of this research. In this study, area, cracks, fissures, and strong steps due to consolidation of soft layers on abrupt basement slopes, and differential subsidence in heterogeneous contacts, have been recognized [33]. The southern region of Mexico City has high levels of heterogeneity, and its typical stratigraphy is composed of superficial fillings (garbage and/or tuff strata) of poor or null capacity to resist load on extremely soft clays that were deposited on the steep slope of the perimeter of the basin (basement). These arrangements of materials manifest in complex behaviors whose prediction is a challenging open task. Identification of the processes is particularly difficult because the manifestations (cracks) are transfigured by interactions with inefficient foundations, inoperative communication routes, damaged buried infrastructure (leaks in water pipes), and the chaotic dynamics of the layers of debris (used to form a level when the urbanization of the area began) deforming under the stresses provoked by pumping wells.
When the best-known models that use groundwater flow equations and a complex geomechanical characterization are used to evaluate vertical deformations (associated with groundwater extraction) in the Tláhuac scenarios, the predictions fail because they entail difficulties and uncertainties, most of which are related to the parameterization of the environment. The transformation of specific information (often scarce) in a 3D space, with near and far borders, is a fragile point when predictions are faced with field observations. The disconnect between interpretations and reality has its most negative effect when ruptures occur without being able to alert government actors, developers, and the public themselves about the risk in their environment. The study of the behaviors and properties of soils has been analyzed with artificial intelligence and machine learning very profusely in recent years, showing the enormous potential of this technology to solve many geotechnical problems [34,35,36,37,38,39,40,41,42]; however, few investigations with sufficient solidity can be found on ground fracturing and subsidence. Using machine learning in the simulation of fracturing from subsidence makes it possible to generate concrete information about the aspects in the environment (natural and anthropic) that activate or exacerbate the breaks, while the processes that extend from the depths of the exploited aquifer to the superficial layers stayed immersed in the data.
This article presents a CART model (classification and regression tree) in classification mode that analyzes ground fracturing in a comprehensive way, eliminating subjective approaches and pointing out the spatial factors that most influence the adverse manifestations (cracks, fissures, and steps). The CART is trained with geotechnical and geological records that characterize the soils and rocks (composition and geometry), information on the buried infrastructure (pumping well location), and the arrangements of houses and streets in the most heavily fractured areas in Tláhuac. The tree makes it possible to define the areas that are most threatened by the phenomenon, the components (natural and anthropic) that exacerbate the manifestation, and the points on the surface that could potentially crack if their intrinsic characteristics and the water pumping patterns do not change. Its outstanding ability to predict breakpoints allows the use of machine learning to provide effective responses to city designers and government administrators for anticipating these kinds of events, work which is essential when the threatened areas have low resilience and a high level of poverty.

2. CART Basics

The CART model is a nonparametric learning technique that produces regression trees when the output is numerical and classification trees when the dependent variable is categorical. The general result of the CART algorithm [43] is a tree where the branches represent sets of decisions and each decision generates successive rules that continue the classification, also known as partition, thus, forming mutually exclusive homogeneous groups with respect to the variable discriminated. Trees are built using a recursive segmentation algorithm, ending once a stopping criterion is reached.
The method uses historical data to build the tree and, once completed, it can be used to classify new data. This means that Y is a dependent variable and the p predictor variables are x 1 , x 2 , , x n , where x is considered fixed and Y d is a random variable. The statistical problem is to establish a relationship between Y and x in such a way that it is possible to predict Y based on the values of x . Mathematically it is required to study the conditional probability of the random variable P   [ Y = y | x 1 , x 2 , , x n ] or a function of its probability, such as the conditional expectation [44].

2.1. Elements of a Tree

The tree in Figure 2 is made up of an initial node called the root node, formed from the attribute that creates the purest subset. It is divided in turn into two groups or decision nodes, to later apply the partition procedure separately to these groups. The divisions are selected in such a way that the purity of the decision nodes is greater than that of the root node. The goal is to partition the response into homogeneous groups while keeping the tree small enough. The recursive segmentation process continues until the tree is saturated, i.e., the subjects in the descendant nodes cannot be divided into an additional division since they do not fulfill the condition of the purity of these nodes being less than that of the node from which they come. These nodes that cannot continue to be divided are known as terminal nodes.
The methodology for developing regression and classification trees generally consists of three steps [45]. This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, as well as their interpretation, and the experimental conclusions that can be drawn. This methodology is as follows:
  • Construction of the tree;
  • Pruning the tree;
  • Selection of the optimal tree through a cross-validation procedure.

2.2. Node Impurity Function

This function is a measure that allows determining the quality of a node, expressed with i(t). The impurity measures that allow the analysis of different types of responses in the classification trees are available in [43,46].
The entropy index is defined as follows:
i t = j p j | t l n p j | t
where
p = relative frequency of class j in t/probability that an object is classified in a class
j = class
j = data set
The objective is to find the partition that maximizes Δ i t of the following equation:
Δ i t = j = 1 k p j | t ln p ( j | t )
The Gini index defined as follows:
i t = i j p j | t p i | t
The following equation is used to find the partition that maximizes Δ i t :
Δ i t = j = 1 k p j t 2
The most common algorithms for regression trees are based on the calculation of standard deviation, variance, and sum of squares.

2.3. Tree Pruning

In the first stages the tree obtained is over adjusted, so terminal nodes must be cut successively until the most suitable size is found. To solve this problem, an alternative is to look for a series of nested trees of decreasing size [47], each of which is the best of all trees of its size. These small trees are compared to determine the optimum. This comparison is based on a cost–complexity function, R α T .
For each T tree, this function [48] is defined as follows:
R α T = R T + α T ˜
where R T is the average of the sum of squares between two nodes, which can be the total misclassification rate or the total residual sum of squares (depending on the type of tree); T ˜ is the complexity of the tree, defined as the total number of nodes in the subtree, and α is the complexity parameter. The parameter α is a real number greater than or equal to zero; when α = 0 the result is the largest tree, and as it increases, the size of the tree decreases.
The function R α T will always be minimized by the largest tree; therefore, better estimates of the error are needed. From the sequence of nested trees, it is necessary to select the one that is optimal, and to achieve this, it is not effective to use comparison or a complexity penalty [48]; therefore, it is necessary to accurately estimate the prediction error, and in general, this estimation is made using a cross-validation procedure.
The objective is to find the optimal ratio between the misclassification rate (quotient between the misclassified observations and the total number of observations) and the complexity of the tree. The cross-validation procedure can be implemented in two ways, depending on the amount of data [49]. With sufficient data, the sample is split; half or less of the data are removed, and then the sequence of trees is constructed using the data that remain. Posteriorly, the predictions of each tree are calculated, and the error of the estimations is set to finally select the tree with the smallest prediction error. If there were not enough data, the k-partition cross-validation tool was applied (k-fold cross-validation).

2.4. J48 Algorithm

Quinlan’s C4.5 algorithm actualizes J48 to create a trimmed C4.5 decision tree. Every aspect of the information is split into minor subsets based on a decision. Here, J48 looks at the standardized data gain that results in the split of the information by choosing an attribute. To summarize, the attribute’s extreme standardized data gain is utilized. The minor subsets are returned by the algorithm. The split strategies stop if a subset has a place with a similar class in all instances. Then, J48 develops a decision node utilizing the expected estimations of the class. The J48 decision tree can deal with particular characteristics, lost or missing attribute estimations of the data, and varying attribute costs. Here, accuracy can be expanded by pruning [50].
The decision trees produced by J48 can be utilized for classification. At every node of the tree, J48 chooses the attribute of the data that most effectively splits its arrangement of tests into subsets improved in one class or the other. The splitting criterion is the standardized information gain (in contrast to entropy). The attribute with the highest standardized information gain is used for making the decision. The J48 algorithm at that point recurs on the smaller sub lists. The J48 tree develops a decision node utilizing the expected estimations of the class. The J48 decision tree can deal with characteristics, lost or missing attribute estimations of the data, and varying attribute costs [51]. Some successful CART applications in engineering and geosciences are [52,53,54].

3. Modeling of Ground Cracking

The investigation focuses on one of the municipalities of Mexico City that has experienced the strongest and most dangerous ruptures in the last 50 years, namely the region of Tláhuac. In the 85 km2 that make up this municipality, very particular areas are recognized where this phenomenon is forcefully manifested, particularly the Del Mar neighborhood (approx. 2 km2) (Figure 3), where cracks of tens of centimeters with openings of 4 m deep and huge steps approximately 120 cm high have been recorded. Agreeing with previous studies [7,55,56], ground fracturing is caused by the interaction of different factors, between the most important, geological preexisting discontinuities (because of variations in the depositional environment), slope of the basement underlying the compressible layers, thermal cycles, heterogeneity in compressibility and permeability, and the extensive exploitation of aquifers. This last aspect is considered the trigger factor of subsidence due to the corresponding vertical and horizontal tensile stresses on certain soil strata [57,58].
When the water level of the aquifer system drops, it induces a gradual compaction of the sedimentary filling, causing superficial sinking. In Tláhuac, this sinking is uneven, so superficial cracks and scarps are formed. There are intrinsic (natural) factors, such as material heterogeneity and basement characteristics, that control the trace, shape, and size (Figure 4) [22,59]. It is also recognized that anthropic settings, such as foundations, buried installations, loads (imposed by houses and buildings), and covert anomalies (e.g., pre-Hispanic structures) are also responsible for the extent and spatial definition of these cracks [60].
For training the CART model, the selected physical variables (natural and anthropic) are integrated as a spatial database, to construct the training matrix. Natural information (geological and geotechnical) and man-made conditions (depletion of groundwater levels, location of dwelling arrangement, and communication routes) were compiled and related to cracks, fissures, and steps in the studied area.

3.1. Exploration and Expression of Variables

From geotechnical and geophysical campaigns (49 geotechnical boreholes, aerial photogrammetry of the area, and 31 measurements of environmental seismic noise), important parameters were defined. This information was mapped in 2D/flat units (squares of 121 m2) georeferenced to the center of each unit. The square is selected for the ease of translating its information into pixels and voxels. Water content W%, Atterberg limits (liquid limit, LL; plastic limit LP; and plasticity index, IP), number of blows of SPT NSPT (average for the first 35 m of the column), materials classification according to the USCS (Unified Soil Classification System), cohesion/friction values (average in the first 35 m), and the presence of a semirigid layer embedded in the clay matrix are the basic geotechnical inputs. Additionally, data concerning soft deposit thickness (clay strata), amplification, and soil period were included. The depth and slope of the basement that underlies the clay formation was also included. Some examples of these maps are shown in Figure 5, Figure 6, Figure 7 and Figure 8. The included anthropic parameters are the housing level (number of floors and type of structure), the size of the streets (number of lanes), the position relative to the pumping wells, and the groundwater level. A summary of the inputs is listed in Table 1.

3.2. Tree Structure

The CART was trained using 3645 lines with 17 inputs and 1 output. The appropriate depth was determined by evaluating the tree on the held-out data set (700 lines) via k-cross validation. By resampling the data many times, splitting the data into training and validation folds, fitting trees with different sizes on the training folds, and looking at the classification accuracy on the validation test, the best tree depth (bias–variance trade-off) was found. The optimal tree (69 leaves and 137 terminal nodes) correctly classified 3338 instances (91.5775%), and 307 were incorrectly classified (8.4225%). Examples of the branches are shown in Figure 9. This tree predicts whether a site (square unit of 121 m2) cracks or not based on its intrinsic and extrinsic conditions, considering that consolidation (due to pumping) remains constant and acts on the same layers (those susceptible to consolidation) that drain toward sufficiently permeable strata.
Beyond its predictive power, this tree uncovers parametric relationships of paramount importance. The first partitioning of the data, the purest subset, is based on the characteristics (stiffness, thickness, and depth) of a stratum called inclusion. This clayey-sand layer is minor (its thickness varies between 50 and 400 cm) and ranges from a compact and fragile material to a soft, ductile soil mass. The depth of this stratum increases as it moves away from the steep gradient of the bedrock (underlying the clay layers), practically vanishing when the slope approaches zero (Figure 10). The behavior of sites with a superficial and fragile inclusion (labeled with ordinals from IV and V, with V being the more breakable) and of those with this layer in conditions of ductility and greater depth (labeled with ordinals from I to III, with I indicating increasing ductility) are clearly recognized by the CART. The next partition is based on the relative position to pumping wells, and the load imposed by houses and traffic is essential for the unit to crack.
Based on the results obtained with the optimal CART, the confusion matrix was constructed (Table 2). The total samples in the positive class are 309 and the number of samples in the negative class is 3336. The correct classifications are 143 for the positive class and 3313 for the negative class. Now, 166 samples that were expected to be of the positive class were classified as the negative class by the model (false negatives) while 23 samples were expected to be of the negative class but were classified as “positive” by the model (false positives).
With this confusion matrix, the accuracy of the tree was calculated using the following Equation (6):
A c c u r a c y = T P + T N T o t a l   P o p u l a t i o n = 143 + 3313 3645 = 94.81 %
More than the 94% of samples were correctly classified out of all the samples present in the test set. To evaluate the worth of each attribute, the correlation (Pearson’s) between it and the output class was calculated (Table 3). The nominal attributes were considered on a value-by-value basis by treating each value as an indicator, so, for a nominal attribute, the correlation was arrived at via a weighted average. The heterogeneity of the soil environment (represented with the inclusion) and the anthropic load of the streets are directly related to the occurrence of superficial cracks. For government administrators, it is essential to recognize that the distance in x that separates the inhabited sites from the pumping wells is very significant on the manifestation. Based on these findings, more threatened areas could be outlined, not just traces of breaks, but entire neighborhoods that will suffer the consequences of subsidence. The attention programs, and destination of the resources, can be more efficiently designed.
An illustration of the prediction process with CART is shown in Figure 11. On a section of the studied area, each patch (square unit) is filled with its corresponding input information to be able to follow the branches. When selecting a patch, for example, one with inclusion = V, the right branch of the tree must be followed to then question its position in space relative to the pumping wells (if x ≤ 13 or x > 13 and y ≤ 35 or y > 35). Because the unit is located at x = 11 and y = 32, then it is questioned about its housing level (>0 or ≤0 and >2 or ≤2). As the unit has a housing level = 3, the next node asks for the thickness of material with 10 < W% < 80%, and then its position in the final leaf categorizes it as “cracking not-activated”.
If, for example, the selected unit has an inclusion = V and its position in space is x <= 13 and y <= 35, the next suitable branch asks for the housing level (>2 or ≤2), then, due to its relative position to the pumping wells, it is categorized as “cracking activated”.
The application of CART to an expanded area is shown in Figure 12, and photographs of the evidence in the field are presented to point out the remarkable predictive capacity of the tree. It is important to mention that these cases are used for validation of the model, i.e., they are not contained in the training file.

4. Discussion

This article presents an alternative method to provide answers to citizens and governments that must build urban settlements and ensure their safety. The most sophisticated geomechanical models on fracturing have shown that their predictive capacity is very low due to subsidence, but what is most serious is that they do not allow for simple and unambiguous property-by-property evaluations. It is even practically impossible to include aspects of the city that exacerbate these breaks, such as the weight of the houses, heavy traffic, or the distance to pumping wells. The presented classification tree does not intend to explain the process of exploitation of groundwater but the connection between soils, urban scenarios, and fracturing.
This model could be built due to the enormous effort of the research team but also due to the government’s willingness to support field questionnaires and the execution of detailed geotechnical and geological tests. Until the 2017 earthquake, no campaign of this scale had been carried out in these lands. For this reason, the CART presented can be placed as one of the most detailed and best supported in this regard.
Streets, sidewalks, pipes, large buildings, and family homes are somehow affected by the “broken” condition of the environment, and idealization of the soils as a continuum (homogeneous layers) with regular rigid bases are strategies that have to be used when applying modeling tools that are not based on data. The CART presented here accepts inputs that describe heterogeneous stratigraphies (of very different materials, rigidities and continuities) that lie on a semi-firm and irregular, steep slope basement, but also, as in no other model, it accounts for houses and traffic loads, as well as the distance from pumping wells.
As this model was created using information contained in square pixel units (patches), expanding them to 3D (as voxels) would allow for refining the tree, so the manifestation is possible. The next step of this CART would be to generate metrics to qualify sites and to simulate subsidence evolution scenarios.

Author Contributions

Conceptualization, S.G.; methodology, P.T.; validation, S.G. and P.T.; formal analysis, S.G., P.T. and S.V.; investigation, P.T.; writing—original draft preparation, S.G. and S.V.; writing—review and editing, S.G.; visualization, P.T. All authors have read and agreed to the published version of the manuscript.

Funding

The investigations on which this publication is based were sponsored by the Mayor’s Office of Tláhuac, under the provisions of agreement IISGCONV-010-2018 between this Mayor’s Office and the Instituto de Ingeniería of the Universidad Nacional Autónoma de México.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Restrictions apply to the availability of these data. Data was obtained from an agreement with Mayor’s Office of Tláhuac and are available from the authors with the permission of Mayor’s Office of Tláhuac.

Acknowledgments

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. The authors of this work wish to express their gratitude to the kind and supportive neighbors of Colonia Del Mar, Nopalera, Miguel Hidalgo, and Villa Centroamericana who pushed the materialization of the project and shared experiences, anecdotes, tours, and hot coffee.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Voraakhom Kotchakorn: How Can We Better Design Cities to Fight Floods? TED Radio Hour, 13 November 2020.
  2. Collados-Lara, A.-J.; Pulido-Velazquez, D.; Mateos, R.M.; Ezquerro, P. Potential Impacts of Future Climate Change Scenarios on Ground Subsidence. Water 2020, 12, 219. [Google Scholar] [CrossRef] [Green Version]
  3. Thompson, J. As Temperatures Rise, Arizona Sinks, Climate Change and Unregulated Wells Are Depleting the West’s Groundwater Reserves. HighCountry News, 1 April 2020. [Google Scholar]
  4. Carreón-Freyre, D.; Cerca, M.; Ochoa-González, G.; Teatini, P.; Zuñiga, F.R. Shearing along faults and stratigraphic joints controlled by land subsidence in the Valley of Queretaro, Mexico. Hydrogeol. J. 2016, 24, 657–674. [Google Scholar] [CrossRef]
  5. Ochoa-González, G.; Carreón-Freyre, D.; Franceschini, A.; Cerca, M.; Teatini, P. Overexploitation of groundwater resources in the faulted basin of Querétaro, Mexico: A 3D deformation and stress analysis. Eng. Geol. 2018, 245, 192–206. [Google Scholar] [CrossRef]
  6. Pacheco, J.; Arzate, J.; Rojas, E.; Arroyo, M.; Yutsis, V.; Ochoa, G. Delimitation of ground failure zones due to land subsidence using gravity data and finite element modeling in the Querétaro valley, México. Eng. Geol. 2006, 84, 143–160. [Google Scholar] [CrossRef]
  7. Carreon-Freyre, D.; Gutierrez-Calderon, R.I.; Cerca, M.; Alcantara-Duran, C.F. Factors that condition physical vulnerability to ground fracturing in Mexico City. IAHS 2020, 382, 571–576. [Google Scholar] [CrossRef]
  8. Teatini, P.; Carreón-Freyre, D.; Ochoa-González, G.; Ye, S.; Galloway, D.; Hernández-Marin, M. Ground ruptures attributed to groundwater overexploitation damaging Jocotepec city in Jalisco, Mexico: 2016 field excursion of IGCP-641. J. Int. Geosci. 2018, 41, 69–73. [Google Scholar] [CrossRef]
  9. Carreón Freyre, D. Keynote. Land subsidence processes and associated ground fracturing in Central Mexico. In Land Subsidence, Associated Hazards and the Role of Natural Resources Development, Proceedings of the EISOLS, Querétaro, Mexico, 17–22 October 2010; IAHS Press: Wallingford, Oxfordshire, UK, 2010; p. 339. [Google Scholar]
  10. Ye, S.; Xue, Y.; Wu, J.; Yan, X.; Yu, J. Progression and mitigation of land subsidence in China. Hydrogeol. J. 2016, 24, 685–693. [Google Scholar] [CrossRef]
  11. Wei, X.Y.; Wang, G. Model Test Studies on Ground Subsidence and Cracks Induced by Tunneling in Weak Rock Mass with High Water Content. In Proceedings of the Ninth Asia Pacific Transportation Development Conference, Colombo, Sri Lanka, 6–8 August 2014. [Google Scholar]
  12. Xiaoqing, S.; Xue, Y.; Wu, J.; Ye, S.; Zhang, Y.; Wei, Z.; Yu, J. Characterization of regional lans subsidence in Yangtze Delta, China the example of SuXiChang area and the city of Shanghai. Hydrogeol. J. 2008, 16, 593–608. [Google Scholar]
  13. Sakineh, P.; Ajdary, K.; Emamgholizadeh, G.A.K.S. Predicting water level drawdown and assessment of land subsidence in Damghan aquifer by combining GMS and GEP models. Geopersia 2015, 5, 63–80. [Google Scholar]
  14. Bouwer, H. Land subsidence and cracking due to groundwater depletion. Groundwater 1977, 15, 358–364. [Google Scholar] [CrossRef]
  15. Narasimhan, T.N.; Holzer, T.L. Possibility of soil deformation in the partially saturated zone due to pore pressure changes. Geol. Soc. Am. Abstr. Program. 1978, 10, 138. [Google Scholar]
  16. Jachens, R.C.; Holzer, T.L. Geophysical investigation of ground failure related to ground water withdrawal-Picacho Basin, Arizona. Groundwater 1979, 176, 574–585. [Google Scholar] [CrossRef]
  17. Holzer, T.L.; Pompeyan, E.H. Earth fissures and localized differential subsidence. Water Res. Ress. 1981, 17, 223–227. [Google Scholar] [CrossRef]
  18. Bell, J.W. Subsidence in Las Vegas Valley. Nevada Bureau of Mines and Geology, Bull, 95, 84 p. Bell F.G., 1988, Subsidence associated with the extraction of fluids. Engineering Geology of Underground Movements. Geol. Soc. Eng. Geol. Sp. Pub. 1981, 5, 363–376. [Google Scholar]
  19. Carpenter, M.C. Earth-Fissure Movements Associated with Fluctuations in Ground-Water Levels Near the Picacho Mountains. South-Central Arizona 1980–84; USGS: Louisville, KY, USA, 1983.
  20. Carpenter, M.C. Land Subsidence in the United State: Circular 1182; Galloway, D., Jones, D.R., Ingebritsen, S.E., Eds.; Part I, Mining Ground Water, South-Central Arizona; U. S. Geological Survey: Louisville, KY, USA, 1999; pp. 65–81.
  21. Hem, D.C. Hydraulic forces that play a role in generating fissures at depth. Bull. Assoc. Eng. Geol. 1994, 31, 293–302. [Google Scholar]
  22. Pacheco-Martínez, J.; Hernández-Marín, M.; Burbey, T.J.; González-Cervantes, N.; Ortiz, J.; Solís-Pinto, A. Land subsidence and ground failure associated to groundwater exploitation in the Aguascalientes Valley, México. Eng. Geol. 2013, 164, 172–186. [Google Scholar] [CrossRef]
  23. Galloway, D.L.; Sneed, M. Analysis and Simulation of Regional Subsidence Accompanying Groundwater Abstraction and Compaction of Susceptible Aquifer Systems in the USA. In Boletín de la Sociedad Geológica Mexicana; USGS: Louisville, KY, USA, 2013; Volumen 65, pp. 123–136. [Google Scholar]
  24. Alberro, J.; Hernández, R. Génesis de las Grietas de Tensión en el Valle de México, el Subsuelo de la Cuenca del Valle de México y su Relación con la Ingeniería de Cimentaciones a Cinco años del Sismo; Sociedad Mexicana de Mecánica de Suelos: Ciudad de México, Mexico, 1990; pp. 95–106. [Google Scholar]
  25. Alberro, J.; Hernández, R. Fuerzas de filtración y fracturamiento hidráulico. UNAM. Serie Azul Instituto de Ingeniería. 1990, 528, 109p. [Google Scholar]
  26. Auvinet, G. Land subsidence in Mexico City. In Proceedings of the XVIIth IMSSGE Conference Geotechnical Engineering in Urban Areas Affected by Land Subsidence, Alejandria, Egypt, 5–9 October 2009; Volume 36, pp. 3–13. [Google Scholar]
  27. Carreon-Freyre, D.C.; Cerca, M.L.; Ochoa, G.H. Estudio de propagación del fracturamiento ocasionado por subsidencia en dos áreas urbanas geológicamente contrastantes de México: Las ciudades de México D.F. y Querétaro. In Proceedings of the XVIIth IMSSGE Conference Geotechnical Engineering in Urban Areas Affected by Land Subsidence, Alejandria, Egypt, 5–9 October 2009; Volume 36, pp. 49–57. [Google Scholar]
  28. Garduño-Monroy, V.H.; Arreygue-Rocha, E.; Isra de Alcántara, I.; Rodríguez-Torres, G.M. Efectos de las fallas asociadas a sobreexplotación de acuíferos y la presencia de fallas potencialmente sísmicas en Morelia, Michoacán, México. Rev. Mex. De Cienc. Geológicas 2001, 181, 37–54. [Google Scholar]
  29. Lermo-Samaniego, J.; Nieto-Obregón, J.; Zermeño, M. Fault and fractures in the valley of Aguascalientes, preliminary microzonification. In Proceedings of the 11th World Conference on Earthquake Engineering, Acapulco, Mexico, 23–28 June 1996; Elsevier Paper: Amsterdam, The Netherlands, 1651. [Google Scholar]
  30. Ortega-Guerrero, A.; Rudolph, L.D.; Chery, A.J. Analysis of long-term land subsidence near Mexico City; field investigations and predictive modeling. Water Resour. Res. 1999, 35, 3327–3341. [Google Scholar] [CrossRef]
  31. Ortiz Zamora, D.C.; Guerrero-Ortega, M.A. Origen y evolución de un nuevo lago en la planicie de chalco; implicaciones de peligro por subsidencia e inundación de áreas urbanas en el valle de Chalco (Estado de México) y Tlahuac (Distrito Federal). Investig. Geog. 2007, 64, 26–42. [Google Scholar]
  32. Chaussard, E.; Wdowinski, S.; Cabral-Cano, E.; Amelung, F. Land subsidence in central Mexico detected by ALOS InSAR time-series. Remote Sens. Environ. 2014, 140, 94–106. [Google Scholar] [CrossRef]
  33. García, S. Estudios Geotécnicos, Geofísicos y Geológicos en Diferentes Pueblos y Colonias de la Delegación Tláhuac; Mayor’s Office of Tláhuac Private Report; Mayor’s Office: Albany, NY, USA, 2019.
  34. Golubiewski, N.E. Urbanization Increases Grassland Carbon Pools: Effects of Landscaping in Colorado’s Front Range. Ecol. Appl. 2006, 16, 555–571. [Google Scholar] [CrossRef] [PubMed]
  35. Vega, F.; Matías, J.; Andrade, M.; Reigosa, M.; Covelo, E. Classification and regression trees (CARTs) for modelling the sorption and retention of heavy metals by soil. J. Hazard. Mater. 2009, 167, 615–624. [Google Scholar] [CrossRef] [PubMed]
  36. Han, J.; Mao, K.; Xu, T.; Guo, J.; Zuo, Z.; Gao, C. A soil moisture estimation framework based on the CART algorithm and its application in China. J. Hydrol. 2018, 563, 65–75. [Google Scholar] [CrossRef]
  37. Bag, R.; Mondal, I.; Dehbozorgi, M.; Bank, S.P.; Das, D.N.; Bandyopadhyay, J.; Pham, Q.B.; Al-Quraishi, A.M.F.; Nguyen, X.C. Modelling and mapping of soil erosion susceptibility using machine learning in a tropical hot sub-humid environment. J. Clean. Prod. 2022, 364, 132428. [Google Scholar] [CrossRef]
  38. García, V.J.; Márquez, C.O.; Isenhart, T.M.; Rodríguez, M.; Crespo, S.D.; Cifuentes, A.G. Evaluating the conservation state of the páramo ecosystem: An object-based image analysis and CART algorithm approach for central Ecuador. Heliyon 2019, 5, e02701. [Google Scholar] [CrossRef] [Green Version]
  39. Ma, G.; Ding, J.; Han, L.; Zhang, Z.; Ran, S. Digital mapping of soil salinization based on Sentinel-1 and Sentinel-2 data combined with machine learning algorithms. Reg. Sustain. 2021, 2, 177–188. [Google Scholar] [CrossRef]
  40. Hamze-Ziabari, S.; Bakhshpoori, T. Improving the prediction of ground motion parameters based on an efficient bagging ensemble model of M5′ and CART algorithms. Appl. Soft Comput. 2018, 68, 147–161. [Google Scholar] [CrossRef]
  41. Ospina-Gutiérrez, J.P.; Aristizábal, E. Aplicación de inteligencia artificial y técnicas de aprendizaje automático para la evaluación de la susceptibilidad por movimientos en masa. Revista Mexicana de Ciencias Geológicas 2021, 38, 43–54. [Google Scholar] [CrossRef]
  42. Morales, D.A. Técnicas de inteligencia artificial aplicadas a problemas de ingeniería civil. Rev. Arquit. E Ing. 2017, 11, 3. [Google Scholar]
  43. Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Wadsworth: Belmont, CA, USA, 1984. [Google Scholar]
  44. Bater, M.; Pradeepta, M.; Nathan, D.; Richard, H. R: Mining Spatial, Text, Web, and Social Media Data; Packt: Birmingham, UK, 2017; 267p, ISBN 9781788293747. [Google Scholar]
  45. Timofeev, R. Classification and Regression Trees (Cart). In Theory and Applications; Humboldt University: Berlin, Germany, 2004. [Google Scholar]
  46. Bertsimas, D.; Dunn, J. Optimal classification trees. Mach. Learn. 2017, 106, 1039–1082. [Google Scholar] [CrossRef]
  47. De’ath, G.; Fabricius, K. Classification and Regression Trees: A Powerful Yet Simple Techinque for Ecological Data Analysis. Ecology 2000, 81, 3178–3192. [Google Scholar] [CrossRef]
  48. Deconick, E.; Zhang, M.H.; Coomans, D.; Heyden, Y.V. Classification Trees Models for the Prediction of Blood-brain Barrier Passage of Drugs. J. Chem. Inf. Model. 2006, 46, 1410–1419. [Google Scholar] [CrossRef] [PubMed]
  49. Boehmke, B.; Greenwell, B. Hands-On Machine Learning with R. Chapman and Hall/CRC; Taylor & Francis: Abingdon, UK, 2020; 488p, ISBN 9781138495685. [Google Scholar]
  50. Venkatesan, E.V. Performance Analysis of Decision Tree Algorithms for Breast Cancer Classification. Indian J. Sci. Technol. 2015, 8, 1–8. [Google Scholar]
  51. Saravanan, N.; Gayathri, V. Performance and Classification Evaluation of J48 Algorithm and Kendall’s Based J48 Algorithm (KNJ48). Int. J. Comput. Intell. Inform. 2018, 7. [Google Scholar] [CrossRef]
  52. Gao, C.; Elzarka, H. The use of decision tree based predictive models for improving the culvert inspection process. Adv. Eng. Inform. 2021, 47, 101203. [Google Scholar] [CrossRef]
  53. Shi, X.; Yu, X.; Esmaeili-Falak, M. Improved arithmetic optimization algorithm and its application to carbon fiber reinforced polymer-steel bond strength estimation. Compos. Struct. 2023, 306, 116599. [Google Scholar] [CrossRef]
  54. Ye, J.; Koopialipoor, M.; Zhou, J.; Armaghani, D.J.; He, X. A Novel Combination of Tree-Based Modeling and Monte Carlo Simulation for Assessing Risk Levels of Flyrock Induced by Mine Blasting. Nat. Resour. Res. 2021, 30, 225–243. [Google Scholar] [CrossRef]
  55. Carreon-Freyre, D.; Cerca, M.; Gutierrez-Calderon, R.; AlcantaraDuran, C.; Strozzi, T.; Teatini, P. Land Subsidence and associated ground fracturing in urban areas, Study cases in central Mexico. In Proceedings of the XVI Pan-American Conference on Soil Mechanics and Geotechnical Engineering, Cancun, Mexico, 17–20 November 2019; p. 9. [Google Scholar]
  56. García, S.; Trejo, P. CART para los suelos de Tláhuac. In Proceedings of the XXX Reunión Nacional de Ingeniería Geotécnic, Guadalajara, Mexico, 11–14 November 2020. [Google Scholar]
  57. Rivera, A.; Ledoux, E.; Marsily, G.D. Nonlinear Modeling of Groundwater Flow and Total Subsidence of the Mexico City Aquifer-Aquitard System. In Proceedings of the Fourth International Symposium on Land Subsidence, Houston, TX, USA, 12–17 May 1991; pp. 45–58. [Google Scholar]
  58. Holzer, T.L.; Bluntzer, R.L. Land Subsidence Near Oil and Gas Fields, Houston, Texas. Groundwater 1984, 22, 450–459. [Google Scholar] [CrossRef]
  59. Zermeño de León, Z.M.; Mendoza-Otero, E.; Calvillo-Silva, G. Medición del Hundimiento y modelo para estudiar el agrietamiento de la ciudad de Aguascalientes. Investig. Y Cienc. De La Univ. Autónoma De Aguascalientes 2004, 31, 35–40. [Google Scholar]
  60. García, S.; KALTIA. Ambiente de Trabajo para Explorar Mapas Ligados a la Susceptibilidad al Agrietamiento en la Alcaldía Tláhuac; Mayor’s Office of Tláhuac Private Report; Mayor’s Office: Albany, NY, USA, 2019.
Figure 1. Collapse of the superficial layers of soil in southern Mexico City. (a) Exacerbated ruptures during the earthquake of 19 September 2017, (b) step-like displacements that have grown gradually for more than 10 years, and (c) cracks discovered during the construction of new houses.
Figure 1. Collapse of the superficial layers of soil in southern Mexico City. (a) Exacerbated ruptures during the earthquake of 19 September 2017, (b) step-like displacements that have grown gradually for more than 10 years, and (c) cracks discovered during the construction of new houses.
Applsci 13 03123 g001
Figure 2. Decision Tree.
Figure 2. Decision Tree.
Applsci 13 03123 g002
Figure 3. Tláhuac, in southern Mexico City, and the Del Mar neighborhood, the subject of this study.
Figure 3. Tláhuac, in southern Mexico City, and the Del Mar neighborhood, the subject of this study.
Applsci 13 03123 g003
Figure 4. Scheme to represent the environment and mechanism. Left, a typical arrangement of the region with a steeply sloping irregular basement which underlies layers of material susceptible to compression and a superficial water table; right, after years, the pumping process lowers the water level and, in a certain area, on or near the steep slope of the basement, subsidence and deformations are manifested, the most alarming expression of which are cracks and steps.
Figure 4. Scheme to represent the environment and mechanism. Left, a typical arrangement of the region with a steeply sloping irregular basement which underlies layers of material susceptible to compression and a superficial water table; right, after years, the pumping process lowers the water level and, in a certain area, on or near the steep slope of the basement, subsidence and deformations are manifested, the most alarming expression of which are cracks and steps.
Applsci 13 03123 g004
Figure 5. Examples of input parameters. (a) Thickness of material with 10 < W% < 80%, and (b) thickness of material with 250 < W% < 350%.
Figure 5. Examples of input parameters. (a) Thickness of material with 10 < W% < 80%, and (b) thickness of material with 250 < W% < 350%.
Applsci 13 03123 g005
Figure 6. Examples of input parameters. (a) NSPT inclusion depth; (b) inclusion thickness.
Figure 6. Examples of input parameters. (a) NSPT inclusion depth; (b) inclusion thickness.
Applsci 13 03123 g006
Figure 7. Examples of input parameters. (a) Basement slope, (b) period, (c) amplification (number of times soft ground amplifies rock motion during an earthquake).
Figure 7. Examples of input parameters. (a) Basement slope, (b) period, (c) amplification (number of times soft ground amplifies rock motion during an earthquake).
Applsci 13 03123 g007
Figure 8. Example of input parameter housing level.
Figure 8. Example of input parameter housing level.
Applsci 13 03123 g008
Figure 9. Example of a branch of the optimal CART.
Figure 9. Example of a branch of the optimal CART.
Applsci 13 03123 g009
Figure 10. Schematic representation of the analyzed environment.
Figure 10. Schematic representation of the analyzed environment.
Applsci 13 03123 g010
Figure 11. Example of the analysis sequence. (1) In the analysis area, the patches that go to the right branch with a danger value due to an inclusion greater than 4 are separated; on the right side, the areas that are susceptible to further analysis are shown. (2) The next division is made by position X, Y with respect to the origin, and a subarea is achieved within the dotted line. (3) In this subarea the patches that contain houses less than two stories high are indicated (solid circles). (4) It is recognized, for each patch, if the W% values go on the right branch. (5). The next division is made by Y to finally, in (6) get to the final leaf, with the classification of the patches between cracking activated and not-activated (lines represent recognized cracks and fractures).
Figure 11. Example of the analysis sequence. (1) In the analysis area, the patches that go to the right branch with a danger value due to an inclusion greater than 4 are separated; on the right side, the areas that are susceptible to further analysis are shown. (2) The next division is made by position X, Y with respect to the origin, and a subarea is achieved within the dotted line. (3) In this subarea the patches that contain houses less than two stories high are indicated (solid circles). (4) It is recognized, for each patch, if the W% values go on the right branch. (5). The next division is made by Y to finally, in (6) get to the final leaf, with the classification of the patches between cracking activated and not-activated (lines represent recognized cracks and fractures).
Applsci 13 03123 g011
Figure 12. The 69 branches of CART are applied to the shown area; continued lines on the map are the cracks, fissures, or steps recognized through extensive geo-campaigns in situ. Predictions are shown by circles, with red for “cracking activated” and blue for “cracking not-activated”.
Figure 12. The 69 branches of CART are applied to the shown area; continued lines on the map are the cracks, fissures, or steps recognized through extensive geo-campaigns in situ. Predictions are shown by circles, with red for “cracking activated” and blue for “cracking not-activated”.
Applsci 13 03123 g012
Table 1. Input parameters description.
Table 1. Input parameters description.
ParameterDefinitionUnits
Geotechnical inputs
Water content W%W%, also known as natural water content or natural moisture content, is the ratio of the weight of water to the weight of the solids in a mass of soil.%
Liquid limit LLLL is the water content where the soil starts to behave as a liquid. %
Plastic limit PLPL is the moisture content at which a fine-grained soil cannot be remolded without cracking.%
Plasticity index IPPI is a quantity which is the range of the water content over which the soil remains in plastic state.%
NSPTIt is the number of blows required to penetrate 30 cm depth, known as the “standard penetration resistance SPT”, used to estimate some important geotechnical engineering properties of the soil. Number of blows
InclusionSufficiently thin semirigid layerI–V, danger levels—function of stiffness, thickness, and depth
Superficial fillingSurface layer composed of low-quality fill (garbage, construction waste)Meters
USCSThe world-accepted geotechnical soil classification.
Cohesion/friction valuesCohesion refers to shear strength under zero normal stress, or the intercept of a material’s failure envelope with shear stress axis in the shear stress–normal stress space.
Friction angle is derived from the Mohr–Coulomb failure criterion, used to describe the friction shear resistance of soils (together with the normal effective stress).
kPa/°
Soft deposits thicknessThickness of the soft clay layers; the materials susceptible to consolidation due to pumping in deep layers are here.Meters
AmplificationIt is the amplification factor used to account for site effects, i.e., how a site condition modifies the rock acceleration response spectra.Dimensionless
Soil periodThe period is the time interval required for one full cycle of a wave. It depends on the characteristics of the soil deposits.Seconds
Depth of the basementMeasured from the surface, it refers to the depth at which the layer considered firm is found.Meter
Slope of the basementIt refers to the slope of the layer considered firm.Grades
Anthropic parameters
Housing levelIt refers to the classification of houses, including their weight (related to build height) and structure solution (included foundation and superstructure). Number of floors, from one to five and, the structuration as precarious, barely, and complete.
Size of the streetsIt refers to the traffic load that goes from a minimum local flow to very heavy load motor transport circulation. Number of lanes.
Pumping wellsIt is expressed as the distance between the closest pumping-well(s) and the studied site.X and Y, meters
Groundwater levelReferred to as the water table, this represents the top of the saturated zone; above the water table lies the unsaturated zone.Meters
Table 2. Confusion matrix.
Table 2. Confusion matrix.
Total Population
3645
Actual Positive (AP)
309
Actual Negative (AN)
3336
Predicted Positive (PP)
166
True Positive (TP)
143
False Positive (FP)
23
Predicted Negative (PN)
3479
False Negative (FN)
166
True Negative (TN)
3313
Table 3. Correlation attribute level.
Table 3. Correlation attribute level.
ParameterCorrelation Attribute Level
0.174244Size of the streets
0.17366Inclusion
0.14621x
0.13615Depth of the basement
0.07877Water content %
0.02658Groundwater level
0.02252y
0.01159Housing level
0.00996Soft deposit thickness
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Trejo, P.; García, S.; Vincent, S. A Classification Tree for Modeling Ground Fractures from Subsidence. Appl. Sci. 2023, 13, 3123. https://doi.org/10.3390/app13053123

AMA Style

Trejo P, García S, Vincent S. A Classification Tree for Modeling Ground Fractures from Subsidence. Applied Sciences. 2023; 13(5):3123. https://doi.org/10.3390/app13053123

Chicago/Turabian Style

Trejo, Paulina, Silvia García, and Shweta Vincent. 2023. "A Classification Tree for Modeling Ground Fractures from Subsidence" Applied Sciences 13, no. 5: 3123. https://doi.org/10.3390/app13053123

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop