1. Introduction
The Rio del Rey Basin in Cameroon is located on the northeastern edge of the Niger Delta Basin and is rich in oil and gas resources, making it the primary oil-producing region in Cameroon. The deep turbidite sand bodies within this block represent the next targets for exploration and production. These deep turbidite sandstone reservoirs are buried at depths ranging from 2500 to 3500 m, with a maximum formation pressure coefficient of 2.10, the highest bottom-hole temperature reaching 161 °C, and a maximum geothermal gradient of 4.6 °C/100 m, classifying them as typical high-temperature and high-pressure reservoirs. These reservoirs are relatively young in terms of diagenesis and are influenced by fault blocks and mud diapirism. The causes of overpressure are complex, potentially influenced by the undercompaction of strata, high-temperature fluid expansion, mud diapirism, and fault block development, leading to a multifaceted pressure system. Conventional prediction methods based on sediment compaction theory primarily apply to the undercompaction mechanism of sandstone and mudstone strata, while the Bowers unloading model is predominantly used for fluid expansion strata [
1]. However, for the multi-mechanism superposition observed in this area, single-pressure mechanism prediction methods can yield significant errors and fail to provide precise predictions. Since the 1990s, the importance of accurately determining the pressure mechanism has been increasingly recognized. Early studies primarily relied on qualitative inferences related to sedimentary evolution, stratigraphic mineral composition, and hydrocarbon migration characteristics; however, these methods often lack substantial data support and can vary greatly depending on the individual researcher’s understanding of the geological context. In later studies, the incorporation of well logging data and rock mechanics has led to more intuitive and convincing judgments [
2]. For instance, Teige [
3] summarized well logging data from a Norwegian field, noting that low acoustic velocity and resistivity, without distinct abnormal high-pressure sections in porosity and density curves, are likely indicative of fluid expansion. Bowers [
4] introduced the acoustic-density crossover method to discern the causes of undercompaction, pore fluid expansion, and tectonic compression by analyzing the positional relationships between measured data points and the ideal loading curve. Nevertheless, this method has limitations in identifying subtle causes such as diapirism. Ye et al. [
5] proposed a method to distinguish between undercompaction and tectonic compression mechanisms, considering specific regional structures and sedimentation. Although the processes of undercompaction and tectonic compression share similarities, they fundamentally differ; undercompaction typically involves a one-dimensional perspective where vertical stress from sediment loading leads to abnormal pressure due to incomplete dewatering, while tectonic compression operates in three dimensions, with horizontal and vertical stresses interacting due to tectonic forces such as faulting or folding. Ye et al. [
5] pointed out that in areas where the folds are not tight, high-stress tectonic compression generally does not occur; however, near mountain fronts and active large reverse faults, tectonic compression can be significant. The acoustic emission test of underground coring can thus be used to determine the magnitude of in situ stress. Building on this line of reasoning, Zhang et al. [
6] added the factor of overpressure transmission and proposed an identification pattern diagram based on these four mechanisms, which can distinguish between unbalanced compaction and tectonic compression; however, they noted difficulties in estimating fluid expansion and overpressure transmission. In 2017, Fan et al. emphasized that when acoustic density and resistivity logging curves are plotted on the same depth coordinate system, significant deviations in all three curves at the same logging depth confirm the occurrence of undercompaction [
7]. Recently, Guo et al. [
8] utilized a multi-well logging curve combination method and Bowers’ acoustic density crossover diagram to identify the causes of overpressure in the Yinggehai Basin, highlighting the relevance of advanced logging techniques in pressure assessment. Furthermore, Ai et al. [
9] combined mud mineral analysis with acoustic time difference-density crossover diagram analysis to determine different compaction stages, yielding ideal predicted pressure results. This showcases the application of innovative methodologies in addressing traditional overpressure assessment challenges.
In addition, recent advances in machine learning and artificial intelligence have shown great promise in predicting abnormal pressure conditions. For instance, Chen et al. [
10] explored the use of support vector machines and neural networks for predicting pore pressure in complex geological environments, demonstrating improved accuracy and reliability over traditional methods. Similarly, Liu et al. [
11] applied deep learning techniques to analyze logging data, revealing patterns in abnormal pressure occurrences that were previously undetectable through conventional approaches. The primary objective of this study is to provide a robust framework for quantifying traditional overpressure mechanisms in the Rio del Rey Basin. We employ a comprehensive approach that integrates theoretical judgment with hierarchical clustering algorithms to calculate the weights associated with different causes of overpressure. By utilizing a powerful LightGBM model enhanced with the SMOTE Bayesian optimization algorithm, we have developed an effective model for identifying the mechanisms responsible for abnormal pressure. This method not only enables the calculation of formation pressure applicable to the combined effects of undercompaction, high-temperature fluid expansion, and structural diapirism but also provides a scientific basis for subsequent oil and gas exploration. By gaining a deeper understanding and quantifying these abnormal pressure mechanisms, this research will offer important support for optimizing drilling and production decisions, thereby reducing operational risks and enhancing the efficiency and safety of resource extraction. The findings of this study not only enrich the theoretical framework of overpressure mechanisms but also provide new ideas and methods for research in similar geological environments, holding significant practical implications for the oil and gas industry.
2. Mechanisms of Abnormal Formation Pressure
Abnormal formation pressure or abnormal formation pore pressure primarily includes abnormally low pressure and abnormally high pressure, both of which are widely present in major oil and gas basins around the world. Among these, abnormally high pressure is more common, while abnormally low pressure is relatively rare. The mechanisms behind abnormally high pressure are highly complex and can be triggered by a single factor or by the combined effect of multiple factors. Geological deposition, chemical reactions, and physical processes can all contribute to an abnormal increase in underground fluid pressure. Common causes of abnormally high pressure include undercompaction, hydrocarbon generation from source rock cracking, tectonic compression, fault activity, and diapirism. Undercompaction occurs when sediment layers fail to consolidate adequately due to rapid sedimentation, leading to an increase in pore pressure [
6]. Hydrocarbon generation involves the thermal cracking of organic matter in source rocks, which can produce additional fluid and increase pore pressure [
12]. Tectonic compression arises from the interaction of tectonic plates and can significantly alter stress conditions in the subsurface [
8]. Fault activity can create pathways for fluid migration, resulting in localized pressure increases [
1]. Diapirism, characterized by the upward movement of less dense materials, can also impact fluid dynamics and pressure distributions [
9]. These factors often interact, leading to even more complex pressure systems. For instance, Zhang et al. [
6] demonstrated that undercompaction and tectonic forces can coexist, complicating the pressure profiles in sedimentary basins. To better understand these interactions, Fan et al. [
13] classified the causes of abnormally high pressure into four categories based on changes in vertical effective stress during sedimentary loading. These categories include:
- 1.
Primary Sedimentary Loading Mechanism—Undercompaction
During diagenesis, as sediments accumulate, pressure from the overlying rock layers increases, continuously squeezing water out of the sediments. As a result, vertical effective stress increases and porosity decreases with the enhanced compaction. The normal compaction process involves an increase in vertical effective stress, i.e., continuous loading. Undercompaction occurs when water in the sediments cannot be expelled smoothly, resulting in porosity that does not decrease with increased compaction. Vertical effective stress either slows its rate of increase or remains unchanged, while the overlying rock pressure continues to increase normally. According to effective stress theory, this leads to the occurrence of abnormally high pressure [
14,
15,
16].
- 2.
Reloading Mechanism—Tectonic Activity
Under certain conditions, tectonic activity can close open fractures, making it difficult for fluids to be expelled, reducing porosity while the overlying rock pressure remains unchanged. As vertical effective stress decreases, abnormally high pressure naturally occurs according to rock effective stress theory. Common tectonic compression involves the distribution of stress between rock volume and fluids, including rock porosity, rock and fluid compressibility properties, and rock stress in sedimentary basins. Stress can only be converted into pore fluid pressure when fluid flow is restricted (tectonic compression can locally open fluid channels, leading to abnormally low pressure). Similar effects are seen with diapirism, fault activity, and salt domes, provided that fluid channels can be closed. Since fault activity can sometimes connect fluid pathways, geological conditions must be carefully considered when dealing with such abnormal high pressures [
17,
18,
19,
20].
- 3.
Unloading Mechanism—Pore Fluid Expansion
During or after compaction, pore fluid volume increases or the overlying rock pressure decreases due to one or more reasons, reducing vertical effective stress.
- a.
Hydrocarbon Generation and Thermal Cracking of Hydrocarbons
Hydrocarbon generation: Kerogen in source rock transforms into less dense oil and gas, causing pore fluid volume to expand. After hydrocarbon expulsion, the fluid migrates to the reservoir, with the pore fluid bearing part of the overlying rock load, resulting in abnormally high pressure.
- b.
Transformation from Montmorillonite to Illite
Montmorillonite is a very common clastic mineral found in shale, and its crystal structure contains abundant interlayer water. When montmorillonite is freshly deposited, it undergoes hydration. This hydration process continuously absorbs free water between the crystal particles, increasing the fluid volume in the pores.
- c.
Transformation from Gypsum to Anhydrite
The transformation from gypsum to anhydrite releases crystallization water, while anhydrite absorbs water and converts back to gypsum. This process causes anhydrite to absorb water and expand, increasing rock volume and reducing rock porosity, which can also lead to higher pressure.
- 4.
Minimal Change in Porosity
Although the porosity remains essentially unchanged, abnormal high pressure can still occur. For example, mineral transformations (such as the conversion of montmorillonite to illite) may lead to the redistribution of fluids and an increase in pressure, even though there is little change in porosity. Additionally, chemical reactions between fluids can result in the generation of dissolved gases or minerals, thereby increasing the pressure of the pore fluids. For instance, reactions between groundwater and dissolved carbonate minerals can lead to the formation of gas bubbles, which causes fluid pressure to rise. These mechanisms are crucial for understanding the complexity of subsurface pressures and their implications for oil and gas exploration [
21,
22].
2.1. Data Collection for the Target Work Area
- (1)
Basic Data
Comprehensive and accurate input data can reflect the overall characteristics of a block and assist researchers in analyzing specific wells. The input data collected in this study primarily include the stratigraphy of the Rio del Rey Basin in Cameroon, a comprehensive geological logging database for key wells, drilling fluid usage, drill bit usage, and records of drilling complexities.
- (2)
Logging Data
The conventional method for calculating formation pore pressure using logging data involves calculating effective stress, using sonic time delay data, and the overlying formation pressure, using density logging data. The pore pressure is then determined by combining these values through effective stress theory. Two main factors influence the reliability of formation pressure calculations: ① the quality of logging data and the preprocessing methods used, and ② the model used to derive the formation pore pressure. Therefore, it is important to use high-quality sonic time delay logging data (generally with a clear mudstone trend and minimal borehole enlargement) and available density logging data, with sufficiently long logging intervals. Additionally, it is crucial to eliminate interference factors and false data points as much as possible. By preliminarily processing the original sonic time delay and density data, it is possible to obtain relatively pure mudstone sonic time delay and mud density data, which are closer to the true formation conditions. Based on this, a pore pressure calculation model that matches the actual regional conditions can be selected to achieve ideal detection results. In the studied block, collected logging data include well depth, sonic time delay, natural gamma, P wave velocity, porosity, shale content, density, and overlying formation pressure [
23,
24,
25].
- (3)
Seismic Data
Two-dimensional seismic data inversion requires seismic line data and individual well logging data. Preferably, seismic lines that pass through wells or are adjacent to wells (if no lines directly pass through wells) should be used. Post-stack migration pure wave seismic data and corresponding wells should be selected, with priority given to two cross-lines that intersect at wells and their corresponding wells, lines passing through multiple wells, or lines near many wells. The selected lines should reflect the research area’s spatial arrangement. Corresponding interpreted horizon data, geological layering, and time–depth conversion data (VSP, geological layering data such as boundary depths, formation codes, or names) should also be used [
26,
27,
28].
2.2. Data Sorting and Analysis
Based on the collected logging data, this study extracted and organized P wave velocity and density data at equivalent depths in the block, presenting them in a velocity–density cross-plot. The cross-plot shows the relationship between P wave velocity and density over the entire well depth, with each data point corresponding to specific values at certain depths. Detailed analysis of the cross-plot can extract important information about formation properties, especially the mechanisms behind abnormal formation pressures. In the plot, normal formations typically exhibit a linear trend representing the relationship between porosity and mineral composition. However, any anomalies deviating from this trend may indicate abnormal formations. The velocity–density cross-plot’s variations over the entire well depth are shown in
Figure 1.
From the P wave velocity and density cross-plot, three main mechanisms for abnormal pressure can be identified: undercompaction, fluid expansion, and mudstone diapirism.
- (1)
Undercompaction: Rapid sedimentation may not allow for proper compaction, resulting in higher porosity and lower density. P wave velocity and resistivity can reflect porosity to some extent; thus, lower velocity and higher resistivity may indicate undercompaction. This corresponds to the lower value area along the normal trend line in the velocity–density cross-plot.
- (2)
Tectonic Compression: Additional compaction beyond that due to the sedimentary one lead to overcompaction, where pores are compressed and fractures are closed, resulting in increased P wave velocity (reduced resistivity, decreased porosity), though density increases only slightly. Mudstone diapirism, differs in the direction of compression compared to expected tectonic offset and transport. This corresponds to the higher value area along the normal trend line in the velocity–density cross-plot, similar to an inverse unloading line.
- (3)
Fluid Expansion: Including hydrothermal pressurization and hydrocarbon generation (oil and gas). On the basis of normal sedimentary compaction, the rock volume is expanded by fluids, reducing P wave velocity (increasing resistivity), though density changes slightly. The temperature relative to the normal geothermal gradient is higher. Pressure changes caused by hydrothermal pressurization can be quantified by the thermal expansion coefficient of water. Structural compression can be seen as overloading, while fluid expansion as unloading. This corresponds to the lower velocity value area along the normal trend line in the velocity–density cross-plot, positioned on the unloading line.
In the studied area’s velocity–density cross-plot, actual data for high-pressure anomalies show distinct characteristics. Specifically, undercompaction occupies a larger proportion throughout the area, concentrated in shallower and mid-level formations. Further analysis indicates fluid expansion phenomena are mainly in mid-level and deeper formations, while mudstone diapirism is primarily found to occur in deep formations.
4. Analysis of Abnormal Formation Pressure Mechanisms
Utilizing well logging data, we extracted and calculated essential parameters, including effective well depth, sonic velocity, density, resistivity, and natural gamma radiation. A detailed analysis of the sonic velocity as a function of depth was conducted to identify the location of the normally compacted section, which is determined to be situated at depths exceeding 1300 m. The corresponding data for this normally compacted section are presented in
Figure 7. Subsequently, this dataset was employed to establish the Gardner correction equation, facilitating the derivation of the normal trend line for the velocity–density relationship. This trend line serves as a critical reference point for identifying and analyzing abnormal pressures in the subsequent stages of the study.
Using hierarchical clustering algorithms to cluster data, determining the number of clusters in the clustering algorithm before clustering is a crucial step, which directly affects the quality and application effect of the final clustering results. This paper combines the elbow method and silhouette method to determine the optimal number of clusters.
Figure 8 is the elbow method graph, which plots the distortion values corresponding to each number of clusters. The number of clusters is on the X axis, and the distortion values are on the Y axis. By observing the shape of the elbow graph, the elbow point usually refers to the position where the curve changes sharply at an inflection point. The number of clusters corresponding to the elbow point is generally considered to be the optimal number of clusters. According to the elbow method graph, the optimal number of clusters is
n = 6. On the basis of the elbow method, clustering results under different numbers of clusters are compared. When
n = 6, hierarchical clustering identifies clusters that conform to the pressure recognition mechanism, and the classification effect on the dataset is more obvious. Some clustering results under different numbers of clusters are shown in
Figure 9.
To accurately divide the clustering results of different abnormal pressure mechanisms, a normal compaction trend line is drawn using the Gardner correction equation established by the data from the normal compaction section. The Gardner empirical formula is a method that reveals the relationship between formation density and P wave velocity, and there is often a certain error when directly using the Gardner formula to research velocity–density relationships. Therefore, based on the Gardner formula, the Gardner correction equation is obtained by fitting the normal compaction section data from the #A well and #B well in Cameroon. The Gardner correction equation fitted based on the normal compaction section data is shown in
Figure 10.
In the Rio Del Rey Basin, by combining the geological data of the target area and the velocity–density cross-plot, it can be seen that various mechanisms such as undercompaction, fluid expansion, and shale diapirism are superimposed and jointly affect the formation of abnormal high pressure. The corrected Gardner curve is used as a loading curve on the velocity–density cross-plot to identify the abnormal pressure mechanism. As shown in
Figure 11, Categories 1, 4, and 5 are on the left and right sides below the normal trend line, deviating towards the lower values of velocity and density relative to the normal compaction trend section of this depth range but still distributed around the loading curve or having a similar trend to the loading curve, and their abnormal causes correspond to undercompaction; Category 3 is slightly below the normal trend line, relatively concentrated near the unloading curve of this depth range, and its abnormal cause corresponds to hydrothermal pressure increase; Categories 0 and 2 are above the normal trend line, similar to the opposite process of fluid expansion, with a certain increase in velocity relative to the normal velocity and a slight increase or unchanged density, and their abnormal causes correspond to shale diapirism.
When processing data from the abnormal high-pressure section, logging parameters such as well depth, velocity, shale content, porosity, and density are selected as features to identify pressure mechanisms. These well-divided abnormal pressure sample data are randomly divided into training and testing set samples in a 7:3 ratio, and the sample training set is balanced by the SMOTE algorithm to ensure the robustness and generalization ability of the model. To eliminate the impact of different data dimensions, the maximum-minimum normalization method is used to standardize the dataset, making each indicator have the same scale, thereby effectively improving the training effect.
The Bayesian optimization algorithm based on the Gaussian process is introduced to optimize the number of base models, the depth of decision trees, and the learning rate of the LightGBM model. The search ranges for these parameters are [1, 100], [1, 50], and [0, 1], respectively. Through 50 iterations of the Bayesian optimization algorithm, the combination parameters of the LightGBM model with the highest recognition accuracy are finally obtained.
The search and optimization process of the Bayesian optimization algorithm is shown in
Figure 12 and
Figure 13.
Figure 12 is a slice graph of the parameter relationships in the model, representing the progress of different hyperparameters in multiple trials. The horizontal coordinates are the learning rate, tree depth, and tree number, three hyperparameters in the model, and the vertical coordinates are the target function values, representing the model accuracy, with the legend on the right representing the number of iterations;
Figure 13 is a historical record graph of model hyperparameter optimization, representing the performance improvement process of the model in multiple iterations. The horizontal coordinate is the number of iterations, and the vertical coordinate is the target function value, representing the model accuracy, with the blue point representing the current iteration accuracy and the red point representing the historical best accuracy.
After comparative testing, the original LightGBM model has an accuracy rate of 0.906 on the test set. However, after model optimization using the Bayesian optimization algorithm, the new model achieved a higher accuracy rate on the same test set, reaching 0.942. This means that the new model is more accurate and reliable in identifying the mechanisms of abnormal pressure.
Figure 14 is the confusion matrix of the results of the two models in identifying the abnormal pressure mechanism in the test set, used to evaluate the performance of the classification model. The horizontal and vertical axes represent the predicted values and true values of the model, respectively, with 0, 1, and 2 representing the abnormal pressure mechanisms of undercompaction, hydrothermal pressure increase, and shale diapirism, respectively. The color bar on the right explains the range of sample quantities corresponding to the color of each cell in the confusion matrix. It can be found that the optimized model is superior to the original model in the three abnormal pressure mechanisms, indicating that it is feasible to identify the mechanisms of abnormal pressure through machine learning.
In the Rio Del Rey Basin, various mechanisms such as undercompaction, fluid expansion, and shale diapirism are superimposed and jointly affect the formation of abnormal high pressure. By calculating the weights of the high-pressure mechanism categories at different well depths in the high-pressure layer segments of the #A well, #B well, and BRM1 well in the Rio Del Rey Basin, the relative importance of each mechanism in the formation of abnormal high pressure can be determined. The experimental results show that undercompaction is the dominant mechanism for abnormal high pressure in the Rio Del Rey Basin, accounting for about 70%, while fluid expansion and shale diapirism account for 10% and 20%, respectively. The proportion of each mechanism is shown in
Figure 15. This means that in this area, undercompaction is the main factor causing abnormal high pressure, and the tectonic compression caused by shale diapirism and the fluid expansion caused by hydrothermal pressure increase further promote the increase in pore fluid pressure in the strata. This finding is of great significance for a deeper understanding of the geological pressure characteristics and sedimentary action in this area. Through the weight analysis of these mechanisms, the formation process of abnormal high pressure can be more accurately predicted and explained, providing strong support for geological research and engineering applications.
According to the aforementioned method for identifying overpressure mechanisms, the causes of overpressure at different strata of the #A well in the work area have been identified and analyzed using the limited data available, and the calculation results are shown in
Table 1.