1. Introduction
Oil is the lifeblood of the industry, and China has emerged as the world’s largest oil importer. As China’s dependence on foreign oil intensifies, the mismatch between supply and demand becomes increasingly apparent [
1]. To mitigate the issue of the oil supply–demand conflict, one effective strategy is the accurate identification of oil reservoirs, which can stabilize oil production and enhance the development of oil reserves.
The logging data curve is a data signal reflecting changes in physical properties with well depth and serves as the foundation for determining various parameters of oil and gas reservoirs in oil logging recognition. Logging data curves offer invaluable insights into identifying subsurface sedimentation and analyzing the distribution of subsurface material layers [
2,
3].
Employing logging data curves for lithology identification is faster and more cost-effective compared to other methods. Logging data processing primarily encompasses data pre-processing, attribute reductions, and classification. Moreover, well logging data denoising is the most critical task in logging data pre-processing. In the process of logging data denoising, challenges are inevitable due to the sparse and inhomogeneous sampling methods, which lead to redundant information, noise, and even misinterpretation. By technically reconstructing the logging data [
4], we can render the geophysical information more comprehensive, accurately reflect the characteristics of the underground geological body, ensure the precision of complex geological reconstructions, and provide more effective guidelines and references for logging data mining.
The application of robust principal component analysis (RPCA) in logging data processing [
5] merits in-depth study, particularly in the contexts of logging data cleaning and data mining. Regarding well logging data denoising, the measurement instruments or sampling methods used to collect the data often result in a substantial amount of redundancy and noise in the well logging data from each field, especially in the harsh down-hole environments where various errors are inevitable. Consequently, to further enhance the mining efficacy of logging data, the development of novel logging data denoising methods for effective denoising is an urgent matter of concern.
Several common methods exist for solving the standard RPCA model, such as the Iterative Thresholding Algorithm (IT) [
6], Accelerated Proximal Gradient (APG) [
7], the Augmented Lagrange Multiplier Method (ALM) [
8], Exact ALM method (EALM) [
9], and Inexact ALM method (IALM) [
10].
Presently, the RPCA model and its enhanced optimization algorithm have been employed for feature extraction, dimensionality reduction, subspace segmentation, etc., in the context of image signal and voice data [
11,
12,
13]. Similarly, the advanced concept of sparse decomposition in the RPCA model can be utilized in logging data processing, effectively integrating the features and advantages of both to achieve superior denoising results [
14,
15,
16].
The fundamental concept behind enhancing RPCA algorithms involves addressing the issue of parametric optimization intrinsic to the RPCA model [
12]. However, a prevalent issue with these algorithms is their tendency to decelerate and amplify the recovery error as the dimensionality of the input matrix escalates. The referenced study [
17] suggests employing a smoothing function adhering to specific value rules to approximate the minimization parametrization. This algorithm operates faster and exhibits superior recovery accuracy for the input matrix compared to other algorithms under identical experimental conditions. Candes [
18] introduced the concept of weighted parametric minimization, employed to enhance the performance of the low-rank matrix recovery model in terms of matrix sparse decomposition capacity. However, the performance indicated that this approach impacts the stability of the solution to a certain degree. Peng [
19] proposed the idea of weighted kernel parametrization to augment the low-rankness and sparsity of the matrix, thereby increasing the efficiency of matrix recovery. Zou [
20] demonstrated that in the process of solving the low-rank matrix recovery model, achieving both the stability and sparsity of the solution simultaneously is unattainable. To guarantee the solution’s accuracy, it is necessary for the algorithm analysis to take both factors into account. Simultaneously, RPCA algorithms can also leverage both static and dynamic functions to augment their performance under sparsity constraints [
21,
22,
23]. Rekavandi [
24] has introduced four α−divergence-based RPCA methods, which provide a robust and flexible framework for signal recovery in fMRI and for Foreground–Background separation in video analysis. The main idea of those algorithms is to extract the principal loading vectors and their corresponding PCs by minimizing a cost function derived from the α−divergence between the sample density (obtained using the observed data) and the nominal density model. These methods effectively mitigate the impact of both structured and unstructured outliers.
Motivated by the aspiration to construct a more robust and swift matrix recovery method from RPCA, while ensuring solution stability, the low-rankness of the observation matrix and the sparsity of the sparse matrix can be more effectively utilized. Moreover, there is an actual demand for oil logging denoising methods. Therefore, we present two main contributions in this study. First, we proposed an approximate zero norm, based on the structure of the fractional function, which is used to construct the objective optimization function, and subsequently, we utilized weighted kernel parametrization and penalty terms to optimize the RPCA model. Second, we have proposed a new method for oil logging data denoising, which is predicated on the use of IRPCA.
This paper is organized as follows.
Section 2 introduces the basic principles of RPCA.
Section 3 elucidates the details of the Improved RPCA.
Section 4 presents experimental results to demonstrate the efficiency of IRPCA in both simulation experiments and logging data denoising experiments. Lastly, we present our conclusions in
Section 5.