Next Article in Journal
Trace Modelling: A Quantitative Approach to the Interpretation of Ground-Penetrating Radar Profiles
Previous Article in Journal
Three-Dimensional Radar Echo Extrapolation Using a Physics-Constrained Deep Learning Model
Previous Article in Special Issue
HSSTN: A Hybrid Spectral–Structural Transformer Network for High-Fidelity Pansharpening
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improvement of the Semi-Analytical Algorithm Integrating Ultraviolet Band and Deep Learning for Inverting the Absorption Coefficient of Chromophoric Dissolved Organic Matter in the Ocean

1
National Engineering Research Center of Port Hydraulic Construction Technology, Tianjin Research Institute for Water Transport Engineering, Tianjin 300456, China
2
Key Laboratory of Space Ocean Remote Sensing and Application, Ministry of Natural Resources of the People’s Republic of China, Beijing 100081, China
3
Nanjing Hydraulic Research Institute, Nanjing 210029, China
4
China Three Gorges Corporation, Wuhan 430010, China
5
Shanghai Investigation, Design & Research Institute Co., Ltd., Shanghai 200335, China
6
School of Marine Science and Technology, Tianjin University‌, Tianjin 300072, China
7
School of Environmental Science and Safety Engineering, Tianjin University of Technology, Tianjin 300384, China
8
National Satellite Ocean Application Service, Ministry of Natural Resources of the People’s Republic of China, Beijing 100081, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2026, 18(2), 207; https://doi.org/10.3390/rs18020207
Submission received: 27 November 2025 / Revised: 5 January 2026 / Accepted: 6 January 2026 / Published: 8 January 2026
(This article belongs to the Special Issue Artificial Intelligence in Hyperspectral Remote Sensing Data Analysis)

Highlights

What are the main findings?
  • The DQAAG algorithm significantly improves the retrieval accuracy of the ag(443) in both coastal and ocean waters by integrating UV bands and deep learning model.
  • Compared with established models (S2011, A2018, and QAA-CDOM), DQAAG achieves superior performance, demonstrating high accuracy across both simulated (IOCCG) and in situ (NOMAD) datasets.
What are the implications of the main findings?
  • The integration of UV bands providing a more effective approach for future ocean color satellite missions to retrieve CDOM accurately.
  • Combining deep learning with semi-analytical algorithms offers a robust and adaptable method for processing hyperspectral ocean color data.

Abstract

As an important component of waters constituent that affects ocean color and the underwater ecological environment, the accurate assessment of Chromophoric Dissolved Organic Matter (CDOM) is crucial for observing the continuous changes in the marine ecosystem. However, remote sensing estimation of CDOM remains challenging for both coastal and oceanic waters due to its weak optical signals and complex optical conditions. Therefore, the development of efficient, practical, and robust models for estimating the CDOM absorption coefficient in both coastal and oceanic waters remains an active research focus. This study presents a novel algorithm (denoted as DQAAG) that incorporates ultraviolet bands into the inversion model. The design leverages the distinct spectral absorption characteristics of phytoplankton versus detrital particles in the ultraviolet (UV) region, enabling improved discrimination of water color parameters. Furthermore, the algorithm replaces empirical formulas commonly used in semi-analytical approaches with an artificial intelligence model (deep learning) to achieve enhanced inversion accuracy. Using IOCCG hyperspectral simulation data and NOMAD dataset to evaluates Shanmugam (2011) (S2011), Aurin et al. (2018) (A2018), Zhu et al. (2011) (QAA-CDOM), DQAAG, the results indicate that the ag(443) derived from the DQAAG exhibit good agreement with the validation data, with root mean square deviation (RMSD) < 0.3 m−1, mean absolute relative difference (MARD) < 0.30, mean bias (bias) < 0.028 m−1, coefficient of determination (R2) > 0.78. The DQAAG algorithm was applied to SeaWiFS remote sensing data, and validation was performed through match-up analysis with the NOMAD dataset. The results show the RMSD = 0.14 m−1, MARD = 0.39, and R2 = 0.62. Through a sensitivity analysis of the algorithm, the study reveals that Rrs(670) and Rrs(380) exhibit more significant characteristics. These results demonstrate that UV bands play a crucial role in enhancing the retrieval accuracy of ocean color parameters. In addition, DQAAG, which integrates semi-analytical algorithms with artificial intelligence, presents an encouraging approach for processing ocean color imagery to retrieve ag(443).

1. Introduction

Dissolved organic matter (DOM) represents a major reservoir of organic carbon in aquatic systems and serves as the primary carrier of organic carbon in the Earth’s hydrosphere, playing a significant role in the global carbon cycle [1,2,3,4,5]. Chromophoric Dissolved Organic Matter (CDOM), the photosensitive component of DOM, is not only conservative in nature but also capable of absorbing ultraviolet and visible light [6]. It is thus employed as an effective tracer for evaluating both the concentration and spatial distribution of DOM in aquatic environments [7,8,9]. The absorption of ultraviolet and visible radiation by CDOM drives the development of marine photochemistry, modulates the penetration of UV radiation into the water column [10,11]. Therefore, the global distribution and dynamic processes of CDOM contribute to improving our prior understanding of biogeochemical behaviors and processes, thereby enhancing the accuracy of global marine ecosystem and climate modeling.
Quantifying CDOM through remote sensing serves as a valuable tool for investigating changes in marine ecosystems and studying the global-scale carbon cycle [2,12,13,14]. The absorption coefficient of CDOM, denoted as ag(λ), is used to represent CDOM concentration in ocean color remote sensing. Numerous algorithms have been developed to estimate ag(λ), primarily including empirical algorithms [15,16,17,18] and semi-analytical algorithms [19,20,21,22,23,24,25]. Empirical algorithms are established based on statistical relationships between water constituents and remote sensing reflectance (Rrs), exhibiting regional applicability [18,19].
The semi-analytical algorithms have unique theoretical foundations and mathematical solutions. Taking the widely used QAA as an example, this method employs the absorption ratio of two blue bands (e.g., 410 and 440 nm) to partition the total absorption (a) spectrum into contributions by phytoplankton (aph) and combined detrital and dissolved organic matter (adg) [25]. However, Wei and Lee (2015) [26] pointed out that in the UV region, adg is significantly higher than aph. By introducing UV wavelengths into the QAA (resulting in QAA-UV), they enhanced the retrieval accuracy of both adg and aph [27]. Recent and planned ocean color hyperspectral satellite missions increasingly include spectral bands within the UV range (for instance, OLCI, SGLI, HY1C, and PACE) [28,29,30]. These advancements establish a critical data foundation for improving the accurate retrieval of ag [11,31].
It is noteworthy that neither QAA nor QAA-UV decomposes adg into ad and ag [25,26], thereby limiting their ability to accurately characterize the spatial and temporal variability of CDOM. In addition, Wang et al. (2021) [11] demonstrated that the complex variations in water constituents often lead to nonlinear relationships among optical parameters, making it difficult for conventional methods to establish reliable functional approximations. In recent years, the rise in artificial intelligence model (deep learning or neural networks) has significantly advanced the retrieval of optical parameters [32,33,34,35,36,37,38]. Deep learning (neural networks) is inherently an empirical model, and relying solely on it imposes the limitation of unclear physical interpretability. Many scholars tend to construct deep learning models directly for the inversion of optical parameters. Chen et al. (2014) [34] developed a neural network model for retrieving the absorption coefficient. However, in the inversion of CDOM, statistical formulas were still employed [39]. Such an approach often lacks clear physical interpretability. Therefore, integrating deep learning with semi-analytical algorithms to effectively separate ag and enhance inversion accuracy represents an important and meaningful research direction.
This study aims to develop a new algorithm, deep learning-enhanced QAA with UV bands for CDOM retrieval (named DQAAG), that combines semi-analytical algorithms and deep learning models to improve the retrieval of ag. The performance of the DQAAG algorithm was evaluated using both simulated data and the NOMAD in situ dataset. Using Sea-viewing Wide Field-of-view Sensor (SeaWiFS) data as an example, we demonstrate the performance of the DQAAG algorithm in estimating ag(443) on a global scale and illustrate its impact on ocean color retrieval accuracy. The organization of this paper is as follows: in Section 2, we described the data used to establish the ag(λ) inversion algorithm; in Section 3, we introduced the algorithm of DQAAG; in Section 4, we present the results of the algorithm; in Section 5, provide a comprehensive evaluation of the algorithm performance and demonstrates the application of the algorithm to global ocean data; and Section 6 summarizes the key findings and proposed future prospects.

2. Data and Materials

2.1. Training Data

The availability of a large and diverse dataset is critical for the development of any deep learning-based algorithm. We constructed an inclusive hyperspectral synthetic dataset containing IOP and Rrs to train DQAAG. To accommodate diverse water types, the generation of a synthetic dataset encompassing a wide range of Rrs(λ) necessitates that both the a(λ) and the backscattering coefficient bb(λ) cover broad yet plausible ranges. Both a(λ) and bb(λ) consist of contributions from pure water and water constituents, including phytoplankton pigments, CDOM, and detrital minerals. The generation of this dataset generally follows the methodology described in IOCCG Report 5 [40,41]. For detailed formulations, please refer to https://ioccg.org/wp-content/uploads/2016/03/lee-data.pdf, accessed on 25 July 2025.
The hyperspectral synthetic dataset generation system is driven by aph(440), with randomized parameters designed to ensure broad coverage of diverse water types. The generation of these constrained random values follows the methodology established in IOCCG-OCAG [41]. We set the range of aph(440) to 0.001–1 m−1, generated 200,000 IOP–Rrs(λ) synthetic dataset, with 80% of the data is used for training the model, and 20% of the data is used for validating the model. The wavelength ranges from 350 to 800 nm with an interval of 1 nm, the range of Rrs(555) is 6.0 × 10−4–0.059 sr−1, bbp(443) is 4.6 × 10−5–0.71 m−1, a(443) is 6.8 × 10−3–10.08 m−1, and ag(443) is 3.9 × 10−4–8.05 m−1, respectively.
Figure 1a shows an example of simulating Rrs(λ) spectra, demonstrating that the synthetic dataset covers both clear open ocean waters and optically complex coastal waters. Figure 1b illustrates the statistical distribution of synthesized data ag(443), and the statistical indicators are shown in Table 1. Figure 1e,f illustrate the relationships between Rrs(443) and Rrs(410), as well as between ag(443) and ag(410), in simulated data versus natural observational data. The range covered by the simulated dataset far exceeds that of the IOCCG and NOMAD datasets. This indicates that the IOCCG and NOMAD data are well encompassed within the synthetic dataset range, though some combinations of inherent optical properties (IOPs) in the simulated set may not exist or are extremely rare in natural aquatic environments. In the simulated dataset, the coefficient of variation (CV) for Rrs(555) = 1.18, while that for ag(443) = 2.19. These relatively high values indicate a substantial degree of dispersion among the constructed data points, which effectively covers a wide range of water types. As a result, the deep learning model developed on this dataset is expected to possess stronger generalization capability.

2.2. Validation Data

2.2.1. Simulated Data

The International Ocean Color Coordinating Group (IOCCG) established a hyperspectral dataset containing 500 inherent optical property (IOP) spectra. The spectral range of the original IOCCG dataset is 400–800 nm (with a wavelength increment of 10 nm), and the solar zenith angle is 30° [40]. We extended the IOCCG simulation data to the ultraviolet band at 360 nm and interpolated it to intervals of 1 nm. The data extended methodology was adopted from Wang et al. (2021) [11]. The data range of IOCCG simulation data is shown in Table 1, where the range of Rrs(555) is 1.0 × 10−3–0.029 sr−1, bbp(443) is 6.4 × 10−4–0.13 m−1, a(443) is 1.6 × 10−2–3.17 m−1, ag(443) is 2.5 × 10−3–2.37 m−1. The IOCCG simulated dataset Rrs(λ) spectrum is shown in Figure 1c. The CV for ag(443) in the IOCCG simulated dataset is 1.45 (Table 1), indicating that the validation data covers a wide range of waters, thereby enabling a better assessment of the model generalization capability.

2.2.2. NOMAD Dataset

The NOMAD dataset is a publicly available, globally distributed, high-quality in situ bio-optical dataset widely used for ocean color algorithm development and validation of satellite data products [42]. The NOMAD dataset (Version 2.a) was downloaded from the SeaBASS website: http://seabass.gsfc.nasa.gov/, accessed on 25 July 2025. In addition to in situ measured Rrs(λ), the dataset also includes matched IOPs, such as particle absorption (ap), aph, ag, ad, and the bb. A total of 4559 spectra were obtained, but after excluding incomplete records and low-quality data, with missing or invalid values (−999) in Rrs(410, 443, 490, 555, 670) and ag(443), and quality assurance score lower than 0.8. Rrs(λ) was matched with ag to obtain a total of 287 spectra for further analysis. The distribution of the matched data is shown in Figure 2 (indicated by red circles). The NOMAD data range is shown in Table 1, where the range of Rrs(555) is 6.4 × 10−4–0.040 sr−1, ag(443) is 5.4 × 10−4–1.12 m−1. The NOMAD Rrs(λ) spectrum is shown in Figure 1d. Although the CV for ag(443) in the NOMAD dataset is lower than that of the simulated data, it remains effective for evaluating the model.

2.2.3. Remote Sensing Image Data

To advance the understanding of satellite ocean color remote sensing applications, taking SeaWiFS as an example, we employed satellite to in situ data match-up to evaluate the performance of the algorithm in retrieving ag(443). The SeaWiFS Level-2 Rrs(λ) data were obtained from the NASA website (https://oceancolor.gsfc.nasa.gov, accessed on 25 July 2025).
For each field station, the median value of a 3 × 3 pixel centered around the station is used to represent the satellite measurement [43]. The time window between the in situ and satellite data is set to ±5 h [42]. In addition, the quality of spectral data is judged based on Quality Assurance (QA) scores [44] and Level-2 processing flags (l2_flags), where only data with QA scores > 0.8 are retained, and SeaWiFS data containing these l2_flags (atmospheric correction failure, land pixels, possible cloud or ice pollution, strong solar scintillation pollution, and cloud clutter or shadow pollution) are excluded [45]. In the end, we obtained 81 matching points between SeaWiFS and in situ measurements (Figure 2, yellow squares).

2.3. Accuracy Assessment

The performance of the retrieval algorithm was evaluated using four statistical metrics: the Root Mean Square Difference (RMSD), the Mean Absolute Relative Difference (MARD), and the mean bias (bias). The formulas for calculating these metrics are provided below:
RMSD = i = 1 N ( X est , i X mea , i ) 2 N ,
MARD = 1 N i = 1 N X est , i X mea , i X mea , i ,
bias = 1 N i = 1 N X est , i X mea , i ,
where N represents the number of samples, Xest,i and Xmea,i denote the estimated value from the inversion and the measured value from the reference data for the i sample, respectively. This study also computed the coefficient of determination (R2) between Xest,i and Xmea,i.

3. Methods

3.1. S2011

S2011 introduced a novel approach for coastal and oceanic waters designed to accurately model the absorption spectrum of ag(λ) [46]. This modeling method utilizes two spectral slopes, an exponential curve fit (S) and a hyperbolic curve fit (γ), for the inversion of ag(λ). The specific formulations are as follows:
a g ( λ ) = a g 350 e x p ( S λ 350 γ 0 )
where ag(350) can be calculated using the following formula:
a g ( 350 ) = 0.5567 ( R r s 443 R r s 555 ) 2.0421
The spectral slopes S is as follows:
S = 0.0058 ( R r s 412 R r s 350 ) 0.9677
Furthermore, Rrs(443) and Rrs(555) are utilized to estimate ag(412).
a g ( 412 ) = 0.1866 ( R r s 443 R r s 555 ) 1.9668
The parameter γ0 serves as a crucial link accounting for the substantial variability of CDOM across transitional coastal and oceanic waters, and is calculated as follows:
γ 0 = a g 350 1 γ a g 350 + 1 γ
The hyperbolic model spectral slopes γ is as follows:
γ = 2.9332 ( R r s 412 R r s 350 ) 0.7506

3.2. A2018

Aurin et al. (2018) [2] employed the Global Ocean Carbon Algorithm Database (GOCAD) to derive an empirical model for ag(λ) through multiple linear regression (named A2018). This model establishes a functional relationship between the natural logarithm of Rrs(λ) at four distinct visible wavelengths and the natural logarithm of ag(λ), as represented by the following equation:
ln a g 443 = β 0 + β 1 ln R rs λ 1 + β 2 ln R r s λ 2 + β 3 ln R r s λ 3 + β 4 l n ( R r s ( λ 4 ) )
where λ1 = 443, λ1 = 490, λ1 = 510, λ1 = 555 for SeaWiFS, β0 = −6.410, β1 = −0.743, β2 = −0.145, and β3 = −0.367, β4 = 0.547.

3.3. QAA-CDOM

QAA-CDOM represents an enhancement of the QAA developed by Zhu et al. (2011) [23], specifically designed for the separation of the ag. QAA, developed by Lee et al. (2002) [25], is designed to derive IOPs in optically deep waters. The inversion process is divided into two consecutive steps: in the first part, a reference wavelength λ0 is selected, and semi-analytical models are applied to accurately estimate the bb and a at various wavelengths. In the second part, using the total absorption coefficient from the first part, calculate the aph and adg. Currently, QAA has developed to version 6 (https://ioccg.org/wp-content/uploads/2020/11/qaa_v6_202011.pdf, accessed on 25 July 2025). For detailed formulations, see Table 2.
Due to the similar spectral shapes of ad and ag, there is a significant challenge in distinguishing them [36]. Zhu et al. (2011) [23] estimated ad(443) based on the bbp(555) derived from QAA_v6, which can further separate ag from adg. The specific formula is as follows:
a d ( 443 ) = j 1 b b p ( 555 ) j 2
a g 443 = a d g ( λ ) a d ( λ )
which j1 = 0.966, j2 = 1.038 (the parameters used in this step is derived from empirical fits to in situ data, as reported in the study by Zhu et al. (2011) [23].

3.4. DQAAG

DQAAG is an algorithm developed for retrieving ag(443) by combining QAA semi-analysis algorithm and three sets of deep learning models.
In this study, we adopt the QAA-UV strategy by incorporating UV band into the inversion process to improve the separation of aph and adg. The UV band of 380 nm was selected. This wavelength choice was made for two primary reasons: (1) Rrs(380) has been widely adopted in existing research [26], (2) robust models are available for its retrieval [11]. For in situ measurements or satellite data lacking Rrs(380), the UVISRdl model developed by Wang et al. (2021) [11] was employed for estimation. The method demonstrates strong reliability, with a MARD < 5% for Rrs(380) estimates.
In the first part of the QAA model construction process, the parameter a0 in Step 3 and the spectral exponent Y in Step 5 are derived through empirical formulations. Since these empirically retrieved parameters are primarily used to obtain bbp(λ), where bbw is a wavelength-dependent constant, we replace this segment with a deep learning model that directly establishes an inversion relationship from Rrs(λ) to bbp(λ). Therefore, the model takes hyperspectral remote sensing dataset, Rrs(380), Rrs(410), Rrs(443), Rrs(490), Rrs(555), and Rrs(670), as input parameters, with the output being the bbp at the corresponding wavelengths.
In the development of the second part of the QAA model, two empirical formulas, ζ in Step 7 and ξ in Step 8, were originally introduced to further separate the water components. In our model construction, Rrs(380) is integrated as input into a deep learning model, with ζ and ε serving as the output targets, thereby establishing Rrs(380), Rrs(410), Rrs(443), Rrs(490), Rrs(555), and Rrs(670)—ζ, ξ deep learning module for their retrieval.
Moreover, since the standard QAA framework does not directly retrieve ag, we introduced an additional deep learning model that takes bbp(380), bbp(410), bbp(443), bbp(490), bbp(555), and bbp(670) as input and outputs ad(443), thereby establishing a dedicated model for ad inversion. A detailed flowchart of the entire process is presented in Figure 3. Figure 3 shows the algorithm flowchart on the left (Step 0–Step 9), and the deep learning model framework diagrams for Step 2, Step 4, and Step 6 on the right.
Similarly to all artificial intelligence systems, the deep learning models included in DQAAG consist of an input layer, multiple hidden layers, and an output layer. In this study, the Keras framework [47], deeply integrated with TensorFlow, was selected for implementing DQAAG. The Keras environment offers exceptionally clear, concise, and highly readable code, while providing robust backend support and a rich ecosystem. This combination enables rapid model development and experimental research. The number of hidden layers and the number of neurons in each layer were determined based on minimizing the loss function [48]. After extensive experimentation, it was found that a system with three hidden layers yielded the best performance for DQAAG, comprising 256 neurons in the first layer, 128 neurons in the second layer, and 16 neurons in the third layer. The Rectified Linear Unit (ReLU) was employed as the activation function, ReLU is favored for its computational efficiency and its effectiveness in mitigating the gradient vanishing problem [49]. The Adaptive Moment Estimation (Adam) algorithm was used as the optimizer with a learning rate set to 2 × 10−5, batch size was set to 64, the Adam combines the advantages of the momentum method and adaptive learning rate, enabling it to efficiently and stably address sparse gradient issues [50]. To prevent model overfitting, a dropout rate of 0.1 was set. The loss function is mean absolute error. When the loss function converges and the iteration stops, the training of DQAAG is completed.

4. Results

4.1. Evaluation of bbp(λ) and a(λ)

QAA_v6 recommends selecting a reference wavelength of 55x or 670 based on the case of water, and using an exponential decay model to calculate bbp(λ). This approach relies on two empirical formulas—a0) and the spectral slope Y—which can lead to error propagation. In contrast, DQAAG employs a deep learning model to derive bbp(λ) directly from Rrs(λ), eliminating the need for water classification and avoiding the cumulative errors associated with the two-step empirical approximations. The bbp(λ) retrieval results of the DQAAG model are illustrated in Figure 4a–e, the accuracy metrics are provided in Table 3. The results indicate that the bbp inverted by DQAAG has good consistency with the simulated data, RMSD < 0.0074 m−1, MARD < 0.1, bias < 0.0012 m−1, R2 > 0.96. It should be noted that no comparison with NOMAD data was performed due to the lack of in situ bbp(λ) measurements in that dataset. In addition, Aurin and Dierssen (2012) [51] pointed out that the specific values of g1 and g2 may vary with the case of water, and the use of constants for different waters may not be appropriate [19,34]. The use of DQAAG for bbp(λ) retrieval effectively circumvents this issue.
Figure 5a–e shows the comparison between a(λ) derived from DQAAG and the IOCCG simulated data at 410, 443, 490, 555, and 670 nm. The results demonstrate that the a(λ) values retrieved by DQAAG show good agreement with the simulated data. For waters with a(555) ranging from 0.06 to 0.99 m−1, the data is evenly distributed on both sides of the 1:1 line. The performance metrics are RMSD < 0.23 m−1, MARD < 0.083, and R2 > 0.95. The inversion results of a(670) have slightly poor data consistency, RMSD = 0.075 m−1, MARD = 0.10, R2 = 0.73, and the accuracy evaluation indicators are shown in Table 4.
Compared with the NOMAD measured data, although the accuracy of the model inversion results has decreased, the overall consistency of the data is good, as shown in Figure 6a–e. The retrieval results for a(λ) show RMSD < 0.31 m−1, MARD < 0.23, and R2 > 0.72. When a(410) < 0.04 m−1, there is a slight underestimation phenomenon in the inversion (Figure 6a,b). DQAAG performs accurately in moderately to highly turbid waters, further demonstrating its capability for retrieving a(λ) and bbp(λ) in global ocean applications.

4.2. Evaluation of ag(443)

Four estimation algorithms for ag(443) described in Section 3, including the empirical models S2011 and A2018, the semi-analytical model QAA-CDOM, and the DQAAG model combining deep learning with a semi-analytical algorithm, were evaluated using both simulated data from IOCCG and the publicly available NOMAD dataset. Figure 7 and Figure 8 and Table 5 present the inversion results and performance metrics of these algorithms. In the data comparison, water types were categorized based on the Rrs(490)/Rrs(555) ratio. A ratio greater than 0.85 was defined as Case 1 non-turbid water, while a ratio less than or equal to 0.85 was classified as Case 2 turbid water [6]. This classification was used to investigate the applicability of the algorithm across different water types. The specific outcomes of each algorithm are detailed below.

4.2.1. S2011

The S2011 algorithm exhibited certain deviations in the IOCCG simulated dataset (Figure 7a,c), with RMSD = 0.29 m−1 and MARD = 0.53 (Table 5). A clear underestimation phenomenon is observed in Case 1 waters, whereas the consistency is relatively better in Case 2 waters. However, its performance on the NOMAD dataset proved satisfactory, demonstrating RMSD = 0.15 m−1 and MARD = 0.44.
The accurate inversion results of S2011 in NOMAD come from two characteristics. First, the algorithm leverages the high responsiveness and variability of CDOM around 350 nm to effectively discriminate CDOM signatures [46]. Second, its exponential model incorporates two spectral slope parameters that effectively characterize ag(λ) across both UV and visible light [6,15,46]. The observed discrepancies in the IOCCG dataset may be attributed to its construction methodology. The IOCCG data were generated using Hydrolight simulations with randomized parameters to ensure broad coverage, which potentially includes optical scenarios rarely encountered in natural environments.

4.2.2. A2018

The A2018 model demonstrated significant deviations in both the IOCCG and NOMAD datasets, irrespective of water type (Case 1 or Case 2 waters), with RMSD > 0.17 m−1, MARD > 0.8, and R2 < 0.45 (Figure 7b,d and Table 5). The inversion results exhibited systematic biases, overestimation at low ag(443) values and underestimation at high values, causing the retrieved ag(443) to cluster within the narrow range of 0.01–0.5 m−1. Aurin et al. (2018) [2] also acknowledged the model limited accuracy in retrieving ag(443) when using SeaWiFS bands as input [6].

4.2.3. QAA_CDOM

The inversion accuracy of QAA-CDOM in the simulated dataset is RMSD = 0.17 m−1, MARD = 0.24, R2 = 0.89, as shown in Figure 8a, while for the NOMAD dataset, it yielded an RMSD = 0.17 m−1, MARD = 0.38, and R2 = 0.59 (Figure 8c).
The performance of QAA-CDOM is influenced by its calibration using multiple datasets, including IOCCG, NOMAD, Hudson, Mississippi, and Neponset [52], which contributes to its relatively good accuracy with the IOCCG simulated and NOMAD data. In addition, QAA_CDOM used IOCCG simulation data and NOMAD data to fit during the inversion of ad(443), providing different j1 and j2 values. It should be noted that the choice of these parameters significantly impacts the final ag(443) retrieval across different water types. For instance, using the parameters suggested by Zhu et al. [23] (j1 = 10.51, j2 = 1.56) leads to a MARD > 0.9 when compared to NOMAD data.

4.2.4. DQAAG

DQAAG achieves optimal inversion performance across both the IOCCG simulated dataset and the NOMAD in situ dataset, particularly in waters with low to moderate turbidity (Rrs(555) < 0.06 sr−1), with RMSD < 0.13 m−1, MARD < 0.30, R2 > 0.89, as shown in Figure 8b,d.
The robust performance of DQAAG on the IOCCG simulated dataset (covering both Case 1 and Case 2 waters) stems directly from its training methodology. The model was developed using a synthetically generated dataset created with IOCCG-recommended algorithms, which maintains strong consistency with Hydrolight simulations (deviation < 1%). This comprehensive training dataset spans an extended dynamic range of optical conditions, significantly enhancing the model’s robustness and applicability across diverse aquatic environments. Therefore, DQAAG demonstrated effectiveness on the IOCCG simulated data is justified. It should be noted that the construction of deep learning models heavily depends on the training dataset. Therefore, their applicability to highly turbid waters (Rrs(555) > 0.06 sr−1) still requires further validation.

4.3. Comparison of SeaWiFS Remote Sensing ag(443) Data

Since the ultimate objective of the algorithm is to apply it to ocean color satellites for obtaining global distributions of ag, we further evaluated the ag(443) values derived from DQAAG using SeaWiFS data, where the position of the satellite in situ matching station is shown in Figure 2, which includes yellow squares. Since the in situ data covers 1999–2006, SeaWIFS data was used for matching NOMAD data. The SeaWIFS data was first used to estimate Rrs(380) using the UVISRdl model established by Wang et al. (2021) [11], followed by the application of DQAAG for retrieval ag(443). Figure 9 shows the scatter plot between the inversion SeaWIFS data and NOMAD measured ag (443), where RMSD = 0.14 m−1, MARD = 0.36, and R2 = 0.51. These evaluation indicators are slightly worse than those obtained using in situ Rrs(λ) invert ag(443). The reasons for this performance degradation include: (1) The lack of perfect “match-ups” between satellite and field measurements due to temporal and spatial mismatches [53,54], which is a primary source of data bias [42]. (2) Despite efforts to minimize error propagation, residual uncertainties in Rrs(λ) products can propagate to the estimation of ag(443) due to sensor noise and incomplete atmospheric correction in marine and coastal areas [55,56,57].

5. Discussion

5.1. Model Performance

The DQAAG algorithm includes three sets of deep learning models: (1) using Rrs(λ) as input to obtain bbp(λ), (2) takes Rrs(λ) as input and produces ζ and ξ as outputs, and (3) with bbp(λ) input to estimate ad(443). To further investigate the importance of input features, we computed Shapley Additive explanations (SHAP) values between the input and output parameters of different models. SHAP value is an additional observation of the impact of each input feature on the variability of the corresponding output parameters in a deep learning model [58]. In brief, SHAP values are generated for each input variable to estimate its marginal effect on the output. The SHAP summary plot combines feature importance with the direction of feature effects, where a wider distribution of SHAP values indicates a stronger influence of the variable on the predicted parameter. Given the characteristics of the model, the SHAP interpreter was configured as the DeepExplainer, which estimates contribution values based on a layer-wise backpropagation mechanism, effectively reducing computation time. We employed global bar plots, global bee swarm plots, and SHAP value scatter plots for the analysis.
Figure 10a–c show the effect of Rrs(λ) on bbp(λ), taking bbp(555) as an example. It can be observed that Rrs(670) and Rrs(380) exert the most significant influence on bbp(555) (Figure 10a). Variations in Rrs(670) can lead to changes in bbp(555) ranging from −0.2–1 m−1 (Figure 10b). The feature dependency analysis (Figure 10c) indicates that higher values of Rrs(670) correspond to a stronger positive effect on bbp(555) (R2 = 0.99, p value < 0.01). Rrs(670) serves as a criterion for determining water classification, and provides a rough estimation of water constituents. When Rrs(670) > 0.0015 sr−1, it indicates that the water is case 2 waters [59]. Due to an increased concentration of suspended particles, the bbp(555) also increases. Therefore, it is reasonable that Rrs(670) has a greater impact on bbp(λ) than other bands.
Figure 10d–f present the impact of Rrs(λ) on ζ. Similarly, Rrs(670) shows the strongest effect on ζ, with variations potentially causing changes in ζ between −0.04~0.06 (Figure 10d,e). Furthermore, Rrs(670) exhibits a positive correlation with ζ (R2 = 0.98, p value < 0.01), although this positive influence diminishes when Rrs(670) exceeds 0.01 (Figure 10f). Figure 10g–i demonstrate the effect of Rrs(λ) on ξ. It is worth noting that Rrs(380) has the greatest impact on the change in ξ (Figure 10g), and the change in Rrs(380) can cause a change in ξ of −0.2~0.2 (Figure 10h). Specifically, Rrs(380) has a significant negative impact on ξ (Figure 10i).
Figure 10j–l show the influence of bbp(λ) on ad(443). Consistent with previous findings, bbp(670) demonstrates the strongest positive effect on the estimation of ad(443) (R2 = 0.98, p value < 0.01). Despite the differences in spectral bands, it is well understood that a clear positive correlation exists between the bbp and ad.

5.2. Sensitivity Analysis

The DQAAG algorithm uses bbp(λ) to establish a deep learning model during the inversion of ad(443), while bbp(λ) is calculated based on Rrs(λ), which leads to further accumulation of bias generated by Rrs(λ) in ad(443). To quantify this effect, we introduced random noise within ranges of ±5%, ±10%, ±20%, and ±50% to Rrs(λ) using the 500 simulated datasets provided by IOCCG, and evaluated the resulting impact on ad(443) retrieval. Figure 11 illustrates the variation in RMSD, MARD, and R2 for different wavelengths under the influence of random noise, taking Rrs(380), Rrs(443), Rrs(555), and Rrs(670) as examples. Overall, Rrs(443) exerts the strongest influence on the retrieval of ad(443) (Figure 11d–f). When ±20% random noise was added to Rrs(380), the RMSD and MARD of ad(443) increased by less than 5% (Figure 11a,b), and R2 decreased by less than 3% (Figure 11c). In contrast, introducing ±20% noise to Rrs(443) led to an approximately 200% increase in MARD and a reduction in R2 of nearly 30% (Figure 11e). The influence of noise on Rrs(555) (Figure 11g–i) and Rrs(670) (Figure 11j–l) follows the same trend as that at 380 nm, but its impact is less pronounced. These results indicate that error accumulation in the deep learning model is most significant at 443 nm, while other wavelengths have minimal impact on the retrieval accuracy.

5.3. Global CDOM Distribution Patterns

Based on the results presented above, DQAAG achieves excellent retrieval performance with SeaWiFS data. Figure 12 shows the climatological distribution of ag(443) derived from SeaWiFS using DQAAG. Extending the estimation of ag(443) to a global scale helps provide key information on the global distribution of CDOM, thereby facilitating the assessment of potential photochemical and photobiological processes in the ocean.
The global spatial distribution of ag(443) aligns with reports from previous studies. Notably, this parameter exhibits high spatial dynamism worldwide, with measured values spanning more than three orders of magnitude (ag(443)~0.001–2 m−1) [6]. As shown in Figure 12a, the ag(443) in the equatorial Pacific is slightly higher in spring, mainly due to the influence of equatorial upwelling, which leads to an increase in biological activity [60,61]. Spatially, it was found that the ag(443) of Gyre in the South Pacific was 0.01 m−1 (Figure 12b). The reason is that light degradation limits the content of CDOM [45]. In the Yangtze River estuary area, mainly due to the influence of land-based sources, the content of ag(443) can reach 1 m−1 [62]. In the southwestern Atlantic, ag(443) increases during autumn and winter (Figure 12c,d). In addition, across all seasons, ag(443) values in regions north of 30°N consistently exceed those in the Southern hemisphere, a pattern that aligns with the findings of Bricaud et al. (2012) [63].
Unlike previous algorithms [16,64,65], which face significant challenges in accurately retrieving ag(443) near the sea surface due to complex marine environments, our study demonstrates that such conditions do not substantially affect the performance of DQAAG, as confirmed by statistical analyses. Within the range of Rrs(555) from 6.0 × 10−4 to 0.06 sr−1, the DQAAG algorithm consistently yields effective results. Therefore, DQAAG is an effective algorithm for accurately obtaining ag(443), which is helpful for modeling marine ecosystems and estimating the heat of upper marine organisms.

6. Conclusions

In this study, an artificial intelligence model (deep learning) for retrieving ag(443) was developed using extensive simulated data. We compared the inversion results of intrinsic optical parameters using globally available datasets such as the IOCCG simulated data and NOMAD field measurements, which cover both Case 1 and Case 2 waters. The comparison between the IOCCG simulated dataset and NOMAD in situ data a(λ) showed RMSD < 0.31 m−1, MARD < 0.23, and R2 > 0.72. Compared with QAA-CDOM, S2011, A2018, DQAAG inversion of ag(443) has better inversion results, RMSD < 0.3 m−1, MARD < 0.30, bias < 0.028 m−1, R2 > 0.78. This study also demonstrates the universality of algorithms based on radiative transfer and further underscores the powerful capability of combining deep learning with semi-analytical methods to address ocean color retrieval challenges. In addition, when matching satellite-derived ag(443) with NOMAD data, DQAAG exhibited higher retrieval accuracy, with RMSD = 0.14 m−1 and MARD ≈ 0.39. Accurate retrieval of CDOM can effectively monitor the spatiotemporal distribution of river plume inputs into the ocean, thereby advancing our understanding of land-ocean interaction processes. Since freshwater plumes often carry significant nutrient loads, they can stimulate phytoplankton growth, thereby providing valuable information for environmental monitoring agencies and fisheries management.
In summary, DQAAG relies on two important characteristics, one is the use of ultraviolet band for AG separation, and the other is the use of deep learning models instead of simple empirical formulas. The adoption of ultraviolet spectral bands enables full utilization of existing satellite data with UV sensing capabilities, while simultaneously opening new methodological avenues for subsequent research on water color parameters. As current satellites are equipped with bands below 380 nm (such as HY-1C or OCI), further exploration of the application scenarios of wavelengths below 380 nm in ocean color remote sensing is a worthwhile endeavor. During this process, attention must be paid to the influence of Mycosporine-like amino acids (MAAs). Furthermore, the integration of deep learning models with semi-analytical approaches represents a practical and effective approach for processing hyperspectral ocean color remote sensing data. They not only enhance the capability to retrieve ag(443) from both global oceanic and coastal waters but also improve the accuracy of marine primary productivity estimates, making an important contribution to the field of ocean optics.

Author Contributions

Y.W. responsible for data analysis, model training, and manuscript writing; X.W. and Q.S. contributed to the design, organization, and manuscript revision of the manuscript; Q.X. and L.X. have made contributions to the collection and data analysis of remote sensing images; J.B. and K.B. contributed to the collection of remote sensing image data. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 42406180), Key Laboratory of Space Ocean Remote Sensing and Application Open Fund (No. 202301002), Supported by the Guangxi Disclosure System Technology Project (No. 2025JBGS008), the Central Basic Research Business Fund Projects (No. TKS20250205) and the Research Project of China Three Gorges Corporation (No. 202103552).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

We thank NASA OBPG for providing satellite ocean color products (http://oceancolor.gsfc.nasa.gov/, accessed on 25 July 2025), NASA for their help with providing the NOMAD dataset. We thank the reviewers for their suggestions, which significantly improved the presentation of the paper.

Conflicts of Interest

Author Xiaodao Wei was employed by the company Shanghai Investigation, Design & Research Institute Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Cao, F.; Tzortziou, M.; Hu, C.; Mannino, A.; Fichot, C.G.; Del Vecchio, R.; Najjar, R.G.; Novak, M. Remote sensing retrievals of colored dissolved organic matter and dissolved organic carbon dynamics in North American estuaries and their margins. Remote Sens. Environ. 2018, 205, 151–165. [Google Scholar] [CrossRef]
  2. Aurin, D.; Mannino, A.; Lary, D.J. Remote sensing of CDOM, CDOM spectral slope, and dissolved organic carbon in the global ocean. Appl. Sci. 2018, 8, 2687. [Google Scholar] [CrossRef]
  3. Stedmon, C.A.; Nelson, N.B. The optical properties of DOM in the ocean. In Biogeochemistry of Marine Dissolved Organic Matter; Elsevier: Amsterdam, The Netherlands, 2015; pp. 481–508. [Google Scholar]
  4. Nebbioso, A.; Piccolo, A.J.A.; Chemistry, B. Molecular characterization of dissolved organic matter (DOM): A critical review. Anal. Bioanal. Chem. 2013, 405, 109–124. [Google Scholar] [CrossRef]
  5. Huang, J.; Chen, J.; Mu, Y.; Cao, C.; Shen, H. Remote-sensing monitoring of colored dissolved organic matter in the Arctic Ocean. Mar. Pollut. Bull. 2024, 204, 116529. [Google Scholar] [CrossRef]
  6. Bonelli, A.G.; Vantrepotte, V.; Jorge, D.S.F.; Demaria, J.; Jamet, C.; Dessailly, D.; Mangin, A.; D’Andon, O.F.; Kwiatkowska, E.; Loisel, H. Colored dissolved organic matter absorption at global scale from ocean color radiometry observation: Spatio-temporal variability and contribution to the absorption budget. Remote Sens. Environ. 2021, 265, 112637. [Google Scholar] [CrossRef]
  7. Jiao, N.; Luo, T.; Chen, Q.; Zhao, Z.; Xiao, X.; Liu, J.; Jian, Z.; Xie, S.; Thomas, H.; Herndl, G.J.; et al. The microbial carbon pump and climate change. Nat. Rev. Microbiol. 2024, 22, 408–419. [Google Scholar] [CrossRef]
  8. Ducklow, H.W.; Steinberg, D.K.; Buesseler, K.O.J.O. Upper ocean carbon export and the biological pump. Oceanography 2001, 14, 50–58. [Google Scholar] [CrossRef]
  9. Norman, L.; Thomas, D.N.; Stedmon, C.A.; Granskog, M.A.; Papadimitriou, S.; Krapp, R.H.; Meiners, K.M.; Lannuzel, D.; van der Merwe, P.; Dieckmann, G.S. The characteristics of dissolved organic matter (DOM) and chromophoric dissolved organic matter (CDOM) in Antarctic sea ice. Deep. Sea Res. Part II Top. Stud. Oceanogr. 2011, 58, 1075–1091. [Google Scholar] [CrossRef]
  10. De Mora, S.; Demers, S.; Vernet, M. The Effects of UV Radiation in the Marine Environment; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar]
  11. Wang, Y.; Lee, Z.; Wei, J.; Shang, S.; Wang, M.; Lai, W. Extending satellite ocean color remote sensing to the near-blue ultraviolet bands. Remote Sens. Environ. 2021, 253, 112228. [Google Scholar] [CrossRef]
  12. Mahrad, B.E.; Newton, A.; Icely, J.D.; Kacimi, I.; Abalansa, S.; Snoussi, M.J.R.S. Contribution of remote sensing technologies to a holistic coastal and marine environmental management framework: A review. Remote Sens. 2020, 12, 2313. [Google Scholar] [CrossRef]
  13. Kutser, T.; Pierson, D.C.; Kallio, K.Y.; Reinart, A.; Sobek, S. Mapping lake CDOM by satellite remote sensing. Remote Sens. Environ. 2005, 94, 535–540. [Google Scholar] [CrossRef]
  14. Isada, T.; Hooker, S.B.; Taniuchi, Y.; Suzuki, K. Evaluation of retrieving chlorophyll a concentration and colored dissolved organic matter absorption from satellite ocean color remote sensing in the coastal waters of Hokkaido, Japan. J. Oceanogr. 2022, 78, 263–276. [Google Scholar] [CrossRef]
  15. Nguyen, V.S.; Loisel, H.; Vantrepotte, V.; Mériaux, X.; Tran, D.L. An Empirical Algorithm for Estimating the Absorption of Colored Dissolved Organic Matter from Sentinel-2 (MSI) and Landsat-8 (OLI) Observations of Coastal Waters. Remote Sens. 2024, 16, 4061. [Google Scholar] [CrossRef]
  16. Mannino, A.; Russ, M.E.; Hooker, S.B. Algorithm development and validation for satellite-derived distributions of DOC and CDOM in the US Middle Atlantic Bight. J. Geophys. Res.-Ocean. 2008, 113, C07051. [Google Scholar] [CrossRef]
  17. Sathyendranath, S.; Cota, G.; Stuart, V.; Maass, M.; Platt, T. Remote sensing of phytoplankton pigments: A comparison of empirical and theoretical approaches. Int. J. Remote Sens. 2001, 22, 249–273. [Google Scholar] [CrossRef]
  18. Lee, Z.P.; Carder, K.L.; Steward, R.G.; Peacock, T.G.; Davis, C.O.; Patch, J.S. An empirical algorithm for light absorption by ocean water based on color. J. Geophys. Res. 1998, 103, 27967–27978. [Google Scholar] [CrossRef]
  19. Wang, Y.; Shen, F.; Sokoletsky, L.; Sun, X. Validation and Calibration of QAA Algorithm for CDOM Absorption Retrieval in the Changjiang (Yangtze) Estuarine and Coastal Waters. Remote Sens. 2017, 9, 1192. [Google Scholar] [CrossRef]
  20. Hoge, F.E.; Lyon, P.E. Satellite retrieval of inherent optical properties by linear matrix inversion of oceanic radiance models: An analysis of model and radiance measurement errors. J. Geophys. Res. Ocean. 1996, 101, 16631–16648. [Google Scholar] [CrossRef]
  21. D’Sa, E.J.; Miller, R.L.; Del Castillo, C. Bio-optical properties and ocean color algorithms for coastal waters influenced by the Mississippi River during a cold front. Appl. Opt. 2006, 45, 7410–7428. [Google Scholar] [CrossRef]
  22. Barnard, A.H.; Zaneveld, J.R.V.; Pegau, W.S. In situ determination of the remotely sensed reflectance and the absorption coefficient: Closure and inversion. Appl. Opt. 1999, 38, 5108–5117. [Google Scholar] [CrossRef]
  23. Zhu, W.; Yu, Q.; Tian, Y.Q.; Chen, R.F.; Gardner, G.B. Estimation of chromophoric dissolved organic matter in the Mississippi and Atchafalaya river plume regions using above-surface hyperspectral remote sensing. J. Geophys. Res.-Ocean. 2011, 116, C02011. [Google Scholar] [CrossRef]
  24. Maritorena, S.; Siegel, D.A.; Peterson, A.R. Optimization of a semianalytical ocean color model for global-scale applications. Appl. Opt. 2002, 41, 2705–2714. [Google Scholar] [CrossRef]
  25. Lee, Z.; Carder, K.L.; Arnone, R.A. Deriving inherent optical properties from water color: A multiband quasi-analytical algorithm for optically deep waters. Appl. Opt. 2002, 41, 5755–5772. [Google Scholar] [CrossRef]
  26. Wei, J.; Lee, Z. Retrieval of phytoplankton and colored detrital matter absorption coefficients with remote sensing reflectance in an ultraviolet band. Appl. Opt. 2015, 54, 636–649. [Google Scholar] [CrossRef] [PubMed]
  27. Liu, H.; He, X.; Li, Q.; Kratzer, S.; Wang, J.; Shi, T.; Hu, Z.; Yang, C.; Hu, S.; Zhou, Q. Estimating ultraviolet reflectance from visible bands in ocean colour remote sensing. Remote Sens. Environ. 2021, 258, 112404. [Google Scholar] [CrossRef]
  28. Zheng, L.; Lee, Z.; Wang, Y.; Yu, X.; Lai, W.; Shang, S. Evaluation of near-blue UV remote sensing reflectance over the global ocean from SNPP VIIRS, PACE OCI, and GCOM-C SGLI. Opt. Express 2025, 33, 40465–40488. [Google Scholar] [CrossRef] [PubMed]
  29. Siswanto, E. Assessing optical water types in Asian coastal ocean waters from space using GCOM-C/SGLI observations. Int. J. Remote Sens. 2025, 46, 2337–2357. [Google Scholar] [CrossRef]
  30. Li, S.; Chen, S.; Ma, C.; Peng, H.; Wang, J.; Hu, L.; Song, Q. Construction of a radiometric degradation model for ocean color sensors of HY1C/D. In IEEE Transactions on Geoscience Remote Sensing; IEEE: New York, NY, USA, 2024; Volume 62, pp. 1–13. [Google Scholar]
  31. Wang, J.; Wang, Y.; Lee, Z.; Wang, D.; Chen, S.; Lai, W. A revision of NASA SeaDAS atmospheric correction algorithm over turbid waters with artificial Neural Networks estimated remote-sensing reflectance in the near-infrared. ISPRS J. Photogramm. Remote Sens. 2022, 194, 235–249. [Google Scholar] [CrossRef]
  32. Zhao, D.; Feng, L.; Yang, Z.; Yu, X.; Wang, M. A deep-learning assisted algorithm to improve inherent optical properties estimations over inland and nearshore coastal waters. In IEEE Transactions on Geoscience Remote Sensing; IEEE: New York, NY, USA, 2025. [Google Scholar]
  33. Zhang, Z.; Chen, P.; Zhang, S.; Huang, H.; Pan, Y.; Pan, D. A Review of Machine Learning Applications in Ocean Color Remote Sensing. Remote Sens. 2025, 17, 1776. [Google Scholar] [CrossRef]
  34. Chen, J.; Quan, W.; Cui, T.; Song, Q.; Lin, C. Remote sensing of absorption and scattering coefficient using neural network model: Development, validation, and application. Remote Sens. Environ. 2014, 149, 213–226. [Google Scholar] [CrossRef]
  35. Sauzède, R.; Claustre, H.; Uitz, J.; Jamet, C.; Dall’Olmo, G.; d’Ortenzio, F.; Gentili, B.; Poteau, A.; Schmechtig, C. A neural network-based method for merging ocean color and Argo data to extend surface bio-optical properties to depth: Retrieval of the particulate backscattering coefficient. J. Geophys. Res. Ocean. 2016, 121, 2552–2571. [Google Scholar] [CrossRef]
  36. Ioannou, I.; Gilerson, A.; Gross, B.; Moshary, F.; Ahmed, S. Deriving ocean color products using neural networks. Remote Sens. Environ. 2013, 134, 78–91. [Google Scholar] [CrossRef]
  37. Yuan, Q.; Shen, H.; Li, T.; Li, Z.; Li, S.; Jiang, Y.; Xu, H.; Tan, W.; Yang, Q.; Wang, J. Deep learning in environmental remote sensing: Achievements and challenges. Remote Sens. Environ. 2020, 241, 111716. [Google Scholar] [CrossRef]
  38. Sun, X.; Zhang, Y.; Zhang, Y.; Shi, K.; Zhou, Y.; Li, N. Machine learning algorithms for chromophoric dissolved organic matter (CDOM) estimation based on Landsat 8 images. Remote Sens. 2021, 13, 3560. [Google Scholar] [CrossRef]
  39. Chen, J.; He, X.; Zhou, B.; Pan, D. Deriving colored dissolved organic matter absorption coefficient from ocean color with a neural quasi-analytical algorithm. J. Geophys. Res. Ocean. 2017, 122, 8543–8556. [Google Scholar] [CrossRef]
  40. IOCCG. Remote Sensing of Inherent Optical Properties: Fundamentals, Tests of Algorithms, and Applications. In Reports of the International Ocean-Colour Coordinating Group, No. 5; Lee, Z.-P., Stuart, V., Eds.; IOCCG: Dartmouth, NS, Canada, 2006; Volume 5, p. 126. [Google Scholar]
  41. IOCCG-OCAG (International Ocean Colour Coordinating Group). Model, Parameters, and Approaches That Used to Generate Wide Range of Absorption and Backscattering Spectra. 2003. Available online: http://www.ioccg.org/groups/OCAG_data.html (accessed on 25 July 2025).
  42. Werdell, P.J.; Bailey, S.W. An improved in-situ bio-optical data set for ocean color algorithm development and satellite data product validation. Remote Sens. Environ. 2005, 98, 122–140. [Google Scholar] [CrossRef]
  43. Bailey, S.W.; Werdell, P.J. A multi-sensor approach for the on-orbit validation of ocean color satellite data products. Remote Sens. Environ. 2006, 102, 12–23. [Google Scholar] [CrossRef]
  44. Wei, J.; Lee, Z.; Shang, S. A system to measure the data quality of spectral remote sensing reflectance of aquatic environments. J. Geophys. R. 2016, 121, 8189–8207. [Google Scholar] [CrossRef]
  45. Wang, Y.; Lee, Z.; Ondrusek, M.; Li, X.; Zhang, S.; Wu, J. An evaluation of remote sensing algorithms for the estimation of diffuse attenuation coefficients in the ultraviolet bands. Opt. Express 2022, 30, 6640–6655. [Google Scholar] [CrossRef]
  46. Shanmugam, P. New models for retrieving and partitioning the colored dissolved organic matter in the global ocean: Implications for remote sensing. Remote Sens. Environ. 2011, 115, 1501–1521. [Google Scholar] [CrossRef]
  47. Chollet, F. Keras. 2015. Available online: https://github.com/fchollet/keras (accessed on 25 July 2025).
  48. Géron, A. Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow: Concepts, Tools, and Techniques to Build Intelligent Systems; O’Reilly Media: Santa Rosa, CA, USA, 2019. [Google Scholar]
  49. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–8 December 2012; pp. 1097–1105. [Google Scholar]
  50. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  51. Aurin, D.A.; Dierssen, H.M. Advantages and limitations of ocean color remote sensing in CDOM-dominated, mineral-rich coastal and estuarine waters. Remote Sens. Environ. 2012, 125, 181–197. [Google Scholar] [CrossRef]
  52. Zhu, W.; Yu, Q.; Tian, Y.Q.; Becker, B.L.; Zheng, T.; Carrick, H.J. An assessment of remote sensing algorithms for colored dissolved organic matter in complex freshwater environments. Remote Sens. Environ. 2014, 140, 766–778. [Google Scholar] [CrossRef]
  53. Antoine, D.; d’Ortenzio, F.; Hooker, S.B.; Bécu, G.; Gentili, B.; Tailliez, D.; Scott, A.J. Assessment of uncertainty in the ocean reflectance determined by three satellite ocean color sensors (MERIS, SeaWiFS and MODIS-A) at an offshore site in the Mediterranean Sea (BOUSSOLE project). J. Geophys. Res. Ocean. 2008, 113, C07013. [Google Scholar] [CrossRef]
  54. Zibordi, G.; Berthon, J.-F.; Mélin, F.; D’Alimonte, D.; Kaitala, S. Validation of satellite ocean color primary products at optically complex coastal sites: Northern Adriatic Sea, Northern Baltic Proper and Gulf of Finland. Remote Sens. Environ. 2009, 113, 2574–2591. [Google Scholar] [CrossRef]
  55. Wei, J.; Lee, Z.; Garcia, R.; Zoffoli, L.; Armstrong, R.A.; Shang, Z.; Sheldon, P.; Chen, R.F. An assessment of Landsat-8 atmospheric correction schemes and remote sensing reflectance products in coral reefs and coastal turbid waters. Remote Sens. Environ. 2018, 215, 18–32. [Google Scholar] [CrossRef]
  56. Wang, M. Remote sensing of the ocean contributions from ultraviolet to near-infrared using the shortwave infrared bands: Simulations. Appl. Opt. 2007, 46, 1535–1547. [Google Scholar] [CrossRef] [PubMed]
  57. Wang, J.; Lee, Z.; Wei, J.; Du, K. Atmospheric correction in coastal region using same-day observations of different sun-sensor geometries with a revised POLYMER model. Opt. Express 2020, 28, 26953–26976. [Google Scholar] [CrossRef]
  58. Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Proceedings of the Advances in Neural Information Processing Systems 30, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
  59. Najah, A.; Al-Shehhi, M.R. Performance of the ocean color algorithms: QAA, GSM, and GIOP in inland and coastal waters. Remote Sens. Earth Syst. Sci. 2021, 4, 235–248. [Google Scholar] [CrossRef]
  60. Nelson, N.B.; Siegel, D.A. The global distribution and dynamics of chromophoric dissolved organic matter. Annu. Rev. Mar. Sci. 2013, 5, 447–476. [Google Scholar] [CrossRef]
  61. Nolan, C.; Overpeck, J.T.; Allen, J.R.; Anderson, P.M.; Betancourt, J.L.; Binney, H.A.; Brewer, S.; Bush, M.B.; Chase, B.M.; Cheddadi, R. Past and future global transformation of terrestrial ecosystems under climate change. Science 2018, 361, 920–923. [Google Scholar] [CrossRef] [PubMed]
  62. Zhang, Y.; Zhou, L.; Zhou, Y.; Zhang, L.; Yao, X.; Shi, K.; Jeppesen, E.; Yu, Q.; Zhu, W. Chromophoric dissolved organic matter in inland waters: Present knowledge and future challenges. Sci. Total Environ. 2021, 759, 143550. [Google Scholar] [CrossRef]
  63. Bricaud, A.; Ciotti, A.M.; Gentili, B. Spatial-temporal variations in phytoplankton size and colored detrital matter absorption at global and regional scales, as derived from twelve years of SeaWiFS data (1998–2009). Glob. Biogeochem. Cycles 2012, 26, GB1010. [Google Scholar] [CrossRef]
  64. Mannino, A.; Novak, M.G.; Hooker, S.B.; Hyde, K.; Aurin, D. Algorithm development and validation of CDOMproperties for estuarine and continental shelf waters along the northeastern U.S. coast. Remote Sens. Environ. 2014, 152, 576–602. [Google Scholar] [CrossRef]
  65. Bai, Y.; Pan, D.; Cai, W.J.; He, X.; Wang, D.; Tao, B.; Zhu, Q. Remote sensing of salinity from satellite-derived CDOM in the Changjiang River dominated East China Sea. J. Geophys. Res. Ocean. 2013, 118, 227–243. [Google Scholar] [CrossRef]
Figure 1. Rrs(λ) spectra and ag(443) data statistics used in this study. (a) Rrs(λ) hyperspectral of synthesized data, (b) the statistical distribution of synthesized data ag(443), (c) the Rrs(λ) hyperspectral of simulated data provided by IOCCG, (d) the measured Rrs(λ) spectrum provided by NOMAD dataset Relationship between Rrs(λ) and ag(λ) of both synthetic, IOCCG, and NOMAD datasets. (e) Rrs(443) vs. Rrs(410), (f) ag(443) vs. ag(410).
Figure 1. Rrs(λ) spectra and ag(443) data statistics used in this study. (a) Rrs(λ) hyperspectral of synthesized data, (b) the statistical distribution of synthesized data ag(443), (c) the Rrs(λ) hyperspectral of simulated data provided by IOCCG, (d) the measured Rrs(λ) spectrum provided by NOMAD dataset Relationship between Rrs(λ) and ag(λ) of both synthetic, IOCCG, and NOMAD datasets. (e) Rrs(443) vs. Rrs(410), (f) ag(443) vs. ag(410).
Remotesensing 18 00207 g001
Figure 2. The NOMAD in situ data used for evaluating the ag(443) retrieval algorithm. The red circle represents the measurement positions of NOMAD data, and the yellow square represents the NOMAD data stations that match the SeaWiFS satellite.
Figure 2. The NOMAD in situ data used for evaluating the ag(443) retrieval algorithm. The red circle represents the measurement positions of NOMAD data, and the yellow square represents the NOMAD data stations that match the SeaWiFS satellite.
Remotesensing 18 00207 g002
Figure 3. Schematic diagram of the system for estimating ag(443) using deep learning: DQAAG.
Figure 3. Schematic diagram of the system for estimating ag(443) using deep learning: DQAAG.
Remotesensing 18 00207 g003
Figure 4. Comparison between bbp(λ) derived from DQAAG and the bbp(λ) from the IOCCG simulated dataset: (a) bbp(410), (b) bbp(443), (c) bbp(490), (d) bbp(555), (e) bbp(670).
Figure 4. Comparison between bbp(λ) derived from DQAAG and the bbp(λ) from the IOCCG simulated dataset: (a) bbp(410), (b) bbp(443), (c) bbp(490), (d) bbp(555), (e) bbp(670).
Remotesensing 18 00207 g004
Figure 5. Comparison between a(λ) derived from DQAAG and a(λ) from the IOCCG simulated dataset: (a) a(410), (b) a(443), (c) a(490), (d) a(555), and (e) a(670).
Figure 5. Comparison between a(λ) derived from DQAAG and a(λ) from the IOCCG simulated dataset: (a) a(410), (b) a(443), (c) a(490), (d) a(555), and (e) a(670).
Remotesensing 18 00207 g005
Figure 6. Comparison between a(λ) derived from DQAAG and a(λ) from the NOMAD dataset: (a) a(410), (b) a(443), (c) a(490), (d) a(555), and (e) a(670).
Figure 6. Comparison between a(λ) derived from DQAAG and a(λ) from the NOMAD dataset: (a) a(410), (b) a(443), (c) a(490), (d) a(555), and (e) a(670).
Remotesensing 18 00207 g006
Figure 7. Comparison between ag(443) derived from S2011 and A2018 from the IOCCG (a) S2011, (b) A2018 and NOMAD dataset (c) S2011, and (d) A2018.
Figure 7. Comparison between ag(443) derived from S2011 and A2018 from the IOCCG (a) S2011, (b) A2018 and NOMAD dataset (c) S2011, and (d) A2018.
Remotesensing 18 00207 g007
Figure 8. Comparison between ag(443) derived from QAA-CDOM and DQAAG from the IOCCG (a) QAA-CDOM, (b) DQAAG, and NOMAD dataset (c) QAA-CDOM, and (d) DQAAG.
Figure 8. Comparison between ag(443) derived from QAA-CDOM and DQAAG from the IOCCG (a) QAA-CDOM, (b) DQAAG, and NOMAD dataset (c) QAA-CDOM, and (d) DQAAG.
Remotesensing 18 00207 g008
Figure 9. Comparison between the SeaWiFS data derived by DQAAG and the NOMAD measured ag(443).
Figure 9. Comparison between the SeaWiFS data derived by DQAAG and the NOMAD measured ag(443).
Remotesensing 18 00207 g009
Figure 10. SHAP summary plots for the DQAAG model applied to the synthetic dataset. (a) mean |SHAP| values of Rrs(λ) for predicting bbp(555), (b) SHAP values of Rrs(λ) for bbp(555), (c) the feature dependence trend of Rrs(670) on bbp(555), (d) mean |SHAP| values of Rrs(λ) for predicting ζ, (e) SHAP values of Rrs(λ) for ζ, (f) the feature dependence trend of Rrs(670) on ζ, (g) mean |SHAP| values of Rrs(λ) for predicting ξ, (h) SHAP values of Rrs(λ) for ξ, (i) the feature dependence trend of Rrs(380) on ξ, (j) mean |SHAP| values of bbp(λ) for predicting ad(443), (k) SHAP values of bbp(λ) for ad(443), and (l) the feature dependence trend of bbp(670) on ad(443).
Figure 10. SHAP summary plots for the DQAAG model applied to the synthetic dataset. (a) mean |SHAP| values of Rrs(λ) for predicting bbp(555), (b) SHAP values of Rrs(λ) for bbp(555), (c) the feature dependence trend of Rrs(670) on bbp(555), (d) mean |SHAP| values of Rrs(λ) for predicting ζ, (e) SHAP values of Rrs(λ) for ζ, (f) the feature dependence trend of Rrs(670) on ζ, (g) mean |SHAP| values of Rrs(λ) for predicting ξ, (h) SHAP values of Rrs(λ) for ξ, (i) the feature dependence trend of Rrs(380) on ξ, (j) mean |SHAP| values of bbp(λ) for predicting ad(443), (k) SHAP values of bbp(λ) for ad(443), and (l) the feature dependence trend of bbp(670) on ad(443).
Remotesensing 18 00207 g010
Figure 11. Impact of adding random noise (±5%, ±10%, ±20%, and ±50%) to Rrs(λ) on the retrieval accuracy of ad(443): (ac) represent the RMSD, MARD, and R2 for Rrs(380), (df) correspond to the RMSD, MARD, and R2 for Rrs(443), (gi) represent the RMSD, MARD, and R2 for Rrs(555), and (jl) represent the RMSD, MARD, and R2 for Rrs(670).
Figure 11. Impact of adding random noise (±5%, ±10%, ±20%, and ±50%) to Rrs(λ) on the retrieval accuracy of ad(443): (ac) represent the RMSD, MARD, and R2 for Rrs(380), (df) correspond to the RMSD, MARD, and R2 for Rrs(443), (gi) represent the RMSD, MARD, and R2 for Rrs(555), and (jl) represent the RMSD, MARD, and R2 for Rrs(670).
Remotesensing 18 00207 g011
Figure 12. Global distribution of seasonal climatology of SeaWiFS derived ag(443): (a) Spring, (b) Summer, (c) Autumn, and (d) Winter.
Figure 12. Global distribution of seasonal climatology of SeaWiFS derived ag(443): (a) Spring, (b) Summer, (c) Autumn, and (d) Winter.
Remotesensing 18 00207 g012
Table 1. Statistical description of Rrs(555) (taking Rrs(555) as an example) and ag(443) datasets used for model training, validation, and testing (coefficient of variation (CV) is the ratio of standard deviation to mean).
Table 1. Statistical description of Rrs(555) (taking Rrs(555) as an example) and ag(443) datasets used for model training, validation, and testing (coefficient of variation (CV) is the ratio of standard deviation to mean).
DataData Sources
(Data Number)
ParameterMinMaxMeanCV
Training
data
Simulated data
(N = 200,000)
Rrs(555) [sr−1]6.0 × 10−4 0.059 0.0062 1.18
ag(443) [m−1]3.9 × 10−48.050.542.19
Validation
data
IOCCG data
(N = 500)
Rrs(555) [sr−1]1.0 × 10−30.0290.00610.77
ag(443) [m−1]2.5 × 10−32.370.331.45
NOMAD data
(N = 287)
Rrs(555) [sr−1]6.4 × 10−40.0400.00611.08
ag(443) [m−1]5.4 × 10−41.120.171.18
Table 2. QAA_v6.
Table 2. QAA_v6.
StepsPropertyDerivationApproach
Step 0 r r s λ = R r s 0.52 + 1.7 R r s Semi-analytical
Step 1 u λ = g 0 ± g 0 2 4 g 1 r r s λ 2 g 1 Semi-analytical
Step 2 a ( λ 0 ) if R r s 670 < 0.0015 s r 1
= a w ( λ 0 ) + 10 1.146 1.366 x 0.469 x 2
x = log 10 ( r r s ( 443 ) + r r s ( 490 ) r r s ( λ 0 ) + 5 r r s ( 667 ) r r s ( 667 ) r r s ( 490 ) )
else
= a w 670 + 0.39 ( R r s ( 670 ) R r s 443 + R r s ( 490 ) ) 1.14
Empirical
Step 3 b b p ( λ 0 ) = u ( λ 0 ) a ( λ 0 ) 1 u ( λ 0 ) b b w λ 0 Analytical
Step 4 Y = 2.0 ( 1 1.2 e 0.9 r r s ( 443 ) r r s ( 555 ) ) Empirical
Step 5 b b p λ = b b p ( λ 0 ) ( λ 0 λ ) Y Semi-analytical
Step 6 a λ = [ 1 u λ ] b b λ u λ Analytical
Step 7 ζ = p 1 + p 2 p 3 + r r s ( 443 ) r r s ( 555 )
p 1 = 0.74 , p 2 = 0.2 , p 3 = 0.8
Empirical
Step 8 ξ = e S ( 443 412 )
S = p 1 + p 2 p 3 + r r s ( 443 ) r r s ( 555 )
p 1 = 0.015 ,   p 2 = 0.002 ,   p 3 = 0.6
Empirical
Step 9 a d g ( 443 ) = a 412 ζ a ( 443 ) ξ ζ a w 412 ζ a w ( 443 ) ξ ζ Analytical
Step 10 a p h ( λ ) = a a d g ( λ ) a w ( λ )
  a d g ( λ ) = a d g ( 443 ) e S ( λ 443 )
Analytical
Table 3. Statistics of DQAAG applied to the IOCCG simulated dataset for bbp(λ) inversion.
Table 3. Statistics of DQAAG applied to the IOCCG simulated dataset for bbp(λ) inversion.
DataNRMSD (m−1)MARDBias (m−1)R2
bbp(410)5000.00520.087−0.00039~0.97
bbp(443)0.00530.078−0.00085~0.97
bbp(490)0.00600.0840.00032~0.96
bbp(555)0.00680.0810.00075~0.96
bbp(670)0.00740.0730.0012~0.96
Table 4. Statistics of DQAAG applied to the IOCCG simulated dataset and NOMAD dataset for a(λ) inversion.
Table 4. Statistics of DQAAG applied to the IOCCG simulated dataset and NOMAD dataset for a(λ) inversion.
AlgorithmsDataNRMSD (m−1)MARDBias (m−1)R2
a(410)IOCCG5000.230.0690.0310.96
NOMAD2870.310.23−0.0640.75
a(443)IOCCG5000.140.12−0.0420.97
NOMAD2870.230.21−0.0450.76
a(490)IOCCG5000.0820.0630.00940.96
NOMAD2870.120.20−0.0160.82
a(555)IOCCG5000.0330.083−0.00660.95
NOMAD287 (286) *0.0590.170.0110.72
a(670)IOCCG5000.0750.100.0390.73
NOMAD287 (283) *0.150.220.0830.79
* The number in parentheses indicates the count of valid observations.
Table 5. Statistical metrics of ag(443) derived from S2011, A2018, QAA-CDOM, and DQAAG from the IOCCG and NOMAD dataset.
Table 5. Statistical metrics of ag(443) derived from S2011, A2018, QAA-CDOM, and DQAAG from the IOCCG and NOMAD dataset.
AlgorithmsDataNRMSD (m−1)MARDBias (m−1)R2
a g 443 QAA-CDOMIOCCG5000.200.32−0.0780.89
Nomad2870.150.42−0.0470.50
S2011IOCCG5000.290.530.0230.64
Nomad2870.150.44−0.00460.54
A2018IOCCG5000.451.01−0.180.38
Nomad2870.170.82−0.0640.45
DQAAGIOCCG5000.110.190.00760.96
Nomad2870.130.300.0280.78
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Y.; Xin, Q.; Wei, X.; Xu, L.; Bi, J.; Bao, K.; Song, Q. Improvement of the Semi-Analytical Algorithm Integrating Ultraviolet Band and Deep Learning for Inverting the Absorption Coefficient of Chromophoric Dissolved Organic Matter in the Ocean. Remote Sens. 2026, 18, 207. https://doi.org/10.3390/rs18020207

AMA Style

Wang Y, Xin Q, Wei X, Xu L, Bi J, Bao K, Song Q. Improvement of the Semi-Analytical Algorithm Integrating Ultraviolet Band and Deep Learning for Inverting the Absorption Coefficient of Chromophoric Dissolved Organic Matter in the Ocean. Remote Sensing. 2026; 18(2):207. https://doi.org/10.3390/rs18020207

Chicago/Turabian Style

Wang, Yongchao, Quanbo Xin, Xiaodao Wei, Luoning Xu, Jinqiang Bi, Kexin Bao, and Qingjun Song. 2026. "Improvement of the Semi-Analytical Algorithm Integrating Ultraviolet Band and Deep Learning for Inverting the Absorption Coefficient of Chromophoric Dissolved Organic Matter in the Ocean" Remote Sensing 18, no. 2: 207. https://doi.org/10.3390/rs18020207

APA Style

Wang, Y., Xin, Q., Wei, X., Xu, L., Bi, J., Bao, K., & Song, Q. (2026). Improvement of the Semi-Analytical Algorithm Integrating Ultraviolet Band and Deep Learning for Inverting the Absorption Coefficient of Chromophoric Dissolved Organic Matter in the Ocean. Remote Sensing, 18(2), 207. https://doi.org/10.3390/rs18020207

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop