Long-Term Hourly Ozone Forecasting via Time–Frequency Analysis of ICEEMDAN-Decomposed Components: A 36-Hour Forecast for a Site in Beijing

Lv, Taotao; Yi, Yulu; Zheng, Zhuowen; Yang, Jie; Li, Siwei

doi:10.3390/rs17142530

Open AccessArticle

Long-Term Hourly Ozone Forecasting via Time–Frequency Analysis of ICEEMDAN-Decomposed Components: A 36-Hour Forecast for a Site in Beijing

by

Taotao Lv

¹

,

Yulu Yi

¹,

Zhuowen Zheng

¹

,

Jie Yang

^1,2,3,* and

Siwei Li

^1,2,3

¹

School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China

²

Perception and Effectiveness Assessment for Carbon-Neutrality Efforts, Engineering Research Center of Ministry of Education, Institute for Carbon Neutrality, Wuhan University, Wuhan 430072, China

³

Hubei Luojia Laboratory, Wuhan University, Wuhan 430079, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(14), 2530; https://doi.org/10.3390/rs17142530

Submission received: 16 May 2025 / Revised: 3 July 2025 / Accepted: 19 July 2025 / Published: 21 July 2025

(This article belongs to the Section Environmental Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

Surface ozone is a pollutant linked to higher risks of cardiopulmonary diseases with long-term exposure. Timely forecasting of ozone levels helps authorities implement preventive measures to protect public health and safety. However, few studies have been able to reliably provide long-term hourly ozone forecasts due to the complexity of ozone’s diurnal variations. To address this issue, this study constructs a hybrid prediction model integrating improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN), bi-directional long short-term memory neural network (BiLSTM), and the persistence model to forecast the hourly ozone concentrations for the next continuous 36 h. The model is trained and tested at the Wanshouxigong site in Beijing. The ICEEMDAN method decomposes the ozone time series data to extract trends and obtain intrinsic mode functions (IMFs) and a residual (Res). Fourier period analysis is employed to elucidate the periodicity of the IMFs, which serves as the basis for selecting the prediction model (BiLSTM or persistence model) for different IMFs. Extensive experiments have shown that a hybrid model of ICEEMDAN, BiLSTM, and persistence model is able to achieve a good performance, with a prediction accuracy of R² = 0.86 and RMSE = 18.70 µg/m³ for the 36th hour, outperforming other models.

Keywords:

ozone forecast; ICEEMDAN; BiLSTM; persistence model

1. Introduction

The surface zone is one of the primary atmospheric pollutants in China. Numerous studies have shown that excessive surface ozone can damage the vascular and respiratory systems [1,2]. Excessively high ozone concentrations can harm human health [3,4], and many plants also show high sensitivity to ozone concentration [5,6]. Since 2013, Beijing, the capital city, has vigorously carried out atmospheric pollution control efforts. However, the results for ozone control have not been optimistic, with the maximum daily 8-h average ozone (MDA8 O₃) concentration remaining persistently high [7,8]. Since ozone exhibits pronounced diurnal variations and high-pollution episodes frequently overlap with peak human activity periods [9,10,11], accurate long-term hourly ozone forecasting is crucial, as it provides sufficient warning time to take measures and accurate occurrence time to reduce public exposure.

Currently, methods for predicting surface ozone concentrations can be broadly divided into deterministic techniques and statistical models. Deterministic techniques include numerical simulation models with spatial prediction capabilities, such as the WRF/Chem model [12,13,14]. To initialize the WRF simulation, users must provide initial conditions that define the state of the atmosphere at the beginning of the model run [15]. These initial conditions may include atmospheric pressure, temperature, humidity, wind speed and direction, as well as various O₃ precursors such as VOCs and NOx [16]. The prediction accuracy of these models is highly dependent on the consistency between the input conditions and the actual atmospheric state, and they are computationally complex and require high-performance hardware. In contrast, statistical models can more easily extract the characteristics of ozone changes [17]. Statistical models include traditional machine learning methods, such as generalized linear models [18,19,20], decision trees [21], support vector machines (SVM) [22], etc. These models can learn and fit the future trends of surface ozone with a sufficient number of actual observations from monitoring sites [17]. However, due to the nonlinear characteristic of surface ozone [23,24], achieving long-term hourly ozone prediction for traditional machine learning methods is not easy.

In recent years, various deep neural network models have been used for ozone concentration prediction, such as temporal models including recurrent neural networks (RNN) [25], long short-term memory networks (LSTM) [26,27], as well as convolutional neural networks (CNN) [28,29,30,31] which is often used to capture spatiotemporal features. However, due to the lack of constraints, errors can accumulate quickly, making long-term hourly ozone prediction very challenging. To address this issue, ref. [32] have introduced meteorological variable prediction values from WRF and periodic time encoding into artificial neural networks (ANN) to achieve continuous 24-h prediction. Ref. [28] proposed the temporal convolutional neural networks (TCNN) to overcome the limitation of fixed receptive field sizes in CNNs and capture the long-term temporal dependencies of ozone concentration sequences, achieving continuous 72-h forecasts. However, these studies still suffer from data unavailability, and the characteristics of ozone concentration changes have not been fully considered.

The ozone trend is dominated by diurnal [10], weekly [33,34], and seasonal [35] cycles. To improve long-term predictions, hybrid models utilizing decomposition techniques can simplify complex patterns [36,37]. For instance, ref. [38] used wavelet decomposition with gated recurrent unit neural networks (GRU) and support vector regression to achieve a 66.29% reduction in RMSE for MDA8 O₃ predictions. Ref. [39] creates a hybrid model (CEEMD-Subset-OASVR-GRNN) for better daily average ozone forecasting, while Ref. [40] develops a model (RF-CEEMDAN-Attention-LSTM) to predict PM2.5 and O₃ in Chengdu. However, these studies still have limitations in that they train all decomposed sequences with the same network without analyzing the characteristics of these sequences, which may limit forecasting accuracy.

This study first decomposes the ozone time series to analyze its characteristics and then selects appropriate prediction models. Data decomposition enhances model performance by breaking the original time series into stable sub-sequences [38]. Next, we use the Fourier period analysis to classify the IMFs (intrinsic mode functions) from the ICEEMDAN (improved complete ensemble empirical mode decomposition with adaptive noise) into noise, short-periodic, and long-periodic IMFs. Two models are applied to different decomposed sequences. The BiLSTM model is effective for short-period non-stationary signals [41], while the persistence model is appropriate for long-period stationary signals, facilitating long-term hourly ozone prediction [42]. Extensive experiments show that the new ICEEMDAN-BiLSTM-persistence model improves the stability and accuracy in long-term hour-by-hour predictions.

2. Materials and Methods

2.1. Data

Beijing, China’s political and cultural center, has a permanent population of 21.858 million and a total area of 16,410.54 square kilometers, making it one of China’s most densely populated cities with severe ozone pollution. We select the Wanshouxigong monitoring site in the urban area and compile hourly ozone concentration data from 2015 to 2023 for the study dataset. During these nine years, there were 341 days with 1760 h of ozone pollution at this site, according to China’s ambient air quality standards [43].

Figure 1a shows that ozone concentration changes exhibit a distinct annual single-peak distribution. To better analyze the annual variation characteristics of ozone, the average weekly ozone concentrations for different years were calculated and compared between 2015 and 2023, as shown in Figure 1b. It can be seen that the ozone concentration peaks around June each year, and from October to the following January, the ozone concentration reaches its lowest point. The trends are similar across different years, but with some differences. Figure 1c shows ozone’s average daily variation curves in different seasons within 24 h. The ozone concentration is highest in summer, followed by spring and autumn, with the lowest concentration in winter. The daily variation trends of ozone concentration in different seasons are similar, showing a single-peak distribution, with the concentration rising at 09:00, peaking at 16:00–18:00, and then declining until 09:00 the next day. However, although 16:00–18:00 is statistically the high incidence of ozone pollution, the actual ozone concentration changes in pollution days are often more complicated. We analyzed the ozone concentration changes in ozone pollution days (27 days in total) in the second quarter of 2023, classified pollution days into four categories according to the time of reaching the peak concentration, and counted the hour-by-hour median concentration values of each category in Figure 1d. It was found that the ozone concentration reached its lowest point at 06:00, and the time to reach the peak value spanned a wide range from 13:00 to 20:00, with the shortest duration of pollution being 1 h and the longest duration being as high as 12 h. This indicates that the daily variation of ozone concentration is complex, and hour-by-hour prediction is difficult.

2.2. The Framework of This Study

We propose a hybrid model named ICEEMDAN-BiLSTM-persistence to achieve the prediction of ozone concentrations, which consists of a decomposition model and two time series prediction models. The ICEEMDAN method effectively extracts variations in time series data [41] and decomposes the sequence into intrinsic mode functions (IMFs) and a residual (Res). Then, fast Fourier transform is applied to classify the IMFs into short-periodic IMFs and long-periodic IMFs in order to select a more appropriate prediction model.

Short-periodic IMFs exhibit complex variations within the prediction time frame and are suitable for prediction using the BiLSTM model, which has strong nonlinear fitting capabilities. In contrast, long-periodic IMFs, with minimal changes over the prediction period, are well-suited for prediction using the persistence model, which directly uses the current input as the output. Each IMF is trained with a single prediction model. For the current moment T, all prediction models output the continuous 36-h forecast values for their respective IMFs. The forecast values at corresponding time steps are then summed to obtain the reconstructed forecast sequence. The specific construction process includes data preprocessing, model construction, and prediction results, as shown in Figure 2.

2.2.1. Dataset Division and Decomposition

Overall, the hourly ozone concentration data from 2015 to 2022 are used as the training set, and the data from 2023 are used as the test set. Firstly, the ICEEMDAN method is applied to the training set to obtain IMFs and Res. Secondly, the test set is merged with the training set and then decomposed. The IMFs and Res corresponding to the year 2023 are extracted from the decomposition results to serve as the decomposed test set. It should be noted that the two decompositions use the same parameters (Nstd = 0.2, NR = 32, MaxIter = 1000). This differs from Ref. [40], which decomposes the entire dataset before dividing it into training and test sets, and the decomposition results of the training set may be affected by the test set. In everyday operation, new data is continuously added to the existing dataset and then decomposed, similar to how the test set is treated in this study. Moreover, if the test set is not decomposed separately from the training set, it is highly unlikely to have inconsistencies in the number of components between the training and test sets [36].

2.2.2. Fourier Period Analysis

IMFs and Res have different periodic properties, as shown in Figure 3, and Figure 4a shows the results of Fourier spectral analysis on all IMFs (excluding Res). The y-axis amplitude corresponds to the sine wave’s amplitude [44], and the x-axis represents the period. To quantify the changes in each IMF over a continuous 36-h period, we define the moving variance (MV) [45] with a specific temporal window

T_{w}

(in this study,

T_{w} = 36

), as shown in Equation (1).

N

is the number of the samples.

N

= 70,128 denotes the total number of samples (covering 2015–2022).

\bar{C_{t_{0}}}

represents the average concentration within the

T_{w}

, as given by Equation (2). MV captures both the dispersion of ozone values and the information density within the window, as illustrated in Figure 4b.

M V = \frac{1}{N} \sum_{t_{0} = 1}^{N} \sqrt{\frac{\sum_{i = 1}^{T_{w}} {(C_{t_{0} + i} - \bar{C_{t_{0}}})}^{2}}{T_{w}}}

(1)

\bar{C_{t_{0}}} = \frac{1}{T_{w}} \sum_{i = 1}^{T_{w}} C_{t_{0} + i}

(2)

From the amplitude perspective, Figure 4a shows two distinct peaks at IMF 4–5 and IMF 13–14, with smaller amplitudes for other IMFs. From the frequency perspective, IMFs 1–3 are high-frequency components with periods under 1 day, especially IMF 3 at 0.5 days. IMFs 4–5 have a main period of 1 day, while IMFs 6–12 range from 1 day to 1 year. IMFs 13–14 have a main period of one year, and IMFs 15–16 exceed one year, indicating multi-year ozone variation trends.

From the MV perspective, Figure 4b shows a single-peak distribution, and IMF 4 is the most rapidly changing component within 36 h. The MV values of the IMFs after IMF 4 gradually decrease and tend towards zero.

In summary, the results from the frequency–amplitude analysis and the MV categorize the ICEEMDAN decompositions into three types: Noise-IMFs (1–2) exhibit high frequency and low amplitude without periodicity and are not used for prediction; short-periodic IMFs (3–12) show distinct periods and significant changes within 36 h, and are fitted using BiLSTM; long-periodic IMFs (13–16) and Res, which change little within 36 h, are predicted using the persistence model based on previous values.

It is worth noting that, as shown in Figure 4b, the MV of the IMFs and the Res following IMF4 gradually decrease and approach zero. There is no definitive threshold value to distinguish between short- and long-periodic IMFs. Also, additional experiments have demonstrated that the overall prediction results are minimally affected by whether the persistence model or the BiLSTM model is used to predict IMF11 to IMF13. This conclusion is discussed in detail in Section 4.3. Therefore, considering the significant annual periodicity exhibited by IMF13 and IMF14 (as shown in Figure 4a) and the overall prediction efficiency of the model, IMFs with periods shorter than one year are classified as short-periodic IMFs, while those with periods longer than one year are classified as long-periodic IMFs.

2.2.3. Time Step Setting

The selection of input time steps is crucial for training the BiLSTM model. Short input lengths may lack sufficient historical information, while overly long lengths can cause redundancy, affecting prediction efficiency. To quantify the contribution of the input ozone concentrations to the predictions, the autocorrelation function (ACF) and partial autocorrelation function (PACF) are calculated as shown in Figure 5. The ACF represents the correlation between a signal at different lagged time points, and the PACF reflects the correlation between any two time points, eliminating the linear effects of the points between them [46]. Since we focus on forecasting future 36-h ozone concentrations hourly, the lag time is set from 1 to 36 h.

For IMFs 3–8, both the ACF and PACF show a trailing behavior, indicating a long history of dependence [47]. For IMF 8, the ACF tends to zero at T+36, suggesting that forecasting beyond 36 h is difficult. Meanwhile, the PACF connects to zero within 36 h, suggesting that ozone from more than 36 h in the past no longer provides valid information. For IMFs 9–12, the high ACF and low PACF for most lag points indicate a strong correlation between T and T+36, but the contribution to forecasting decreases rapidly with the increase of time intervals. IMF 12, which is close to zero at T+36, indicates that the data from the past 36 moments are beneficial for forecasting T+1, and also, the data at T are beneficial for forecasting T+36. To sum up, it is reasonable to set the time step at 36 h.

Additionally, for the Noise-IMFs (IMF1–2), the ACF and PACF are truncated rapidly within a very short lag time, indicating that the prediction at T will hardly contribute to the prediction at T+36. It is impossible to accurately forecast the next 36 h using the information inherent in the sequence itself. For long-periodic IMF, the ACF remains close to 1 within 36 h. The PACF shows a truncation after lag = 1, indicating that the value at T+1 is dependent mainly on the value at T, and this strong dependence can be propagated to T+36, suggesting that there is no need to introduce long-periodic trends for forecasting, and the most recent moment is sufficient for accurate forecasting. Hence, the persistence model is used to predict long-periodic IMFs.

2.3. Accuracy Evaluation

The goodness of fit between the predicted and observed ozone concentrations is assessed on hourly and daily scales. Our model provides a continuous 36-h prediction, and the maximum daily 8-h average ozone (MDA8 O₃) can be derived from the first 24 h (today) of midnight predictions or the last 24 h (the next day) of noon predictions. Evaluation metrics include the determination coefficient of determination (R²) and the root mean square error (RMSE).

Besides the concentration validation, high pollution episodes are also of concern in this study. China’s air quality standards set the 1-h ozone limit at 200 µg/m³ and the MDA8 O₃ limit at 160 µg/m³ for Class II areas. Hour-pollution refers to events exceeding the 1-h ozone limit, while day-pollution relates to the MDA8 O₃ limit. Based on the continuous 36-h prediction, we can forecast high pollution episodes on both hourly and daily scales, and assess the high pollution forecast accuracy (HPFA) by the ratio of successfully forecasted pollution episodes to the total. The RMSE between the predicted and observed ozone concentrations during pollution events is calculated to reflect the accuracy of the model’s pollution concentration predictions. Moreover, the hourly prediction can clearly identify the start and end times of pollution episodes. The accuracy of the model’s timing predictions is evaluated by computing the mean absolute time deviation (MATD) and the Mean Time Deviation (MTD) for the start time and end time of pollution events. The former reflects the uncertainty, while the latter indicates the systematic bias.

3. Results

3.1. Concentration Prediction

The predictive performance of the model was evaluated across the entire test dataset. As the analysis of the annual ozone concentration trends in Section 2.1 identified June as the peak period for ozone concentrations within a year, Figure 6a–d shows the predicted ozone concentrations for June 2023 compared to the actuals at T+1, T+12, T+24, and T+36. The predictions at T+1 almost coincide with the observed values, indicating excellent performance. As time extends, the gap between the predicted and observed ozone concentrations gradually increases, especially in concentration peaks and valleys. Nevertheless, the overall trend remains consistent with the observed values, and the model shows a reasonable response to pollution events. Figure 6e–f compares predicted and observed noon-MDA8 and midnight-MDA8. The performance of midnight-MDA8 is generally better than that of noon-MDA8, with a more timely response to pollution events.

Figure 7 illustrates the model’s prediction accuracy across various prediction durations in hourly forecasts and how ozone concentration levels affect prediction accuracy. It can be seen that the predictions show a strong correlation with the observations. The model achieves excellent performance at T+1, with an R² of 0.997 and an RMSE of only 2.65 µg/m³. Until T+12, the prediction accuracy maintains an R² greater than 0.93 and an RMSE less than 14 µg/m³. Even in long-term predictions at T+24 and T+36, the R² remains above 0.86, with an RMSE below 19 µg/m³. Additionally, it can be seen from Figure 7f that the model is generally more accurate in forecasting ozone concentrations that do not reach pollution levels. For samples that exceed the pollution threshold, the model’s predicted values tend to be lower than the actual values, as shown in Figure 7a–e.

Table 1 compares predictions from three models with varying components. For hourly prediction, the RMSE of the ICEEMDAN-BiLSTM-persistence model (our work) improves by 77% at T+1 (from 11.80 to 2.68 µg/m³) compared to the BiLSTM, and by 42% at T+36. The ICEEMDAN-BiLSTM also shows similar improvements, highlighting that the ICEEMDAN facilitates the BiLSTM in capturing the trends in ozone concentration sequences. Compared with ICEEMDAN-BiLSTM, our model further enhances RMSE, from 3.10 to 2.68 µg/m³ at T+1 and from 20.03 to 18.70 µg/m³ at T+36, demonstrating the persistence model’s capability for long-period IMFs.

Table 1 also indicates daily prediction performance, with R² for MDA8 O₃ exceeding 0.9 for both noon and midnight, and low RMSE of 5.7 µg/m³ and 13.27 µg/m³, respectively. The midnight prediction outperforms the noon prediction slightly due to proximity to the predicted day. Our model shows a 69% RMSE improvement for midnight and 61% for noon compared to the BiLSTM, while the persistence model contributes an additional 10% and 8% improvement, respectively, compared to the ICEEMDAN-BiLSTM.

In comparison with similar studies, our model achieves better accuracy for short- and long-term predictions. For instance, our 3-h RMSE of 5.268 µg/m³ is 74% lower than 20.036 µg/m³ reported in [40], and in the 24-h prediction, our model’s R² of 0.89 is 19% higher than 0.75 reported in [32]. Additionally, our model’s night prediction RMSE of 5.70 µg/m³ shows a 47% improvement over 10.72 µg/m³ reported in [38]. Despite differences in datasets and study areas, these results highlight the ICEEMDAN-BiLSTM-persistence model’s superior performance.

3.2. High Pollution Episodes Prediction

In daily life, people are concerned about the pollution level to ensure timely early warning. Using the ICEEMDAN-BiLSTM-persistence model, we forecast hour- and day-pollution episodes and assess their accuracy.

HPFA, MATD, and MTD are applied to assess the hour-pollution episodes. The HPFA is presented in Table 1. For 165 h-pollution events and 48 day-pollution events recorded in 2023, our model achieves over 60% HPFA for hour–pollution predictions made 24 h in advance and 98% HPFA for one-hour forecasts. HPFA beyond 24 h with BiLSTM alone drops below 10%, while the ICEEMDAN improved HPFA to 40% for 36-h forecasts.

Across midnight and noon predictions, our model achieves MATD of 1.75 and 1.95 h for pollution event start times, alongside 2.05 and 1.81 h for end times, indicating that the model has an uncertainty of about 2 h in predicting the start and end times of pollution events. Meanwhile, for midnight and noon predictions, the MTD shows −0.76 and −1.26 h for pollution start times, and 0.48 and 0.36 h for end times. This suggests systematic delays in predicting the start times of pollution events and minor early estimates for end times. However, overall, the model demonstrates satisfactory predictive accuracy.

For day-pollution episodes, our model achieves an HPFA of 67% in noon predictions, compared to 17% for the BiLSTM and 65% for the ICEEMDAN-BiLSTM. In midnight predictions, the HPFA reaches 90%, slightly surpassing ICEEMDAN-BiLSTM’s 87% and ahead of BiLSTM’s 79%.

3.3. Cases in Different Seasons

Ozone concentration in Beijing shows a clear annual cycle, peaking in spring and summer with higher levels and more pollution events, while autumn and winter have lower concentrations. The study area, Beijing, located in northern China, has a temperate monsoon climate. March to August is spring and summer, and September to February is autumn and winter.

The ICEEMDAN-BiLSTM-persistence model exhibits strong predictive performance for high pollution episodes, particularly in spring and summer. For example, Figure 8a shows the ozone concentration prediction for 1 July 2023, made at midnight on 1 July. The prediction indicates that ozone concentrations reached pollution levels for 9 h, from 13:00 to 22:00 on 1 July, while the actual pollution event lasted for 11 h, from 12:00 to 23:00. The forecast timing closely aligns with the actual occurrence, with only a 1-h difference in the start and end times. The predicted peak of 277.12 µg/m³ is only 4.7% higher than the actual peak of 264 µg/m³. Additionally, the model accurately captured a sharp drop in ozone levels on 18 June 2023, as shown in Figure 8b, where input data for 16 and 17 June indicated ozone pollution events, while it dropped sharply on 18 June, reflecting its accuracy in forecasting sudden fluctuations.

Figure 8c,d show the prediction results for autumn and winter. During these seasons, near-surface ozone concentrations are typically low, showing a more stable daily variation trend and fewer pollution events. As a result, the model performs exceptionally well during this period. The seasonal differences in model performance are reflected numerically in the MDA8 O₃ accuracy, with slightly lower accuracy observed in spring and summer compared to autumn and winter (spring and summer RMSE = 7.21 µg/m³; autumn and winter RMSE = 5.70 µg/m³). In summary, the model performs well across all four seasons.

4. Discussion

4.1. Auxiliary Variables

Research has shown that meteorological conditions and pollutant variables may strongly correlate with ozone concentration [48,49,50,51]. Many studies on pollutant prediction have considered the role of auxiliary variables [32,52], which can enhance the accuracy of model predictions. Data from the same monitoring site including AQI, PM2.5, PM10, SO₂, NO₂, O₃, and CO, and meteorological data from NOAA 2-m dew point temperature (d2m), surface pressure (sp), 2-m temperature (t2m), total column water (tcw), total precipitation (tp), 10-m wind speed U (10 u), and 10-m wind speed V (10 v), are collected and analyzed for their correlation with the ozone concentration sequence, as shown in Figure 9.

Two distinct regions of strong mutual correlation are evident in Figure 9. One in meteorological data involving d2m, t2m, and tcw (with t2m showing the strongest correlation with ozone at 0.65) and another in pollutant data involving AQI, PM2.5, PM10, SO₂, NO₂, and CO (with NO₂ having the strongest correlation with ozone at −0.55). To avoid redundancy in auxiliary variables, we select t2m as the representative meteorological data and NO₂ as the representative pollutant data. Furthermore, we apply ICEEMDAN to these two variables. The impact of meteorological factors and pollutant concentrations on prediction is assessed through three sets of experiments: one without auxiliary variables, one with auxiliary variables, and one with decomposed auxiliary variables.

The results, presented in Table 2, indicate that adding auxiliary variables, whether in their original form or decomposed, does not effectively improve the model’s accuracy. To investigate the underlying reasons, we computed partial correlation coefficients between ozone concentration and auxiliary variables (t2m and NO₂) at various time lags, controlling for intermediate ozone concentrations. Results show that while t2m at time T is positively correlated with ozone concentration at T+1 (0.65), T+3 (0.60), T+12 (0.37), T+24 (0.63), and T+36 (0.35), the partial correlations between t2m at T and ozone at T+3, T+12, T+24, and T+36 are near zero (0.012, 0.11, 0.08, and 0.05). Similarly, NO₂ concentrations at T are negatively correlated with ozone at T+1 (−0.52), T+3 (−0.43), T+12 (−0.11), T+24 (−0.34), and T+36 (−0.05). Moreover, the partial correlations at T+3, T+12, T+24, and T+36 are 0.012, 0.017, 0.032, and 0.004, which are close to zero. These results indicate that the initial correlations between auxiliary variables and future ozone concentrations are largely explained by the intermediate ozone concentrations. Therefore, NO₂ and t2m at the current time step provide limited predictive value for ozone concentrations beyond short time horizons, particularly after T+3. Therefore, the inclusion of auxiliary variables in long-term predictions of ozone concentration at time T+36 may be redundant for the model.

4.2. The Importance of Different IMFs

The prediction accuracy of each IMF and its effect on the overall performance are different. Each intrinsic mode function’s RMSE (IMFE) is calculated, along with the ratio of IMFE to the overall RMSE, as illustrated in Figure 10. Figure 10a shows the IMFE across different prediction durations, and all IMFEs increase as the prediction period extends. Notably, during the period (T+6, T+36), IMF 4, with a main period of 1 day, peaks in error, indicating its significant influence on long-term predictions.

Figure 10b shows the ratio of IMFE to the overall RMSE, and this ratio reflects the extent to which each IMF contributes to the overall prediction error. At T+1, the IMFE of all IMFs remains at a relatively low level, with IMF3–4 showing slightly higher values. For long-term predictions within (T+6, T+36), IMF 4 consistently contributes the most to the overall error, suggesting that enhancing long-term prediction accuracy primarily relies on the accuracy of IMF 4. Furthermore, as the prediction period extends, the contribution of IMFs with main periods greater than 1 day (IMFs 5–12) gradually increases (green line to blue line), indicating that to improve long-term prediction accuracy further, it is also necessary to pay attention to the IMFs with main periods greater than 1 day.

4.3. Selection of IMFs’ Classification Thresholds

We classified the non-noise IMFs into short-periodic IMFs and long-periodic IMFs based on frequency and MV in Section 2.2.2, and applied different prediction models for them. Therefore, the selection of IMFs’ classification thresholds is of great significance. In Figure 4b, the MV of IMFs and Res after IMF4 gradually decreases and approaches zero. The smaller the MV, the less the change of the IMF within 36 h, and the more suitable it is for prediction using the persistence model. However, since the decline of MV is a gradual process, it is difficult to accurately determine the threshold for classification. Thus, we attempted various classification methods and demonstrated the impact of different classification methods on prediction accuracy. The experimental results are presented in Table 3.

As indicated by the results in Table 3, the prediction outcomes are similar when the classification threshold is set between IMF11 and IMF14. This suggests that for the IMFs (IMF11–13) with monthly and quarterly main cycles, employing either the BiLSTM model or the persistence model for prediction has a negligible impact on the overall prediction results. However, when the classification threshold is set to IMF8, the RMSE at T+36 increases by 36.4% compared to that with the boundary set at IMF13. When the classification threshold is set to the maximum (all IMFs and Res are predicted using the BiLSTM model), the RMSE at T+36 rises by 5.7% compared to that with the threshold at IMF13. The prediction performance deteriorates when the classification threshold is set either too high or too low. The inferred reason is that the persistence model is not suitable for predicting IMFs with large variations within the prediction period, while the BiLSTM model may struggle to capture the signal characteristics of components with very low magnitudes, such as Res [37].

4.4. O₃-Associated Health Effects

June is a peak time for ozone pollution and also for sports activity. In June 2023, Beijing hosted 19 sports events, with over 100,000 participants, according to the Beijing Sports Competitions Administration and International Exchange Center (https://www.bjcac.org.cn/, accessed on 3 July 2025). Meanwhile, the China Meteorological Administration reported 11 days of hourly ozone pollution events at the Wanshouxigong monitoring site. The average duration of ozone concentrations exceeding 200 µg/m³ ranged from 2 to 12 h per day, with a mean of 7.2 h, mostly occurring in the afternoons and evenings, coinciding with peak times for sports activities and commuting. The overlap presents a serious threat to public health. Hourly forecasting can accurately capture the complex intra-day variations in ozone concentrations, and providing a one-day advance warning can give event organizers considerable time to plan and implement measures to reduce exposure.

The 2023 International Football Invitational was held at the Beijing Workers Stadium on 15 June 2023, with over 60,000 participants. Figure 11 shows the distribution of sites across Beijing and the observed MDA8 O₃ concentrations on 15 June. It is evident that on 15 June all stations experienced pollution events, with MDA8 O₃ concentrations significantly exceeding the pollution threshold of 160 µg/m³. The Beijing Workers Stadium is located in the urban area, just 2.5 km from the nearest site, Nongzhanguan. So, we used the ozone concentration values monitored at the Nongzhanguan site as the ozone observation values around the Beijing Workers Stadium.

On 15 June 2023, the 2023 International Football Invitational (In Beijing Workers Stadium) experienced 11 h of ozone concentrations exceeding the hourly pollution limit of 200 µg/m³ from 13:00 to 24:00, with an MDA8 O₃ of 284 µg/m³, far above the pollution limit of 160 µg/m³. Over 60,000 attendees were exposed to short-periodic ozone pollution due to this event. From a statistical perspective, we assess the negative health effects of ozone pollution during this occurrence based on Equation (2), which is widely used to assess the premature deaths attributable to short-periodic ozone exposure [53]. BMR represents the baseline mortality rate of all-cause deaths due to short-periodic ozone exposure, pop is the number of people involved in the event, β is the concentration response factor, c is the ozone concentration, and c₀ is the baseline concentration that causes O₃-related health effects. To align with Chinese health assessments, the baseline mortality rate is sourced from the China National Disease Surveillance Points (DSP). The baseline O₃ concentration is set at 65 µg/m³, in accordance with Chinese health studies and WHO guidelines [54,55]. The value of β is referenced from relevant studies [53,56]. The results indicate that among the 60,000 participants in this event, 2.9 premature deaths would be attributable to short-periodic ozone exposure.

Δ M o r t = M o r t \times B M R_{d} \times p o p \times (1 - \exp (- (c - c_{0}) \times β))

(3)

We attempted to predict the high pollution episode on 15 June using our model. The Nongzhanguan site and the Wanshouxigong site are both urban sites, and it is assumed that they have similar ozone variation characteristics. Therefore, we used the model trained with data from the Wanshouxigong site to predict the ozone concentration at the Nongzhanguan station for the year 2023. The R² between the predicted and observed values at T+1, T+12, and T+36 is 0.997, 0.923, and 0.837. Although the accuracy is somewhat reduced compared to the Wanshouxigong site, the overall prediction performance is still good. Notably, our model predicted on the afternoon of 14 June that the MDA8 O₃ on 15 June would exceed the pollution threshold (252.1 µg/m³, with an 11.2% lower than the actual value of 284 µg/m³). The pollution is forecasted to start at 12:00 (which aligned with reality) and end at 23:00 (although it actually ended at 24:00). If the relevant departments promptly implement emergency emission reduction controls based on early warning information to maintain ozone concentrations below the standard level (160 µg/m³) on the 15th, the number of premature deaths would decrease to 1.3 cases, achieving a 55% reduction in mortality rate.

5. Conclusions

This study proposes a model for predicting ozone concentration based on ICEEMDAN, the BiLSTM Model, and the persistence model. A key step in this model is using the ICEEMDAN method to decompose the ozone concentration sequence into multiple components, allowing for the selection of either the BiLSTM Model or the persistence model for prediction based on the characteristics of different components.

In predicting the Wanshouxigong site in Beijing, the model achieved an R² of 0.86 and an RMSE of 18.70 µg/m³ at T+36, which is superior to existing research results and sufficient to meet the time requirements for decision-makers to implement relevant policies and actions. Additionally, the model demonstrates sensitivity to peak ozone levels in spring and summer, and it performs well in predicting sudden changes in ozone concentration.

For the model of long-term ozone concentration prediction, we have the following conclusions:

(1): The persistence model demonstrates superior accuracy and speed in predicting long-period IMFs and Res compared to the BiLSTM model, owing to its simplicity and direct reliance on historical data, which better captures these gradually changing trends.
(2): In the 36-h ahead prediction, the prediction accuracy of IMF 4 (with a period of 24 h) largely determines the overall prediction accuracy.
(3): Adding meteorological factors (t2m) and precursors (NO₂) as auxiliary variables to the model has no significant impact on the prediction accuracy.

Author Contributions

T.L.: writing—original draft, visualization, validation, methodology, formal analysis, data curation, and conceptualization. Y.Y.: investigation and visualization. Z.Z.: investigation and data curation. J.Y.: methodology, conceptualization, project administration, funding acquisition, visualization, and resources. S.L.: methodology, conceptualization, funding acquisition, and project administration. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 42205129) and supported by the Youth Project from the Hubei Research Center for Basic Disciplines of Earth Sciences (No. HRCES-202408), and the Open Fund of Hubei Luojia Laboratory (No. 250100011 and No. 250100008).

Data Availability Statement

The data in this study can be obtained from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kinney, P.L.; Ware, J.H.; Spengler, J.D. A Critical Evaluation of Acute Ozone Epidemiology Results. Arch. Environ. Health 1988, 43, 168–173. [Google Scholar] [CrossRef] [PubMed]
Mirowsky, J.E.; Carraway, M.S.; Dhingra, R.; Tong, H.; Neas, L.; Diaz-Sanchez, D.; Cascio, W.; Case, M.; Crooks, J.; Hauser, E.R.; et al. Ozone Exposure Is Associated with Acute Changes in Inflammation, Fibrinolysis, and Endothelial Cell Function in Coronary Artery Disease Patients. Environ. Health 2017, 16, 126. [Google Scholar] [CrossRef] [PubMed]
Lu, X.; Zhang, L.; Wang, X.; Gao, M.; Li, K.; Zhang, Y.; Yue, X.; Zhang, Y. Rapid Increases in Warm-Season Surface Ozone and Resulting Health Impact in China Since 2013. Environ. Sci. Technol. Lett. 2020, 7, 240–247. [Google Scholar] [CrossRef]
Li, K.; Jacob, D.J.; Shen, L.; Lu, X.; De Smedt, I.; Liao, H. Increases in Surface Ozone Pollution in China from 2013 to 2019: Anthropogenic and Meteorological Influences. Atmos. Chem. Phys. 2020, 20, 11423–11433. [Google Scholar] [CrossRef]
Pell, E.J.; Schlagnhaufer, C.D.; Arteca, R.N. Ozone-Induced Oxidative Stress: Mechanisms of Action and Reaction. Physiol. Plant. 1997, 100, 264–273. [Google Scholar] [CrossRef]
Feng, Z.; Sun, J.; Wan, W.; Hu, E.; Calatayud, V. Evidence of Widespread Ozone-Induced Visible Injury on Plants in Beijing, China. Environ. Pollut. 2014, 193, 296–301. [Google Scholar] [CrossRef] [PubMed]
Zhao, S.; Yin, D.; Yu, Y.; Kang, S.; Qin, D.; Dong, L. PM_2.5 and O₃ Pollution during 2015–2019 over 367 Chinese Cities: Spatiotemporal Variations, Meteorological and Topographical Impacts. Environ. Pollut. 2020, 264, 114694. [Google Scholar] [CrossRef] [PubMed]
Li, W.; Shao, L.; Wang, W.; Li, H.; Wang, X.; Li, Y.; Li, W.; Jones, T.; Zhang, D. Air Quality Improvement in Response to Intensified Control Strategies in Beijing during 2013–2019. Sci. Total Environ. 2020, 744, 140776. [Google Scholar] [CrossRef] [PubMed]
Chen, W.; Tang, H.; Zhao, H. Diurnal, Weekly and Monthly Spatial Variations of Air Pollutants and Air Quality of Beijing. Atmos. Environ. 2015, 119, 21–34. [Google Scholar] [CrossRef]
Chen, S.; Wang, H.; Lu, K.; Zeng, L.; Hu, M.; Zhang, Y. The Trend of Surface Ozone in Beijing from 2013 to 2019: Indications of the Persisting Strong Atmospheric Oxidation Capacity. Atmos. Environ. 2020, 242, 117801. [Google Scholar] [CrossRef]
Xia, N.; Du, E.; Guo, Z.; de Vries, W. The Diurnal Cycle of Summer Tropospheric Ozone Concentrations across Chinese Cities: Spatial Patterns and Main Drivers. Environ. Pollut. 2021, 286, 117547. [Google Scholar] [CrossRef] [PubMed]
Hu, J.; Chen, J.; Ying, Q.; Zhang, H. One-Year Simulation of Ozone and Particulate Matter in China Using WRF/CMAQ Modeling System. Atmos. Chem. Phys. 2016, 16, 10333–10350. [Google Scholar] [CrossRef]
Wang, P.; Qiao, X.; Zhang, H. Modeling PM_2.5 and O₃ with Aerosol Feedbacks Using WRF/Chem over the Sichuan Basin, Southwestern China. Chemosphere 2020, 254, 126735. [Google Scholar] [CrossRef] [PubMed]
Wei, W.; Li, Y.; Ren, Y.; Cheng, S.; Han, L. Sensitivity of Summer Ozone to Precursor Emission Change over Beijing during 2010–2015: A WRF-Chem Modeling Study. Atmos. Environ. 2019, 218, 116984. [Google Scholar] [CrossRef]
Sharma, A.; Ojha, N.; Pozzer, A.; Mar, K.A.; Beig, G.; Lelieveld, J.; Gunthe, S.S. WRF-Chem Simulated Surface Ozone over South Asia during the Pre-Monsoon: Effects of Emission Inventories and Chemical Mechanisms. Atmos. Chem. Phys. 2017, 17, 14393–14413. [Google Scholar] [CrossRef]
Guo, Y.; Roychoudhury, C.; Mirrezaei, M.A.; Kumar, R.; Sorooshian, A.; Arellano, A.F. Investigating Ground-Level Ozone Pollution in Semi-Arid and Arid Regions of Arizona Using WRF-Chem v4.4 Modeling. Geosci. Model Dev. 2024, 17, 4331–4353. [Google Scholar] [CrossRef]
Yafouz, A.; Ahmed, A.N.; Zaini, N.; El-Shafie, A. Ozone Concentration Forecasting Based on Artificial Intelligence Techniques: A Systematic Review. Water Air Soil Pollut. 2021, 232, 79. [Google Scholar] [CrossRef]
Sun, W.; Palazoglu, A.; Singh, A.; Zhang, H.; Wang, Q.; Zhao, Z.; Cao, D. Prediction of Surface Ozone Episodes Using Clusters Based Generalized Linear Mixed Effects Models in Houston–Galveston–Brazoria Area, Texas. Atmos. Pollut. Res. 2015, 6, 245–253. [Google Scholar] [CrossRef]
Sousa, S.I.V.; Martins, F.G.; Alvim-Ferraz, M.C.M.; Pereira, M.C. Multiple Linear Regression and Artificial Neural Networks Based on Principal Components to Predict Ozone Concentrations. Environ. Model. Softw. 2007, 22, 97–103. [Google Scholar] [CrossRef]
Ghazali, N.A.; Ramli, N.A.; Yahaya, A.S.; Yusof, N.F.F.M.; Sansuddin, N.; Al Madhoun, W.A. Transformation of Nitrogen Dioxide into Ozone and Prediction of Ozone Concentrations Using Multiple Linear Regression Techniques. Environ. Monit. Assess. 2010, 165, 475–489. [Google Scholar] [CrossRef] [PubMed]
Balogun, A.-L.; Tella, A. Modelling and Investigating the Impacts of Climatic Variables on Ozone Concentration in Malaysia Using Correlation Analysis with Random Forest, Decision Tree Regression, Linear Regression, and Support Vector Regression. Chemosphere 2022, 299, 134250. [Google Scholar] [CrossRef] [PubMed]
Luna, A.S.; Paredes, M.L.L.; de Oliveira, G.C.G.; Corrêa, S.M. Prediction of Ozone Concentration in Tropospheric Levels Using Artificial Neural Networks and Support Vector Machine at Rio de Janeiro, Brazil. Atmos. Environ. 2014, 98, 98–104. [Google Scholar] [CrossRef]
Xing, J.; Wang, S.X.; Jang, C.; Zhu, Y.; Hao, J.M. Nonlinear Response of Ozone to Precursor Emission Changes in China: A Modeling Study Using Response Surface Methodology. Atmos. Chem. Phys. 2011, 11, 5027–5044. [Google Scholar] [CrossRef]
Song, G.; Li, S.; Xing, J.; Yang, J.; Dong, L.; Lin, H.; Teng, M.; Hu, S.; Qin, Y.; Zeng, X. Surface UV-Assisted Retrieval of Spatially Continuous Surface Ozone with High Spatial Transferability. Remote Sens. Environ. 2022, 274, 112996. [Google Scholar] [CrossRef]
Biancofiore, F.; Verdecchia, M.; Di Carlo, P.; Tomassetti, B.; Aruffo, E.; Busilacchio, M.; Bianco, S.; Di Tommaso, S.; Colangeli, C. Analysis of Surface Ozone Using a Recurrent Neural Network. Sci. Total Environ. 2015, 514, 379–387. [Google Scholar] [CrossRef] [PubMed]
Guo, Q.; He, Z.; Wang, Z. Assessing the Effectiveness of Long Short-Term Memory and Artificial Neural Network in Predicting Daily Ozone Concentrations in Liaocheng City. Sci. Rep. 2025, 15, 6798. [Google Scholar] [CrossRef] [PubMed]
Pak, U.; Kim, C.; Ryu, U.; Sok, K.; Pak, S. A Hybrid Model Based on Convolutional Neural Networks and Long Short-Term Memory for Ozone Concentration Prediction. Air Qual. Atmos. Health 2018, 11, 883–895. [Google Scholar] [CrossRef]
Salman, A.K.; Choi, Y.; Singh, D.; Kayastha, S.G.; Dimri, R.; Park, J. Temporal CNN-Based 72-h Ozone Forecasting in South Korea: Explainability and Uncertainty Quantification. Atmos. Environ. 2025, 343, 120987. [Google Scholar] [CrossRef]
Mu, X.; Wang, S.; Jiang, P.; Wu, Y. Estimation of Surface Ozone Concentration over Jiangsu Province Using a High-Performance Deep Learning Model. J. Environ. Sci. 2023, 132, 122–133. [Google Scholar] [CrossRef] [PubMed]
Sayeed, A.; Choi, Y.; Eslami, E.; Lops, Y.; Roy, A.; Jung, J. Using a Deep Convolutional Neural Network to Predict 2017 Ozone Concentrations, 24 Hours in Advance. Neural Netw. 2020, 121, 396–408. [Google Scholar] [CrossRef] [PubMed]
Eslami, E.; Choi, Y.; Lops, Y.; Sayeed, A. A Real-Time Hourly Ozone Prediction System Using Deep Convolutional Neural Network. Neural Comput. Appl. 2020, 32, 8783–8797. [Google Scholar] [CrossRef]
Zavala-Romero, O.; Segura-Chavez, P.A.; Camacho-Gonzalez, P.; Zavala-Hidalgo, J.; Garcia, A.R.; Oropeza-Alfaro, P.; Romero-Centeno, R.; Gomez-Ramos, O. Operational Ozone Forecasting System in Mexico City: A Machine Learning Framework Integrating Forecasted Weather and Historical Ozone Data. Atmos. Environ. 2025, 344, 121017. [Google Scholar] [CrossRef]
Wang, Z.; Li, J.; Liang, L. Spatio-Temporal Evolution of Ozone Pollution and Its Influencing Factors in the Beijing-Tianjin-Hebei Urban Agglomeration. Environ. Pollut. 2020, 256, 113419. [Google Scholar] [CrossRef] [PubMed]
Wang, Z.; Li, Y.; Chen, T.; Zhang, D.; Sun, F.; Wei, Q.; Dong, X.; Sun, R.; Huan, N.; Pan, L. Ground-Level Ozone in Urban Beijing over a 1-Year Period: Temporal Variations and Relationship to Atmospheric Oxidation. Atmos. Res. 2015, 164–165, 110–117. [Google Scholar] [CrossRef]
Cheng, N.; Li, R.; Xu, C.; Chen, Z.; Chen, D.; Meng, F.; Cheng, B.; Ma, Z.; Zhuang, Y.; He, B.; et al. Ground Ozone Variations at an Urban and a Rural Station in Beijing from 2006 to 2017: Trend, Meteorological Influences and Formation Regimes. J. Clean. Prod. 2019, 235, 11–20. [Google Scholar] [CrossRef]
Teng, M.; Li, S.; Yang, J.; Wang, S.; Fan, C.; Ding, Y.; Dong, J.; Lin, H.; Wang, S. Long-Term PM2.5 Concentration Prediction Based on Improved Empirical Mode Decomposition and Deep Neural Network Combined with Noise Reduction Auto-Encoder- A Case Study in Beijing. J. Clean. Prod. 2023, 428, 139449. [Google Scholar] [CrossRef]
Teng, M.; Li, S.; Xing, J.; Song, G.; Yang, J.; Dong, J.; Zeng, X.; Qin, Y. 24-Hour Prediction of PM2.5 Concentrations by Combining Empirical Mode Decomposition and Bidirectional Long Short-Term Memory Neural Network. Sci. Total Environ. 2022, 821, 153276. [Google Scholar] [CrossRef] [PubMed]
Cheng, Y.; He, L.-Y.; Huang, X.-F. Development of a High-Performance Machine Learning Model to Predict Ground Ozone Pollution in Typical Cities of China. J. Environ. Manag. 2021, 299, 113670. [Google Scholar] [CrossRef] [PubMed]
Zhu, S.; Wang, X.; Shi, N.; Lu, M. CEEMD-Subset-OASVR-GRNN for Ozone Forecasting: Xiamen and Harbin as Cases. Atmos. Pollut. Res. 2020, 11, 744–754. [Google Scholar] [CrossRef]
Chu, Y.; Yao, J.; Qiao, D.; Zhang, Z.; Zhong, C.; Tang, L. Three-Hourly PM_2.5 and O₃ Concentrations Prediction Based on Time Series Decomposition and LSTM Model with Attention Mechanism. Atmos. Pollut. Res. 2023, 14, 101879. [Google Scholar] [CrossRef]
Bhardwaj, D.; Ragiri, P.R. A Deep Learning Approach to Enhance Air Quality Prediction: Comparative Analysis of LSTM, LSTM with Attention Mechanism and BiLSTM. In Proceedings of the 2024 IEEE Region 10 Symposium (TENSYMP), New Delhi, India, 27–29 September 2024; pp. 1–8. [Google Scholar]
Jung, J.; Broadwater, R.P. Current Status and Future Advances for Wind Speed and Power Forecasting. Renew. Sustain. Energy Rev. 2014, 31, 762–777. [Google Scholar] [CrossRef]
Wang, S.; Hao, J. Air Quality Management in China: Issues, Challenges, and Options. J. Environ. Sci. 2012, 24, 2–13. [Google Scholar] [CrossRef] [PubMed]
Dehghan, Y.; Sadrinasab, M.; Chegini, V. Empirical Mode Decomposition and Fourier Analysis of Caspian Sea Level’s Time Series. Ocean. Eng. 2022, 252, 111114. [Google Scholar] [CrossRef]
Box, G.E.P.; Jenkins, G.M. Time Series Analysis: Forecasting and Control, 3rd ed.; Prentice Hall PTR: Upper Saddle River, NJ, USA, 1994; ISBN 978-0-13-060774-4. [Google Scholar]
Bögl, M.; Aigner, W.; Filzmoser, P.; Lammarsch, T.; Miksch, S.; Rind, A. Visual Analytics for Model Selection in Time Series Analysis. IEEE Trans. Vis. Comput. Graph. 2013, 19, 2237–2246. [Google Scholar] [CrossRef] [PubMed]
Kumar, U.; De Ridder, K. GARCH Modelling in Association with FFT–ARIMA to Forecast Ozone Episodes. Atmos. Environ. 2010, 44, 4252–4265. [Google Scholar] [CrossRef]
Liu, P.; Song, H.; Wang, T.; Wang, F.; Li, X.; Miao, C.; Zhao, H. Effects of Meteorological Conditions and Anthropogenic Precursors on Ground-Level Ozone Concentrations in Chinese Cities. Environ. Pollut. 2020, 262, 114366. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.; Zhu, L.; Wang, S.; Henze, D.K.; Fu, T.-M.; Zhang, L.; Wang, X. Unraveling the Complexities of Ozone and PM2.5 Pollution in the Pearl River Delta Region: Impacts of Precursors Emissions and Meteorological Factors, and Effective Mitigation Strategies. Atmos. Pollut. Res. 2025, 16, 102368. [Google Scholar] [CrossRef]
Chen, Z.; Zhuang, Y.; Xie, X.; Chen, D.; Cheng, N.; Yang, L.; Li, R. Understanding Long-Term Variations of Meteorological Influences on Ground Ozone Concentrations in Beijing During 2006–2016. Environ. Pollut. 2019, 245, 29–37. [Google Scholar] [CrossRef] [PubMed]
Su, J.; Jiao, L.; Xu, G. Intensified Exposure to Compound Extreme Heat and Ozone Pollution in Summer across Chinese Cities. npj Clim. Atmos. Sci. 2025, 8, 78. [Google Scholar] [CrossRef]
Lyu, Y.; Ju, Q.; Lv, F.; Feng, J.; Pang, X.; Li, X. Spatiotemporal Variations of Air Pollutants and Ozone Prediction Using Machine Learning Algorithms in the Beijing-Tianjin-Hebei Region from 2014 to 2021. Environ. Pollut. 2022, 306, 119420. [Google Scholar] [CrossRef] [PubMed]
Li, S.; Song, G.; Xing, J.; Dong, J.; Zhang, M.; Fan, C.; Meng, S.; Yang, J.; Dong, L.; Gong, W. Unraveling Overestimated Exposure Risks through Hourly Ozone Retrievals from Next-Generation Geostationary Satellites. Nat. Commun. 2025, 16, 3364. [Google Scholar] [CrossRef] [PubMed]
Fu, G.; Cheng, H.; Lu, Q.; Liu, H.; Zhang, X.; Zhang, X. The Synergistic Effect of High Temperature and Ozone on the Number of Deaths from Circulatory System Diseases in Shijiazhuang, China. Front. Public Health 2023, 11, 1266643. [Google Scholar] [CrossRef] [PubMed]
Xue, T.; Geng, G.; Meng, X.; Xiao, Q.; Zheng, Y.; Gong, J.; Liu, J.; Wan, W.; Zhang, Q.; Kan, H.; et al. New WHO Global Air Quality Guidelines Help Prevent Premature Deaths in China. Natl. Sci. Rev. 2022, 9, nwac055. [Google Scholar] [CrossRef] [PubMed]
Sun, Q.; Wang, W.; Chen, C.; Ban, J.; Xu, D.; Zhu, P.; He, M.Z.; Li, T. Acute Effect of Multiple Ozone Metrics on Mortality by Season in 34 Chinese Counties in 2013–2015. J. Intern. Med. 2018, 283, 481–488. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Characteristics of ozone concentration variations. The data from (a–c) were calculated based on hourly ozone concentration data from 2015 to 2023, and (d) was derived from ozone pollution days in the second quarter of 2023. (a) The 2015–2023 hourly and daily ozone concentration, (b) variation pattern every week within a year, (c) ozone level’s daily change in different seasons, (d) daily concentration pattern of ozone pollution days, classified based on the time reaching peak concentration (13 and 14, etc.).

Figure 2. The workflow of the ICEEMDAN-BiLSTM-persistence model. The new model employs the ICEEMDAN to decompose the ozone sequence. The resulting IMFs undergo fast Fourier transform for period analysis and are classified as short-periodic or long-periodic. Short-periodic IMFs are predicted with a BiLSTM model, while long-periodic IMFs are forecasted using a persistence model.

Figure 3. The results obtained from the ICEEMDAN decomposition method. Decomposition results of the training set time series concentration.

Figure 4. The frequency and MV characteristics of different IMFs. (a) Fourier spectral analysis on all IMFs (excluding Res), (b) all IMFs’ and Res’ MV, reflects the richness of information contained in different IMFs and Res within the 36 h.

Figure 5. The autocorrelation function (ACF) and partial autocorrelation function (PACF) for each IMF.

Figure 6. The ICEEMDAN-BiLSTM-persistence model’s ozone concentration prediction results. (a–d) compares predicted and observed values at T+1, T+12, T+24, and T+36, (e,f) compare predicted and observed values for noon-MDA8 and midnight-MDA8. The dashed lines represent the threshold for high pollution episodes.

Figure 7. The impact of ozone concentration levels and prediction timestep on model performance. (a–e) are the predicted results of T+1, T+6, T+12, T+24, and T+36, and (f) refers to the differences in hourly errors between clean and polluted hours. Ozone concentrations above 200 μg/m³ are considered polluted, while those below 200 μg/m³ are considered clean.

Figure 8. Four typical prediction examples. (a–d) are the comparisons between predicted and observed ozone concentrations for 1 July, 18 June, 22 September, and 22 December 2023.

Figure 9. The correlation between ozone and a series of meteorological conditions and pollutant variables.

Figure 10. The error distribution of different IMFs. (a) IMFE of short-periodic IMFs, and (b) the ratio of IMFE to the overall RMSE of IMFs 1–12.

Figure 11. Distribution of MDA8 O₃ in Beijing sites on 15 June 2023 and the location of the Beijing Workers Stadium.

Table 1. The hourly and daily predictions from different methods. The unit of RMSE is µg/m³.

Hourly Prediction		ICEEMDAN- BiLSTM- Persistence	ICEEMDAN- BiLSTM	BiLSTM
T+1	R²	0.997	0.987	0.950
	RMSE	2.68	3.10	11.80
	HPFA	98%	91%	66%
T+6	R²	0.966	0.961	0.770
	RMSE	9.29	10.32	25.27
	HPFA	76%	70%	26%
T+12	R²	0.930	0.923	0.700
	RMSE	13.24	14.40	28.84
	HPFA	66%	61%	15%
T+24	R²	0.892	0.889	0.666
	RMSE	16.40	17.31	30.46
	HPFA	61%	53%	7%
T+36	R²	0.860	0.851	0.615
	RMSE	18.70	20.03	32.70
	HPFA	45%	40%	1%
Daily Prediction
noon- MDA8 O₃	R²	0.939	0.928	0.594
	RMSE	13.27	14.37	34.22
	HPFA	67%	65%	17%
midnight- MDA8 O₃	R²	0.989	0.986	0.877
	RMSE	5.70	6.32	18.70
	HPFA	90%	87%	79%

Note: HPFA represents high-pollution forecast accuracy.

Table 2. The impact of different processing methods of auxiliary variables on prediction.

	T+1		T+12		T+36
	R²	RMSE	R²	RMSE	R²	RMSE
Without auxiliary variables	0.997	2.67	0.933	13.00	0.856	18.95
With auxiliary variables	0.997	2.70	0.930	13.20	0.842	18.96
With decomposed auxiliary variables	0.996	3.02	0.930	13.27	0.855	19.02

Table 3. The impact of classification methods on prediction.

IMFs Predicted by Persistence Model	T+1		T+12		T+36
IMFs Predicted by Persistence Model	R²	RMSE	R²	RMSE	R²	RMSE
$\geq$ IMF8	0.997	2.79	0.913	14.75	0.733	25.84
$\geq$ IMF11	0.997	2.73	0.930	13.16	0.855	19.05
$\geq$ IMF12	0.997	2.74	0.930	13.15	0.856	18.99
$\geq$ IMF13	0.997	2.68	0.933	13.24	0.856	18.95
$\geq$ IMF14	0.997	2.69	0.931	13.22	0.856	18.97
All IMFs predicted by BiLSTM	0.987	3.10	0.923	14.40	0.851	20.03

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lv, T.; Yi, Y.; Zheng, Z.; Yang, J.; Li, S. Long-Term Hourly Ozone Forecasting via Time–Frequency Analysis of ICEEMDAN-Decomposed Components: A 36-Hour Forecast for a Site in Beijing. Remote Sens. 2025, 17, 2530. https://doi.org/10.3390/rs17142530

AMA Style

Lv T, Yi Y, Zheng Z, Yang J, Li S. Long-Term Hourly Ozone Forecasting via Time–Frequency Analysis of ICEEMDAN-Decomposed Components: A 36-Hour Forecast for a Site in Beijing. Remote Sensing. 2025; 17(14):2530. https://doi.org/10.3390/rs17142530

Chicago/Turabian Style

Lv, Taotao, Yulu Yi, Zhuowen Zheng, Jie Yang, and Siwei Li. 2025. "Long-Term Hourly Ozone Forecasting via Time–Frequency Analysis of ICEEMDAN-Decomposed Components: A 36-Hour Forecast for a Site in Beijing" Remote Sensing 17, no. 14: 2530. https://doi.org/10.3390/rs17142530

APA Style

Lv, T., Yi, Y., Zheng, Z., Yang, J., & Li, S. (2025). Long-Term Hourly Ozone Forecasting via Time–Frequency Analysis of ICEEMDAN-Decomposed Components: A 36-Hour Forecast for a Site in Beijing. Remote Sensing, 17(14), 2530. https://doi.org/10.3390/rs17142530

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Long-Term Hourly Ozone Forecasting via Time–Frequency Analysis of ICEEMDAN-Decomposed Components: A 36-Hour Forecast for a Site in Beijing

Abstract

1. Introduction