1. Introduction
In the context of increasing global warming and heightened environmental awareness, the issues of carbon dioxide (CO
2) emissions and fuel consumption in the shipping industry have garnered extensive attention. As one of the major sources of global greenhouse gas emissions, the maritime transportation sector faces escalating emission challenges with the growth of international trade volumes [
1,
2,
3]. According to data from the International Maritime Organization (IMO), the shipping industry accounts for approximately 2.5% of global greenhouse gas emissions, and this proportion is expected to rise further as global trade continues to expand. Therefore, enhancing ship operational efficiency and reducing carbon footprints have become core objectives for industry development.
However, the collection of fuel consumption and emission data during actual shipping processes encounters numerous challenges. Firstly, the data acquisition process may be incomplete or inaccurate, introducing substantial noise into the models. Secondly, shipping data typically exhibit highly nonlinear characteristics, making it difficult for traditional linear regression models to effectively capture these complex interactions, thereby limiting the improvement of predictive performance. Additionally, the variability of the marine environment—such as weather conditions, route choices, and ship loading—further increases the complexity of the data and the difficulty of prediction.
To address these issues and achieve more accurate predictions, many researchers have currently adopted advanced machine learning methods such as Random Forest [
4,
5], Support Vector Machines (SVMs) [
6,
7], and Neural Networks [
8,
9]. These methods have, to some extent, improved the accuracy and robustness of predictions but also present significant limitations. Firstly, these models often require large amounts of high-quality data to ensure their performance when handling high-dimensional features, making them highly dependent on data quantity and quality. Secondly, these models have high computational complexity, especially when dealing with large-scale datasets, as the training and prediction processes consume substantial computational resources. Furthermore, the “black-box” [
10,
11,
12] nature of these methods makes it difficult to interpret the contribution of specific features to the prediction results, limiting their interpretability and transparency in practical applications. Ultimately, in scenarios with insufficient data or significant noise, these models are prone to overfitting, resulting in a substantial decline in their generalization capabilities.
In contrast, existing linear regression models [
13,
14] perform poorly when dealing with multi-dimensional and heterogeneous feature data, especially when features of different scales have unbalanced effects on the model. Although feature scaling (e.g., standardization) can alleviate this issue to some extent, the model remains susceptible to interference from irrelevant features, leading to decreased prediction accuracy. Traditional machine learning methods, when directly applied to features, have advantages over deep learning methods in terms of interpretability and performance; however, single models struggle to achieve high-precision results. Additionally, machine learning methods lack the ability of deep learning to filter effective features and suppress low-relevance features through mechanisms like attention, which can lead to training failures and increased prediction errors when handling a large number of features. These issues introduce significant errors and challenges to CO
2 prediction.
To address these challenges, this study proposes a Voting Regressor model (Voting-BRL) that combines Bayesian Ridge Regression and Lasso Regression, aiming to enhance the prediction performance of ship CO2 emissions and fuel consumption. Specifically, this study first employs Analysis of Variance (ANOVA) to select features highly correlated with the dependent variable, thereby reducing the dimensionality of the independent variables and effectively filtering out features with low relevance to the prediction task, thus decreasing model complexity and noise interference. Subsequently, an ensemble learning method that combines Bayesian Ridge Regression and Lasso Regression is utilized, leveraging the advantages of Bayesian Ridge Regression in handling uncertainty and feature correlations, and Lasso Regression’s capability for automatic feature selection. Through a voting mechanism, the predictions of the two models are integrated, further enhancing the model’s generalization capability and prediction stability.
The main contributions of this study include the following:
Proposed the Voting-BRL (Voting-Bayesian Ridge and Lasso) method: This method combines Bayesian Ridge Regression and Lasso Regression through a voting mechanism to achieve more precise carbon dioxide emission predictions.
Conducted detailed ablation experiments: These experiments analyze the impact of different modules on the performance of the Voting-BRL model across multiple datasets, validating the effectiveness of each component of the model.
Validated the method using real-world data: Utilizing four years of actual data from the THETIS-MRV platform managed by the European Maritime Safety Agency (EMSA), the experimental results demonstrate that the Voting-BRL model achieves or exceeds an of 0.99 in prediction performance, significantly outperforming traditional methods and showcasing its efficiency and reliability in practical applications.
2. Related Work
2.1. Ship Energy Saving and Emission Reduction
The study of energy saving and emission reduction in shipping faces many challenges. The complexity of data acquisition and quality assurance arises from the inconsistency of multi-source heterogeneous data and its inherent noise, which seriously affects the accuracy and reliability of the analysis. Feature selection and data identification and extraction also play a key role in the model’s performance and its predictive results. To address these issues, highly complex models commonly used in current research place significant demands the computing resources during practical application. The complexity of shipping data requires researchers to develop methods that adapt to changes in uncertainty.
In the study of carbon dioxide emissions prediction, many methods have been applied. Song et al. [
1], using data from 2010 to 2018, identified causal relationships between factors influencing the logistics environment and specific modes of transportation through the OLS method. Zincir’s research [
15] focused on the potential of ammonia as an alternative fuel. The adoption of ammonia fuel could effectively reduce CO
2 emissions in shipping, but commercialization still faces challenges related to infrastructure and fuel efficiency. Wang et al. [
16] explored the main challenges of decarbonizing the shipping industry, including the long-term sustainability of the industry’s response to regulatory and policy changes on emissions. Mocerino et al. [
17] provided a detailed review of the mutual impacts between climate change and the shipping industry, which offers significant insights into CO2 emissions prediction. Nguyen et al. [
18] reviewed the application of electric propulsion systems in the shipping industry and highlighted their potential to reduce CO
2 emissions. Xing et al. [
19] proposed several measures to reduce CO
2 emissions from ships, including reducing ship speed and optimizing sailing routes. Mersin et al. [
20] analyzed multiple existing emission reduction methods, noting that the inconsistency of multi-source data and the accurate identification and extraction of relevant features are major challenges in this field. It is evident that reducing CO
2 emissions during shipping has become a significant issue, and predicting the impact of different shipping methods on CO
2 emissions is a crucial step in addressing this problem.
2.2. Bayesian Ridge Regression
Bayesian Ridge Regression is a linear regression method that incorporates a probabilistic perspective, adding regularization to prevent overfitting by applying a prior distribution to the regression coefficients. This method is particularly useful when dealing with multicollinearity or when the number of features exceeds the number of samples. In shipping energy-saving and emission reduction studies, Bayesian Ridge Regression has been applied to estimate and predict various factors influencing CO
2 emissions, including fuel consumption, ship speed, and cargo load. The use of Bayesian Ridge Regression allows researchers to introduce uncertainty into the model, which is essential when dealing with noisy or incomplete multi-source data in shipping. For instance, Crosby et al. [
21] applied Bayesian logistic regression to examine the relationship between occupants’ perceived thermal comfort and indoor CO
2 levels in buildings, providing an effective predictive method. Affholder et al. [
22] used Bayesian statistical methods to quantify the likelihood that methanogenesis (biomethane production) could explain the escape rates of molecular hydrogen and methane in the plumes of Enceladus. Michimae et al. [
23] employed vine copula to construct a copula-based joint prior distribution, yielding more accurate estimates in cases of multicollinearity. As computing power increases, the application of Bayesian methods is expected to grow, offering more robust and accurate predictions in energy-saving and emission reduction efforts.
2.3. Lasso Regression
Lasso Regression (Least Absolute Shrinkage and Selection Operator) is a popular regression method that performs both feature selection and regularization to enhance the prediction accuracy and interpretability of the model. By imposing a penalty on the absolute values of the coefficients, Lasso forces some of them to be exactly zero, effectively selecting a simpler model by excluding less important features. This makes Lasso particularly well suited for high-dimensional datasets with many predictors, a common scenario in shipping research, where multiple variables such as fuel types, operating speeds, and environmental factors need to be considered. Michalakopoulos et al. [
24] conducted a comparative analysis of machine learning algorithms used to predict CO
2 emissions in the maritime sector. Zhou et al. [
25] proposed an adaptive hyperparameter tuning method combining ANN and Lasso, taking into account the impact of marine environmental factors on fuel consumption. Monisha et al. [
26] developed two machine learning models, including Lasso, using actual voyage data collected from noon reports of ships in Bangladesh. Lasso’s application is expected to increase as the volume and complexity of shipping data continue to grow.
3. Voting-BRL
In maritime emission and fuel consumption prediction models, ship CO2 emission data are influenced by a myriad of environmental and operational factors, such as weather conditions, cargo load levels, and route choices. These factors not only complicate the data but also introduce varying degrees of uncertainty in prediction outcomes. The inconsistent collection of emission data further exacerbates the prediction bias inherent in traditional models. Given that the relationship between emissions and fuel consumption is typically complex and nonlinear, traditional linear models, such as ordinary least squares (OLS) regression, are often insufficient to capture the intricate dynamics between variables. This underscores the need for more sophisticated modeling approaches to ensure higher accuracy.
Moreover, the environmental factors—such as wind speed, sea waves, and temperature—directly influence the emission levels. These variables are often neglected in traditional models or are difficult to quantify accurately, further contributing to prediction errors. The interactions between various factors, including ship type, navigation conditions, and fuel type, create a high-dimensional dataset that poses additional challenges for traditional modeling techniques. Therefore, to manage the inherent nonlinearity and complexity, more advanced methods that can effectively handle such high-dimensional data are required.
3.1. Overall Structure
To address data nonlinearity, complexity, and uncertainty in predicting ship emissions, we drew inspiration from Bayes’ theorem. Our approach leverages the probabilistic correlation between known and unknown events to improve the accuracy of predictions. Based on this principle, we developed a Bayesian-based predictive model that utilizes both historical data and the inherent uncertainty in maritime operations to make more informed predictions about ship emissions.
The overall structure of our model is illustrated in
Figure 1. In this framework, we first apply an ANOVA (Analysis of Variance) technique for feature selection, then perform regression using a hybrid method that combines Bayesian Ridge Regression and Lasso Regression, ultimately outputting final predictions through an ensemble learning strategy known as Voting Regressor.
In the following sections, we introduce the individual components of the model in detail, explaining the rationale behind each step and its contribution to improving the prediction accuracy of maritime emissions.
3.2. ANOVA Feature Selection
Feature selection plays a pivotal role in the development of predictive models, especially when dealing with high-dimensional data that contain numerous independent variables. In our model, we employ one-way ANOVA analysis to identify the most critical features that have a statistically significant impact on the target variable, which is ship emissions in this context.
Prior to conducting ANOVA, we applied one-hot encoding to all categorical independent variables to ensure the creation of a high-dimensional feature space. While this transformation enables the model to capture essential information from categorical variables, it also increases the dimensionality of the feature space. Too many features can reduce the efficiency of model fitting and introduce overfitting risks, whereas too few features may fail to represent the true complexity of the data. Therefore, feature selection is crucial in balancing model complexity and accuracy.
The ANOVA method allows us to test the hypothesis of whether there are significant differences between the means of different groups with respect to the target variable. Specifically, for each independent variable, we test the following hypotheses:
Null Hypothesis : There is no significant relationship between the feature and the target variable.
Alternative Hypothesis : There is a significant relationship between the feature and the target variable.
The test statistic for one-way ANOVA, the F-statistic, is calculated as follows:
where k is the number of groups,
is the sample size of the i-th group,
is the mean of the i-th group,
is the overall mean, and N is the total sample size.
Based on the resulting F-statistic and its corresponding p-value, we select features that exhibit a significance level below a threshold of . These selected features are deemed important for the subsequent regression analyses.
3.3. Ensemble Learning Methods
To improve the predictive performance of our model, we employ an ensemble learning approach by combining Bayesian Ridge Regression with Lasso Regression. This approach harnesses the strengths of both regression techniques: Bayesian Ridge Regression excels in handling uncertainty and correlated features, while Lasso Regression is effective at performing automatic feature selection.
Bayesian Ridge Regression incorporates a probabilistic framework that introduces prior distributions for the model parameters, leading to more robust estimates. The regression model is represented as
where Y is the target variable, X is the matrix of independent variables,
represents the regression coefficients, and
is the error term. The use of priors allows for quantifying the uncertainty in the parameter estimates, which is especially beneficial in high-dimensional settings.
Lasso Regression, on the other hand, is well known for its feature selection capability due to its regularization term, which encourages sparsity in the model coefficients. The objective function of Lasso Regression is
where
is the regularization parameter and p is the number of features. Lasso effectively shrinks irrelevant feature coefficients to zero, which not only improves model interpretability but also prevents overfitting.
By combining these two methods in an ensemble learning framework, we enhance the model’s overall predictive power. We first obtain parameter estimates from Bayesian Ridge Regression and then pass these estimates into the Lasso Regression model to refine them further. This two-step approach ensures that we benefit from the strengths of both techniques.
3.4. Voting Regressor
The final step in our model involves the use of a Voting Regressor, which is an ensemble method that generates predictions by taking a weighted average of the outputs from multiple base regression models. In our case, we combine the predictions from Bayesian Ridge Regression and Lasso Regression to form a more robust prediction framework. The final prediction
can be expressed as
where
and
are the weights assigned to the predictions from the Bayesian Ridge and Lasso models, respectively, and n is the number of models. This method effectively balances the strengths of each model and reduces the potential bias that a single model might introduce.
The overall structure of the Voting Regressor is depicted in
Figure 1, illustrating how the predictions from the
and the
are aggregated to produce the final result. This approach improves the stability, accuracy, and robustness of the model by leveraging the complementary strengths of Bayesian Ridge and Lasso Regression.
Algorithm 1 Voting-BRL algorithm framework. | |
1: procedure VotingRegressor() | |
2: | ▹ One-hot encode the independent variables |
3: | ▹ Feature selection |
4: | ▹ Select relevant features |
5: | ▹ Train Bayesian Ridge model |
6: | ▹ Train Lasso model |
7: | ▹ Bayesian Ridge prediction |
8: | ▹ Lasso prediction |
9: | ▹ Calculate weights |
10: | ▹ Weighted average to obtain final prediction |
11: Return | ▹ Return final prediction results |
12: end procedure | |
4. Experiments
4.1. Data Preparation
The data originate from the THETIS-MRV platform managed by the European Maritime Safety Agency (EMSA), focusing on the monitoring, reporting, and verification (MRV) system of ship emissions. This platform provides public ship emission data, helping users view and analyze CO
2 emissions from ships within EU waters. The data cover detailed information such as fuel consumption, navigation distance, and CO
2 emissions, aiding in promoting transparency and environmental compliance in the shipping industry.
Table 1 presents relevant information about these data.
For the original data series, unnecessary rows and irrelevant data are first removed to make the data suitable for model construction. Several feature columns related to the target are selected, and missing values are filled. Numerical data are converted to numeric types and standardized, while categorical data are transformed through one-hot encoding. Finally, the ANOVA method is used for feature selection to prepare the data for model training.
4.2. Parameter Settings
Table 2 shows the parameters for Bayesian Ridge, Lasso, and Voting Regressor models.
For Bayesian Ridge Regression, setting hyperparameters such as alpha and lambda controls the strength of regularization. These hyperparameters constrain the model’s complexity, reducing overfitting and enhancing the model’s generalization ability on new data. Specifically, and are used to specify the strength of the prior distribution, while and help control the distribution of weights. Since Bayesian Ridge Regression involves uncertainty estimation, using regularization can effectively improve the model’s robustness.
In Lasso Regression, the alpha parameter directly affects the strength of regularization. By applying L1 regularization to the regression coefficients, Lasso Regression can achieve feature selection, effectively removing unimportant features. This setup helps enhance the model’s interpretability and improve computational efficiency, especially when the number of features is large.
In Voting Regressor, using multiple base models’ predictions can effectively improve prediction stability and accuracy. Combining different models can compensate for each model’s shortcomings, resulting in more balanced and reliable predictions.
In the feature engineering part, selecting the top 100 features using SelectKBest aims to extract the most influential features on the target variable, further reducing noise interference in the model. Additionally, data standardization ensures all features have a mean of 0 and a variance of 1, eliminating the impact of different units and scales on model training, facilitating faster convergence, and improving prediction accuracy.
4.3. Test Results
4.3.1. Comparative Experiments
To effectively demonstrate the superiority of our method, this experiment compares the regression results from 2020 to 2023 with existing classical machine learning methods using RMSE and
metrics. The comparison results are shown in
Table 3.
As shown in the comparison results above, advanced methods applied to this dataset exhibit varying performances. For instance, GaussianProcessRegressor is entirely unsuitable for these data from 2020 to 2023. This may be partly due to the sparsity or insufficiency of CO2 emission data, and additionally, GPR is extremely sensitive to noise in the data, with existing noise and outliers significantly impacting model performance.
MLPRegressor, SVR, RandomForestRegressor, and LinearSVR can handle nonlinear relationships to some extent, but their ability to capture the complex interactions underlying the data still depends on feature quality, resulting in poor model test results. Although LinearSVR has good results for 2020 and 2021 data, the RMSE results still show significant errors, indicating that the model has learned excessive noise and randomness, leading to overfitting. Such values are a result of overfitting and lack generalizability.
Among the subsequent methods, ExtraTreeRegressor, DecisionTreeRegressor, and XGBRegressor achieve relatively high values due to their unique tree structures and nonlinear processing characteristics. However, they still struggle to handle data features with strong uncertainty and complexity.
In both and RMSE performance metrics, Voting-BRL consistently demonstrates superior predictive performance. This is because Voting-BRL combines multiple models, allowing it to learn diverse features and patterns captured by different models, effectively avoiding overfitting.
To visualize the comparison among different methods more intuitively, we plot bar charts. However, due to the significant differences in magnitude between the values, smaller values are difficult to discern. Therefore, we normalize the values to the range [0,1] as shown in
Figure 2.
4.3.2. Visualization Analysis
To more accurately analyze the model’s predictive performance, we plot scatter plots to observe the discrepancies between the predicted and actual values as shown in
Figure 3.
In this scatter plot, each point represents an observation from the test dataset. The x-axis represents the actual values, and the y-axis represents the corresponding predictions generated by the model. The dashed line indicates the ideal fit line, illustrating that if the prediction perfectly matches the actual values, the points would lie on this line. This result graphically demonstrates the alignment between actual and predicted values.
To observe the distribution of results, we also plot a box plot as shown in
Figure 4.
The median line within the box represents the median of negative MSE values, indicating that most scores are concentrated around this value. The upper and lower edges of the box represent the first quartile (Q1) and third quartile (Q3), covering the middle 50% of the data. This span shows that MSE exhibits some degree of variation. The whiskers extend to the non-outlier minimum and maximum values, indicating that there are few extreme error values, and most of the model’s prediction errors are within a reasonable range.
To understand the relationship between predicted values and residuals and to evaluate potential issues with the model, we plot a residual plot as shown in
Figure 5.
Most residuals are concentrated around the zero horizontal line, meaning that most predictions are close to the actual values. However, there are larger residuals at low prediction values, indicating significant errors in these predictions. The residual plot shows no obvious non-random patterns, suggesting that the model may not have significant systematic errors. A few residuals are observed in the lower right and left areas, indicating that the model performs poorly on these data points. This suggests that the model may overfit on some data, necessitating the addition of regularization methods for optimization.
5. Discussion
5.1. Ablation Experiments
To verify that the Voting-BRL method indeed combines the advantages of each model and outperforms other models, we conduct ablation experiments by comparing single models with their ensemble as shown in
Table 4.
The performance of Lasso and Bayesian Ridge models is relatively similar, with high values indicating strong flexibility and better adaptation to the data structure. However, their RMSE values are still relatively large, indicating significant prediction errors. This suggests that while they have advantages in regularization, they may struggle to capture key patterns in the presence of substantial data noise.
Compared to Lasso and Bayesian Ridge, BRL exhibits stronger fitting capabilities, and its RMSE values are significantly lower. However, in 2023, it still experiences considerable errors, indicating that BRL can still overfit in complex data scenarios.
Voting-BRL combines the predictions of multiple models, meaning it likely utilizes different weight assignments or voting mechanisms to leverage the strengths of each model while reducing the errors associated with single models.
5.2. Significance Tests
To demonstrate which mechanisms in the model optimization process have more significant impacts on the results, we conduct t-tests on RMSE and
. The results are shown in
Figure 6 and
Figure 7.
In
Figure 6, each bar represents the
p-value of the t-test between the RMSE of two models. The red dashed line indicates the significance level of 0.05. If the
p-value is below this line, the difference between models is typically considered statistically significant. All
p-values are above 0.05, indicating that the RMSE differences between these models are not statistically significant. However, they are below 0.4, meaning that the changes have some improvement on RMSE.
In
Figure 7, different significance results are observed. The highest
p-value is between “Lasso vs. BayesianRidge”, indicating no significant difference. The
p-values for “Lasso vs. BRL” and “Lasso vs. Voting-BRL” are close to 0.05, suggesting that BRL has significantly improved
compared to Lasso and Bayesian Ridge. Similarly, Voting-BRL also shows a significant improvement over BRL. This proves the effectiveness of the Voting-BRL method.
Thus, combining multiple models significantly enhances results, leveraging each model’s strengths to effectively predict CO2 emissions.
6. Conclusions
In this study, we introduced the Voting-BRL model, an innovative ensemble learning approach that integrates Bayesian Ridge Regression and Lasso Regression, to predict ship carbon dioxide (CO2) emissions and fuel consumption with high accuracy and robustness. By leveraging Analysis of Variance (ANOVA) for feature selection, the model effectively reduced dimensionality and minimized noise interference, enhancing its predictive performance. Experimental results demonstrated that Voting-BRL achieved an outstanding of 0.9981 and a Root Mean Square Error (RMSE) of 8.53, markedly outperforming traditional machine learning models such as XGBRegressor, which attained an of 0.97 and an RMSE of 45.03. Ablation studies confirmed that the ensemble strategy harnesses the complementary strengths of Bayesian Ridge and Lasso Regression, resulting in superior generalization capabilities and prediction stability.
The exceptional performance of the Voting-BRL model underscores its potential as a reliable tool for emission management and operational optimization within the maritime industry. Accurate predictions of CO2 emissions and fuel consumption are crucial for developing strategies to enhance environmental sustainability and comply with increasingly stringent regulatory standards. By providing precise forecasts, the Voting-BRL model can assist stakeholders in making informed decisions that contribute to reducing the carbon footprint of shipping operations.
Future work may focus on expanding the model to incorporate additional environmental and operational factors, thereby further enhancing its predictive accuracy and applicability. Additionally, integrating real-time data streams could enable dynamic emission monitoring and adaptive decision-making in response to changing maritime conditions. Exploring the application of the Voting-BRL framework to other sectors within the transportation industry may also yield valuable insights and broaden its impact on global efforts to mitigate greenhouse gas emissions.