**1. Introduction**

Natural gas has been proposed as a solution to increase the security of the energy supply and to reduce environmental pollution around the world. It is the second most widely used energy commodity after oil [1]. With the replacement of coal and the widespread use of natural gas, gas spot price forecasting has become one of the most critical issues in many sectors. The accurate forecasting of natural gas spot prices is of high importance, as these forecasts are used in the energy market, in power system planning and in regulatory decision making, covering both supply and demand in the natural gas market.

Due to the significant economic results obtained from forecasting, many techniques have been explored and studied, especially in electric load forecasting, such as artificial neural networks (ANN), as seen in [2] and SVM, as seen in [3] and many other works. The current studies on energy market forecasting mainly focus on crude oil prices [4]. Thus, publications in the field of natural gas price forecasting are relatively rare [1].

One of the few studies that has tried to directionally forecast natural gas price movements for the U.S. market is that of [5], which analyzed trader positions published on a weekly basis. [6] forecasted gas prices one day ahead, but they relied on monthly forward products and futures instead of focusing on current prices. They combined wavelet transform (WT) with fixed and adaptive machine learning/time series models: multi-layer perceptron (MLP), radial basis functions (RBF), linear regression, and GARCH (Generalized Autoregressive Conditional Heteroskedasticity). According to their results, the best models for electricity demand/gas price forecasting are the adaptive MLP/GARCH.

Another study analyzing gas prices is that of [7]. They trained several nonlinear models with the aid of a Gamma test: local linear regression (LLR), dynamic local linear regression (DLLR), and artificial neural networks (ANN). They used daily, weekly, and monthly Henry Hub spot prices from 1997 to 2012. They concluded that the forecasting model of daily spot prices using ANN can provide an accurate view. Moreover, ANN models have superior performance compared to LLR and DLLR models.

**Citation:** Mouchtaris, D.; Sofianos, E.; Gogas, P.; Papadimitriou, T. Forecasting Natural Gas Spot Prices with Machine Learning. *Energies* **2021**, *14*, 5782. https://doi.org/10.3390/ en14185782

Academic Editor: Javier Reneses

Received: 28 July 2021 Accepted: 9 September 2021 Published: 14 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Ref. [8] tried to determine whether natural gas future prices can predict natural gas spot prices. They used daily observations for the spot and futures prices for natural gas for all trading days between 1 January 1997 and 3 March 2014 collected from the U.S. Energy Information Administration (EIA) for a total of 4294 observations. According to their results, gas futures prices are not superior in forecasting natural gas spot prices when compared to a random walk (RW) model.

Ref. [9] compared the long-horizon forecasting performance of traditional econometric models with machine learning methods (neural networks and random forests) for the main energy commodities in the world: oil, coal and gas. Their results showed that machine learning methods outperform traditional econometric methods and that they present an additional advantage, which is the ability to predict turning points.

Ref. [10] combined machine learning methodologies (XGboost, SVM, logistic regression, random forests, and neural networks) with dynamic moving windows and expanded windows to forecast crises in the U.S. natural gas market for a period spanning from 1994 to 2019. According to their results, the best forecasting accuracy was achieved with the XGboost combined with the dynamic moving window, reaching 49% accuracy and a false alarm of no more than 25%.

Ref. [11] presented a literature survey of the published papers forecasting natural gas prices, amongs<sup>t</sup> others. According to their survey, predicting the exact future evolution of natural gas price is impossible.

According to the literature review, it can be observed that machine learning methodologies produce higher prediction accuracy compared to standard econometric methods. Therefore, in this paper we trained models that have the potential to successfully predict gas prices. The models trained in this paper are the support vector machines (SVM), regression trees, linear regression, Gaussian process regression (GPR), and ensemble of trees models. We focus on the short-term forecasting of the natural gas spot price 1, 3, 5, and 10 days ahead, and we compare the effectiveness of the machine learning models in natural gas price forecasting with a random walk model.

For the training of the models, we used the lags of the natural gas spot prices and a set of 21 explanatory variables that were selected based on the relevant literature (for instance, [1,8,12]) and determined their ability to enhance the predictive ability of natural gas price forecasting. The selected variables were then fed into the forecasting models through a training–testing learning process, resulting in the most efficient and least errorprone models for natural gas price forecasting.

The paper is organized as follows: in Section 2, we will briefly discuss the methodologies and the data used in our study, while in Section 3, we describe our empirical results. Finally, Section 4 will conclude the paper.
