Price Prediction of Bitcoin Based on Adaptive Feature Selection and Model Optimization

Zhu, Yingjie; Ma, Jiageng; Gu, Fangqing; Wang, Jie; Li, Zhijuan; Zhang, Youyao; Xu, Jiani; Li, Yifan; Wang, Yiwen; Yang, Xiangqun

doi:10.3390/math11061335

Open AccessArticle

Price Prediction of Bitcoin Based on Adaptive Feature Selection and Model Optimization

by

Yingjie Zhu

^1,*,

Jiageng Ma

^1,*

,

Fangqing Gu

^2,*

,

Jie Wang

¹,

Zhijuan Li

¹

,

Youyao Zhang

³

,

Jiani Xu

⁴,

Yifan Li

⁵

,

Yiwen Wang

¹

and

Xiangqun Yang

¹

School of Science, Changchun University, Changchun 130022, China

²

School of Mathematics and Statistics, Guangdong University of Technology, Guangzhou 510520, China

³

School of Philosophy, Shaanxi Normal University, Xi’an 710119, China

⁴

School of Economics and Management, Beijing University of Chemical Technology, Beijing 100029, China

⁵

HSBC Business School, Peking University, Beijing 100871, China

^*

Authors to whom correspondence should be addressed.

Mathematics 2023, 11(6), 1335; https://doi.org/10.3390/math11061335

Submission received: 5 January 2023 / Revised: 2 March 2023 / Accepted: 7 March 2023 / Published: 9 March 2023

(This article belongs to the Special Issue Emerging Topics in Machine Learning, Image Processing and Pattern Recognition for AI-Related Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Bitcoin is one of the most successful cryptocurrencies, and research on price predictions is receiving more attention. To predict Bitcoin price fluctuations better and more effectively, it is necessary to establish a more abundant index system and prediction model with a better prediction effect. In this study, a combined prediction model with twin support vector regression was used as the main model. Twenty-seven factors related to Bitcoin prices were collected. Some of the factors that have the greatest impact on Bitcoin prices were selected by using the XGBoost algorithm and random forest algorithm. The combined prediction model with support vector regression (SVR), least-squares support vector regression (LSSVR), and twin support vector regression (TWSVR) was used to predict the Bitcoin price. Since the model’s hyperparameters have a great impact on prediction accuracy and algorithm performance, we used the whale optimization algorithm (WOA) and particle swarm optimization algorithm (PSO) to optimize the hyperparameters of the model. The experimental results show that the combined model, XGBoost-WOA-TWSVR, has the best prediction effect, and the EVS score of this model is significantly better than that of the traditional statistical model. In addition, our study verifies that twin support vector regression has advantages in both prediction effect and computation speed.

Keywords:

bitcoin price prediction; twin support vector regression; XGBoost; feature selection; whale optimization algorithm

MSC:

68U35

1. Introduction

Since the official birth of Bitcoin in 2009, it has attracted worldwide attention. As a cryptocurrency market representative, Bitcoin price fluctuation affects the stability and development of the world economy. The effective price prediction of Bitcoin at a low time cost is an important issue that investors and policy-makers urgently need to solve (e.g., [1,2]). Researchers have used traditional statistical methods and machine learning models to predict and analyze the price of Bitcoin. For example, Khedr et al. [3] summarized the research progress in the cryptocurrency price-prediction field from 2010 to 2020 and proposed gaps in existing research and prospects for related future research. Traditional statistical methods include Bayesian regression (e.g., [1,2]), logistic regression [4,5], linear regression [6], and autoregressive comprehensive moving average (ARIMA) (e.g., [7,8]). Machine learning models include support vector machines (SVMs) [9], random forest (RF) classifiers [10], artificial neural networks (ANNs) (e.g., [11,12,13]), and long- and short-term memory (LSTM) networks (e.g., [14,15]).

Further research has divided Bitcoin prices into separate periods of time. Chen Z et al. [16] divided Bitcoin price prediction into 5 min price prediction and daily price prediction models according to data characteristics. In the 5 min price prediction model, the accuracy of the machine learning algorithm was 67.2%, which is better than that of the traditional statistical methods. In daily price prediction, traditional statistical methods such as logistic regression and linear discriminant analysis have accuracy rates of about 66%, which is higher than those of machine learning models. Similarly, Mudassir M. et al. [17] used classification and prediction models, i.e., ANN, SVM, and LSTM, to study Bitcoin prices for 1 day, 7 days, 30 days, and 90 days. The results show that the accuracy of the classification model is higher than 60%, and the error percentage of the prediction model is lower than

5 %

. Mallqui D.C. et al. [18] used machine learning technology to predict the direction, maximum value, minimum value, and closing price of the daily Bitcoin price. Compared to the original model, the accuracy of Mallqui’s final model was improved by more than

10 %

.

Current Bitcoin price prediction studies have the following two drawbacks. First, the selection of appropriate influencing factors and prediction models for Bitcoin prices is of great significance. How to choose the factors that affect the price of Bitcoin is a central issue. Yechen Zhu et al. [19] verified the impacts of various economic factors on the price of Bitcoin, and the research results show that the dollar index has the greatest impact on the price of Bitcoin, and the price of gold has the least impact. In addition to economic factors, user sentiment can also play a role in the price of Bitcoin. Based on a study of the impact of macroeconomic factors on the price of Bitcoin, Aggarwal A. et al. [20] creatively considered Twitter emotion as an influencing factor and found that Twitter emotion had a positive correlation with the price of Bitcoin. Based on the study of traditional determinants, Ciaian P. et al. [21] found that the attractiveness of Bitcoin represented by Wikipedia to investors and users significantly affected the change in the Bitcoin price. Kim Y.B. et al. [22] extracted all comments and responses posted in crypto-related online communities and rated them based on the positive or negative sentiments of the comments and responses. Based on the averaged one-dependence estimators, a prediction was made by a machine learning model. Three cryptocurrencies (Bitcoin, Ethereum, Ripple) were tested to verify the scientific nature and feasibility of the model.

Furthermore, Wei C. et al. [23] developed a two-stage machine learning model for predicting Bitcoin prices. In the first stage, an artificial neural network and random forest method were used to extract important economic and technical factors affecting the price of Bitcoin. In the second stage, LSTM, ARIMA, the SVR machine model, and the adaptive network fuzzy inference system (ANFIS) model were used to predict the price of Bitcoin. Finally, it is concluded that the prediction effect of LSTM was better than those of other prediction models. However, in it, the change in Bitcoin price is limited to the impacts of a particular category of factors, rather than a comprehensive consideration of all the influencing factors. At the same time, the multicollinearity of selection influencing factors is not considered in the establishment of the model, and a feature-selection stage is missing.

In this study, the XGBoost algorithm and random forest algorithm were used to select the influencing variables. Previous studies have shown that both the XGBoost and random forest algorithms are effective feature selection models (e.g., [24,25,26,27]). The comparison between the selected variable group and the original full variable group is shown in Section 4. The XGBoost algorithm is a type of ensemble learning boosting system. The idea is to integrate many weak classifiers to form a strong classifier (e.g., [28,29]). The random forest algorithm is an ensemble learning bagging approach. It combines the ability to solve regression problems and classification problems. A random forest integrated multiple decision trees through the idea of ensemble learning and averaged the output of each decision tree to obtain the final output result in [30].

There are various types of Bitcoin price prediction models, but they do not remove the influence of model differences on prediction. In addition, the differences between the strengths and weaknesses of the prediction models were not considered in the different periods of the datasets, and the generalization abilities of the models were poor. Therefore, in this study, support vector regression (SVR), least-squares support vector regression (LSSVR), and twin support vector regression (TWSVR) were used as the main predictive models. TWSVR has good generalization, high prediction accuracy, relatively simple calculations, and good robustness to outliers (e.g., [31,32,33]). These three models all fit the data by establishing the prediction belt method, which solves the influence of model-type differences in the overall prediction to a certain extent. At the same time, the ARIMA model was set as the benchmark model and was compared with the main prediction model to obtain the best prediction model of Bitcoin price.

In addition, selecting a reasonable hyperparameter for the model can improve the accuracy of the prediction model (e.g., [34,35,36]). The intrinsic relationship between hyperparameters and prediction accuracy is very complex and unknown, which brings great challenges to the selection of hyperparameters. In this study, the whale optimization algorithm (WOA) and particle swarm optimization (PSO) algorithm were selected to optimize the hyperparameters of the models. Both algorithms are fast and effective community search algorithms.

The WOA is a new optimization algorithm inspired by whale hunting behavior (e.g., [37,38]). Whales are social animals and cooperate to drive and round up prey when hunting. The most important feature of the WOA is to simulate the hunting behavior of humpback whales with random individuals or optimal individuals and the bubble-net attack mechanism of a humpback whale with a spiral. The WOA performs a global search in the early stage and a local search in the late stage. The WOA seeks an optimal solution to the problem by mimicking the way whales hunt (e.g., [37,39]). The WOA randomly generates a group of whales, and the position of each whale represents a feasible solution. During a pod hunt, each whale behaves in two ways. During each iteration, each whale randomly chooses one of two behaviors to hunt. One behavior is where all whales move toward each other to surround prey. The other behavior is when whales swim in circles and blow bubbles to repel the prey. Each whale follows two rules of behavior until the best solution is found. Therefore, the WOA has the advantage of fast convergence while avoiding local optimization [40].

PSO is a random search algorithm based on group cooperation developed by simulating the foraging behavior of birds [41]. In PSO, the solution to each optimization problem is a bird in the search space, also known as a “particle”. Each bird (particle) looks for food in the search space, that is, the optimal solution to the optimization problem in [42]. Each particle does not know exactly where the food is but can sense its general direction. Each particle carries on the next search in the direction it decides, records the position of each iteration during the search and shares it with other particles. Finally, in the process of continuous iteration, the specific location of the optimal solution is searched (e.g., [43,44,45]).

The major contributions in this paper are described as follows:

(a): Expanding the factors that influence Bitcoin price movements. This article considers 27 influencing factors—that is, the impact of other cryptocurrencies on Bitcoin’s price—and uses the Crypto Fear and Greed Index (F&G) as an indicator of the degree of user demand for Bitcoin. In this study, XGBoost and random forest models were used to select the influencing factors of the price of Bitcoin, the variables selected by the two methods were compared with the full variables in the prediction model, and the index system affecting the prediction of the Bitcoin price was obtained.
(b): Three models, i.e., SVR, LSSVR, and TWSVR, were used as the main prediction models in this study. Since the prediction accuracies of SVR, LSSVR, and TWSVR are easily affected by the hyperparameters of the models, two optimization algorithms, i.e., WOA and PSO, were used in this study to optimize the hyperparameters of the prediction model.

The remainder of this article is organized as follows. The raw data of the Bitcoin price and its influencing factors are explained and interpreted in Section 2. In Section 3, we provide a detailed description of the prediction models and the optimization algorithm for the hyperparameters of the models. In Section 4, feature selection is reported on the original data, and the combined prediction model is constructed and solved. Section 5 provides a discussion on the prediction models and concludes this paper.

2. Data Collection and Preprocessing

Figure 1 plots the main processes of Bitcoin price prediction. It mainly includes four processes: data collection, data processing, feature selection, prediction model, and parameter optimization. We provide a detailed description of these four processes in the following sections.

2.1. Data Collection

This paper mainly studies the prediction of Bitcoin prices. Bitcoin prices have generally tended to rise over time, rising sharply from 2018 and peaking twice in March and November 2021. We selected relevant daily data from 1 January 2018 to 1 April 2022. The variation trend is shown in Figure 2.

The price of Bitcoin is affected by the internal factors of cryptocurrency and external economic factors. The internal factors of cryptocurrency are divided into three secondary indices: Bitcoin-supply factors, Bitcoin-demand factors, and other cryptocurrency’s prices. External economic factors are divided into three secondary indices: financial-indicator factors, macroeconomic factors, and exchange-rate factors. Twenty-seven factors were considered in this study. Figure 3 summarizes the relevant influencing factors of the Bitcoin price.

The Bitcoin and other encryption currency price data were from https://history.btc123.fans/ (accessed on 4 January 2023). Data on Bitcoin-supply factors and Bitcoin-demand factors are available from https://alternative.me/ (accessed on 4 January 2023). The financial-indicator factor data were from https://www.10jqka.com.cn/ (accessed on 4 January 2023). Macroeconomic factors and the exchange rate data were from https://fred.stlouisfed.org/ (accessed on 4 January 2023). The specific data are explained as follows.

Mining difficulty (MD) means the difficulty of successfully mining Bitcoin data blocks in transaction information, which represents the level of Bitcoin supply.
The Crypto Fear and Greed Index (F&G) analyzes the current sentiment in the Bitcoin market. Data levels range from 0 to 100. Zero means "extreme fear", while 100 means "extreme greed". It takes into account the volatility of Bitcoin price changes (25%), the current volume of Bitcoin trading (25%), market momentum, and the public’s appreciation of Bitcoin (50%). The Crypto Fear and Greed Index represents public demand for Bitcoin.
Ethereum Price (ETH), Litecoin Price (LTC), Bitcoin Cash Price (BCH), and the USDT Price Index (USDT) are the prices of other types of cryptocurrencies, which indirectly affect the price fluctuations of Bitcoin.
The Dow Jones Industrial Average Index (DJIA), National Association of Securities Dealers Automated Quotations Index (NASDAQ), Standard and Poor’s 500 Index (S&P 500), and CBOE Volatility Index (VIXCLS) are important measures of the U.S. stock market volatility.
The Financial Times Stock Exchange 100 Index (FTSE 100) is an important measure of volatility in the UK stock market.
The Hang Seng China Enterprises Index (HSCEI), Hang Seng Index (HSI), A/H share premium index (AHP), and Shanghai Securities Composite Index (SSEC) are important measure indices of volatility in China’s stock market.
Gold price (GOLD) and silver price (SILVER) are important indicators to measure the price fluctuation of rare metals in the United States and are also macroeconomic factors affecting the Bitcoin price change.
West Texas Intermediate (WTI) is an important index for measuring the price fluctuation of US crude oil and is a macroeconomic factor affecting the Bitcoin price change.
The effective federal funds rate (EFFR) is an important index to measure the fluctuation of the funds rate in the United States, and it is a macroeconomic factor affecting the change in Bitcoin price.
The 10-year break-even inflation rate (T10YIE) is an important index for measuring the inflation level in the United States and is a macroeconomic factor affecting the Bitcoin price change.
The US dollar index &reg (USDX) is an important index that comprehensively reflects the exchange rate of the US dollar in the international foreign exchange market.
Australian dollar to US dollar exchange rate (AUD∖USD), European dollar to US dollar exchange rate (EUR∖USD), Great Britain pound to US dollar exchange rate (GBP∖USD), 100 Japanese yen to US dollar exchange rate (100 JPY∖USD), Swiss Franc to US dollar exchange rate (CHY∖USD), and Canadian dollar to US dollar exchange rate (CAD∖USD) represent the exchange rates of the US dollar in each foreign exchange market.

2.2. Data Preprocessing

2.2.1. Missing Data Completion through Linear Interpolation

Bitcoin prices and related data are collected through websites and databases, so there are missing data and noise. Missing values are filled in by linear interpolation. The specific formula is expressed as follows:

\begin{matrix} \frac{Y - Y_{0}}{Y_{1} - Y_{0}} = \frac{X - X_{0}}{X_{1} - X_{0}}, \end{matrix}

(1)

where

X_{0}, X_{1}

, and X are the values of the previous period, the next period, and the current period of the known sample data, respectively; and

Y_{0}, Y_{1}

, and Y are the values of the previous period, the next period, and the current period of the missing sample data, respectively.

2.2.2. Data Normalization and Division

Normalization is used to process the data to eliminate the influence of dimensions of the model [46,47]. The specific formula is expressed as follows:

\begin{matrix} X_{n e w} = \frac{X_{o l d} - X_{m i n}}{X_{m a x} - X_{m i n}}, \end{matrix}

(2)

where

X_{m i n}

,

X_{m a x}

,

X_{o l d}

, and

X_{n e w}

are the minimum, maximum, original, and normalized values of the sample data, respectively. Figure 4 shows the normalized historical data, including the supply and demand of Bitcoin, other cryptocurrency prices, financial indicators, macroeconomics, and exchange rates.

Through the data of this period, this study aimed to determine the key variables that affect the price change of Bitcoin and more effectively predict the price of Bitcoin on the basis of previous studies. In this study, the data of the first 1315 periods are defined as the training set to fit the model, and the data of the last 200 periods are defined as the test set to evaluate the quality of the model. The test set is approximately 13.2% of the total data.

2.2.3. Feature Selection for Price Prediction of Bitcoin

We drew on the experience of previous studies and adaptively selects more external economic and financial factors of Bitcoin. The total number of influencing variables is as high as 27. The 27 variables were completely included in the prediction model, which may affect the prediction accuracy, so we used the XGBoost algorithm and random forest algorithm to screen the influencing variables. Previous studies have shown that both the XGBoost algorithm and random forest algorithm are effective feature screening models [25,27,48,49]. The comparison between the selected variable group and the original full variable group is shown in Section 4.

3. Price Prediction of Bitcoin

The main prediction models used in this study are support SVR, LSSVR, and TWSVR. Complete features (27), XGBoost features (11), and random forest features (6) were substituted into the prediction model; and the WOA and PSO were used for optimization to form a hybrid model, including data screening, optimization, and prediction. Finally, we used the expected variance score (EVS), coefficient of determination (

R^{2}

), mean absolute error (MAE), mean square error (MSE), root-mean-square error (RMSE) and mean absolute percentage error (MAPE) to measure the prediction effect of the hybrid model. At the same time, this study uses CPU time to measure the running rate of the hybrid model (e.g., [50,51]).

3.1. Main Forecasting Model

In this study, support vector regression and its variants were selected as the main prediction model of the experiment. SVR, LSSVR, and TWSVR have good generalization, high prediction accuracy, relatively simple calculation, and good robustness to outliers [33]. These models are applicable to the prediction of Bitcoin price at this time.

3.1.1. Support Vector Regression

SVR is a model that introduces the idea of a support vector machine into a regression problem [52]. Differently from the traditional regression model, which directly calculates the loss based on the difference between the model predicted value

f (x)

and the real value y, SVR assumes that it can tolerate the deviation of

ε

. It is equivalent to establishing an interval band with width

2 ε

. If the predicted value of the model is included in the interval band, then the prediction is considered accurate. The solution procedure of the SVR problem is similar to that of the SVM problem. The interval band must be maximized; that is,

{∥ω∥}^{- 1}

is maximized and

{∥ω∥}^{2}

is minimized. Thus, the SVR problem can be formulated as shown in Figure 5.

\begin{matrix} min \frac{1}{2} {∥ ω ∥}^{2}, \\ s . t . |y_{i} - (ω^{T} x_{i} + b)| \leq ε, \forall i . \end{matrix}

(3)

Introduce the insensitive loss function:

min_{ω, b} \frac{1}{2} {∥ ω ∥}^{2} + C \sum_{i = 1}^{m} l_{ε} (f (x_{i} - y_{i}),

(4)

where C is the regularization coefficient, which determines the importance of the loss function.

l_{ε}

is the insensitive loss function, and the specific form is:

\begin{matrix} l_{ε} (Z) = \{\begin{matrix} 0, i f | z | \leq ε; \\ | z | - ε, o t h e r w i s e . \end{matrix} \end{matrix}

(5)

Here, the inequality constraints of the original problem are not convex functions, and slack variables

\hat{ξ}

and

\overset{ˇ}{ξ}

are introduced

\begin{matrix} \begin{matrix} y_{i} - (ω^{T} + b) - ε \leq ξ_{i}, \\ \underset{l o w e r}{\underset{⏟}{- ε - ξ_{i} \leq ω_{i} + b - y_{i}}} \leq ε + ξ_{i}, \\ - ε - ξ_{i} \leq \underset{u p p e r}{\underset{⏟}{ω_{i} + b - y_{i} \leq ε + ξ_{i}}} . \end{matrix} \end{matrix}

(6)

\begin{matrix} \begin{matrix} s . t . f (x_{i}) - y_{i} \leq ε + \hat{ξ}, \\ y_{i} - f (x_{i}) \leq ε + \overset{ˇ}{ξ}, \\ \hat{ξ} \geq 0, \overset{ˇ}{ξ} \geq 0, \forall i . \end{matrix} \end{matrix}

(7)

Next, we use the Lagrange multiplier method. The Lagrange multipliers

\hat{μ} \geq 0

,

\overset{ˇ}{μ} \geq 0

,

\hat{α} \geq 0

, and

\overset{ˇ}{α} \geq 0

are introduced to obtain the Lagrange function £º:

\begin{matrix} L (ω, b, {\hat{μ}}_{i}, {\overset{˘}{μ}}_{i}, {\hat{α}}_{i}, {\overset{ˇ}{α}}_{i}, {\hat{ξ}}_{i}, {\overset{˘}{ξ}}_{i}) \\ = \frac{1}{2} {∥ ω ∥}^{2} + C \sum_{i = 1}^{m} ({\hat{ξ}}_{i} + {\overset{˘}{ξ}}_{i}) - \sum_{i = 1}^{m} {\overset{˘}{μ}}_{i} {\overset{˘}{ξ}}_{i} - \sum_{i = 1}^{m} {\hat{μ}}_{i} {\hat{ξ}}_{i} \\ + \sum_{i = 1}^{m} {\overset{˘}{α}}_{i} (f (x_{i}) - y_{i} - ε - {\overset{˘}{ξ}}_{i}) + \sum_{i = 1}^{m} {\hat{α}}_{i} (f (x_{i}) - y_{i} - ε - {\hat{ξ}}_{i}) . \end{matrix}

(8)

Take the partial derivatives separately and bring them into the original function to solve the KKT conditions:

\{\begin{matrix} {\overset{˘}{α}}_{i} (f (x_{i}) - y_{i} - ε - {\overset{˘}{ξ}}_{i}) = 0, \\ {\hat{α}}_{i} (f (x_{i}) - y_{i} - ε - {\hat{ξ}}_{i}) = 0, \\ {\hat{α}}_{i} {\overset{˘}{α}}_{i} = 0, \hat{ξ_{i}} {\overset{˘}{ξ}}_{i} = 0, \\ (C - {\overset{˘}{α}}_{i}) {\overset{˘}{ξ}}_{i} = 0, (C - {\hat{α}}_{i}) {\hat{ξ}}_{i} = 0 . \end{matrix}

(9)

The final support vector regression model is as follows:

f_{S V R} (x) = ω^{T} x + b,

(10)

where

\begin{matrix} ω = \sum_{i = 1}^{m} ({\hat{α}}_{i} - {\overset{˘}{α}}_{i}) x_{i}, \\ b = y_{i} - \sum_{j = 1}^{m} ({\hat{α}}_{j} - {\overset{˘}{α}}_{j}) x_{j}^{T} x_{i} . \end{matrix}

(11)

3.1.2. Least-Squares Support Vector Regression

LSSVR is a fusion algorithm of SVR and ordinary least squares regression [31]. While inheriting the advantages of SVR, LSSVR creatively replaces the insensitive loss function

l_{ε}

with the error two-norm and replaces the inequality constraint with the equality constraint. The original convex quadratic programming problem is transformed into a linear equation system, which reduces the complexity of the algorithm and improves the running speed of the algorithm. However, while simplifying, LSSVR gives up the robustness of the SVR model for outliers and reduces the generalization ability. The original problem for LSSVR is

\begin{matrix} min_{ω, b} \frac{1}{2} {∥ ω ∥}^{2} + \frac{1}{2} γ \sum_{i = 1}^{n} ε_{i}^{2}, \\ s . t . y_{i} = ω^{T} ϕ (x_{i}) + b + ε_{i} . \end{matrix}

(12)

The Lagrangian function of least-squares support vector regression is £º

\begin{matrix} L (ω, b, ε_{i}, α_{i}) = \frac{1}{2} {∥ ω ∥}^{2} + \frac{1}{2} γ \sum_{i = 1}^{n} ε_{i}^{2} - \sum_{i = 1}^{n} α_{i} (ω^{T} ϕ (x_{i}) + b + ε_{i} - y_{i}), \\ [\begin{matrix} I & 0 & 0 & - ϕ \\ 0 & 0 & 0 & 1^{T} \\ 0 & 0 & γ I & - I \\ ϕ & 1 & I & 0 \end{matrix}] \cdot [\begin{matrix} ω \\ b \\ ε_{i} \\ α_{i} \end{matrix}] = [\begin{matrix} 0 \\ 0 \\ 0 \\ y \end{matrix}], \\ [\begin{matrix} 0 & 1^{T} \\ 1 & k (x, x^{T}) + γ^{- 1} I \end{matrix}] [\begin{matrix} b \\ α_{i} \end{matrix}] = [\begin{matrix} 0 \\ y \end{matrix}], \end{matrix}

(13)

where

k (x_{i}, x_{j}^{T})

is the kernel function, and the specific expression is as follows:

k (x_{i}, x_{j}^{T}) = φ {(x_{i})}^{T} φ (x_{j}) .

(14)

The final least-squares support vector regression model is:

f_{L S S V R} (x) = \sum_{i = 1}^{n} α_{i} k (x_{i}, x_{j}^{T}) + b .

(15)

Twin support vector regression is shown in Figure 6.

On the basis of SVR, TWSVR creatively transformed the original hyperplane into two nonparallel hyperplanes and constrained the upper and lower hyperplanes of [32]. The final prediction result is the average of the prediction results of the two hyperplanes. Figure 6 shows that the TWSVR algorithm decomposes an original quadratic programming problem into two smaller quadratic programming problems. The computational speed and generalization ability of TWSVR were effectively improved. Experimental results show that the computational efficiency of TWSVR is four times that of SVR. The original problem for TWSVR is as follows:

\begin{matrix} (TWSVR 1) & min_{ω_{1}, b_{1}, ξ_{1}} C_{1} e^{T} ξ_{1} + \frac{C_{3}}{2} ({∥ω_{1}∥}_{2} + b_{1}^{2}) + \frac{1}{2} ∥ (Y + e ε_{2} - (A ω_{1} + e b_{1}) ∥_{2} \\ subject to (A ω_{1} + e b_{1}) - (Y - e ε_{1}) \leq ξ_{1}, \\ ξ_{1} \geq 0 . \\ (TWSVR 2) & min_{ω_{2}, b_{2}, ξ_{2}} C_{2} e^{T} ξ_{2} + \frac{C_{4}}{2} ({∥ω_{2}∥}_{2} + b_{2}^{2}) + \frac{1}{2} ∥ (Y + e ε_{1} - (A ω_{2} + e b_{2}) ∥_{2} \\ subject to (Y - e ε_{2}) - (A ω_{2} + e b_{2}) \leq ξ_{2}, \\ ξ_{2} \geq 0 . \end{matrix}

(16)

The Lagrangian function is as follows:

\begin{matrix} L (ω_{1}, b_{1}, ξ_{1}, α, β) = C_{1} e^{T} ξ_{1} + α^{T} ((A ω_{1} + e b_{1}) - (Y - e ε_{1}) - ξ_{1}) \\ + \frac{1}{2} ({(Y + e ε_{2} - (A ω_{1} + e b_{1}))}^{T} \times (Y + e ε_{2} - (A ω_{1} + e b_{1}))) \\ + \frac{C_{3}}{2} (ω_{1}^{T} ω_{1} + b_{1}^{2}) - β^{T} ξ_{1} . \end{matrix}

(17)

The KKT conditions are as follows:

\{\begin{matrix} C_{3} ω_{1} - A^{T} (Y + e ε_{2} - (A ω_{1} + e b_{1})) + A^{T} α & = 0, \\ C_{3} b_{1} - e^{T} (Y + e ε_{2} - (A ω_{1} + e b_{1})) + e^{T} α & = 0, \\ C_{1} e - α - β & = 0 \\ (A ω_{1} + e b_{1} - (Y - e ε_{1})) \leq ξ_{1}, ξ_{1} & \geq 0, \\ α^{T} ((A ω_{1} + e b_{1}) - (Y - e ε_{1}) - ξ_{1})) = 0, β^{T} ξ_{1} & = 0, \\ α \geq 0, β & \geq 0 . \end{matrix}

(18)

The final twin support vector regression model is as follows:

\begin{matrix} f_{T W S V R} = \frac{1}{2} ({(ω_{1} + ω_{2})}^{T} x + (b_{1} + b_{2}) . \end{matrix}

(19)

3.1.3. Optimization Model

In this study, two optimization algorithms, namely, WOA and PSO, were selected to optimize the prediction. Both algorithms are fast and effective community search algorithms and have solutions for falling into local optima, which are suitable for this study.

(1): Whale optimization algorithm

Algorithm 1 General framework of the WOA algorithm.

Require: Fitness function, Whale swarm

Ω

.
Ensure: Global optimal solution.

1:: Initialize the parameters.
2:: Initialize the whale position
3:: Calculate the fitness value of each whale.
4:: while termination condition does not satisfy do
5:: for $i = 1$ to $| Ω |$ do
6:: The “best and closest” whale Y sought.
7:: if Y exists then
8:: Move according to the formula under the guidance of Y.
9:: Review $Ω_{i}$ .
10:: end if
11:: end for
12:: end while
13:: Return the global optimal solution.

When the whales are hunting, each whale behaves in two ways. During each iteration, each whale randomly chooses one of two behaviors to hunt. One behavior is where all whales move toward each other to surround prey. Another behavior is when whales swim in circles and blow bubbles to repel prey. Each whale follows two rules of behavior until the best solution is found. The rules for both behaviors are as follows.

①: Surrounding prey behavior

The search scope of the whale algorithm is the global solution space. Since the position of the optimal solution in the search range is not known a priori, the WOA assumes that the current best candidate solution is the target prey or near-optimal solution. After defining the best search agent, other search agents will attempt to update their locations to that of the best search agent. This phase is defined as surrounding prey behavior.

\begin{matrix} A (t) = 2 r_{1} a (t) - a (t), \\ C (t) = 2 r_{2}, \\ a (t) = 2 - 2 \frac{t}{t_{max}}, \end{matrix}

(20)

where a is the convergence factor. As the number of iterations decreases linearly from 2 to 0, the random number between [0, 1] is taken modulo of

r_{1}

and

r_{2}

. With respect to the hunt, the behavior of whales rounding up prey is defined as follows:

\begin{matrix} \vec{D} (t) = |C (t) \cdot \vec{X_{p}} (t) - \vec{X} (t)|, \\ \vec{X} (t + 1) = \vec{X_{p}} (t) - A \cdot \vec{D} (t) . \end{matrix}

(21)

Formula (20) represents the distance between the individual and the prey, and Formula (21) is the position-updating formula of the whale, where t is the current iteration algebra; a and c are the coefficient vectors; and

X_{p}

and X are the position vectors of the prey and gray wolf, respectively. The surrounding prey behavior of the schematic diagram of the whale optimization algorithm is shown in Figure 7a.

②: Spiral bubble hunting behavior

Another hunting behavior of whales is the creation of a spiral bubble net. The individual calculates the distance between himself and the best individual in the current population, and then humpback whales pump out bubbles as they spiral to seal off prey. The position update between the humpback whale and prey is expressed by the logarithmic spiral equation.

\begin{matrix} \vec{D^{'}} (t) = |\vec{X_{p}} (t) - \vec{X} (t)|, \\ \vec{X} (t + 1) = {\vec{D}}^{'} (t) \cdot e^{b l} \cdot cos (2 π l) + \vec{X_{p}} (t), \end{matrix}

(22)

where b is the constant that defines the shape of a logarithmic spiral and l is a random number in [−1, 1]. The spiral-bubble hunting behavior of schematic diagram of the whale optimization algorithm is shown in Figure 7b.

(2): Particle swarm optimization algorithm

The next direction and position of each particle is indicated by its velocity (v). The velocity of particles is affected in the following three ways: the motion state of the particle itself, the direction and distance between the particle and the optimal solution, and the position information of other particles. The velocity iteration formula is:

v_{i d}^{k + 1} = ω v_{i d}^{k} + c_{1} r_{1} (p_{i d, p b e s t}^{k} - x_{i d}^{k}) + c_{2} r_{2} (p_{d, g b e s t}^{k} - x_{i d}^{k}),

(23)

where k is the number of iterations,

ω

is the inertia weight,

c_{1}

is the individual learning factor, and

c_{2}

is the group learning factor.

r_{1}

and

r_{2}

are random numbers for the search between [0, 1].

v_{i d}^{k}

is the velocity vector of particle i in dimension d in the KTH iteration, and

x_{i d}^{k}

is the position vector of particle i in dimension d in the KTH iteration.

p_{i d, p b e s t}^{k}

represents the historical optimal position of particle i in dimension d of the KTH iteration, and

p_{d, g b e s t}^{k}

represents the historical optimal position of the population in dimension d of the KTH iteration.

4. Results and Discussion

4.1. Experiment Settings

To evaluate the accuracy and reliability of the prediction of each model in this study, EVS,

R^{2}

, MAE, MSE, RMSE, and MAPE were used to measure the relationship between the actual Bitcoin price and the predicted Bitcoin price. At the same time, the CPU time was used to measure the running speed of each mixed model. The six measures of accuracy are defined as follows:

\begin{matrix} EVS = 1 - \frac{Var (y - \hat{y})}{Var (y)}, \\ R^{2} = 1 - \frac{S S E}{S S T}, \\ M A E = \frac{1}{n} \sum_{i = 1}^{n} |{\hat{y}}_{i} - y_{i}|, \\ M S E = \frac{1}{n} \sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}, \\ RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}}, \\ M A P E = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{{\hat{y}}_{i} - y_{i}}{y_{i}}|, \end{matrix}

(24)

where

\hat{y_{i}}

is the predicted value,

y_{i}

is the real price of Bitcoin at time ith, and n is the number of predictions. EVS and

R^{2}

are the numbers of [0, 1], and the larger the value is, the better the prediction effect of the model. MAE, MSE, RMSE, and MAPE are all methods used to calculate the prediction error, and the smaller the value is, the better the prediction effect of the model. To remove the error of a single experiment, the CPU running time is exactly the 100th time. The computer configuration used in the experiment was as follows. Processor: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80 GHz 2.80 GHz. Belt: RAM 8.00 GB (7.86 GB). Computational speed was limited by Python 3.9.

4.2. Feature Selection

In this study, the XGBoost algorithm and random forest algorithm are used to extract the main factors affecting the price fluctuation of Bitcoin. Both the XGBoost algorithm and the random forest algorithm are effective machine learning methods for variable extraction. The accuracy and scientificity of extracting variables are improved by comparing the two methods. The main factors extracted by the XGBoost algorithm and random forest algorithm replace the original factors to make the prediction of the model more accurate by subsequent modeling and analysis. The screening results of the two models are represented by a bar chart and nightingale rose chart, in which the column length of the bar chart and the size of the rose petal in the rose chart both represent the importance of each variable. The bar chart better shows the difference in the importance of the same variable in different models, as shown in Figure 8. The nightingale rose chart better shows the difference in importance of different variables in the same model in Figure 9.

(a): XGBoost screened out six important factors in Figure 8. The S&P 500 index is the most important. Moreover, the ETH, LTC, MD, F&G, and GOLD indices are also very important. The six important factors are the supply factor of Bitcoin, the demand factor of Bitcoin, the price factor of other cryptocurrencies, the financial-indicator factor, and the macroeconomic factor, but the exchange rate factor was not included. Eleven important factors were screened by the random forest algorithm, focusing on internal cryptocurrency factors and U.S. stock market factors. The importance of the S&P500 indices screened by the random forest algorithm is significantly higher than that of the other indices. These are the BCH, DJIA, ETH, F&G, GOLD, HSCEI, LTC, MD, NASDAQ, and WTI indicators. Compared with the important indicators extracted by the two methods, the indicators extracted by the XGBoost algorithm are more important and representative. The eleven important factors screened by the random forest algorithm include six important factors screened by the XGBoost algorithm.
(b): From the perspective of economics, the internal factors of cryptocurrency and the U.S. stock market are the most important factors affecting the price of Bitcoin. The internal factors of cryptocurrency directly affect the price of Bitcoin from the perspective of supply and demand. Bitcoin is also popular in Western countries, so its volatile price is indirectly influenced by the U.S. stock market.

4.3. Combined Model Prediction

First, three models, SVR, LSSVR, and TWSVR, were selected to forecast the Bitcoin price, and the ARIMA model was selected as the benchmark model. At the same time, the prediction model and feature screening model were combined to obtain the best prediction effect. We assume that the regularization factor satisfies the conditions

c_{1} = c_{2} = c_{3} = c_{4}

, and the RBF is selected as the kernel function in the prediction model.

Second, the parameters of the combined model were debugged by the WOA and PSO, and the prediction index EVS of the combined model was taken as the objective function of the optimization algorithm. Then, six combined prediction models (WOA-SVR, WOA-LSSVR, WOA-TWSVR, PSO-SVR, PSO-LSSVR, and PSO-TWSVR) were obtained.

Finally, we used EVS,

R^{2}

, MAE, MSE, RMSE, MAPE, and the CPU running time to evaluate and compare the effects of each combination model, as shown in Table 1.

It can be seen in Table 1 that the prediction effect of the combined prediction model under the condition of “XGBoost feature” was better than those of the “full feature” and “random forest feature”. In this study, six variables selected by the XGBoost algorithm, S&P 500, ETH, F&G, GOLD, LTC, and MD, were taken as the final variables of SVR and its variant model. In addition, the TWSVR model has a better prediction effect than the other two models according to EVS,

R^{2}

, MAE, MSE, RMSE, and MAPE. As shown in Figure 10, the column height of the histogram represents the EVS index size of each combination model. The blue columns represent the full features, the red columns represent the random forest features, and the green columns represent the XGBoost features.

Other than PSO-TWSVR, other hybrid models based on the “XGBoost feature” were better than those based on the “full feature” and “random forest feature”. In the SVR, the prediction effect of the model based on “random forest characteristics” was worse than that of the model based on “full characteristics”. With both LSSVR and TWSVR, the predictions of the models based on “random forest characteristics” were better than those of the models based on “full characteristics”. In addition, among all models, PSO-TWSVR and WOA-TWSVR exhibited the best prediction effect. The optimal EVS value of the WOA-SVR model was up to 0.9547, and that of the PSO-SVR model was up to 0.9491.

Among the three prediction models, LSSVR runs the fastest, followed by TWSVR. The slowest is the SVR model. When “full feature” was adopted, in Table 1, we can see that the running time of TWSVR was improved from the 2.5631–2.6417 s of the SVR model to 0.6736 s by parameter tuning of the WOA and PSO, and that the running time was reduced by 1.8895 to 1.9681 s. The running times of the SVR model were 2.5631 and 2.6417 s and the running times of LSSVR were reduced to 0.5268 and 0.5416 s. When the “random forest feature” was adopted, the running time of TWSVR was reduced by 1.4558 or 1.4443 s through parameter tuning of the WOA and PSO. The LSSVR runtime was reduced by 1.563 and 1.5648 s. When using the “XGBoost feature”, the TWSVR running time is reduced by 1.8181 and 1.7801 s through parameter tuning of the WOA and PSO. The LSSVR runtime is reduced by 1.945 and 1.7801 s.

In conclusion, the TWSVR model is superior to the SVR model in both prediction effect and calculation speed, which proves the superiority of this variant model. TWSVR is slightly slower than the LSSVR, but its prediction is much better.

Figure 11 shows the prediction results of three prediction models: SVR, LSSVR, and TWSVR. The red solid line represents the true Bitcoin price, the purple dashed line represents the SVR model’s prediction, the deep pink dotted line represents the LSSVR model’s prediction, and the green dot-dash line represents the TWSVR model’s prediction. The optimal combination model of three prediction curves was selected, that is, the condition of XGBoost feature and parameter optimization with WOA. Through the image in Figure 11, we can clearly see the difference between the predicted results and the real values. The TWSVR model’s predictions were closer to the real prices of Bitcoin. The four curves showed slight differences in the first 70 stages and great differences in the 85–95, 120–125, 140–165, and 180–200 stages. The reason for this is that the SVR and its variant models do not fit the data with large fluctuations well.

Figure 12 shows the fitness curves of the SVR, LSSVR, and TWSVR models using the WOA and PSO for parameter tuning. The objective function of the optimization is EVS. The WOA and PSO are both excellent optimization algorithms, and both optimize the model within 100 iterations. From the image in Figure 12, the WOA is faster than the PSO. Compared with PSO, the WOA more easily jumps out of the local optimum and finds the overall optimum. In this study, the WOA is superior to PSO with respect to the parameter optimization of the prediction model.

Figure 13 shows the EVS of the SVR and its variants, as the explained variance score (EVS) varies with the regularization factor (C) and kernel RBF bandwidth (

γ

) based on XGBoost variable 6. Since the variation in the fitting effect of the model is more sensitive when the parameter range is [0, 1], we chose the selected range to be

2^{- 9}

to

2^{10}

. The z-axis coordinate and surface color represent the fitting effect of the model, and the larger the value is, and the brighter the color is, the better the fitting effect of the model. In other words, the TWSVR model has a better fitting effect than the SVR and LSSVR models. In Figure 13a,c, the SVR model and the TWSVR model are more affected by the regularization factor than the kernel RBF bandwidth. The fitting effect of the SVR model is at a higher level when the regularization factor is less than

2^{- 1}

and reaches a peak when the regularization factor is equal to

2^{- 5}

. When the regularization factor is greater than

2^{- 1}

, the fitting effect of the SVR model is low, even reaching zero. The EVS values of the TWSVR models were all greater than 0.4, and the overall fitting effect was better. The EVS value reached its peak when the regularization factor equals

2^{1}

and the EVS value is higher than 0.95. The fitting effect of the LSSVR model is greatly affected by the regularization factor and kernel RBF bandwidth. In Figure 13c, the effect of hills with high middle and low four corners is shown. The EVS value of LSSVR reaches the peak value when the regularization factor is equal to

2^{- 1}

and the regularization factor is equal to

2^{- 1}

, and the peak value is 0.90.

5. Conclusions

Bitcoin has been the most successful and popular cryptocurrency in recent years. In this study, XGBoost and random forest algorithms were used to screen the influencing factors. Six important factor groups were obtained via XGBoost, and eleven important factor groups were obtained via random forest. The main prediction models used in this study were SVR, LSSVR, and TWSVR. Complete features (27), XGBoost features (11), and random forest features (6) were substituted into the prediction model; and the WOA and PSO were used to optimize and establish a mixed model including data screening, optimization, and prediction. Finally, we used EVS,

R^{2}

, MAE, MSE, RMSE, and MAPE to measure the prediction effect of the mixed model. Meanwhile, the CPU time was used to measure the running speed of the hybrid model. The following conclusions were reached:

The combined model, XGBoost-WOA-TWSVR, demonstrated the best prediction effect, and the EVS score reached 0.9547. Our model can effectively predict the price of Bitcoin. In terms of computing speed, TWSVR’s computing speed was not the fastest, but it was not significantly different from the fastest LSSVR model. Through feature screening of the XGBoost and random forest algorithms and parameter tuning of the WOA and PSO, the accuracy of the model was significantly improved.

SVR, LSSVR, and TWSVR were all effective prediction algorithms with significantly improved accuracy and speed compared to the benchmark model. During the Bitcoin price prediction experiment, we found that LSSVR is greatly improved in computing speed compared with the traditional SVR model, but it is somewhat less accurate. Compared with the SVR model, TWSVR improved not only the calculation speed of the model, but also accuracy, and it was approximately four times faster than SVR.

Author Contributions

Conceptualization, J.M. and Y.Z. (Yingjie Zhu); methodology, J.M.; software, J.M.; validation, J.M., Y.Z. (Yingjie Zhu), and F.G.; formal analysis, J.W.; investigation, Z.L.; resources, Y.Z. (Youyao Zhang); data curation, J.X.; writing—original draft preparation, J.M.; writing—review and editing, Y.L.; visualization, Y.W.; supervision, X.Y.; project administration, F.G.; funding acquisition, Y.Z. (Yingjie Zhu). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the National Natural Science Foundation of China (NO. 41701054), in part by the Ministry of Education China University Industry University Research Project (NO. 2021ALA03004), in part by the Education Science of the 14th Five-Year Plan Project of Jilin Province (ZD21057), and in part by the Department of Education Project (NO. JJKH20200556KJ).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Bitcoin and other encryption currency price data from https://history.btc123.fans/ (accessed on 4 January 2023). Data on Bitcoin supply factors and Bitcoin demand factors are available from https://alternative.me/ (accessed on 4 January 2023). The financial-indicator factors data from https://www.10jqka.com.cn/ (accessed on 4 January 2023). Macroeconomic factors and the exchange rate data from https://fred.stlouisfed.org/ (accessed on 4 January 2023).

Acknowledgments

The authors would like to thank the Northeast Institute of Geography and Agroecology Chinese Academy of Sciences and University of Ottawa for providing standard test sets of points and answering our questions regarding the algorithm. They are also deeply grateful to Changchun Song and F. Lustcher for their enthusiasm regarding the initial results and for providing the software and bibliography.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SVR	Support Vector Regression
LSSVR	Least-Square Support Vector Regression
TWSVR	Twin Support Vector Regression
WOA	Whale Optimization Algorithm
PSO	Particle Swarm Optimization Algorithm
ARIMA	Autoregressive comprehensive moving average
SVM	Support Vector Machine
RF	Random Forest
ANN	Artificial Neural Network
LSTM	Long- and Short-Term Memory Neural Network
ANFIS	Adaptive Network Fuzzy Inference System model
MD	Mining Difficulty
F&G	Crypto Fear and Greed Index
ETH	Ethereum Price
LTC	Litecoin Price
BCH	Bitcoin Cash Price
USDT	USDT Price Index
DJIA	Dow Jones Industrial Average Index
NASDAQ	National Association of Securities Dealers Automated Quotations Index
S&P 500	Standard & Poor’s 500 Index
VIXCLS	CBOE Volatility Index
FTSE 100	Financial Times Stock Exchange 100 Index
HSCEI	Hang Seng China Enterprises Index
HSI	Hang Seng Index
AHP	A/H share premium index
SSEC	Shanghai Securities The Composite Index
GOLD	Gold Price
SILVER	Silver Price
WTI	West Texas Intermediate
EFFR	Effective Federal Funds Rate
T10YIE	10-Year Breakeven Inflation Rate
AUD∖USD	Australian Dollar to US Dollar Exchange Rate
EUR∖USD	European Dollar to US Dollar Exchange Rat
GBP∖USD	Great Britain Pound to US Dollar Exchange Rat
100JPY∖USD	100 Japanese Yen to US Dollar Exchange Rat
CHY∖USD	Swiss Franc to US Dolla Exchange Ratx
CAD∖USD	Canadian dollar to US Dolla Exchange Rat
EVS	Expected Variance Score
$R^{2}$	Coefficient of Determination
MSE	Mean Square Error
RMSE	Root Mean Square Erro
MAPE	Mean Absolute Percentage Error

References

Shah, D.; Zhang, K. Bayesian regression and Bitcoin. In Proceedings of the 2014 52nd annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 30 September–3 October 2014; pp. 409–414. [Google Scholar] [CrossRef] [Green Version]
Jang, H.; Lee, J. An empirical study on modeling and prediction of bitcoin prices with bayesian neural networks based on blockchain information. IEEE Access 2018, 6, 5427–5437. [Google Scholar] [CrossRef]
Khedr, A.M.; Arif, I.; El-Bannany, M.; Alhashmi, S.M.; Sreedharan, M. Cryptocurrency price prediction using traditional statistical and machine-learning techniques: A survey. Intell. Syst. Account. Financ. Manag. 2021, 28, 3–34. [Google Scholar] [CrossRef]
Andi, H.K. An accurate bitcoin price prediction using logistic regression with LSTM machine learning model. J. Soft Comput. Paradig. 2021, 3, 205–217. [Google Scholar] [CrossRef]
Liu, X.; Hu, Z.; Ling, H.; Cheung, Y.M. MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 964–981. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ali, M.; Shatabda, S. A data selection methodology to train linear regression model to predict bitcoin price. In Proceedings of the 2020 2nd International Conference on Advanced Information and Communication Technology (ICAICT), Dhaka, Bangladesh, 28–29 November 2020; pp. 330–335. [Google Scholar] [CrossRef]
Poongodi, M.; Vijayakumar, V.; Chilamkurti, N. Bitcoin price prediction using ARIMA model. Int. J. Internet Technol. Secur. Trans. 2020, 10, 396–406. [Google Scholar] [CrossRef]
Abu Bakar, N.; Rosbi, S. Autoregressive integrated moving average (ARIMA) model for forecasting cryptocurrency exchange rate in high volatility environment: A new insight of bitcoin transaction. Int. J. Adv. Eng. Res. Sci. 2017, 4, 130–137. [Google Scholar] [CrossRef]
Erfanian, S.; Zhou, Y.; Razzaq, A.; Abbas, A.; Safeer, A.A.; Li, T. Predicting Bitcoin (BTC) Price in the Context of Economic Theories: A Machine Learning Approach. Entropy 2022, 24, 1487. [Google Scholar] [CrossRef]
Derbentsev, V.; Babenko, V.; Khrustalev, K.; Obruch, H.; Khrustalova, S. Comparative performance of machine learning ensemble algorithms for forecasting cryptocurrency prices. Int. J. Eng. 2021, 34, 140–148. [Google Scholar] [CrossRef]
Charandabi, S.E.; Kamyar, K. Prediction of cryptocurrency price index using artificial neural networks: A survey of the literature. Eur. J. Bus. Manag. Res. 2021, 6, 17–20. [Google Scholar] [CrossRef]
Ho, A.; Vatambeti, R.; Ravichandran, S. Bitcoin Price Prediction Using Machine Learning and Artificial Neural Network Model. Indian J. Sci. Technol. 2021, 14, 2300–2308. [Google Scholar] [CrossRef]
Pour, E.S.; Jafari, H.; Lashgari, A.; Rabiee, E.; Ahmadisharaf, A. Cryptocurrency price prediction with neural networks of LSTM and Bayesian optimization. Eur. J. Bus. Manag. Res. 2022, 7, 20–27. [Google Scholar] [CrossRef]
Wu, C.H.; Lu, C.C.; Ma, Y.F.; Lu, R.S. A new forecasting framework for bitcoin price with LSTM. In Proceedings of the 2018 IEEE International Conference on Data Mining Workshops, Singapore, 17–20 November 2018; pp. 168–175. [Google Scholar] [CrossRef]
Ye, Z.; Wu, Y.; Chen, H.; Pan, Y.; Jiang, Q. A stacking ensemble deep learning model for Bitcoin price prediction using Twitter comments on Bitcoin. Mathematics 2022, 10, 1307. [Google Scholar] [CrossRef]
Chen, Z.; Li, C.; Sun, W. Bitcoin price prediction using machine learning: An approach to sample dimension engineering. J. Comput. Appl. Math. 2020, 365, 112395. [Google Scholar] [CrossRef]
Mudassir, M.; Bennbaia, S.; Unal, D.; Hammoudeh, M. Time-series forecasting of Bitcoin prices using high-dimensional features: A machine learning approach. Neural Comput. Appl. 2020, 1–15. [Google Scholar] [CrossRef]
Mallqui, D.C.; Fernandes, R.A. Predicting the direction, maximum, minimum and closing prices of daily Bitcoin exchange rate using machine learning techniques. Appl. Soft Comput. 2019, 75, 596–606. [Google Scholar] [CrossRef]
Zhu, Y.; Dickinson, D.; Li, J. Analysis on the influence factors of Bitcoin’s price based on VEC model. Financ. Innov. 2017, 3, 37–49. [Google Scholar] [CrossRef] [Green Version]
Aggarwal, A.; Gupta, I.; Garg, N.; Goel, A. Deep learning approach to determine the impact of socio economic factors on bitcoin price prediction. In Proceedings of the 2019 Twelfth International Conference on Contemporary Computing (IC3), Noida, India, 8–10 August 2019; pp. 1–5. [Google Scholar] [CrossRef]
Ciaian, P.; Rajcaniova, M.; Kancs, D. The economics of BitCoin price formation. Appl. Econ. 2016, 48, 1799–1815. [Google Scholar] [CrossRef] [Green Version]
Kim, Y.B.; Kim, J.G.; Kim, W.; Im, J.H.; Kim, T.H.; Kang, S.J.; Kim, C.H. Predicting fluctuations in cryptocurrency transactions based on user comments and replies. PLoS ONE 2016, 11, e0161197. [Google Scholar] [CrossRef] [Green Version]
Chen, W.; Xu, H.; Jia, L.; Gao, Y. Machine learning model for Bitcoin exchange rate prediction using economic and technology determinants. Int. J. Forecast. 2021, 37, 28–43. [Google Scholar] [CrossRef]
Ma, B.; Yan, G.; Chai, B.; Hou, X. XGBLC: An improved survival prediction model based on XGBoost. Bioinformatics 2022, 38, 410–418. [Google Scholar] [CrossRef]
Srinivas, P.; Katarya, R. hyOPTXg: OPTUNA hyper-parameter optimization framework for predicting cardiovascular disease using XGBoost. Biomed. Signal Process. Control 2022, 73, 103456. [Google Scholar] [CrossRef]
Huo, W.; Li, W.; Zhang, Z.; Sun, C.; Zhou, F.; Gong, G. Performance prediction of proton-exchange membrane fuel cell based on convolutional neural network and random forest feature selection. Energy Convers. Manag. 2021, 243, 114367. [Google Scholar] [CrossRef]
Saraswat, M.; Arya, K. Feature selection and classification of leukocytes using random forest. Med. Biol. Eng. Comput. 2014, 52, 1041–1052. [Google Scholar] [CrossRef]
Wu, Y.; Zhang, Q.; Hu, Y.; Sun-Woo, K.; Zhang, X.; Zhu, H.; Li, S. Novel binary logistic regression model based on feature transformation of XGBoost for type 2 Diabetes Mellitus prediction in healthcare systems. Future Gener. Comput. Syst. 2022, 129, 1–12. [Google Scholar] [CrossRef]
Zhang, X.; Yan, C.; Gao, C.; Malin, B.A.; Chen, Y. Predicting missing values in medical data via XGBoost regression. J. Healthc. Inform. Res. 2020, 4, 383–394. [Google Scholar] [CrossRef]
Zhang, W.; Wu, C.; Li, Y.; Wang, L.; Samui, P. Assessment of pile drivability using random forest regression and multivariate adaptive regression splines. Georisk Assess. Manag. Risk Eng. Syst. Geohazards 2021, 15, 27–40. [Google Scholar] [CrossRef]
Xie, G.; Wang, S.; Zhao, Y.; Lai, K.K. Hybrid approaches based on LSSVR model for container throughput forecasting: A comparative study. Appl. Soft Comput. 2013, 13, 2232–2241. [Google Scholar] [CrossRef]
Khemchandani, R.; Goyal, K.; Chandra, S. TWSVR: Regression via twin support vector machine. Neural Netw. 2016, 74, 14–21. [Google Scholar] [CrossRef]
Peng, X. TSVR: An efficient twin support vector machine for regression. Neural Netw. 2010, 23, 365–372. [Google Scholar] [CrossRef]
Erkan, U.; Toktas, A.; Ustun, D. Hyperparameter optimization of deep CNN classifier for plant species identification using artificial bee colony algorithm. J. Ambient. Intell. Humaniz. Comput. 2022, 1–12. [Google Scholar] [CrossRef]
Yang, L.; Shami, A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 2020, 415, 295–316. [Google Scholar] [CrossRef]
Bischl, B.; Binder, M.; Lang, M.; Pielok, T.; Richter, J.; Coors, S.; Thomas, J.; Ullmann, T.; Becker, M.; Boulesteix, A.L.; et al. Hyperparameter optimization: Foundations, algorithms, best practices and open challenges. arXiv 2021, arXiv:2107.05847. [Google Scholar] [CrossRef]
Qiu, Y.; Zhou, J.; Khandelwal, M.; Yang, H.; Yang, P.; Li, C. Performance evaluation of hybrid WOA-XGBoost, GWO-XGBoost and BO-XGBoost models to predict blast-induced ground vibration. Eng. Comput. 2022, 38, 4145–4162. [Google Scholar] [CrossRef]
Hitam, N.A.; Ismail, A.R.; Samsudin, R.; Ameerbakhsh, O. The Influence of Sentiments in Digital Currency Prediction Using Hybrid Sentiment-based Support Vector Machine with Whale Optimization Algorithm (SVMWOA). In Proceedings of the 2021 International Congress of Advanced Technology and Engineering (ICOTEN), Taiz, Yemen, 4–5 July 2021; pp. 1–7. [Google Scholar] [CrossRef]
Wu, K.; Zhu, Y.; Shao, D.; Wang, X.; Ye, C. A Method of Trading Strategies for Bitcoin and Gold. In Proceedings of the 2022 International Conference on Artificial Intelligence, Internet and Digital Economy (ICAID 2022), Xi’an, China, 15–17 April 2022; Atlantis Press: Amsterdam, The Netherlands, 2022; pp. 186–197. [Google Scholar] [CrossRef]
Chen, X.; Cheng, L.; Liu, C.; Liu, Q.; Liu, J.; Mao, Y.; Murphy, J. A WOA-based optimization approach for task scheduling in cloud computing systems. IEEE Syst. J. 2020, 14, 3117–3128. [Google Scholar] [CrossRef]
Balyan, A.K.; Ahuja, S.; Lilhore, U.K.; Sharma, S.K.; Manoharan, P.; Algarni, A.D.; Elmannai, H.; Raahemifar, K. A hybrid intrusion detection model using ega-pso and improved random forest method. Sensors 2022, 22, 5986. [Google Scholar] [CrossRef]
Xing, Z.; Zhu, J.; Zhang, Z.; Qin, Y.; Jia, L. Energy consumption optimization of tramway operation based on improved PSO algorithm. Energy 2022, 258, 124848. [Google Scholar] [CrossRef]
Huang, C.L.; Dun, J.F. A distributed PSO–SVM hybrid system with feature selection and parameter optimization. Appl. Soft Comput. 2008, 8, 1381–1391. [Google Scholar] [CrossRef]
Cuong-Le, T.; Nghia-Nguyen, T.; Khatir, S.; Trong-Nguyen, P.; Mirjalili, S.; Nguyen, K.D. An efficient approach for damage identification based on improved machine learning using PSO-SVM. Eng. Comput. 2022, 38, 3069–3084. [Google Scholar] [CrossRef]
Chtita, S.; Motahhir, S.; El Hammoumi, A.; Chouder, A.; Benyoucef, A.S.; El Ghzizal, A.; Derouich, A.; Abouhawwash, M.; Askar, S. A novel hybrid GWO–PSO-based maximum power point tracking for photovoltaic systems operating under partial shading conditions. Sci. Rep. 2022, 12, 10637. [Google Scholar] [CrossRef]
Vehtari, A.; Gelman, A.; Simpson, D.; Carpenter, B.; Bürkner, P.C. Rank-normalization, folding, and localization: An improved $\hat{R}$ for assessing convergence of MCMC (with discussion). Bayesian Anal. 2021, 16, 667–718. [Google Scholar] [CrossRef]
Petetin, H.; Bowdalo, D.; Soret, A.; Guevara, M.; Jorba, O.; Serradell, K.; Pérez García-Pando, C. Meteorology-normalized impact of the COVID-19 lockdown upon NO 2 pollution in Spain. Atmos. Chem. Phys. 2020, 20, 11119–11141. [Google Scholar] [CrossRef]
Wang, Y.; Ni, X.S. A XGBoost risk model via feature selection and Bayesian hyper-parameter optimization. arXiv 2019, arXiv:1901.08433. [Google Scholar] [CrossRef]
Kursa, M.B.; Rudnicki, W.R. Feature selection with the Boruta package. J. Stat. Softw. 2010, 36, 1–13. [Google Scholar] [CrossRef] [Green Version]
Prayudani, S.; Hizriadi, A.; Lase, Y.; Fatmi, Y. Analysis accuracy of forecasting measurement technique on random K-nearest neighbor (RKNN) using MAPE and MSE. In Proceedings of the 1st International Conference of SNIKOM 2018, Medan, Indonesia, 23–24 November 2018; Volume 1361, p. 012089. [Google Scholar] [CrossRef]
Rai, N.K.; Saravanan, D.; Kumar, L.; Shukla, P.; Shaw, R.N. RMSE and MAPE analysis for short-term solar irradiance, solar energy, and load forecasting using a Recurrent Artificial Neural Network. In Applications of AI and IOT in Renewable Energy; Elsevier: Amsterdam, The Netherlands, 2022; pp. 181–192. [Google Scholar] [CrossRef]
Dhiman, H.S.; Deb, D.; Guerrero, J.M. Hybrid machine intelligent SVR variants for wind forecasting and ramp events. Renew. Sustain. Energy Rev. 2019, 108, 369–379. [Google Scholar] [CrossRef]

Figure 1. Technical route.

Figure 2. Time series of Bitcoin price.

Figure 3. Relationship diagram of the factors influencing Bitcoin price.

Figure 4. Normalized historical data of the internal factors of cryptocurrencies, financial-indicator factors, macroeconomic factors, and exchange rate factors.

Figure 5. Schematic diagram of the SVR algorithm.

Figure 6. Schematic diagram of the TWSVR algorithm.

Figure 7. Schematic diagram of the whale optimization algorithm.

Figure 8. Feature-screening bar graph based on the XGBoost algorithm and random forest algorithm.

Figure 9. Feature-screening nightingale rose graph based on the XGBoost algorithm and random forest algorithm.

Figure 10. Performances of different models on EVS.

Figure 11. Prediction results of SVR, LSSVR, and TWSVR.

Figure 12. Fitness curves of SVR, LSSVR, and TWSVR.

Figure 13. Variation in the EVS of SVR, LSSVR, and TWSVR with RBF bandwidth and a regularization factor.

Table 1. Performance indexes of each combination model.

CATEGORY	MODEL	EVS	$R^{2}$	MAE	MSE	RMSE	MAPE	CPU Time (s)
Complete Features	ARIMA	0.723	0.691	0.091	0.008	0.091	0.132	1.173
	WOA-SVR	0.9022	0.879	0.042	0.003	0.055	0.0602	2.5631
	WOA-LSSVR	0.8868	0.869	0.043	0.003	0.057	0.0622	0.5268
	WOA-TWSVR	0.9484	0.923	0.032	0.002	0.044	0.0442	0.6736
	PSO-SVR	0.8964	0.876	0.042	0.003	0.056	0.0614	2.6417
	PSO-LSSVR	0.8835	0.867	0.043	0.003	0.057	0.0625	0.5416
	PSO-TWSVR	0.9484	0.923	0.032	0.002	0.044	0.0442	0.6736
XGBoost Features	WOA-SVR	0.9161	0.894	0.036	0.002	0.049	0.0512	1.9301
	WOA-LSSVR	0.9041	0.887	0.042	0.003	0.054	0.0594	0.3671
	WOA-TWSVR	0.9547	0.929	0.031	0.002	0.042	0.0433	0.4743
	PSO-SVR	0.9074	0.89	0.041	0.003	0.052	0.0571	1.9319
	PSO-LSSVR	0.9041	0.887	0.042	0.003	0.054	0.0594	0.3671
	PSO-TWSVR	0.9239	0.896	0.035	0.002	0.046	0.0475	0.4876
RF Features	WOA-SVR	0.8067	0.78	0.054	0.005	0.072	0.0786	2.3297
	WOA-LSSVR	0.9015	0.881	0.042	0.003	0.055	0.0607	0.3847
	WOA-TWSVR	0.9502	0.919	0.031	0.002	0.042	0.0438	0.5113
	PSO-SVR	0.8041	0.771	0.055	0.005	0.073	0.0792	2.2744
	PSO-LSSVR	0.8994	0.878	0.042	0.003	0.055	0.0611	0.4228
	PSO-TWSVR	0.9491	0.924	0.032	0.002	0.044	0.0441	0.4943

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhu, Y.; Ma, J.; Gu, F.; Wang, J.; Li, Z.; Zhang, Y.; Xu, J.; Li, Y.; Wang, Y.; Yang, X. Price Prediction of Bitcoin Based on Adaptive Feature Selection and Model Optimization. Mathematics 2023, 11, 1335. https://doi.org/10.3390/math11061335

AMA Style

Zhu Y, Ma J, Gu F, Wang J, Li Z, Zhang Y, Xu J, Li Y, Wang Y, Yang X. Price Prediction of Bitcoin Based on Adaptive Feature Selection and Model Optimization. Mathematics. 2023; 11(6):1335. https://doi.org/10.3390/math11061335

Chicago/Turabian Style

Zhu, Yingjie, Jiageng Ma, Fangqing Gu, Jie Wang, Zhijuan Li, Youyao Zhang, Jiani Xu, Yifan Li, Yiwen Wang, and Xiangqun Yang. 2023. "Price Prediction of Bitcoin Based on Adaptive Feature Selection and Model Optimization" Mathematics 11, no. 6: 1335. https://doi.org/10.3390/math11061335

APA Style

Zhu, Y., Ma, J., Gu, F., Wang, J., Li, Z., Zhang, Y., Xu, J., Li, Y., Wang, Y., & Yang, X. (2023). Price Prediction of Bitcoin Based on Adaptive Feature Selection and Model Optimization. Mathematics, 11(6), 1335. https://doi.org/10.3390/math11061335

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Price Prediction of Bitcoin Based on Adaptive Feature Selection and Model Optimization

Abstract

1. Introduction

2. Data Collection and Preprocessing

2.1. Data Collection

2.2. Data Preprocessing

2.2.1. Missing Data Completion through Linear Interpolation

2.2.2. Data Normalization and Division

2.2.3. Feature Selection for Price Prediction of Bitcoin

3. Price Prediction of Bitcoin

3.1. Main Forecasting Model

3.1.1. Support Vector Regression

3.1.2. Least-Squares Support Vector Regression

3.1.3. Optimization Model

4. Results and Discussion

4.1. Experiment Settings

4.2. Feature Selection

4.3. Combined Model Prediction

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI