Next Article in Journal
A Comparative Analysis of Cyber-Threat Intelligence Sources, Formats and Languages
Next Article in Special Issue
Gaze-Based Vehicle Driving Evaluation of System with an Actual Vehicle at an Intersection with a Traffic Light
Previous Article in Journal
Finding Efficient and Lower Capacitance Paths for the Transfer of Energy in a Digital Microgrid
Previous Article in Special Issue
Validation of Large-Scale Classification Problem in Dendritic Neuron Model Using Particle Antagonism Mechanism
 
 
Article
Peer-Review Record

Detecting Predictable Segments of Chaotic Financial Time Series via Neural Network

Electronics 2020, 9(5), 823; https://doi.org/10.3390/electronics9050823
by Tianle Zhou *, Chaoyi Chu, Chaobin Xu, Weihao Liu and Hao Yu
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Electronics 2020, 9(5), 823; https://doi.org/10.3390/electronics9050823
Submission received: 16 April 2020 / Revised: 13 May 2020 / Accepted: 13 May 2020 / Published: 16 May 2020
(This article belongs to the Special Issue Applications of Bioinspired Neural Network)

Round 1

Reviewer 1 Report

This is an interesting and maybe also useful paper as the result of rather thorough research.  The focus of this paper is on detecting the predictable part of financial time series. new idea how to analyze the financial market and detect price fluctuation is presented, based on an integration of the Phase Space Reconstruction technology, and Kohonen's Self Organizing Maps neuron network algorithm. The application of this integration resulted in clustering of linear components, and these were tested by the Long short-term memory (LSTM) neuron network.

The research conducted seems to be scientifically sound, and the contents could be considered to be significant and original. However, I have several comments that could contribute to an improvement of quality of the paper.

First of all, I miss the part usually called "Related Research" or "Related Work". Instead, there is a rather extensive chapter 2, called Methodology. It is OK to have explained all the methods and techniques used later on, but are you sure that there are no related works in the published literature? 

Second, there are some parts of the paper that are not too understandable, although the overall quality of the paper is rather good. These are commented and marked as yellow in the enclosed version of your paper. Try to correct them, please. Thank you.
 

Comments for author File: Comments.pdf

Author Response

Response to Reviewer 1 Comments

 

 

 

Point 1: First of all, I miss the part usually called "Related Research" or "Related Work". Instead, there is a rather extensive chapter 2, called Methodology. It is OK to have explained all the methods and techniques used later on, but are you sure that there are no related works in the published literature?

 

Response 1:

Thank you for your suggestion. The related works published in the past have been listed in reference 4 mentioned in line 23. In this study, the content in reference 4 has been supplemented, and the two research objectives are explained in line 54-63.

 

Point 2: Second, there are some parts of the paper that are not too understandable, although the overall quality of the paper is rather good. These are commented and marked as yellow in the enclosed version of your paper. Try to correct them, please. Thank you.

 

Response 2:

Thank you for your advice. The marked content has been supplemented and corrected, and the revised paper has been attached.

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper presents an interesting idea of segmenting financial time series to detect predictable segments and then to apply LSTM for the prediction of these segments. Therefore, I believe that the paper has merit but needs some revisions to clarify the methodology.

1) Clustering time series is not a novel idea and should be properly discussed in the paper. However, finding predictable segments has not been investigated sufficiently in previous studies. So, the authors should explain their contribution also from this point of view.

2) In fact, I would recommend inclusion of a literature review section to better explain the weaknesses of the earlier research in this field. The three paragraphs in the introduction are good but cannot provide the complete picture.

3) In section 2.2 it should be clarified how the time series are mapped to SOM and how are the SOM nodes (representatives) evaluated in terms of PSC.

4) Generally, the technical quality of the paper is very good but all variables should be written in italics in the text.

5) The settings of the training parameters in the LSTM neural network should be supported by some references. LSTM has been used many times for financial time series prediction and the number of neurons and other parameters of the training algorithm should be properly justified compared with related studies.

6) The same holds for SOM, the learning parameters are not given and 10*10 structure should be justified. There exist several heuristics for this setting and, in fact, the structure strongly affects the quality of SOM training. A proper explanation is therefore necessary.

7) It would be beneficial to present the results in terms of other measures such as R or MAPE to make comparison with alternative models possible. I would also recommend performing the experiments with other deep NN-based state-of-the-art models.

8) Typos and errors: Korhonen network should be Kohonen network

Author Response

Response to Reviewer 2 Comments

 

Point 1: Clustering time series is not a novel idea and should be properly discussed in the paper. However, finding predictable segments has not been investigated sufficiently in previous studies. So, the authors should explain their contribution also from this point of view.

 

Response 1: 

Thank you for your suggestion. A simple summary of clustering technology has been made in Chapter 2.2. Clustering is an attempt for analyzing financial data. In this experiment, whether the original data has the same hypothesis in the phase space as it in the real-time interval is not sure. The experiment is for detecting the financial time series data and is verified with supervised learning.

 

 

 

Point 2: In fact, I would recommend the inclusion of a literature review section to better explain the weaknesses of the earlier research in this field. The three paragraphs in the introduction are good but cannot provide the complete picture.

 

Response 2:

In line 30-38, the premise of current time series prediction research has been pointed out. Currently, most of the choices of models to predict time series are logically blind, and the only result for those experiments is a comparatively better approximating ability for dealing with specific data. However, in this study, a clustering algorithm is used to detect the predictable interval of the time series data, and the result is that the range of the predictable interval is very small. Although the result is contrary to the results of most of the previous time series forecasting papers, it is a fact that financial data is difficult to predict, and there is never a fixed algorithm that makes profits all the time.

 

Point 3: In section 2.2 it should be clarified how the time series are mapped to SOM and how are the SOM nodes (representatives) evaluated in terms of PSC.

 

Response 3:

Thank you for your advice and the related contents are supplemented in Chapter 2.2. In fact, the time-series data is projected in Euclidean space in 3.1.2 before learning by SOM neural network, so that the neural network can recognize the difference between linear and non-linear better as well as the transition procedure between the two parts of data, and then calculate in 3.2.1.

 

Point 4: Generally, the technical quality of the paper is very good but all variables should be written in italics in the text.

 

Response 4: 

Thank you for your suggestion and it has been revised.

 

Point 5: The settings of the training parameters in the LSTM neural network should be supported by some references. LSTM has been used many times for financial time series prediction and the number of neurons and other parameters of the training algorithm should be properly justified compared with related studies.

 

Response 5: 

 

Related references have been added in 162 lines. In the process of supervised verification of unsupervised algorithm, a dynamic parameter adjustment is used to clear the difference between the two parts of data. Therefore, there is no detailed introduction to the optimization of the parameters of the supervised learning model. In response 7, a similar response is given.

 

Point 6: The same holds for SOM, the learning parameters are not given and 10*10 structure should be justified. There exist several heuristics for this setting and, in fact, the structure strongly affects the quality of SOM training. A proper explanation is therefore necessary.

 

Response 6: 

Thank you for your suggestion. A detailed explanation has been added in Chapter 3.2.1 that the suitable structure of SOM can be determined only when it has been verified by learning. However, this makes no help for detecting the data without assumptions in the financial market. Therefore, the 10 * 10 structure is chosen initially.

 

 

Point 7: It would be beneficial to present the results in terms of other measures such as R or MAPE to make comparisons with alternative models possible. I would also recommend performing the experiments with other deep NN-based state-of-the-art models.

 

Response 7: 

Thank you for your suggestion and the results of R and MAPE validation are added in Table 3. Instead of comparing the fitting ability with other supervised models, the focus of this experiment is on clustering to detect the predictable region, as well as verifying the significance of segmenting linear and non-linear data.

 

 

Point 8: Typos and errors: Korhonen network should be Kohonen network

 

Response 8: 

Thank you for your suggestion and it has been revised.

 

Author Response File: Author Response.pdf

Reviewer 3 Report

This paper is an interesting combination of concepts from Monetary Economics, Non linear phenomenon, Fractals, Entropy, Chaotic Financial time series, Energy based models etc. Dividing phase track segments of a time series falls into the linear group and non linear groups, indicated when and where that the time series is capable to be predicted and vice versa. After divided into linear group and nonlinear group by clustering, the two groups of time series are predicted by LSTM neural network to verify the difference between them. The concept is very simple but has tremendous impact since it explains why majority of the trading algorithms today are ineffective and unreliable at times since we overtfit LSTM andd other ML models to predict time series and approximate fast in a short time without regard for regions of predictability and external factors causing chaos in higher dimensions.

 

The explanation is fluid and easy to understand with very relatable examples from different financial market(Using 4 datasets - DJI,GEI,NIKKE225,AUL8). There is decent amount of data analysis and experimentation to explain the procedure and it's mathematical validity.

Few important underlying points in this paper are:

- The dynamic properties of chaotic financial time series should be restored first before it is analyzed
- LSTM is suitable for the financial time series with large amount of data and high volatility
- After conducting PSR to the financial time series data, the maximum Lyapunov exponent is calculated by WOLF algorithm. If the Lyapunov exponent of any dimension in the embedding dimension is positive, the reconstructed dynamic system is considered as chaotic
- The predictable range of a time series is reflected by the reciprocal of the Lyapunov index. It is meaningful to predict the time series only when the Lyapunov index is 0 < L < 1. Although L index is able to estimate the average predictable step length, the predictable area is uncertain
- When the embedded dimension is larger than 2d + 1 (d is for the fractal dimension), the time delayed versions [y(t), y(t − t), y(t − 2t), . . . , y(t − 2nt)] of one generic signal are sufficient to embed the n-dimensional manifold, and the embedded dynamics are diffeomorphic to the original state-space dynamics once there are right t and enough dimensions
- Grassberger Procaccia algorithm is applied to determine the embedding dimension m, and the mutual information function is used to calculate time delay t.
- Due to the topological property of the SOM hidden layer, an input of any dimension is capable to be discretized into one-dimensional or two-dimensional (higher dimension is rare to see) discrete space. Since the phase trajectories is of high complexity and is high-dimensional, it is also possible for SOM to carry nonlinear data by increasing the number of network layers and hidden layer nodes. Consequently, instead of defining the number of clusters, SOM is selected to map the reconstructed time series to the similar SOM neural network, and the similar phase track segments cluster in the network automatically.
- Too large value of t always leads to a failure of PSR performance, it is necessary to reduce t appropriately to continue the transformation
- For each embedding dimension, the reconstructed phase trajectory is divided into several parts with a time delay of 3, and the every second point is taken as the vertex, so the angle of the three points is calculated as q. It is believed that the closer q is to 180â—¦, the closer the segment is to linearity, and vice versa.

Author Response

Thank you very much for your valuable suggestions and comments. They will be great treasures for my future study.

Reviewer 4 Report

The quality and the content of the manuscript correspond to the Journal requirements. However, there are a few suggestions to explain the content of the study more understandable.

 

Suggestions:

  • Figure 3 is the most important part of the manuscript to understand the whole study process. I suggest to move this figure forward (place it after the introduction, before the introduction of the applied methods) and extend the manuscript with a detailed description of the proposed methodology. Please describe the aim of all subprocesses. For example, explain what does mean the linear and the nonlinear part of the time series, and why it is needed to divide the time series into these groups. It would be very nice to illustrate this with an illustrative example as well.
  • Please separate the introduction of the methodology from the presentation of the results.
  • Please cite all the applied methods, such as the Lyapunov index, Hirst index, WOLF algorithm, etc.
  • The manuscript contains only a few grammatical errors, please read it carefully trough, and revise them.
  • Notations: Line 90: L_0^’ instead of L0^’ ; furthermore please explain all the applied notations (some of them are missing). Furthermore, please explain the role of the time delay.
  • What was the size of the applied 2 LSTM layers?

Author Response

Response to Reviewer 4 Comments

 

Point 1: Figure 3 is the most important part of the manuscript to understand the whole study process. I suggest to move this figure forward (place it after the introduction, before the introduction of the applied methods) and extend the manuscript with a detailed description of the proposed methodology. Please describe the aim of all subprocesses. For example, explain what does mean the linear and the nonlinear part of the time series, and why it is needed to divide the time series into these groups. It would be very nice to illustrate this with an illustrative example as well.


 

Response 1:

Thank you for your suggestion. The flow chart has been moved to the beginning of the second chapter and each step is described at the beginning of the third chapter.

 

 

Point 2: Please separate the introduction of the methodology from the presentation of the results.

 

Response 2:

The description of the method in the second chapter is more theoretical, while the discussion of the pre-processing process and the appropriate parameters of the model in Chapter 3 is more application oriented. Therefore, they are discussed separately.

 

 

Point 3: Please cite all the applied methods, such as the Lyapunov index, Hirst index, WOLF algorithm, etc.

 

Response 3:

References of Lyapunov index, Hurst index and Wolf algorithm has been added in the introduction chapter.

 

 

Point 4: The manuscript contains only a few grammatical errors, please read it carefully trough, and revise them.

 

Response 4: 

Thank you for your suggestion. The grammar has been checked and the errors of both the text and format have been corrected.

 

 

Point 5: Notations: Line 90: L_0^’ instead of L0^’ ; furthermore please explain all the applied notations (some of them are missing). Furthermore, please explain the role of the time delay.

 

Response 5:

L0 ^ ' has been revised. In line 78, the description of the selection of time delay τ and references have been added, and an example of the inapplicable attractor of τ reduction given in Figure 4 (B1). This example shows that a too small τ will lead to the failure of time series expansion in phase space.

 

 

Point 6: What was the size of the applied 2 LSTM layers?

 

Response 6:

Thank you for your remind. The description of LSTM method has been supplemented. There are 100 hidden units in each layer of the 2-layer LSTM structure.

 

Author Response File: Author Response.pdf

Back to TopTop