This paper presents a hierarchical reinforcement learning architecture that employs various low-level agents to act in the trading environment, i.e. the market. However, some descriptions are not clear. Some revisions are necessary in the manuscript.

1. The mathematical description of algorithms and models is too little, please add.

2. Please explain why hierarchical reinforcement learning is used to solve single-asset trading problems.

3. Please make sure all parameters are defined.

4. Please further explain how the proposed method improves on traditional hierarchical reinforcement learning.

5. In the paper, authors have focused on deep reinforcement learning for trading. The comparison of different intelligent methods needs to be analyzed to indicate advantages of your work, which can refer to：

[a] IEEE Transactions on Power Systems, vol. 37, no. 5, pp. 4067-4077, 2022

[b] IEEE Transactions on Smart Grid, vol. 12, no. 6, pp. 5185-5200, Nov. 2021

[c] IEEE Transactions on Energy Markets, Policy and Regulation, vol. 1, no. 1, pp. 23-36, 2023

Author Response

Thank you for your feedback.

Please see below.

1. The mathematical description of algorithms and models is too little, please add.

I added more equations and more details at the bottom of page 4.

2. Please explain why hierarchical reinforcement learning is used to solve single-asset trading problems.

I added a paragraph (page 4 – Overview) explaining more clearly why hierarchy is used for the single-asset case.

3. Please make sure all parameters are defined.

There is a huge number of hyperparameters for some of the algorithms. All I could do is refer the reader to the original source code for the values of all the hyperparameters. I added a footnote to the hyperparameters page.

4. Please further explain how the proposed method improves on traditional hierarchical reinforcement learning.

I added a clear differences list between this work and more classical HRL in the conclusion

I added two of the citations suggested, which seemed relevant. Comparison is not applicable.

Reviewer 2 Report

[Analytics] Manuscript ID: analytics-2394316-peer-review-v1

Article:

Hierarchical Model-Based Deep Reinforcement Learning For Single-Asset Trading

Review Comments

· Line 17. At the end of the abstract, please add a sentence stating "Implications/usefulness of research results for users and/or stakeholders"

· Abstract, please include the title "Abstract" and "Keywords"

· In the final paragraph of the introduction, please rearrange or add a discussion about NOVELTY, which is stated clearly and unequivocally. NOVELTY must be discussed, which includes: There is a gap in the results of previous research; The aims and objectives of this research; Broadly speaking the methods used to achieve the goal; What are the advantages or differences in the results of this study compared to the results of previous studies; and Benefits/contribution of the results of this research for users or stakeholders or the development of science.

· Line 132. What does "PPO" stand for? Every time you write an abbreviation for the first time, please include the long sentence. In the discussion that follows use the abbreviation.

· In the paper, there is no section entitled "Methods". If you read the description in "Overview", it is similar to the contents of the method. Please clarify.

· What can be explained from Figure 1? So that readers can understand the meaning of the diagram in Figure 1.

· What can be explained from the graphs in Figure 3? So that readers can understand the meaning of the graphs in Figure 3.

· Apa yang bisa dijelaskan dari grafik-grafik pada Figure 4? Agar para pembaca dapat memahami makna dari grafik-grafik yang ada pada Gambar 4.

· Please provide an explanation, the sign "-" in equation (7), what does it mean, or does it have anything to do with the previous discussions?

· Figure 7 and Figure 8, suddenly appeared, where did it come from, please explain. Also, please explain what Figure 7 and Figure 8 mean, so that readers can also understand.

· It would be nice if before the "Conclusion" section there was a "Discussion" section.

· Line 400. Please title "References"

· In the previous discussion, I don't think there was mention of "Appendix". Then why is there an Appendix?

· Table naming in "Appendix" and others, please follow the template from Analytics journal.

Comments for author File: Comments.docx

Author Response

Thank you for your feedback. Please see below.

Line 17. At the end of the abstract, please add a sentence stating "Implications/usefulness of research results for users and/or stakeholders"

Done.

Abstract, please include the title "Abstract" and "Keywords"

Done.

In the final paragraph of the introduction, please rearrange or add a discussion about NOVELTY, which is stated clearly and unequivocally. NOVELTY must be discussed, which includes: There is a gap in the results of previous research; The aims and objectives of this research; Broadly speaking the methods used to achieve the goal; What are the advantages or differences in the results of this study compared to the results of previous studies; and Benefits/contribution of the results of this research for users or stakeholders or the development of science.

Done

Line 132. What does "PPO" stand for? Every time you write an abbreviation for the first time, please include the long sentence. In the discussion that follows use the abbreviation.

Added explanation.

In the paper, there is no section entitled "Methods". If you read the description in "Overview", it is similar to the contents of the method. Please clarify.

Renamed section to Overview and Methods.

What can be explained from Figure 1? So that readers can understand the meaning of the diagram in Figure 1.

Added explanation.

What can be explained from the graphs in Figure 3? So that readers can understand the meaning of the graphs in Figure 3.
Apa yang bisa dijelaskan dari grafik-grafik pada Figure 4? Agar para pembaca dapat memahami makna dari grafik-grafik yang ada pada Gambar 4.

Added explanation.

Please provide an explanation, the sign "-" in equation (7), what does it mean, or does it have anything to do with the previous discussions?

Added explanation.

Figure 7 and Figure 8, suddenly appeared, where did it come from, please explain. Also, please explain what Figure 7 and Figure 8 mean, so that readers can also understand.

Added explanation.

It would be nice if before the "Conclusion" section there was a "Discussion" section.

Added section.

Line 400. Please title "References"

Added.

In the previous discussion, I don't think there was mention of "Appendix". Then why is there an Appendix?

Added.

Table naming in "Appendix" and others, please follow the template from Analytics journal.

Done.

Reviewer 3 Report

The article Hierarchical Model-Based Deep Reinforcement Learning For Single-Asset Trading presents a hierarchical reinforcement learning (RL,reinforcement learning) architecture that uses different low-level agents to operate in the trading environment, i.e. the market. In most works, usually the highest level agent from the group of specialized agents is selected, and then the selected agent decides when to sell or buy a particular asset, in a certain time period. This period may be variable, depending on the completion function. This paper assumes that because of the different market regimes, more than one agent is required when trying to learn from heterogeneous data, as multiple agents will perform better when each agent is specialized on a subset of the data. The paper uses k-means clustering to partition the data and train each agent with a different cluster.

The results of several experiments demonstrating the strengths of the hierarchical approach are presented, and different prediction models at both levels are tested, including one with a high-level risk reward.

The results presented suggest that, in general, the hierarchical approach reveals considerable promise, especially when the pool of low-level agents is very diverse.

Of minor comments, it should be noted that the paper does not consider a variant of agent behavior in which one agent controls the learning of another (meta-learning). In addition, the paper does not fully study the training of agents of one level, taking into account some measure of risk, and agents of another level, taking into account profitability.

In general, the work is at a good level.

The results presented suggest that, in general, the hierarchical approach reveals considerable promise, especially when the pool of low-level agents is very diverse.

In general, the work is at a good level.

Author Response

Thank you for your feedback.

Round 2

Reviewer 1 Report

The author has revised the reviewer's comments well and the current manuscript can be accepted.

Reviewer 2 Report

Thank you, the paper has been revised according to comments and suggestions.

Article Menu

Hierarchical Model-Based Deep Reinforcement Learning for Single-Asset Trading

[Analytics] Manuscript ID: analytics-2394316-peer-review-v1

Further Information

Guidelines

MDPI Initiatives

Follow MDPI