Multi-Objective Deep Reinforcement Learning for Personalized Dose Optimization Based on Multi-Indicator Experience Replay
Round 1
Reviewer 1 Report
1. It is not clear why such a weighting function is used in 3.3.4.
2. Briefly introduce scenario one and two.
3. Figures 5 (e), 7 (e) should be separately labeled when MIER-MO-DQN algorithm is used solely.
4. The chart typesetting can be further embellished. During the reading process, some tables appear in front of the text.
5. Figure 9 shows that the MIER-MO-DQN algorithm fluctuates greatly when the number of iterations is small, which is recommended to explain this phenomenon briefly.
6. The content of discussion and summary are repeated.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
The topic of the paper is quite interesting and I find the paper well-written and easy to read. The paper can be accepted.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
The subject is very important However, some things should improve:
!. There are other formulations of models to deal with the subject, why was the one proposed by Pillis et al. chosen?
2. How are the parameters in Table 1 obtained? Are they general or specific for the various forms of cancer?
3. Place a small introduction between items 3. and 3.1 of the text.
4. References are few and, among them, few current.
5. The Item 5. could be placed, in a dispersed manner, in item 4.
6. The conclusion seems more like a discussion and not conclusions, Which , in fact, are the conclusions that the research presents?
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 3 Report
The authors greatly improve the work.