Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Three Metaheuristic Approaches for Tumor Phylogeny Inference: An Experimental Comparison

Algorithms 2023, 16(7), 333; https://doi.org/10.3390/a16070333

by Simone Ciccolella¹

, Gianluca Della Vedova^1,*

, Vladimir Filipović²

and Mauricio Soto Gomez³

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Algorithms 2023, 16(7), 333; https://doi.org/10.3390/a16070333

Submission received: 3 June 2023 / Revised: 29 June 2023 / Accepted: 4 July 2023 / Published: 12 July 2023

(This article belongs to the Special Issue Algorithms for Natural Computing Models)

Round 1

Reviewer 1 Report

This paper explores some metaheuristic approaches for phylogeny problems motivated by tumor phylogenetics, with specific focus on the perfect phylogeny and Dollo-k phylogenies. The problem area is important and getting a lot of attention, so there is value in considering the effectiveness of alternative algorithms. The main focus here is considering three different approaches and how they compare to a more specialized prior heuristic, CASC, published by some of the same authors. The three approaches are all versions of previously developed general heuristic optimization strategies. While it is hard to evaluate heuristics other than empirically, the authors demonstrate good knowledge of these heuristic strategies and their implementations for the present problems appear reasonable. Their effectiveness is evaluated by comparing their accuracy on simulated data. There is a good effort to explore a range of reasonable simulation parameter values, with a few exceptions noted below. None of the methods appears superior to the prior work, with one (particle swarm optimization) comparable to CASC in most scenarios, and the other two notably inferior. The paper is therefore essentially a negative result, that these techniques do not lead to any improvement in the state of the art on these problems. Given that, I cannot consider this paper especially significant, in that there is no particular insight to be gained, either theoretically or empirically, into how to solve the tumor phylogeny problem more effectively. But negative results can be useful to the field and the work could also be useful as a case study for exploring heuristic optimization in general. I do think some points should be clarified or explored in more depth before any publication.

1. These particular problems, perfect phylogeny and Dollo-k, have been used in a good deal of prior tumor phylogeny work, but they are highly simplified versions of the real problem. While the paper talks about recurrent mutation and reversion, there is a lot more that can occur in the real system, such as copy number changes, genome duplications, and structural variations. Although there is some legitimate motivation to focus on these simpler problems, I think it is worth discussing the biology a bit more and the gap between the models and the real system.

2. It is unusual to use heuristics for the perfect phylogeny problem, since there is an exact linear time algorithm known for the problem, as the paper notes. I can appreciate that it might be interesting as a point of comparison since it is a special case of the Dollo-k problem, but it could perhaps be clearer that there is no real practical value to better heuristics for perfect phylogeny (unless perhaps one can find sub-linear heuristic algorithms).

3. Dollo-k is more plausibly a problem for heuristics since it is intractable. Even here, though, I think the paper should do more to consider past literature and how such problems have been solved in practice previously. There are exact approaches one can use, such as direct branch-and-bound searches or integer programming, that are likely to do well for the ranges of parameters considered. Ideally, the paper should compare run times to some standard exact algorithms, or at least discuss the alternatives and why one might still prefer heuristic approaches.

4. On a related note, it would be fair to consider how well any of these solutions do compared to other real state-of-the-art tumor phylogeny methods, of which there are now many in the literature. I realize one cannot necessarily do head-to-head comparisons since most tumor phylogeny code is not solving the same optimization problem, but it would still be appropriate to discuss more in the text what alternative approaches exist and what advantages and disadvantages they have compared to the approaches considered in this paper.

5. Related to the above point, the paper adds an extra parameter, d, that I believe would make the Dollo-k problem substantially easier for small d. This makes it somewhat unfair to compare the methods here to ones from prior work that do not have this parameter d. I think it would be appropriate to do some testing where d is not used, or is so high it would not be restrictive.

6. The paper does not comment on run time, which would be an important consideration for choice of method. Many of these heuristics also can trade off more run time for more time searching for better solutions, so it is important to know how much time is allocated to each method. It would be appropriate to include run time numbers on a common hardware platform.

7. With respect to simulation parameter values, I think the paper generally does a good job picking realistic values. The number of mutations N can be much larger in real cases. It is worth considering how effectively the methods could scale to larger N.

8. It would also be worth extending the comparison to at least one real data set. While I recognize that one does not know the ground truth on real data, one can at least consider optimality of solutions and run time to achieve them, as well as compare to what more sophisticated tumor phylogeny algorithms have found on the same data.

9. The paper has to solve the problem of comparing tree distances in implementing the particle swarm method and the authors come up with their own approach to do this. I think the approach should be better justified and related to prior work. There is some literature on this problem of measuring distance between trees, including prior work on matching-based distances similar to that proposed here as well as work on tree distances specifically for tumor phylogenies. The paper actually cites and uses one of these specialized tumor phylogeny distance methods, MP3, in a later section. It would be appropriate to briefly review and cite prior work on this problem and justify the chosen approach in relation to it.

10. On lines 336-343, the authors make some guarantees about the performance of Algorithm 6, that it only produces valid solutions and that the search can reach all valid solutions. I was unclear on why these properties are true and would appreciate some clarification.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

The reconstruction of the evolutionary process of tumor cells is a hot research area that is crucial for discovering key carcinogenic factors, understanding the molecular processes of tumor development, and achieving precise tumor treatment. In this article, the authors built 3 new algorithms based on 3 metaheuristic approaches, namely Particle Swarm Optimization (PSO), Genetic Programming (GP) and Variable Neighbourhood Search (VNS), for inferring phylogeny tree of tumor cells. The results showed that the PSO algorithm had similar effectiveness to SASC, suggesting that this method could be a focus of future development. The article is well-written, featuring clear and coherent prose, a detailed account of the algorithm design process, and reasonable treatment of various aspects. It is overall a high-quality research paper. However, there are several areas that could benefit from improvement:

1. The three new algorithms developed by the authors did not demonstrate superior performance compared to SASC. Therefore, the reviewer suggests that the authors include a discussion on the potential advantages of the new methods to emphasize the need for developing and improving this algorithm.

2. False negatives are a major confounding factor in single-cell genome sequencing. In the simulated data used in this article, the authors set the false negative rate at 0.15. The reviewer considers this setting to be too low and suggests increasing it to 0.2 for comparison.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The authors have been responsive to the criticisms in the revision. They made most of the requested revisions and I believe they had good arguments for the ones they declined to make. While I am not completely satisfied with the response --- for example, I would still prefer to see comparison to state of the art methods other than the authors' own --- I think it is acceptable. They added some new experiments that were suggested and I do think these improve the paper. The run time experiments make a good for at least a couple of the heuristic approaches, although the experiments on scaling do raise further concern that even the most successful heuristic may fall far short of the current state of the art in accuracy on some realistic scenarios. Overall, I believe my prior concerns about whether the validation was sufficiently thorough have been sufficiently addressed.

My concerns about significance of the work remain and I do not think could be better addressed with further revisions. The work essentially remains a negative result --- that these classes of heuristic techniques do not appear to lead to a competitive approach for the tumor phylogeny problem --- although one could perhaps conclude that at least PSO may have promise for especially hard problem instances. As before, I believe a negative result can still be useful to the field even if it is hard to argue that it leads to any great new insights.

Article Menu

Three Metaheuristic Approaches for Tumor Phylogeny Inference: An Experimental Comparison

Further Information

Guidelines

MDPI Initiatives

Follow MDPI