Next Article in Journal
X-ray Observations of Planetary Nebulae since WORKPLANS I and Beyond
Previous Article in Journal
WORKPLANS: Workshop on Planetary Nebula Observations
Previous Article in Special Issue
Constraints to Dark Matter Annihilation from High-Latitude HAWC Unidentified Sources
 
 
Review
Peer-Review Record

Gamma-Ray Dark Matter Searches in Milky Way Satellites—A Comparative Review of Data Analysis Methods and Current Results

by Javier Rico
Reviewer 1:
Reviewer 2: Anonymous
Submission received: 29 July 2019 / Revised: 20 February 2020 / Accepted: 24 February 2020 / Published: 17 March 2020
(This article belongs to the Special Issue The Role of Halo Substructure in Gamma-Ray Dark Matter Searches)

Round 1

Reviewer 1 Report

I have just finished reading the paper.   In general the paper is a good summary of the field and provides a lot of useful information.   However the clarity of the language is really not yet adequate for a review paper.   There are numerous imprecise statements that could easily be misunderstood by readers.   There are also many statements where, although the author's intention is clear, because of the imprecision or wording of the statement, it is technically not quite correct.  

For example, the very first sentence of the abstract reads, "Annihilation or decay ... produces gamma rays that are searched for ..."

In fact, that should be qualified and read something more like "Annihilation or decay ... may produce gamma rays that could be searched for ..."

Similarly, the very first sentence of the body of the text read "In one of the most plausible and thoroughly studied theoretical scenarios, dark matter is composed ..."

This sentence lacks context, it doesn't define what this scenario is seeks to explain.

These are just a couple of examples, there are many, many more such in the rest of the paper.  

Before proceeding with the review I'd like to ask the author to have the paper carefully re-read for precision and language usage.   Once they have done that I'd be happy to review the paper more fully.  

As I said, the paper is a nice summary and covers many important details.   There is a lot of good material here.   The presentation should be improved to do it justice.

Author Response

Dear referee, thank you for you report

Find below my replies to your comments

R: I have just finished reading the paper.   In general the paper is a good summary of the field and provides a lot of useful information.   However the clarity of the language is really not yet adequate for a review paper.   There are numerous imprecise statements that could easily be misunderstood by readers.   There are also many statements where, although the author's intention is clear, because of the imprecision or wording of the statement, it is technically not quite correct.  

JR: I went again through the whole paper trying to identify those sentences that could be misinterpreted, following in particular the couple of examples provided by the referee. I changed all the ones that I could clearly identify, and also tried to improve the overall clarity of the text. I also removed some sentences or even full paragraphs that I considered non-essential and maybe causing some confusion or lacking clarity. I hope you will find this new version greatly improved with respect to the previous one. 

 

R: For example, the very first sentence of the abstract reads, "Annihilation or decay ... produces gamma rays that are searched for ..."

In fact, that should be qualified and read something more like "Annihilation or decay ... may produce gamma rays that could be searched for ..."

JR: Changed to "If dark matter is composed of weakly-interacting particles
with mass in the GeV-TeV range, their annihilation or decay may
produce gamma rays that are searched for by gamma-ray telescopes." (the telescopes do search for the gamma-rays, whether dark matter exists or not)

 

R: Similarly, the very first sentence of the body of the text read "In one of the most plausible and thoroughly studied theoretical scenarios, dark matter is composed ..."

This sentence lacks context, it doesn't define what this scenario is seeks to explain.

JR: Added introductory sentence: "The existence of a dominant non-baryonic, neutral, cold matter component in the Universe, called \emph{dark matter}, has been postulated in order to explain the kinematics of galaxies in galaxy clusters~\cite{ref:zwicky33} and stars in spiral galaxies~\cite{ref:babcock39}, as well as the power spectrum of temperature anisotropies of the cosmic microwave background~\cite{ref:planck2018}."

 

R: These are just a couple of examples, there are many, many more such in the rest of the paper. 

JR: As said, I tried to look for all of them. If there are still some unclear sentences, I would appreciate knowing which ones they are.

 

Reviewer 2 Report

The paper "Gamma-ray dark matter searches in Milky Way satellites" is well written and reviews the present status of indirect dark matter search with gamma rays from dSphs. The author estimated the gamma-ray signals from WIMPs in dSphs, introduces the current gamma-ray experiments, describe the analysis methods and compare the latest results. I have a few comments which should be considered in more detail in the revised version of the paper.

- L50, "which is of the order of the point-spread-function of most of the current-generation gamma-ray telescopes" -> here you probably want to compare with the angular resolution of current gamma-ray telescopes.

- L81, thanks to the feature of high mass-to-light ratio and containing no gas in dSphs, the precise prediction of dark matter distribution can be obtained, but could you explain more on which is "generally within one order of magnitude"? Do you mean the uncertainty?

-L112, "primary gamma-rays"->"primary gamma rays"

- Eq 2., the third term of the right side of the equation should be dJ_ann/d\Omega. Please check.

- Eq. 8, the author uses "a" as a prefactor, however, in the later text, \alpha is used instead, for example in L195. Please check here.

- L145, "This will allow to perform Fermi-LAT archival dark matter searches should new dSphs will be discovered in the future.", please elaborate the sentence.

- L188, as known dark matter signals are quite weak compare to background, how can we clearly distinct them from background?

- L195, "bounds to \alpha", is this \alpha the same as "a" in Eq8? Or if not, please clarify it here.

- The paragraph under L248, here I guess N_E' and N_P' are the number of energy and spatial bins.

- L250, for my understanding, the IRF used in Eq. 15 is not the probability as you defined in the paper. It should have the unit of cm^2 as you formulated in Eq. 16.

- L281, "Note that this problem would not arise should confidence intervals partially or totally contained in the non-physical region were considered acceptable results (which they are from a pure statistical point of view)." Please elaborate this sentence, which is quite confusing to me.

- L357, please write the full name of DES here.

- L381, in this study, the 1% systematic uncertainty is only from the geometrical differences of on and off region? Other systematic uncertainty is not taken into account? I think for current IACTs, the 1% sys refers to the total sys, from effective area, psf, energy scale and et al.

- L437, "i.e. the smaller the dSph" -> "i.e. the smaller dSph", "the larger the dSph" -> "the larger dSph"

- L571, what is the searching energy range for HAWC? Could you elaborate the meaning of side bins in the text?

 

 

Author Response

Dear referee, 

thank you very much for your useful comments, which I have tried to implement in this new version. Be aware that there are some other major changes in the text, as requested by the other referee. Below my point-by-point response:

L50, "which is of the order of the point-spread-function of most of the current-generation gamma-ray telescopes" -> here you probably want to compare with the angular resolution of current gamma-ray telescopes.

This sentence has disappeared after major revision requested by the other referee.

L81, thanks to the feature of high mass-to-light ratio and containing no gas in dSphs, the precise prediction of dark matter distribution can be obtained, but could you explain more on which is "generally within one order of magnitude"? Do you mean the uncertainty?

Yes. Changed to: "Furthermore, dSphs contain in general no
significant amount of gas, which allows their mass distribution to be
inferred from the stellar motions. This results in relatively precise
predictions of the dark matter distribution in dSphs, enabling in turn
robust predictions of the intensity of the associated gamma-ray
signals, generally within a precision of one order of
magnitude~\cite{ref:Geringer2014}"

L112, "primary gamma-rays"->"primary gamma rays"

Done

Eq 2., the third term of the right side of the equation should be dJ_ann/d\Omega. Please check.

Done

Eq. 8, the author uses "a" as a prefactor, however, in the later text, \alpha is used instead, for example in L195. Please check here.

Yes, the problem is that during the edition of the text alpha slipped to L195, before it is defined properly later in the text. I change L195 (now L191) to "assuming an incorrect morphology may bias the result of the search"

L145, "This will allow to perform Fermi-LAT archival dark matter searches should new dSphs will be discovered in the future.", please elaborate the sentence. 

I just wanted to mention the fact that (contrary to what happens with Cherenkov telescopes) Fermi can look back to its data archive and analyze newly discovered dSphs. Change sentence to: "Thanks to this full-sky coverage, Fermi-LAT will be able to perform dark matter searches using its data archive should new dSphs will be discovered in the future."

L188, as known dark matter signals are quite weak compare to background, how can we clearly distinct them from background? 

What I say is that the spatial and energy distributions for signal and background are clearly distinct (they are). I have changed the word "distribution" by "PDF" to make the point hopefully more clear

L195, "bounds to \alpha", is this \alpha the same as "a" in Eq8? Or if not, please clarify it here.

Removed the reference to alpha, that is properly defined only later

The paragraph under L248, here I guess N_E' and N_P' are the number of energy and spatial bins. 

Yes, added "with N_E' the number of bins of reconstructed energy and N_P' the number of bins of reconstructed arrival direction"

L250, for my understanding, the IRF used in Eq. 15 is not the probability as you defined in the paper. It should have the unit of cm^2 as you formulated in Eq. 16.

Yes, thanks for catching this. Changed to: "IRF(E′, p′|E, p)dE′ dΩ′ is the effective collection area of the detector times the probability for a gamma ray with true energy E and direction p to be assigned an estimated energy in the interval [E′,E′ +dE′] and p′ in the solid angle dΩ′ (see more details below)."

    L281, "Note that this problem would not arise should confidence intervals partially or totally contained in the non-physical region were considered acceptable results (which they are from a pure statistical point of view)." Please elaborate this sentence, which is quite confusing to me.

I mean that if we allowed for intervals partially or fully contained in the non-physical region the coverage property would be fulfilled, we would just know that those unphysical intervals are part of the 1-CL fraction of intervals that do not contain the true value. Changed to "Note that this problem would not arise should confidence intervals partially or totally contained in the non-physical region were considered acceptable results (what they are from a pure statistical point of view because they fulfill the definition of coverage). 

L357, please write the full name of DES here.

Done

L381, in this study, the 1% systematic uncertainty is only from the geometrical differences of on and off region? Other systematic uncertainty is not taken into account? I think for current IACTs, the 1% sys refers to the total sys, from effective area, psf, energy scale and et al.

No, the 1% is really just for background estimation (energy scale is more about 10-15%), but you are right it is not just due to geometrical differences between the On and Off regions, it could be any thing causing them to have different acceptances (e.g. a nearby star or dead pixels in the camera). I rephrase: "The systematic uncertainty takes into account the residual differences of exposure between the Off and On regions, and it is normally assumed to be of the order of $1\%$ for the current generation of Cherenkov telescopes"

L437, "i.e. the smaller the dSph" -> "i.e. the smaller dSph", "the larger the dSph" -> "the larger dSph"

Changed to : "A_eff can be better approximated by a constant value for smaller signal regions, i.e.\ smaller dSphs, whereas the effect of the angular resolution in the distribution of measured events is smaller for larger dSphs."

L571, what is the searching energy range for HAWC? Could you elaborate the meaning of side bins in the text?

I added the energy range. The spatial bins have no special meaning here, they are the bins referred to in Equation 15 . I understand that giving their exact size here may seem that the number has some special meaning or role, so I have rephrased to:

"Data were binned in reconstructed energy E' (referred to as \f_hit in HAWC
publications~\cite{ref:Abeysekara2017}) covering the range between 500
GeV and 100 TeV, and in reconstructed arrival direction p', covering an area of $5^\circ$ radius around each of the analyzed dSphs."

Round 2

Reviewer 1 Report

This paper is a nice overview searches for DM annihilation signals from dSphs using gamma ray data and in particular of the statistical details of those searches. The current version has a number of weaknesses, but I am confident that the author can address them. Once he has done so, this paper is wholly suitable for publication.

For the most part, the paper is strongest in its discussion of the differences between the statistical methodologies of the various IACTs. Those sections are quite clearly written, informative and insightful. The description of the Vertias methodology is a exception, it is difficult to follow and I'm not certain the methodology warrants as much details as is given. On the other hand, the first three sections and the description of the Fermi-LAT analysis suffer a bit from from lack of precision, and contain a number of statements that are either potentially confusing, or inaccurate, and which should be clarified or fixed.

The paper would also benefit from some more quantitative discussion of the scale of the effects caused by the approximations used in the various analyses. Given the relatively large uncertainties in the J-factors, I suspect that in many cases the associated uncertainty dominates most other uncertainties, and the decision to neglect those other uncertainties is wholly justified. In my comments I've tried to identify places where some more quantitive information would be particularly useful.
Table 1 has 6 columns that tabulate different choices made in the statistical analysis by different collaborations. It would be great if the authors could give some indication as to the size of these effects for the various analyses.

The paper could also benefit from a bit more high-level discussion about the difference between analysis with Fermi, the IACTs and HAWC. Fermi analyses are often signal-limited, particularly at high energies, which IACT and HAWC analysis are background limited. What's more, the background subtraction methods used by HAWC and by the IACTs differ in significant ways. It would be useful to give a overview of this (perhaps towards the beginning of either sections 3 or 4, before jumping into the details).

The LAT data are publicly available and several authors outside the LAT collaboration have analyzed the dSphs for DM annihilation signals. Although many of the papers depart significantly from the common statistical framework described in this paper, and describing the various analysis details is probably outside the scope of this paper, that should be explicitly stated and the papers acknowledge in some form.

The paper leaves out any discussion of random direction control studies, which are, in fact, and excellent tool for evaluating the PDF of the test statistic.

 


Specific Comments-------------------

Title. I think that the title is perhaps overly broad given that the paper is focused in particular on the statistical aspects of searches targeting the dSphs.

L 2. " that are searched for" : "that could be detected by" ( or "that could be searched for by" )

L 4. "of their dark matter content" ( i think that "their" here is technically referring to the observations, not the dSphs themselves, maybe reword to clarify)

L 5. "of these searches" : "of searches targeting dSphs" (you haven't said anything about "searches" yet, in the previous sentence you mentioned "observations" and "constrains", but neither of these is quite the same thing as a "search")

L 5. "is optimized" : "can be optimized"

L 5-6. overall this sentence is a bit misleading in my mind, I think that the "advanced" statistical techniques actually help deal with things like nuisance parameters and J-factor uncertainties, and target stacking. Modelling the spectra and the morphology of a source is pretty standard for gamma-ray analysis.

L 50-57. In partical terms I don't think there is much hope in measuring the branching ratios to various individual quark final states, as the spectra are all pretty similar and separating out the various parts of an admixture would be very challenging. You might consider rewording this sentence to avoid giving the impression that they could. It would probably be fair to say measuring the spectra could help distinguishing between quark final states and lepton final states, I'd be hesitant to claim much more than that.

L 72. "peculiar" : :"particular:" ( There isn't anything that strange about the expected DM spectra or morphology, it is just different from the background)

L 77. "Among these latter ones we find" : "These latter ones include"

L 79. "or the treatment" : "and the treatment"

L 79-80: "treatment of the ... estimation of the dark mater signal and background intensities." Are you referring to the modeling, i.e., J-factor, uncertainty, or to the measurement uncertainty, maybe reword to clarify.

Figure 1. This figure does not look correct. The stated J-factor (5 x 10^21) and cross sections (3 x 10-26) are quite large, but the flux curves are well below the sensitivity curves. If this were correct then we should not be able to set limits anywhere near 3 x 10-26, but in fact the limits are much deeper that than.
The sensitivity curves seem ok, i suspect the problem is with the expected flux curves, perhaps they are being constructed in MeV cm^-2 s^-1 instead of erg cm^-2 s^-1.

L 95. "stellar activity that could produce a relevant background." I don't think that "stellar activity" is quite the right choice of words here; the point is more specifically that they don't contain gamma-ray emitting sources such as pulsars and supernova remnants, and in general don't contain objects that could accelerate particles up to GeV energies. It is true that some galactic gamma-ray sources, e.g., young pulsars & SNRs, are correlated to star formation, but other sources, e.g., millisecond pulsars, are not. Maybe reword this to emphasize the point that dSphs don't contain objects that can produce lots of high energy gamma rays.

L 97. "contain no significant amount of gas, which allows their mass distribution to be inferred from stellar motions." This is a non-sequitur, the mass distribution can be inferred from the stellar motions, the amount of gas in the system is not relevant. Maybe the point is more that there isn't a lot of dark gas that could explain the very high mass-to-light ratio.

L 98-100. the logic presented in these lines is not quite right. It is true that the J-factors of the dSphs is well known compared to other astrophysical systems, but in fact this is b/c the stellar motions constrain the total mass quite well. The absence of gas allows us to attribute all of the mass to dark matter.

L 102. "sit on relatively clean interstellar environments." This is vague, it would be better to define what about the environment makes it "clean", e.g., that the dSphs are well out into the milky way halo, and that the particle densities, cosmic ray fluxes and radiation fields are all very small.

L 107. " relatively low amount of assumptions" : "relatively few assumptions" Also, you might want to get rid of both "relatively" in this sentence, it is true that the dSphs analysis includes fewer assumptions than say the GC analysis, but since you don't discuss other targets in this paper it isn't obvious what the "relatively" here is comparing to.

L 120. "bosons, that would" : "boson, which would"

L 121. "This allows a relatively straightforward computation" : "It is straightforward to compute"

L 147. "larger J-factor central values" This could be read to mean "the J-factor at the center of the dSphs" instead of "the central value of the PDF (or posterior distribution) for the J-factor, I recommend you reword it to avoid the ambiguity.

L 150. Somewhere around here it would be useful to give typical J-factor uncertainties.

L 157-158: As worded this sentence is potentially misleading. As the next couple sentence point suggests, we could also measure higher energies gamma-rays above the atmosphere, it is just that it would take an impracticably large detector, both because of the low fluxes and because of the need to contain the interaction shower to accurately measure the energy. I recommend reword the sentence slightly to more explicitly state those points.

L 169. "and until recently" The Fermi-LAT is still in a sky survey mode, it is just that the survey parameters have been modified to account for the solar panel anomaly.

L 181. "allow to estimate" : "allow for the estimation of"

L 209. "The currently most advanced" : "The most advanced current" (or "Currently, the most advanced")

L 214. "should new dSphs will be" : ":should new dSphs be"

L 225. "astrophysical conventional" : "conventional astrophysical"

L 226. "On the other hard, however" : "However"

L 226. "exact shape" : "morphology" (you have been using that word all along, there is not need to change it in the middle of the paper)

L 227. "because such shape" : "because the morphology"

L 237-239. I don't think this is always true. As a counter-example, typical pulsar spectra are quite similar to ~30 GeV DM decaying to quarks.

L 250-254. I don't entirely agree with the assessment, we don't actually know what the branching fractions of the DM interactions are, so typically what we do is _assume_ a particular decay channel. That is not the really the same thing as using the spectral information to either increase the credibility of a detection or to strengthen the constraints. In fact, searching for different masses and decay channels actually results in a pretty large trials factor which weakens the search sensitivity.

L 266. In partical terms it is often Chernoff's theorem that applies, not Wilks' theorom. For example, for Fermi-LAT searches the signal is not allowed to be negative, so we actually have a bounded DOF and expect 50% of the trials to have TS=0 and the other 50% to follow a chi^2 distribution.

L 286+. Well, the uncertainties affecting s_ij can certainly affect the upper limits. Consider for example uncertainties of the effective area, they would directly lead to uncertainties in the expected number of signal events and hence the resulting upper limits. So I think you might want to rephrase this statement. It is certainly true that we typical ignore explicit dependence of s_ij on nuisance parameters.

Eq 16. This equation is simply not correct for Fermi-LAT analysis. The IRFs are defined in instrument coordinates, and then we must calculate "effective IRFs" for a particular ROI based on the entire observation history of the mission. Pulling T_obs out of the integrals does not at all reflect what we actually do. Rather we intergrate the IRFs in instrument coordinates over the "observing profile", i.e., the livetime a particular direction in the sky is at a particular spot in the instrument reference frame. We often average the "observing profile" over our region of interest.

L 296-8. As worded, this statement is not true. The statement also depends on the size of the region of interest. Taking a large region of interest and then using a small number of spatial bins would very negatively impact the performance of the analysis. I think that what you really want to be talking about here is the number of spatial bins _inside the PSF_, or perhaps the number of bins _inside the target_. But in general in Fermi-LAT analysis the ROI is much larger than either of these.

L 303. As with Eq. 16, this description is not quite accurate for Fermi-LAT analysis, it is true that we could define effective PSF and energy response functions for a particular observation and direction in the sky, but that should be stated explicitly.

L 318-323. This description doesn't really apply to Fermi data. It isn't really feasible to allow for negative values of <sigma v> in Fermi analysis. In practical terms allowing for negative values of the <sigma v> does not work because the PSF is sharply peaked, and at high energies the backgrounds are very small, so the total model contributions can easily become negative for the central pixels in the target model (i.e., the signal PDF becomes more negative than the total background PDF) if there are any data counts in the same pixel this will cause the Poisson likelihood to be undefined. Although are a number of possible workarounds for this, they all introduce their own set of potential problems. In fact, I think the point is more that Chernoff's theorem, rather than Wilks' theorem, should be applied.

L 332-336. These sentences are potentially confusing. Typically, Wilks' and Chenoff's theorem's are statements about the distribution of the test statistic in the case that the null hypothesis is true, so it is confusing to refer to the PDF of the test statistic for non-null values of alpha_true. The subtlety here is that when setting upper limits what one does is essentially to redefine the null hypothesis to be the case that alpha_true = alpha_hat and then applying Wilks' or Chernoff's theorem to that case. Maybe rephrase these sentences to explain more clearly that 1) what we are treating as the null-hypothesis in the context of Wilks' or Chernoff's theorem depends on if we are quoting a detection significance or setting up limits and 2) that we have to account for the difference in the product of alpha and J

L 379. Which one of these backgrounds is dominate depends on the particular dSphs being considered. It is probably useful to explicitly point that out.

L 387-389. The fit is "broadband" in the sense that it is preformed over several energy bins and the includes spatial model for each of the energy bins.

L 390-392. It is potentially a bit misleading to say that the Fermi-LAT analysis does not include any nuisance parameters for the background model. If fact, the background model of the ROIs large number of parameters that were fit in work that predicates the dSphs analysis, (i.e., the construction of the catalogs used to construct the models of the ROIs, and then the refitting of the normalization of the source in the ROI.) It would be better to state that the analysis does not treat any parameters as explicit nuisance parameters in the final stage of the ROI analysis, when fitting for the signal fluxes. The fact that adding a putative source to the model does not result in changes to the background parameters effectively demonstrates that the background are well-constrained by the fitting procedures. It would be worthwhile to add that studies showed that the effect of the background uncertainty contributed at a few % of statistical uncertainty of the signal and are safe to neglect.

L 392+ and 395-397. In fact, in LAT pass 8 analyses of the dSphs the energy dispersion was considered. Also, studies showed that for the spectra being considered, the effects of using the energy-bin likelihoods as a intermediate step was negligible compared to the uncertainties in the J-factors.

Figure 2. "from 300 realizations" : "from 300 simulated realizations"

Lines 428-438 and Equation 22. It would useful to describe the equation a bit more precisely in the text. The equation is the product of Poisson likelihoods for the signal region, the background region and a Gaussian Prior on the scaling factor between the counts expected in the signal and background regions. Maybe just adding something to that effect after the equation would be helpful. It is also probably worth commenting the effect of the typical 1% uncertainty in the background normalization. That 1% uncertainty implies that the best an analysis can reasonable expect to do is to have an uncertainty of about 100 counts on the signal, or a 95% UL of somewhere around 200 counts. Is that in fact typical for IACT results?

L 460-462. I suspect that these approximations are pretty much negligible compared to the J-factor uncertainties. It would be useful to give order of magnitude estimates of the size of the effects to confirm this.

L 516-517. Elsewhere in the paper the uncertainties of the J-factors are included in the limits. Here they are not, and that is artificially making the limits much more constraining than they should be. It would be good to quantify this effect.

L 531 "fist" : "first:"

Figure 3. "the median of the distribution of limits under the null hypothesis" : "the median of the distribution of expected limits under the null hypothesis" Just to be explicit that the bands are expectation bands, and derived from simulations.

L 546-551. Some estimates of the magnitude of the effects caused by the different analysis choices described here would be very useful.

L 567. "In addition, for the first time, the Off/On" It isn't clear if this is the first time the MAGIC collaboration considered the uncertainty on tau, or if it is first time anyone did. Part of the confusion is that equation 22 includes the tau as a nuisance parameter, and the wording there seems to indicate that this is the standard formulation for the likelihood used by IACTs.

L 593+ "We remind that" : "We remind the reader that"

L 605-606. Is it really the mean that is reported? Not the median? If the uncertainties in J are expressed in log-space these could be quite different. Also, as noted, this prescription for taking into the account the J-factor uncertainties is very different that profiling over the J-factor uncertainty, as done by others. It would be useful to comment on the scale of the effect.

L 607-628. This discussion is confusing and not really convincing. If I understand correctly, the method presented here ignores the contribution to the Poisson likelihood of the total number of observed events, i.e., essentially considering the PDF in the totally number of events as a delta function at the number of events actually observed, and focuses instead on the distributions of E\prime and Theta\prime for those events. I believe that this approach requires the signal and background PDFs being properly normalized over the ranges relevant for the analysis. It is also stated that b (i.e., the "number" of background events) is fixed during the likelihood maximization. Since s (the "number" of signal events) is allowed to vary, and all the quantities in the likelihood consist of combinations of s and b, this is essentially equivalent to varying the fraction of signal events in the fitting. f_b is also held fixed during the fitting, it isn't stated explicitly, but I imagine that f_s is also held fixed, as the PDF in E' and Theta' seems to contain all the information about the signal that is being fitted. So basically, it sounds like the only thing that is being freed in the fitting is the fraction of events attributable to the signal. Then each event has two contributions to the log-likelihood, basically the current signal fraction times the value of the signal PDF given E' and Theta' and the current background fraction times the value of the background PDF given E' and Theta'. It should also be stated clearly if this methodology requires that the signal and background PDFs (f_s and f_b) must be normalized, and if they are how that normalization is performed; i.e., if E' and Theta' are truly independent or if Theta' is conditional on E'.

At lines 619-620 it is stated that "Fourier transform of the PDF for a compound Poisson distribution can be simply computed as a function of mu and the PDF for x_i." It isn't clear if the rest of the paragraph is then a description of that process, or if the "function of mu and the PDF for x_i" is not given in the text. It would be good to clarify that, and given the relevant function, if possible.

It isn't clear exactly how this maps into the stated condition for compound poisson variables. The statement at lines 620-622: "In our case, N is the number of observed
signal or background events, with mu = s or b, respectively, and the x_i are the single-event contributions to 2ln lambda_p from signal or background events, respectively." could be interpreted in a number of different ways. What does the index i run over? At what point in the process do we define s and b, i.e., are they the best fit values, or does it even matter? Etc...

Anyway, I think that the point here is just that the log-likelihood just consist of some number of events taken for a signal distribution and some other number of events taken from a background distribution, and all the signal events share the same pdf, and similarly all the background events share the same PDF. It might be helpful to state that in plainer language in addition to the formal defintion given here.

Similarly, the sentence at lines 622 to 625 is very difficult to follow. In part b/c it is not quite clear what some parts of the sentence refer to: 1) what is the "it" in "by computing it in sufficiently fine bins" (literally what the text is saying is that you can compute a quantity by computing it in small bins, that can't be quite right, many you mean to say that you can compute the contribution from each bin) 2) what are "the obtained values" in "and introducing the obtained values in a distribution" (literally this they could be the values from each of the bins, or the value for each of the events, I'm pretty sure that it is the value for each bin, but it would be good to be explicit) 3) what is the "distribution" referred to by "and introducing the obtained values in a distribution weighted by", do you just mean, "and weighting them by" 4) what is the "they" referred to by "their corresponding s_ij value", I think you mean something like "the expected number of signal events in that bin of E' and Theta' (i.e., s_ij)" Anyway, this sentence would be clearer if these quantities were referred to by standard terms or symbols.

It is also not clear from the text how one performs the Fourier transfrom in question. What is the variable (or variables) that is being transformed over? Is it the input variables to the PDF (E', Theta'), or the weighted distribution of the partial log-likelihoods (I think so, but it is not clear from the text)? What sort of boundary conditions need to be applied, if any, etc... I think that the Fourier transform in question is on the weighted distribution of the log-likelihoods for signal and background PDFs, and that the transforming variable is the log-likelihood itself, but this should be stated explicitly.

It is also not clear how the PDF of the likelihood function is actually turned into something like a p-value, i.e., the probability to observe a particular set of results if the null hypothesis were true, using this methodology. The description given just how to compute the PDF of the likelihood function, it doesn't distinguish between the true values of the parameters and the fitted values of the parameters. Anyway, given my imperfect understand of this methodology, I believe that it could be used to determine the PDF for log-likelihood values, assume that s = 0 (i.e., that the null hypothesis is true), or, alternatively, that it could be used to derive the PDF for log-likelihood values for a particular value of s, (say the best fit value). It would be good to clarify how it is being used.

I'm dubious about presenting this methodology as an improvement over the standard methodology based on Wilks' theorem. It seems to me like there some details that could adversely affect the results, in particular, the choice of binning used to compute the distribution of the log-likelihoods, the limits of the E', and Theta' used to define integration range to normalize the PDFs, the fact that the fluctuations in the total number of events has been ignored in the treatment. If the methodology requires that that the signal and background PDFs have to be normalized I think that is pretty heavy downside as the details of how exactly the normalization is done could present some significant complications.

Overall, I'm not sure that this methodology warrants quite some much space in the paper. If the author does want to include a description of it, they might consider it as an appendix.


L 648-649. It is my understanding that while the output of the f_hit analysis more a proxy of the energy, than a reconstructed energy (with associated energy dispersion matrix) in the sense of equation 17. I think in this context it is being treated as if it were a reconstructed energy, with the caveat that MC simulations were used to generate the PDFs for f_hit for different input spectra, rather than used explicitly tabulated IRFs, as is done with Fermi. Also, I believe that the HAWC collaboration has developed a new energy estimation analysis, with a much tighter response matrix, and which they will be able to treat more as a reconstructed energy than the f_hit (or n_hit) analysis.

Figure 4. The presentation of the expectation bands in the left-hand plot are pretty non-standard. Is the dashed line the median expectation and then the colored bands the expectation bands? If so, why are only the positive bands shown? And why does the actually result track the median expectation so closely? If, other the other hand, what is being plotted is actually something different, then it should be explained in detail.

667-670. Here is another place where it would be helpful to include quantitative details about the size of the effects that the HAWC collaboration did consider.

672-675. "The significance of rejection of the null hypothesis for all targets ... is within +- 2 sigma, except for few marginally larger negative fluctuations." You can't have a negative significance, this is mixing two different statistical concepts. It would be better to phrase this to say that the maximum likelihood estimates are consistent with zero to within +- 2 sigma.

 

Author Response

This paper is a nice overview searches for DM annihilation signals from dSphs using gamma ray data and in particular of the statistical details of those searches. The current version has a number of weaknesses, but I am confident that the author can address them. Once he has done so, this paper is wholly suitable for publication.

For the most part, the paper is strongest in its discussion of the differences between the statistical methodologies of the various IACTs. Those sections are quite clearly written, informative and insightful. The description of the Vertias methodology is a exception, it is difficult to follow and I'm not certain the methodology warrants as much details as is given.

-> This section has been greatly shortened (see more below in reply to specific comment)

On the other hand, the first three sections and the description of the Fermi-LAT analysis suffer a bit from from lack of precision, and contain a number of statements that are either potentially confusing, or inaccurate, and which should be clarified or fixed.

-> Also fixed according to specific comments


The paper would also benefit from some more quantitative discussion of the scale of the effects caused by the approximations used in the various analyses. Given the relatively large uncertainties in the J-factors, I suspect that in many cases the associated uncertainty dominates most other uncertainties, and the decision to neglect those other uncertainties is wholly justified. In my comments I've tried to identify places where some more quantitive information would be particularly useful.
Table 1 has 6 columns that tabulate different choices made in the statistical analysis by different collaborations. It would be great if the authors could give some indication as to the size of these effects for the various analyses.

-> I have tried to include some more information when possible. In other cases it is not possible, due to impossibility to access the needed information. In those cases, it is my opinion that it should be the responsibility of the authors of the different works summarized here to show whether a certain approximation is or not justified


The paper could also benefit from a bit more high-level discussion about the difference between analysis with Fermi, the IACTs and HAWC. Fermi analyses are often signal-limited, particularly at high energies, which IACT and HAWC analysis are background limited. What's more, the background subtraction methods used by HAWC and by the IACTs differ in significant ways. It would be useful to give a overview of this (perhaps towards the beginning of either sections 3 or 4, before jumping into the details).

-> This goes against how the storyline is organized, which is related to a certain message I want to convey: that all analyses can be described by a general framework valid for all, that then need to be particularized/adapted to the different experimental situations.


The LAT data are publicly available and several authors outside the LAT collaboration have analyzed the dSphs for DM annihilation signals. Although many of the papers depart significantly from the common statistical framework described in this paper, and describing the various analysis details is probably outside the scope of this paper, that should be explicitly stated and the papers acknowledge in some form.

-> I have included a sentence at the beginning of section 4.1 and cited all papers about searches for DM in dSphs that are in the Fermi-LAT publications website https://www-glast.stanford.edu/cgi-bin/pubpub

The paper leaves out any discussion of random direction control studies, which are, in fact, and excellent tool for evaluating the PDF of the test statistic.

-> This is mentioned twice: in lines 324-325 and in Caption of figure 2


Specific Comments-------------------

Title. I think that the title is perhaps overly broad given that the paper is focused in particular on the statistical aspects of searches targeting the dSphs.

-> "Gamma-ray Dark Matter Searches in Milky Way Satellites - A Comparative Review of Data Analysis Methods and Current Results"

L 2. " that are searched for" : "that could be detected by" ( or "that could be searched for by" )

-> Ok

L 4. "of their dark matter content" ( i think that "their" here is technically referring to the observations, not the dSphs themselves, maybe reword to clarify)

-> Changed to "Observations of dwarf spheroidal satellite galaxies of the Milky Way (dSphs) benefit from the relatively accurate predictions of dSph dark matter content"

L 5. "of these searches" : "of searches targeting dSphs" (you haven't said anything about "searches" yet, in the previous sentence you mentioned "observations" and "constrains", but neither of these is quite the same thing as a "search")

-> "The sensitivity of these observations for the search for dark matter signals can be optimized"

L 5. "is optimized" : "can be optimized"

-> Ok

L 5-6. overall this sentence is a bit misleading in my mind, I think that the "advanced" statistical techniques actually help deal with things like nuisance parameters and J-factor uncertainties, and target stacking. Modelling the spectra and the morphology of a source is pretty standard for gamma-ray analysis.

-> I leave this unchanged. Spectral and morphological modeling are common now, but were not some time ago,wee, e.g. Abdo et al. ApJ 712 (2010) 147 for LAT, Aharonian et al. Astropart. Phys. 29 (2008) 55 for HESS, Aleksic et al JCAP06(2011)035 for MAGIC or Aliu et al. Phys. Review D 85, 062001 (2012) for VERITAS. All other considerations metioned by the referee are also discussed in the main text, and in my opinion not suited for an abstract.


L 50-57. In partical terms I don't think there is much hope in measuring the branching ratios to various individual quark final states, as the spectra are all pretty similar and separating out the various parts of an admixture would be very challenging. You might consider rewording this sentence to avoid giving the impression that they could. It would probably be fair to say measuring the spectra could help distinguishing between quark final states and lepton final states, I'd be hesitant to claim much more than that.

-> Change "measured" by "studied"

L 72. "peculiar" : :"particular:" ( There isn't anything that strange about the expected DM spectra or morphology, it is just different from the background)

-> Ok

L 77. "Among these latter ones we find" : "These latter ones include"

-> Ok

L 79. "or the treatment" : "and the treatment"

-> Ok

L 79-80: "treatment of the ... estimation of the dark mater signal and background intensities." Are you referring to the modeling, i.e., J-factor, uncertainty, or to the measurement uncertainty, maybe reword to clarify.

-> "These latter ones include the methods for computing the spectral and morphological models for the expected gamma-ray signal and associated background, their use in the statistical analysis, and the treatment of the related statistical and systematic uncertainties"

Figure 1. This figure does not look correct. The stated J-factor (5 x 10^21) and cross sections (3 x 10-26) are quite large, but the flux curves are well below the sensitivity curves. If this were correct then we should not be able to set limits anywhere near 3 x 10-26, but in fact the limits are much deeper that than.
The sensitivity curves seem ok, i suspect the problem is with the expected flux curves, perhaps they are being constructed in MeV cm^-2 s^-1 instead of erg cm^-2 s^-1.

-> Yes, thanks! embarrasing mistake, indeed... There was a factor 1e3 missing... Fixed it and changed the value of J to a more typical value

L 95. "stellar activity that could produce a relevant background." I don't think that "stellar activity" is quite the right choice of words here; the point is more specifically that they don't contain gamma-ray emitting sources such as pulsars and supernova remnants, and in general don't contain objects that could accelerate particles up to GeV energies. It is true that some galactic gamma-ray sources, e.g., young pulsars & SNRs, are correlated to star formation, but other sources, e.g., millisecond pulsars, are not. Maybe reword this to emphasize the point that dSphs don't contain objects that can produce lots of high energy gamma rays.

-> "they harbor no known astrophysical gamma-ray sources that could produce a relevant background"

L 97. "contain no significant amount of gas, which allows their mass distribution to be inferred from stellar motions." This is a non-sequitur, the mass distribution can be inferred from the stellar motions, the amount of gas in the system is not relevant. Maybe the point is more that there isn't a lot of dark gas that could explain the very high mass-to-light ratio.

-> see next

L 98-100. the logic presented in these lines is not quite right. It is true that the J-factors of the dSphs is well known compared to other astrophysical systems, but in fact this is b/c the stellar motions constrain the total mass quite well. The absence of gas allows us to attribute all of the mass to dark matter.

-> "Furthermore, dSphs contain in general no significant amount of dark gas, which allows their dark matter distribution to be inferred with relatively good precision from the stellar motions, enabling in turn robust predictions of the intensity of the associated gamma-ray signals, generally within a accuracy of one order of magnitude~\cite{ref:Geringer2014}. "

L 102. "sit on relatively clean interstellar environments." This is vague, it would be better to define what about the environment makes it "clean", e.g., that the dSphs are well out into the milky way halo, and that the particle densities, cosmic ray fluxes and radiation fields are all very small.

-> "Finally, given how most of the known dSphs sit on relatively clean interstellar environments (i.e.\ well out into the Milky Way halo, where the particle densities, cosmic ray fluxes and radiation fields are all small.), the expected gamma-ray signal would come from well-understood prompt processes"

L 107. " relatively low amount of assumptions" : "relatively few assumptions" Also, you might want to get rid of both "relatively" in this sentence, it is true that the dSphs analysis includes fewer assumptions than say the GC analysis, but since you don't discuss other targets in this paper it isn't obvious what the "relatively" here is comparing to.

-> "Therefore, since flux predictions rely on relatively few assumptions compared to other typical observational targets like e.g. the Galactic center or clusters of galaxies, the bounds on the WIMP properties that can be inferred from the presence or absence of a gamma-ray signal are also relatively robust."

L 120. "bosons, that would" : "boson, which would"

-> Ok

L 121. "This allows a relatively straightforward computation" : "It is straightforward to compute"

-> Ok

L 147. "larger J-factor central values" This could be read to mean "the J-factor at the center of the dSphs" instead of "the central value of the PDF (or posterior distribution) for the J-factor, I recommend you reword it to avoid the ambiguity.

-> "usually have larger estimated J-factors but also larger uncertainties"

L 150. Somewhere around here it would be useful to give typical J-factor uncertainties.

-> "In general, the classical dSphs, with relatively large stellar populations ($O(100-1000)$), have relatively low associated J-factors (typically between $3\times 10^{17}$ and
$7 \times 10^{18}$~GeV$^2$cm$^{-5}$ within an integrating angle of $0.5^\circ$), with associated uncertainties also relatively low (typically below $50\%$), suitable for setting robust limits to dark matter properties. On the other hand, members of the ultra-faint population (those discovered by the Sloan Digital Sky Survey or later, with $O(10-100)$ members or less stellar populations) can have larger estimated J-factors (some above $10^{19}$~GeV$^2$cm$^{-5}$) but also larger uncertainties (some above a factor 10)"

L 157-158: As worded this sentence is potentially misleading. As the next couple sentence point suggests, we could also measure higher energies gamma-rays above the atmosphere, it is just that it would take an impracticably large detector, both because of the low fluxes and because of the need to contain the interaction shower to accurately measure the energy. I recommend reword the sentence slightly to more explicitly state those points.

-> "At energies below $\sim100$ GeV, we can efficiently measure gamma rays [...]"

L 169. "and until recently" The Fermi-LAT is still in a sky survey mode, it is just that the survey parameters have been modified to account for the solar panel anomaly.

-> removed

L 181. "allow to estimate" : "allow for the estimation of"

-> ok

L 209. "The currently most advanced" : "The most advanced current" (or "Currently, the most advanced")

-> Ok

L 214. "should new dSphs will be" : ":should new dSphs be"

-> Ok

L 225. "astrophysical conventional" : "conventional astrophysical"

-> Ok

L 226. "On the other hard, however" : "However"

-> Ok

L 226. "exact shape" : "morphology" (you have been using that word all along, there is not need to change it in the middle of the paper)

-> Ok

L 227. "because such shape" : "because the morphology"

-> Ok

L 237-239. I don't think this is always true. As a counter-example, typical pulsar spectra are quite similar to ~30 GeV DM decaying to quarks.

-> I think the sentence is true if one does not consider instrumental effect, which is the case at this point in the text. Also, I say "in general" and then cathegorize the possibilities starting from the "most extreme" case which is the spectral line, to underline this idea that some DM spectra are more peculiar than others.

L 250-254. I don't entirely agree with the assessment, we don't actually know what the branching fractions of the DM interactions are, so typically what we do is _assume_ a particular decay channel. That is not the really the same thing as using the spectral information to either increase the credibility of a detection or to strengthen the constraints. In fact, searching for different masses and decay channels actually results in a pretty large trials factor which weakens the search sensitivity.

-> The sentence says that the uncertainty of dN/dE can be considered negligible *for given annihilation/decay channel*, so there is no contradiction with the fact that we don't know the BRs for the different channels. The rest of this comment is discussed later in lines 269-276


L 266. In partical terms it is often Chernoff's theorem that applies, not Wilks' theorom. For example, for Fermi-LAT searches the signal is not allowed to be negative, so we actually have a bounded DOF and expect 50% of the trials to have TS=0 and the other 50% to follow a chi^2 distribution.

-> Yes, it is actually discussed later (l316 on) how the conditions of Wilks' theorem are not fulfilled. I prefer to refer to Wilks theorem which I believe it is more known among potential readers. Since it is later said that conditions are not met, I think all statements are correct.


L 286+. Well, the uncertainties affecting s_ij can certainly affect the upper limits. Consider for example uncertainties of the effective area, they would directly lead to uncertainties in the expected number of signal events and hence the resulting upper limits. So I think you might want to rephrase this statement. It is certainly true that we typical ignore explicit dependence of s_ij on nuisance parameters.

-> "However, uncertainties affecting $s_{ij}$ are usually considered to be largely dominated by the uncertainty in the J-factor and the dependence of $s_{ij}$ on $\bm{\mu}$ therefore ignored"

Eq 16. This equation is simply not correct for Fermi-LAT analysis. The IRFs are defined in instrument coordinates, and then we must calculate "effective IRFs" for a particular ROI based on the entire observation history of the mission. Pulling T_obs out of the integrals does not at all reflect what we actually do. Rather we intergrate the IRFs in instrument coordinates over the "observing profile", i.e., the livetime a particular direction in the sky is at a particular spot in the instrument reference frame. We often average the "observing profile" over our region of interest.

-> Yes, actually that is true for all instruments, expression has been corrected accordingly, and also the following explanatory paragraph


L 296-8. As worded, this statement is not true. The statement also depends on the size of the region of interest. Taking a large region of interest and then using a small number of spatial bins would very negatively impact the performance of the analysis. I think that what you really want to be talking about here is the number of spatial bins _inside the PSF_, or perhaps the number of bins _inside the target_. But in general in Fermi-LAT analysis the ROI is much larger than either of these.

-> "It must be noted that, defining several spatial bins within the source produces relatively minor improvement in sensitivity to dark matter searches for not significantly extended sources (i.e.\ those well described by a point-like source, as it is the case for many dSphs)~\cite{ref:Nievas2016}. For significantly extended sources, on the other hand, using a too fine spatial binning makes the obtained result more sensitive to the systematic uncertainties in the dark matter spatial distribution within the dSph halo."

L 303. As with Eq. 16, this description is not quite accurate for Fermi-LAT analysis, it is true that we could define effective PSF and energy response functions for a particular observation and direction in the sky, but that should be stated explicitly.

-> Added a time dependence, in accordance with changes in Eq 16


L 318-323. This description doesn't really apply to Fermi data. It isn't really feasible to allow for negative values of <sigma v> in Fermi analysis. In practical terms allowing for negative values of the <sigma v> does not work because the PSF is sharply peaked, and at high energies the backgrounds are very small, so the total model contributions can easily become negative for the central pixels in the target model (i.e., the signal PDF becomes more negative than the total background PDF) if there are any data counts in the same pixel this will cause the Poisson likelihood to be undefined. Although are a number of possible workarounds for this, they all introduce their own set of potential problems. In fact, I think the point is more that Chernoff's theorem, rather than Wilks' theorem, should be applied.

-> Yes, removed

L 332-336. These sentences are potentially confusing. Typically, Wilks' and Chenoff's theorem's are statements about the distribution of the test statistic in the case that the null hypothesis is true, so it is confusing to refer to the PDF of the test statistic for non-null values of alpha_true. The subtlety here is that when setting upper limits what one does is essentially to redefine the null hypothesis to be the case that alpha_true = alpha_hat and then applying Wilks' or Chernoff's theorem to that case. Maybe rephrase these sentences to explain more clearly that 1) what we are treating as the null-hypothesis in the context of Wilks' or Chernoff's theorem depends on if we are quoting a detection significance or setting up limits and 2) that we have to account for the difference in the product of alpha and J

-> True, the paragraph has been modified to account for this

L 379. Which one of these backgrounds is dominate depends on the particular dSphs being considered. It is probably useful to explicitly point that out.

-> Ok

L 387-389. The fit is "broadband" in the sense that it is preformed over several energy bins and the includes spatial model for each of the energy bins.

-> Hope this version is more clear: "The flux \emph{normalizations} of the different background components are determined by means of a maximum-likelihood fit to the spacial a spectral distribution of the observed events, with the rest of spectral parameters fixed to the values listed in the updated third LAT source catalog~\cite{ref:3fgl}."


L 390-392. It is potentially a bit misleading to say that the Fermi-LAT analysis does not include any nuisance parameters for the background model. If fact, the background model of the ROIs large number of parameters that were fit in work that predicates the dSphs analysis, (i.e., the construction of the catalogs used to construct the models of the ROIs, and then the refitting of the normalization of the source in the ROI.) It would be better to state that the analysis does not treat any parameters as explicit nuisance parameters in the final stage of the ROI analysis, when fitting for the signal fluxes. The fact that adding a putative source to the model does not result in changes to the background parameters effectively demonstrates that the background are well-constrained by the fitting procedures. It would be worthwhile to add that studies showed that the effect of the background uncertainty contributed at a few % of statistical uncertainty of the signal and are safe to neglect.

-> " The analysis does not explicitly treat the relevant background parameters $\bm{\mu}$ in Equation~\ref{eq:binnedLkl} as nuisance parameters. Instead, the spectral parameters (e.g.\ normalization, photon index, etc.) of the different background sources are fixed using the following simplified method. The flux \emph{normalizations} of the different background components are determined by means of a global, broadband, maximum-likelihood fit to the spacial a spectral distribution of the observed events, with the rest of spectral parameters fixed to the values listed in the updated third LAT source catalog~\cite{ref:3fgl}. Then, it is checked that the values of the background normalization factors obtained using this method do not change significantly by including an extra weak source at the locations of the dSph, which shows that the background are well-constrained by this procedure. Studies showed that the effect of the background uncertainty from this procedure contributed at a few percent of statistical uncertainty of the signal and are safe to neglect."

L 392+ and 395-397. In fact, in LAT pass 8 analyses of the dSphs the energy dispersion was considered. Also, studies showed that for the spectra being considered, the effects of using the energy-bin likelihoods as a intermediate step was negligible compared to the uncertainties in the J-factors.

-> Phys Rev Lett 115 (2015) 231301 page 4, bottom, reads: "After fixing the background normalizations, we scan the likelihood as a function of the flux normalization of the putative DM signal independently in each energy bin (this procedure is similar to that used to evaluate the spectral energy distribution of a source). Within each bin, we model the putative dSph source with a power-law spectral model (dN/dE \propto E^{−\Gamma}) with spectral index of \Gamma = 2. By analyzing each energy bin separately, we avoid selecting a single spectral shape to span the entire energy range at the expense of introducing additional degrees of freedom into the fit.", this procedure necesarily neglects "migration" of events (from true to reconstructed energy), which depends on the entire energy spectrum, and not only on the one within a given bin, therefore it is effectively neglecting the energy dispersion in Equation 16.


Figure 2. "from 300 realizations" : "from 300 simulated realizations"

-> Ok

Lines 428-438 and Equation 22. It would useful to describe the equation a bit more precisely in the text. The equation is the product of Poisson likelihoods for the signal region, the background region and a Gaussian Prior on the scaling factor between the counts expected in the signal and background regions. Maybe just adding something to that effect after the equation would be helpful.

-> Done

It is also probably worth commenting the effect of the typical 1% uncertainty in the background normalization.
That 1% uncertainty implies that the best an analysis can reasonable expect to do is to have an uncertainty of about 100 counts on the signal, or a 95% UL of somewhere around 200 counts. Is that in fact typical for IACT results?

-> Ok, I have add some comment, although the topic deserves a separate paper probably


L 460-462. I suspect that these approximations are pretty much negligible compared to the J-factor uncertainties. It would be useful to give order of magnitude estimates of the size of the effects to confirm this.

-> As I had written, that really depends on whether or not the region of the acceptance of the region of camera where the potential DM "image" of the dSph would be projected is or not flat. Therefore, it depends at least on the DM profile of the dSph, the telescope observing it, and where in the FoV of such telesope. Also, this is a bias in Aeff computation that necessarily goes in the direction of overestimating the Aeff, i.e. providing better limits in the case of no detection, whereas the pdf for the logJfactor estimator is (or at least it is assumed to be) Gaussian distributed and it could go both ways. I also consider important to point this out here because one could imagine the J-factors of these or new sources to be low enough so that the Aeff effect would become comparatively more relevant, but the same method was used based on the fact that it was already validated with previous works.

L 516-517. Elsewhere in the paper the uncertainties of the J-factors are included in the limits. Here they are not, and that is artificially making the limits much more constraining than they should be. It would be good to quantify this effect.

-> For this analysis, the STATISTICAL uncertainties in the J-factors are included in the usual way described by equations 16 and 17. The SYSTEMATIC uncertainties are not, but this is not particular of this analysis, as discussed in lines 305-315

L 531 "fist" : "first:"

-> ok

Figure 3. "the median of the distribution of limits under the null hypothesis" : "the median of the distribution of expected limits under the null hypothesis" Just to be explicit that the bands are expectation bands, and derived from simulations.

-> the expected limit or sensitivity is the median of the distribution, I do not think the proposed change is correct


L 546-551. Some estimates of the magnitude of the effects caused by the different analysis choices described here would be very useful.

-> It is written in line 551: "This unbinned analysis hence typically produces results that are several tens of percent artificially more constraining than the binned one."

L 567. "In addition, for the first time, the Off/On" It isn't clear if this is the first time the MAGIC collaboration considered the uncertainty on tau, or if it is first time anyone did. Part of the confusion is that equation 22 includes the tau as a nuisance parameter, and the wording there seems to indicate that this is the standard formulation for the likelihood used by IACTs.

-> After Eq 22 rephrased: "and $G$ an (often neglected) Gaussian PDF with mean the measured value $\tauobs$ and width $\sigmatau = \sqrt{\sigmataustat^2 + \sigmatausys^2}$". In line 567: "for the first time in the analysis of Cherenkov telescope data"

L 593+ "We remind that" : "We remind the reader that"

-> ok

L 605-606. Is it really the mean that is reported? Not the median? If the uncertainties in J are expressed in log-space these could be quite different. Also, as noted, this prescription for taking into the account the J-factor uncertainties is very different that profiling over the J-factor uncertainty, as done by others. It would be useful to comment on the scale of the effect.

-> It is actually the median, thanks for pointing out, fixed. An the effect (factor ~2) quantified: "However, the main reported result in this case is still the median of such distribution, which is only sensitive to the central J-factor and not to its uncertainty, producing limits a factor $\sim 2$ more constraining than if $J$ was considered a nuisance parameter."

L 607-628. This discussion is confusing and not really convincing. If I understand correctly, the method presented here ignores the contribution to the Poisson likelihood of the total number of observed events, i.e., essentially considering the PDF in the totally number of events as a delta function at the number of events actually observed, and focuses instead on the distributions of E\prime and Theta\prime for those events. I believe that this approach requires the signal and background PDFs being properly normalized over the ranges relevant for the analysis. It is also stated that b (i.e., the "number" of background events) is fixed during the likelihood maximization. Since s (the "number" of signal events) is allowed to vary, and all the quantities in the likelihood consist of combinations of s and b, this is essentially equivalent to varying the fraction of signal events in the fitting. f_b is also held fixed during the fitting, it isn't stated explicitly, but I imagine that f_s is also held fixed, as the PDF in E' and Theta' seems to contain all the information about the signal that is being fitted. So basically, it sounds like the only thing that is being freed in the fitting is the fraction of events attributable to the signal. Then each event has two contributions to the log-likelihood, basically the current signal fraction times the value of the signal PDF given E' and Theta' and the current background fraction times the value of the background PDF given E' and Theta'. It should also be stated clearly if this methodology requires that the signal and background PDFs (f_s and f_b) must be normalized, and if they are how that normalization is performed; i.e., if E' and Theta' are truly independent or if Theta' is conditional on E'.

-> I have removed the more technical parts of the explanation in lines 607-628, since probably it is too much detail and a difficult to understand and off topic, so I better refer the reader interested in the details to the original paper
f_b and f_s are PDFs and therefore normalized to integral 1. f_s normalization is guarantee by construction (see Eq. 28).
As for the dependence on E' and theta', as already mentioned: "The dependence of f_b with on E′ is modeled by smearing the distribution of E′ measured for events of the background-control (Off) region, whereas the spatial distribution is assumed to be uniform within the On region", for the case of f_b, and for the case of f_s see Equation 28 and the line below that.


At lines 619-620 it is stated that "Fourier transform of the PDF for a compound Poisson distribution can be simply computed as a function of mu and the PDF for x_i." It isn't clear if the rest of the paragraph is then a description of that process, or if the "function of mu and the PDF for x_i" is not given in the text. It would be good to clarify that, and given the relevant function, if possible.

-> removed

It isn't clear exactly how this maps into the stated condition for compound poisson variables. The statement at lines 620-622: "In our case, N is the number of observed
signal or background events, with mu = s or b, respectively, and the x_i are the single-event contributions to 2ln lambda_p from signal or background events, respectively." could be interpreted in a number of different ways. What does the index i run over? At what point in the process do we define s and b, i.e., are they the best fit values, or does it even matter? Etc...

-> removed

Anyway, I think that the point here is just that the log-likelihood just consist of some number of events taken for a signal distribution and some other number of events taken from a background distribution, and all the signal events share the same pdf, and similarly all the background events share the same PDF. It might be helpful to state that in plainer language in addition to the formal defintion given here.

-> I hope that it is now more clear once the more technical part has been removed

Similarly, the sentence at lines 622 to 625 is very difficult to follow. In part b/c it is not quite clear what some parts of the sentence refer to: 1) what is the "it" in "by computing it in sufficiently fine bins" (literally what the text is saying is that you can compute a quantity by computing it in small bins, that can't be quite right, many you mean to say that you can compute the contribution from each bin) 2) what are "the obtained values" in "and introducing the obtained values in a distribution" (literally this they could be the values from each of the bins, or the value for each of the events, I'm pretty sure that it is the value for each bin, but it would be good to be explicit) 3) what is the "distribution" referred to by "and introducing the obtained values in a distribution weighted by", do you just mean, "and weighting them by" 4) what is the "they" referred to by "their corresponding s_ij value", I think you mean something like "the expected number of signal events in that bin of E' and Theta' (i.e., s_ij)" Anyway, this sentence would be clearer if these quantities were referred to by standard terms or symbols.

-> removed

It is also not clear from the text how one performs the Fourier transfrom in question. What is the variable (or variables) that is being transformed over? Is it the input variables to the PDF (E', Theta'), or the weighted distribution of the partial log-likelihoods (I think so, but it is not clear from the text)? What sort of boundary conditions need to be applied, if any, etc... I think that the Fourier transform in question is on the weighted distribution of the log-likelihoods for signal and background PDFs, and that the transforming variable is the log-likelihood itself, but this should be stated explicitly.

-> removed

It is also not clear how the PDF of the likelihood function is actually turned into something like a p-value, i.e., the probability to observe a particular set of results if the null hypothesis were true, using this methodology. The description given just how to compute the PDF of the likelihood function, it doesn't distinguish between the true values of the parameters and the fitted values of the parameters. Anyway, given my imperfect understand of this methodology, I believe that it could be used to determine the PDF for log-likelihood values, assume that s = 0 (i.e., that the null hypothesis is true), or, alternatively, that it could be used to derive the PDF for log-likelihood values for a particular value of s, (say the best fit value). It would be good to clarify how it is being used.

I'm dubious about presenting this methodology as an improvement over the standard methodology based on Wilks' theorem. It seems to me like there some details that could adversely affect the results, in particular, the choice of binning used to compute the distribution of the log-likelihoods, the limits of the E', and Theta' used to define integration range to normalize the PDFs, the fact that the fluctuations in the total number of events has been ignored in the treatment. If the methodology requires that that the signal and background PDFs have to be normalized I think that is pretty heavy downside as the details of how exactly the normalization is done could present some significant complications.

Overall, I'm not sure that this methodology warrants quite some much space in the paper. If the author does want to include a description of it, they might consider it as an appendix.

-> agreed, and removed, also, start the paragraph by "A possible advantage" and end it with "under the assumption that the likelihood function was correct"


L 648-649. It is my understanding that while the output of the f_hit analysis more a proxy of the energy, than a reconstructed energy (with associated energy dispersion matrix) in the sense of equation 17. I think in this context it is being treated as if it were a reconstructed energy, with the caveat that MC simulations were used to generate the PDFs for f_hit for different input spectra, rather than used explicitly tabulated IRFs, as is done with Fermi. Also, I believe that the HAWC collaboration has developed a new energy estimation analysis, with a much tighter response matrix, and which they will be able to treat more as a reconstructed energy than the f_hit (or n_hit) analysis.

-> f_hit is an energy estimator (any function of the data formally is), that could be characterized by its pdf as a funcion of true energy, even if HAWC didn't do it. They instead used MC simulations to do the convolution of the physical signal spectrum with the IRF, which I already mention it is equivalalent to do it using Equation 16. It is intended to highlight these two aspects, so I do not change anything here.
Also it is out of the scope of this review to comment analysis improvements that have not been applied to the dSph published results


Figure 4. The presentation of the expectation bands in the left-hand plot are pretty non-standard. Is the dashed line the median expectation and then the colored bands the expectation bands? If so, why are only the positive bands shown? And why does the actually result track the median expectation so closely? If, other the other hand, what is being plotted is actually something different, then it should be explained in detail.

-> The bands correspond to what is defined in the caption, they look assymetric because MAGIC takes the sensitivity as the minimum possible upper limit in this paper, it is explained in lines 558-564


667-670. Here is another place where it would be helpful to include quantitative details about the size of the effects that the HAWC collaboration did consider.

-> I agree, but I have not access to the necessary information, so I point out that it has not been done


672-675. "The significance of rejection of the null hypothesis for all targets ... is within +- 2 sigma, except for few marginally larger negative fluctuations." You can't have a negative significance, this is mixing two different statistical concepts. It would be better to phrase this to say that the maximum likelihood estimates are consistent with zero to within +- 2 sigma.

-> Absolutely, removed the "+/-"

Round 3

Reviewer 1 Report

The author has done a very nice job addressing my fairly extensive comments, which is very much appreciated. The paper is almost ready for publication. I have only a small number of follow-up comments. I would like to see the next draft, but I expect to be able to recommend it for publication very quickly.

Residual comments:

-----------------------------

The paper could also benefit from a bit more high-level discussion about the difference between analysis with Fermi, the IACTs and HAWC. Fermi analyses are often signal-limited, particularly at high energies, which IACT and HAWC analysis are background limited. What's more, the background subtraction methods used by HAWC and by the IACTs differ in significant ways. It would be useful to give a overview of this (perhaps towards the beginning of either sections 3 or 4, before jumping into the details).

-> This goes against how the storyline is organized, which is related to a certain message I want to convey: that all analyses can be described by a general framework valid for all, that then need to be particularized/adapted to the different experimental situations.


While I'm sympathetic to this, by trying to put everything into a single framework, the paper has glossed over some very important differences in the data analysis between the different types of telescopes. Although the information is available in the more detailed descriptions of the analysis frameworks used by the different experiments, it would be go if it were presented more explicitly. A paragraph or two either of introduction at the beginning of section 4, or summary at the end of that same section that explicitly lists the most releveant similarities and differences in data analysis between the different telescopes would be extremely useful.

-----------------------------

The LAT data are publicly available and several authors outside the LAT collaboration have analyzed the dSphs for DM annihilation signals. Although many of the papers depart significantly from the common statistical framework described in this paper, and describing the various analysis details is probably outside the scope of this paper, that should be explicitly stated and the papers acknowledge in some form.

-> I have included a sentence at the beginning of section 4.1 and cited all papers about searches for DM in dSphs that are in the Fermi-LAT publications website https://www-glast.stanford.edu/cgi-bin/pubpub

I was mainly referring to papers from authors outside the LAT collaboration, sorry for the lack of clarity. There are a number of different approaches used by external authors. Although it is beyond the scope of this paper to describe them, they should at least be acknowledged.

-----------------------------


The paper leaves out any discussion of random direction control studies, which are, in fact, and excellent tool for evaluating the PDF of the test statistic.

-> This is mentioned twice: in lines 324-325 and in Caption of figure 2

Sorry for the confusion, I meant specifically in regards to the calibration of the p-values. The previous draft had a detailed description of the method that Veritas used to calibrate p-values, but spent almost no time with random fields, which, when feasible, are pretty a very robust way to empirically calibrate p-values. Anyway, in the new draft less time is spent on p-values, so this doesn't seem as relevant, however, explicitly adding something like "such as those obtained by analyzing randomly selected directions as potential DM targets" at line 305, wouldn't hurt.

-----------------------------


L 266. In partical terms it is often Chernoff's theorem that applies, not Wilks' theorom. For example, for Fermi-LAT searches the signal is not allowed to be negative, so we actually have a bounded DOF and expect 50% of the trials to have TS=0 and the other 50% to follow a chi^2 distribution.

-> Yes, it is actually discussed later (l316 on) how the conditions of Wilks' theorem are not fulfilled. I prefer to refer to Wilks theorem which I believe it is more known among potential readers. Since it is later said that conditions are not met, I think all statements are correct.

When I read the current text it seems to me that it presents two things as similarly important issues: 1) non-Gaussianity coming from imperfect modeling and 2) the fact that half the trials will give TS=0 if the null hypothesis is true b/c it lies at the edge of the physically allowed region. To my mind the first issue is very challenging to deal with, while the second can be handled trivially by invoking Chernoff's theorem. It would be good to reword the text to avoid giving the impression that these two issues are equally challenging. Chernoff's theorem (and it's predictions for the distribution of the TS) can be described concisely as analogous to Wilks' theorem.

-----------------------------


L 392+ and 395-397. In fact, in LAT pass 8 analyses of the dSphs the energy dispersion was considered. Also, studies showed that for the spectra being considered, the effects of using the energy-bin likelihoods as a intermediate step was negligible compared to the uncertainties in the J-factors.

-> Phys Rev Lett 115 (2015) 231301 page 4, bottom, reads: "After fixing the background normalizations, we scan the likelihood as a function of the flux normalization of the putative DM signal independently in each energy bin (this procedure is similar to that used to evaluate the spectral energy distribution of a source). Within each bin, we model the putative dSph source with a power-law spectral model (dN/dE \propto E^{−\Gamma}) with spectral index of \Gamma = 2. By analyzing each energy bin separately, we avoid selecting a single spectral shape to span the entire energy range at the expense of introducing additional degrees of freedom into the fit.", this procedure necessarily neglects "migration" of events (from true to reconstructed energy), which depends on the entire energy spectrum, and not only on the one within a given bin, therefore it is effectively neglecting the energy dispersion in Equation 16.

Actually that isn't quite right that doing single energy bin fits is equivalent to neglecting the energy dispersion. The single energy bin fits can include the effect of migration from adjacent energy bins when computing the model (the fermi software certainly supports this and has access to the information required to do these computations, such as the binned exposure and PSF convolved source model over the full energy range of the entire fit). There are a number of subtleties about how exactly the spectral shape is handled in computing the migration, etc... that are well beyond the scope of the paper. However, to first order, it is accurate to say that the effects of energy dispersion are considered, and that the likelihoods are computed in True Energy Flux v. True Energy space, as opposed to Reconstructed Energy Flux v. Reconstructed Energy space. So, again, to first order the energy redistribution has been considered.

-----------------------------


Figure 3. "the median of the distribution of limits under the null hypothesis" : "the median of the distribution of expected limits under the null hypothesis" Just to be explicit that the bands are expectation bands, and derived from simulations.

-> the expected limit or sensitivity is the median of the distribution, I do not think the proposed change is correct

That the "expected limit" is the same as the "sensitivity" is the same as the median of the distribution is not really a universal convention . "Expected limit" could also mean "limit you obtained from one simulated realization". Anyway, perhaps "the median of the distribution of limits obtained for simulated realizations of the null hypothesis" is both precise and accurate?

-----------------------------


Figure 4. The presentation of the expectation bands in the left-hand plot are pretty non-standard. Is the dashed line the median expectation and then the colored bands the expectation bands? If so, why are only the positive bands shown? And why does the actually result track the median expectation so closely? If, other the other hand, what is being plotted is actually something different, then it should be explained in detail.

-> The bands correspond to what is defined in the caption, they look asymmetric because MAGIC takes the sensitivity as the minimum possible upper limit in this paper, it is explained in lines 558-564

Ok, I had not understood that is what they did. I realize that the author probably has no control over this, but for the record, presenting limits this way is a bad idea. Limits that dive well below the expectation band are a clear sign that something is wrong with the modeling, doing anything that could obscure that information is dangerous. Anyway, since this is the way that the results are presented by the MAGIC collaboration I guess we are stuck with it.

 

Author Response


The paper could also benefit from a bit more high-level discussion about the difference between analysis with Fermi, the IACTs and HAWC. Fermi analyses are often signal-limited, particularly at high energies, which IACT and HAWC analysis are background limited. What's more, the background subtraction methods used by HAWC and by the IACTs differ in significant ways. It would be useful to give a overview of this (perhaps towards the beginning of either sections 3 or 4, before jumping into the details).

-> This goes against how the storyline is organized, which is related to a certain message I want to convey: that all analyses can be described by a general framework valid for all, that then need to be particularized/adapted to the different experimental situations.


While I'm sympathetic to this, by trying to put everything into a single framework, the paper has glossed over some very important differences in the data analysis between the different types of telescopes. Although the information is available in the more detailed descriptions of the analysis frameworks used by the different experiments, it would be go if it were presented more explicitly. A paragraph or two either of introduction at the beginning of section 4, or summary at the end of that same section that explicitly lists the most releveant similarities and differences in data analysis between the different telescopes would be extremely useful.


-> As recognized by the referee, the relevant information is in the paper, only organized in a different way than she/he suggests. As already mentioned in the previous reply, this is intentional and part of the message I want to convey. I have however given another deep thought about this suggestion and found no way of accommodate it without majorly disrupting the mentioned storyline. I will therefore not include the suggested paragraph.


-----------------------------

The LAT data are publicly available and several authors outside the LAT collaboration have analyzed the dSphs for DM annihilation signals. Although many of the papers depart significantly from the common statistical framework described in this paper, and describing the various analysis details is probably outside the scope of this paper, that should be explicitly stated and the papers acknowledge in some form.

-> I have included a sentence at the beginning of section 4.1 and cited all papers about searches for DM in dSphs that are in the Fermi-LAT publications website https://www-glast.stanford.edu/cgi-bin/pubpub

I was mainly referring to papers from authors outside the LAT collaboration, sorry for the lack of clarity. There are a number of different approaches used by external authors. Although it is beyond the scope of this paper to describe them, they should at least be acknowledged.

-> I have added a few more references that I could find by running "find t dark matter dwarf lat" at inspirehep.net

-----------------------------


The paper leaves out any discussion of random direction control studies, which are, in fact, and excellent tool for evaluating the PDF of the test statistic.

-> This is mentioned twice: in lines 324-325 and in Caption of figure 2

Sorry for the confusion, I meant specifically in regards to the calibration of the p-values. The previous draft had a detailed description of the method that Veritas used to calibrate p-values, but spent almost no time with random fields, which, when feasible, are pretty a very robust way to empirically calibrate p-values. Anyway, in the new draft less time is spent on p-values, so this doesn't seem as relevant, however, explicitly adding something like "such as those obtained by analyzing randomly selected directions as potential DM targets" at line 305, wouldn't hurt.

-> Added

-----------------------------


L 266. In partical terms it is often Chernoff's theorem that applies, not Wilks' theorom. For example, for Fermi-LAT searches the signal is not allowed to be negative, so we actually have a bounded DOF and expect 50% of the trials to have TS=0 and the other 50% to follow a chi^2 distribution.

-> Yes, it is actually discussed later (l316 on) how the conditions of Wilks' theorem are not fulfilled. I prefer to refer to Wilks theorem which I believe it is more known among potential readers. Since it is later said that conditions are not met, I think all statements are correct.

When I read the current text it seems to me that it presents two things as similarly important issues: 1) non-Gaussianity coming from imperfect modeling and 2) the fact that half the trials will give TS=0 if the null hypothesis is true b/c it lies at the edge of the physically allowed region. To my mind the first issue is very challenging to deal with, while the second can be handled trivially by invoking Chernoff's theorem. It would be good to reword the text to avoid giving the impression that these two issues are equally challenging. Chernoff's theorem (and it's predictions for the distribution of the TS) can be described concisely as analogous to Wilks' theorem.


-> Added sentence right after talking about case 2): "This can be avoided by using the correct $-2\ln\lp$ PDFs for this situation \cite{Chernoff, H. On the Distribution of the Likelihood Ratio, Ann. Math. Statist. 1954, 25 573-578.}"

-----------------------------


L 392+ and 395-397. In fact, in LAT pass 8 analyses of the dSphs the energy dispersion was considered. Also, studies showed that for the spectra being considered, the effects of using the energy-bin likelihoods as a intermediate step was negligible compared to the uncertainties in the J-factors.

-> Phys Rev Lett 115 (2015) 231301 page 4, bottom, reads: "After fixing the background normalizations, we scan the likelihood as a function of the flux normalization of the putative DM signal independently in each energy bin (this procedure is similar to that used to evaluate the spectral energy distribution of a source). Within each bin, we model the putative dSph source with a power-law spectral model (dN/dE \propto E^{−\Gamma}) with spectral index of \Gamma = 2. By analyzing each energy bin separately, we avoid selecting a single spectral shape to span the entire energy range at the expense of introducing additional degrees of freedom into the fit.", this procedure necessarily neglects "migration" of events (from true to reconstructed energy), which depends on the entire energy spectrum, and not only on the one within a given bin, therefore it is effectively neglecting the energy dispersion in Equation 16.

Actually that isn't quite right that doing single energy bin fits is equivalent to neglecting the energy dispersion. The single energy bin fits can include the effect of migration from adjacent energy bins when computing the model (the fermi software certainly supports this and has access to the information required to do these computations, such as the binned exposure and PSF convolved source model over the full energy range of the entire fit). There are a number of subtleties about how exactly the spectral shape is handled in computing the migration, etc... that are well beyond the scope of the paper. However, to first order, it is accurate to say that the effects of energy dispersion are considered, and that the likelihoods are computed in True Energy Flux v. True Energy space, as opposed to Reconstructed Energy Flux v. Reconstructed Energy space. So, again, to first order the energy redistribution has been considered.

-> Thanks for this clarification. I removed the part of the migration matrix being approximated by a delta. What I actually want to convey is that the tabulated lkl values are useful provided the spectral shape searched for is not *significantly* different from the one used when computing the table, where "significantly" is difficult to quantify and that on top of that it depends on the migration matrix (so for example for Cherenkov telescopes smaller spectral differences could be significant). There would be no such problem of course if the energy estimator PDF was a delta function, but this does not mean that you are assuming that, as I was writing before.

-----------------------------


Figure 3. "the median of the distribution of limits under the null hypothesis" : "the median of the distribution of expected limits under the null hypothesis" Just to be explicit that the bands are expectation bands, and derived from simulations.

-> the expected limit or sensitivity is the median of the distribution, I do not think the proposed change is correct

That the "expected limit" is the same as the "sensitivity" is the same as the median of the distribution is not really a universal convention . "Expected limit" could also mean "limit you obtained from one simulated realization". Anyway, perhaps "the median of the distribution of limits obtained for simulated realizations of the null hypothesis" is both precise and accurate?

-> Ok

-----------------------------


Figure 4. The presentation of the expectation bands in the left-hand plot are pretty non-standard. Is the dashed line the median expectation and then the colored bands the expectation bands? If so, why are only the positive bands shown? And why does the actually result track the median expectation so closely? If, other the other hand, what is being plotted is actually something different, then it should be explained in detail.

-> The bands correspond to what is defined in the caption, they look asymmetric because MAGIC takes the sensitivity as the minimum possible upper limit in this paper, it is explained in lines 558-564

Ok, I had not understood that is what they did. I realize that the author probably has no control over this, but for the record, presenting limits this way is a bad idea. Limits that dive well below the expectation band are a clear sign that something is wrong with the modeling, doing anything that could obscure that information is dangerous. Anyway, since this is the way that the results are presented by the MAGIC collaboration I guess we are stuck with it.

-> Yes I full agree with this

Round 4

Reviewer 1 Report

I would like to thank the author for the careful thought he put into my comments.  I am satisfied with all of his replies and modifications.  

I would have preferred if they were a more explicit about the differences between the analysis methods applied to the different types of telescopes, but at this point I'm willing to let him proceed with the draft at it is.   This is a very useful paper, and publishing it now has large benefits to the community.

Congratulations on a job well done.

 

 

 

 

Back to TopTop