Next Article in Journal
The Value of Scientific Knowledge Dissemination for Scientists—A Value Capture Perspective
Previous Article in Journal
Rinse and Repeat: Understanding the Value of Replication across Different Ways of Knowing
 
 
Review
Peer-Review Record

Replication Papers

Publications 2019, 7(3), 53; https://doi.org/10.3390/publications7030053
by Peter Harremoës
Reviewer 1:
Reviewer 2:
Reviewer 3: Anonymous
Publications 2019, 7(3), 53; https://doi.org/10.3390/publications7030053
Submission received: 31 December 2018 / Revised: 5 July 2019 / Accepted: 17 July 2019 / Published: 22 July 2019

Round 1

Reviewer 1 Report

The opinion paper by Peter Harremoës proposes that a new type of paper to be introduced: replication. In his definition – if I got it right – a replication is the independent achievement of similar conclusion with a (slightly) different method. This is put in contrast to repeatability, which is the ability to do what is written in the method section; and reproducibility which is the obtainment of the same conclusion with the same method. Repeatability – I think – is checked as much as possible during peer-review. I’m a theoretical biologist, working mostly with simulations. When I review a method section, if I can get a reasonably good mental image of how the events unfold in the simulation, and if all parameters are listed, then I have a feeling that it can the repeated. However, there could be factors that the method section does not include but affect the results. This might be less important in numerical investigations, but in psychology and sociology (to some extent ethology included) anything can influence the outcome, even the weather or the color of the room the experiment takes place in. So even repeatability is not that easy to judge without trying to reproduce the results.

While the claim includes reproduction, the author champions replication of studies. I think the distinction should be made clearer. The example given about mathematical proofs redone with other methods is a telling one, but as the author also mentions, the proofs are hard to follow in the first place, and there might be no other ways to prove (or it will come later, and it will be a new result in itself). However, the reproducibility/replication crisis is more about clinical studies, biomedical experiments, and how we see ourselves (c.f. the Stanford Prison Experiment, and to what extent can it be replicated).

Can an experiment be reproduced, i.e. redone in the same manner? When it comes to physics and organic chemistry, for example, it is my belief that the answer is yes. Can a sociological experiment be reproduced? I think not. You cannot test the exact same subject pool. Deviation from the results of the previous experiment could either point to a flaw in the design, interpretation, statistical test, conclusion, etc. or it could be a genuine new result. For example, estimates of heritability in genetic studies will be different, because theory suggest that heritability changes with population and environment. Thus, repeating the experiment cannot be a reproduction, only a replication and will tell nothing about the validity of the original experiment. Similarly, we know that Public Good experiments in experimental economy results in different outcome in different cultures, or even if the experiment is done with economy students or psychology undergrads. One cannot reproduce the experiment as it may require naïve subjects, or the changing environment have an effect (short term financial hardship, or global financial ups and downs could affect the results of the PG game).

Still, in many cases – especially in the natural sciences – reproduction is possible and should be done. Replication, however, raises its own questions. The author states that (L108) “replication paper should not introduce new ideas or explanations”. But if they are not reproductions, then by definition they have new ideas. If slight changes in methodology has an effect then it is a new result, but it is also (potentially) a new result if it does not.

I think some of the above need to be discussed, and the distinction between replication/reproduction be made clearer, also their usage throughout the text should be consistent.

In summary, this is an opinion paper, and everyone is entitled to his/her opinion. It also raises a valid argument and scientific publication should also include debates, not only one-way declaration of new results. In these senses, I suggest the manuscript to be accepted.  But allow me to add a snarky quote from the manuscript: “Nowadays any piece of research can be published somewhere.”

 

 

Minor comments on the scientific content

L36-L44 The type II error is still unknown. There are too many ways to fudge with statistics. Simply stating alpha does not tell much, and telling P basically tells nothing (only the fact whether it is below or above the priory set threshold of alpha).

L54 I would add “or desirable” to the end of the sentence or replace possible with desirable. While I have read paper is biology that no ethical board would allow to be conducted now, this is more the realm of ethics and desirability than the realm of objective possibility.

L119 If all replication studies would be published in the same journal, then there would be a flood of such papers replicating research in top tier journals. It might not be a bad thing, but the management of said journals would not allow it. Nowadays papers can be searched online, so replication studies are not necessarily buried if not published in the same journal.

L134 I think it is a replication and not a reproduction then

L170-174: This is called a robustness check, and it should be done in every simulation studies. I do not think hardware errors would have an effect, scientist should know if their computer has bad components. Mostly it will kill the program. Software errors (bugs) are another problem, and there reproduction can help a lot. Replication is usually sold as a novel paper (incremental complexification or trying out new parameter combinations).

 

Minor comments

The paper has quite a number of spelling error, ones that should be spotted by leaving spell checking on in the document processing applications. I think there are a number of punctuation errors, but as I’m not an expert on English punctuation, and some might be open to debate, I will not mention them.

L10: advises -> advices

L13: survays -> surveys

L14: I would spell out the millions here.

L21: expression "publish or parish" is has become an important advise -> expression "publish or perish" has become an important advice. I would be very interested to know what is “publish or parish” ?

L22: Rephrase: “has lead to much low quality research”

L28: survays -> surveys

L41: “want to the repeat” -> “want the repeat”

L54: delete one neither

L55: hypotese -> hypotheses; survays -> surveys; nul -> null

L58: psychology should not be capitalized here, also there is a space missing after it and before “was”

L60: put a “can” at the end of the line, the sentence misses a verb.

P62: “and here will only be given a brief overview” -> and here only a brief overview will be given

L68: I think it should be reproduced instead of reproducing

L78: delete “they”

L82: space missing between studies and were

L90: consider “replicated previous” -> “replicated a previous”

L91: repllications -> replications

L93: saem -> same

L116: the correct name is “Public Finance Review”

Table 1: I do not feel it is necessary.

L127: Consider “details about how the original” -> “details of how the original”

L127: survays -> surveys

L128: repoirted -> reported

L144: “it may be difficult find reviewers” -> “it may be difficult to find reviewers”

L147: allthough -> although

L168: forshortened ->  shortened


Author Response

Dear reviewer


Thanks for your constructive comments. I have tried to address all the comments from you and the other reviewers. Your comments have been addressed as detailed below.


Minor comments on the scientific content

Line 36-L44: I have extended the terminology section and also described in more detail why it is worth mentioning the significance level. I am fully aware that significance levels and p-values are heavily misused, but I think the significance level is to a large extend the number that should be compared with success rate of replication studies.

Line 54: Has been changed as suggested.

Line 119: The statement has been modified so that I just state it would be desirable that replication papers could be published in the same journal as the journal of the original paper. This would ensure that the replications were reviewed according to the same standards as the original papers.

Line 134: I keep "reproduced" in this line. If the setup is slightly different but gives the same results, that means that the slightly changed setup was irrelevant, so that it is a reproduction and not a replication.

Line 170-174: I have added the term robustness and some discussion to the section on terminologi, so I will not repeate this point in the subsection on simulations. 

Minor comments

These have been addressed as suggested. 

Reviewer 2 Report

This opinion piece addresses an issue of great importance--the role of publishers in supporting research integrity. Specifically, this piece calls for this publisher to introduce a new type of paper category: replications. This new type of submission has the potential to mitigate some aspects of the replication crisis by allowing a venue for these non-novel reports.

The author may wish to add reproducibility and replicability as keywords.

In the introduction, it would be helpful to cite references that confirm the author's assertion about the number of articles published it might take to achieve each academic rank. This varies considerably across institutions and disciplines, so it would be good to understand where this demarcation comes from. [lines 16-19]

Brian Nosek's quote on lines 23-24 should be cited.

As further evidence to the author's assertion that "nowadays any piece of research can get published somewhere," a brief discussion or sentence about the literature that shows how articles getting rejected from one journal end up in the literature elsewhere could be helpful. That would demonstrate that it is not just hoax/fake publications. If this is not discussed, this sentence should have a citation to some evidence. [line 24]

The phrase, "is kind of depressing," in line 29 is indeed accurate for me, anyway, but perhaps could be rephrased.

I found the discussion of terminology rather confusing, I think perhaps because definitions were given for repeatability, reproducibility, and replicability did not necessarily conform to others' definitions of the terms. Many times, repeatability and reproducibility are used interchangeably to mean what the author refers to as repeatability, for instance, and replicability sometimes used for what is here considered reproducibility and replicability. This may be due to my interpretation only, but I would suggest citing where these definitions came from, perhaps along with citing some of the debate around the definition of these terms. One potential article is "What does research reproducibility mean?" DOI: 10.1126/scitranslmed.aaf5027, though many other definitions exist.

On line 73, is it 43%?

Center for Open Science [line 77]

Section 3.1 (Medicine) could be elaborated upon. For instance, one of the key articles about preclinical reproducibility "Raise standards for preclinical cancer research" in Nature (https://www.nature.com/articles/483531a) is not cited. This was a fairly seminal study that kicked off much of the concern in this field.

In line 82, I am not clear on what the "lower replication rate" refers to--lower than what?

Throughout the paper, the journal names should be italicized.

I am not convinced that replication papers do not qualify as articles, brief reports, or other types of article. That being said, I think the author's premise, that these are important types of article that should be given special consideration, is worthy and well-described.

Section 5.3 presents some interesting ideas. I wonder if the author has considered how this type of replication may fit in with open data and data sharing ideals. It also led me to think about how some disciplines, such as computer science, are noting these type of replications using badges. Though this may not fit with the author's argument, I thought this presented an interesting compliment for those efforts elsewhere.

The bibliography has a few references which appear incomplete, including #12 and #15.

Author Response

Dear reviewer


Thanks for your constructive comments. I have tried to address all the comments from you and the other reviewers. Your comments have been addressed as detailed below.


Reproducibility and replicability have been added as keywords.


I added some references to support my claim that a huge number of publications is needed in order to fill all the academic positions.


I added some references to editors that recommend authors of rejected manuscripts to submit to another journal (after revision).


Former line 73 has been reformulated. It is 43 studies and not 43 %.


In line 77 "US Center for Open Science" has been replaced by "Center for Open Science".


I added a reference to the paper "Raise standards for preclinical cancer research" and added a few comments to their conclusions.


Former line 82 has been reformulated.


In Section 5.3 I have mentioned that that there may be other places where replications can be published.


The (former) references [12] and [15] have been completed. 

Reviewer 3 Report

The paper is titled “Replication Papers” and is an opinion peace arguing for the establishment of replication paper series within scientific journals. The paper suggests a specific type of article or paper, which solely focuses on replication (or reproduction) of an already existing piece of research, without adding any novelty beyond proving or falsifying the previous results. The idea the paper poses is not entirely new, in several fields of research there exists already a replication literature. The author claims that the social sciences are behind in terms of replicating previous results; however, several initiatives aim at bridging this gap already, in this way, the proposal made in this article may be yet another way to address the replication crisis within the social sciences.


The general comment for this paper: It is difficult to assess the overall impact of the paper. Not many editors will read the article and then establish a new type of research article in their journal, especially since journal space is usually scarce. Even if they did, they would still face authors in a scientific world where the acceptance of a replication study, that simply reproduces already known results, does (sadly) not yield many (if any) citations or any other benefit to the replicating authors. Hence on both sides of the table, replication studies are a risk – if authors detect a serious flaw in a well-published paper, both authors and editors will be willing to invest time and journal space to publish the work. If it is a further proof of an existing result, this is less likely to happen, but it would be clearly important. The author does not suggest a solution to this dilemma.


Specific comments to the paper:

1)      In the introduction, the author mentions the “traditional experimental sciences like physics and chemistry”, where reproduction of experiments is an integral part of the education, “while this is not the case in social sciences”. It is not clear, whether the remainder of the paper focuses on this gap in the social sciences or is targeting all fields, i.e. science as a whole. In addition, there seem to be several reasons for the social sciences not being a “replication” science, one of them probably being lack of money to reproduce (or rather repeat) large surveys to reproduce results with a different data source.

2)      The author mentions that 225000 PhD students are educated – is this worldwide? Do 225000 PhD students graduate each year? In all scientific fields? If numbers are used here, they should be precisely defined.

3)      The author further elaborates about the number of papers needed for different career steps. Again, this is somewhat imprecise: is this an average for all fields, and if so, would that be adequate to do, as there seems to be a strong dependence between field and needed scientific output to move up on the career ladder. In addition this is very country specific, which the author does not address at all.

4)      In the second part of the paper, the author defines some terminology about replication in general, framing “repeatability” as the pure repetition of a previous result. A sentence may need further explanation to the readers: “The more one aims at repeteability [sic] the more restricted are the conclusions that one can draw from the experiments.” This may be true, but for the issue at hand (can the previous result be repeated or not) this seems to be sufficient. In addition, many studies repeat first, then add further ideas to a problem. For the further flow of the paper, it is somewhat irritating that the author does not seem to stick with his own terminology: in line 91, he mentions  “direct repllications [sic] rather than a conceptual repetitions”, where the reader is left unclear on what the difference between the two in relation to the introduced terminology may be.

5)      In the section “The Replication Crisis” it is unclear, why these four fields (medicine, psychology, educational sciences, social sciences) are picked. Do other fields not have a replication crisis? Are there no studies about these fields? The author mentions that comparing different fields is difficult (line 63) and that there will be only a brief overview and the numbers “should be taken with great caution”. If the author truly wants to establish a solution to a problem, the problem should be precisely defined. Otherwise the proposed solution may not fit. The question is really, why the numbers and fields are selected and mentioned at all. It is quite clear, that there is some issue with replicating results, but the claim at the start of the paper was that one would look at what the publishers can do to help.

6)      The author suggests to publish replication papers in the same journal as the original article. While the author marks this as his opinion, he does not give a reason for this suggestion. One may be, that there will be a discussion within the journal, when the paper is published here as well; another could be that the replicating author has an incentive to publish in a top journal, even if “only” with a replication study. There is on the other hand the idea of a replication journal, which has been established in some fields (e.g. economics, International Journal for Reviews in Empirical Economics, https://www.iree.eu/). Here the likelihood of having an article accepted may be larger and the mission of the journal is very clear.

7)      The author then mentions replication types (section 5) but does not introduce this section at all. So it remains unclear, what the goal of this section may be: is it to show all possible types of replication works? Is it to provide some examples? What is the reader to take from this section, what on the other hand is its motivation? The proposed three types do not seem to be comprehensive. Other types may be data collection by technical means, like astronomers’ data. Or data collected for testing drugs. There are possibly more types than listed here, but there is no argument for this selection nor for leaving out others.

8)      In section 5.2 the question arises, whether mathematical proofs are subject to a replication crisis as well. The point is interesting, nevertheless, but may not fit into the replication discussion. In addition it seems that proving a mathematical theorem in a different way is actually new knowledge rather than replication.

9)      In section 5.3 the author mentions replications of simulations and numerical calculations. One suggestion is to “run the calculations using different software” (line 171). However, if this leads to different results, the reason could be differences a) in programming, b) in hardware and c) in software. Which of the three it is, cannot be resolved, unless one first starts with the same hardware, the same software and tries to re-program the simulations. Either way, this type of replication is likely to be very time-consuming and thus adds an additional barrier for the potential replicating authors.

10)      The idea to integrate replications in the coursework when becoming a scientist seems to be useful and has been proposed elsewhere already (see below).

Fecher, Benedikt / Fräßdorf, Mathis / Wagner, Gert G. (2016). „Perceptions and Practices of Replication by Social and Behavioral Scientists: Making Replications a Mandatory Element of Curricula Would Be Useful”. DIW Discussion Paper 1572. Berlin.
https://www.diw.de/documents/publikationen/73/diw_01.c.531685.de/dp1572.pdf


The paper is blessed with quite a few spelling errors:
- line 2: “healthy” instead of “heathy”
- line 8: “has” instead of “have”
- line 13, 28, 55, 127: “surveys” instead of “survays”
- line 19: “need” instead of “ned”
- line 21: “perish” instead of “parish” (although I like this one)
- line 21: drop “is”
- line 22: “pressure” instead of “presure"
- line 27: “has been” instead of “have been”
- line 33, 39, 42, 43, 178: “repeatability” instead of “repeteability”
- line 36, 41, 42: “experiment” instead of “experiement”
- line 41: “repeat” instead of “repeate”
- line 45: “a similar experiment” instead of “a similar experiments”
- line 49: “if it is not” instead of “if it not”
- line 50: “influence” instead of “influences”
- line 55: “hypothesis” instead of “hypotese”
- line 55: “null” instead of “nul”
- line 58: “Psychology was” instead of “Psychologywas”
- line 60: “studies are” instead of “studies”
- line 61: drop “be”
- line 61: “has been” instead of “have been”
- line 82: “studies were” instead of “studieswere”
- line 91: “studies” instead of “study”
- line 91: “replications” instead of “repllications”
- line 92: “that had produced” rather than “that have produced”
- line 93: “same” instead of “saem”
- line 99: “were significant effects” instead of “were a significant effects”
- line 112: “attempt” instead of “attempt”
- line 112: “turns out” instead of “turn out”
- line 113: “due to flaws” instead of “due flaws”
- line 113: “attempts” instead of “attemts”
- line 114: “Public Finance” instead of “Public Finanse”
- line 117: “allows” instead of “allow”
- line 117: “in the area” rather than “with the area”
- line 117: “and has developed” rather than “and they have developed”
- line 120: “strict” instead of “stricht”
- line 127, 136: “experiment” instead of “experiment”
- line 128: “reported” instead of repoirted”
- line 130: “Experiments” rather than “Experiment”
- line 136: “a repetition” instead of “an repetition”
- line 147: “although” instead of “allthough"
- line 148: “expertise” instead of “expertice”
- line 167: “calculations” instead of “culations”
- line 174: “variations” instead of “variantions”


Author Response

Dear reviewer,


Thanks for your constructive comments. I have tried to address all the comments from you and the other reviewers. Your comments have been addressed as detailed below.


Regarding the potential impact of a paper like this I can tell the background for this submission. Recently I had a conversion with dr. Shu-Kun Lin, the CEO of MDPI that publishes this journal and about 200 other journals. We discussed the replication crises and I proposed to add replication papers as one of the paper types that can be submitted to the MDPI journals. He said that it might be a good idea to introduce this type of paper, but instead of just introducing it he encouraged me to present the idea in the journal Publications to get my proposal in better shape by use of the peer-review system. Therefore I think the publication of this manuscirpt may have the the consequence that the 200 hundred MDPI journals may allow replication papers as a special category.


Specific comments to the paper

1) I have clarified that I consider replication papers to be relevant for all experimental sciences.

2) Yes, I have added the "worldwide" to the text.

3) I have rewritten this paragraph and I have also added some references to illustrate that with a huge number of researchers that each need publications to climb the career ladder, the number of publications also has to be huge.

4)

5) I want to convince the reader that there is a problem and that it is not confined to a single academic disciplin. Since there is a problem I suggest to introduce replication papers, which may be part of a solution to the problem. The extend of the replication crisis is difficult to say anything precise about, but I think it should be sufficient to document that the extend of the crisis is large enough to justify the introduction of a new paper category.

6) I have extended the text to address these questions.

7) I added an introduction to Section 5. I hope it clarifies the matter.

8) Finding new proofs of existing mathematical theorems is a well established activity, but normally one would add new theorems to a manuscript before submission, because the manuscript may otherwise risk rejection due to lack of novelty. As a mathematician I know this very well, and I have a number of unpublished new proofs of various theorems. The usual way of getting the new proofs published is to write a book.

9) All types of reproductions or replications may be very time consuming and they may also take other resources. If there is a risk that it will never be published one may abstain from doing these activities which is exactly the problem that replication papers may solve (or be part of a solution).

10) I have included this reference and reformulated the sentence.


Minor comments

These have been addressed as suggested. 


Round 2

Reviewer 3 Report

Dear author,

 

thank you for revising the manuscript and especially for the introductory remarks in your reply - this was helpful towards understanding the aim of the paper.

The introduction now is clearer and more concise than before, and thus I consider my points generally to be taken into account. I just still have issues with section 5 “replication types”. It is not comprehensive, i.e. as mentioned in my previous point 7, there are other types of replication data. And still, the goal of this section is not clear – if it is to show all possible types of replication types, it fails. If it is to show examples, it needs to be stated, and it should be clear, why these examples were chosen. In principle, this section could give more specific advice to editors of scientific journals in how to deal with replication papers and what things to look for (which would make the section probably much longer).

The additions to section 4 are helpful as well, although the relation to the impact factor (line 184) shows the key problem with this metric – if a replication and the original paper are cited because of issues with the replication, it is not for its quality and the relevance of its content, which is usually associated with a higher impact factor.

Before I raise my next points, let me make clear that I really think that replications are important. However, I believe that the following points are important and they need to be addressed by the scientific community in order for replications to become a true part of our scientific culture.

·         It seems that replications are thought to be less worthy than original research. This makes sense, as the community believes that only new exciting stuff is good, similar maybe to the real world outside academia. This then goes hand in hand with the previously raised question on incentives or rather benefits from replication. A successful replication study will not get you tenure in a top university. If you had the choice between doing a replication study or invest your time in some original work, I imagine most would chose the latter, and in terms of future prospects and recognition, rightfully so. If you have tenure in a top university, replication studies may be too boring for you to do. So we would need to change the culture around replication in the scientific communities (at least in the disciplines I know of, there may be others, where this is different).

·         Replication is still often seen as an attack on the integrity of other’s work – “why would you replicate my work, don’t you trust me”. So, again, there is a cultural change needed here.

·         How many replications are needed to establish a result? If we take the statistical route, would we need 100 replications with at least 95 with the same direction to say that the result is correct? Obviously this seems too much in terms of resources, but the two studies with the same result should not be considered as proof of a result, either. This then ties with the first point – if you were the sixth person to do a replication for the same problem, would you actually do this or would you rather be the first to replicate another study? And, if all replications are working with the original authors, imagine the time this costs for them each time a replication is conducted.

Overall, then I do not object a publication of the paper (after the correction of several typos), it should hopefully lead to more discussions on the topic.


Author Response

Thanks for the comments to the revised version. In order to avoid a significant extension of the manuscript I have addressed your comments as briefly as possible. 


In Section 5 In have clarified that I mention a few very different replication types in order to illustrate that one will need different guidelines in order to review different types of replication papers.


In Section 4 I have mentioned that a replication paper may increase the number of citations of the original paper. Even without replication papers it is a problem when counting citations is used to evaluate the quality of a paper. This problem in bibliometrics should not effect our decisions. On the contrary, we should be aware that bibliometrics is often quite problematic. It is the job of people working in bibliometrics to develop more accurate measures of paper quality.


I have added a few lines of discussion in the conclusion in order to address your last point. I think we aggree on this point and the only way I see really to promote replications is as part for the education of researchers.


I have corrected a number of types and I also made other minor corrections of the language.

Back to TopTop