Next Article in Journal
Empirical Evidence for the Functionality Hypothesis in Motor Learning: The Effect of an Attentional Focus Is Task Dependent
Next Article in Special Issue
Automated Essay Scoring Using Transformer Models
Previous Article in Journal / Special Issue
Cognitively Diagnostic Analysis Using the G-DINA Model in R
 
 
Tutorial
Peer-Review Record

Reproducible Research in R: A Tutorial on How to Do the Same Thing More Than Once

Psych 2021, 3(4), 836-867; https://doi.org/10.3390/psych3040053
by Aaron Peikert 1,2,*, Caspar J. van Lissa 3,4 and Andreas M. Brandmaier 1,5,6
Reviewer 1: Anonymous
Reviewer 2:
Psych 2021, 3(4), 836-867; https://doi.org/10.3390/psych3040053
Submission received: 11 October 2021 / Revised: 24 November 2021 / Accepted: 25 November 2021 / Published: 9 December 2021

Round 1

Reviewer 1 Report

The current manuscript presents a revised account and tutorial of an automated workflow for reproducible research (with version control, build automation, containerization, automated document generation), its implementation with the R package “repro”, and its application in preregistrations as code (PAC).

I thoroughly appreciated the authors’ responsiveness and transparency, and I agree with many of their arguments. The revised text is now clearer about its scope and intended audience, and it strikes a better balance between the merits and limitations of the proposed workflow. Personally, I would have wished for a more substantial revision of the tutorial parts (before PAC), which still lack a real worked example and are still a bit heavy on hypotheticals/light on specifics, but I also understand the authors’ reasoning and their preference to go into the specifics elsewhere.

I have a few additional comments that the authors might want to consider, some general, some specific. However, I want to emphasize that these are all (in my opinion) fairly minor and mostly subjective.

Regarding the issues I had running the code, I also attach an R script with output at the end of this review. None of these issues really kept me from following the tutorial, and I’m sure that some of these can be considered a “won’t fix” (e.g., those related to editors that aren’t RStudio), but perhaps they still provide some useful pointers for future development.

 

General comments:

 

1. Scope. The scope and target audience of the article is now much clearer. One point that can still be improved is for the introduction to provide a clear statement of what is within (tutorial for the workflow and PAC for simple projects based on R) and outside (e.g., external software, extensions for projects with many files and dependencies, conversion of existing projects) the scope of this article.

2. Length. Due to the many useful additions, the article has become quite long and might benefit from cutting or moving certain parts (e.g., to a footnote, supplement). This is obviously subjective, so I simply included a list of potential candidates for this in my specific comments below (marked with a “*”).

 

Specific comments (page, lines):

 

2, 63: Typo (archived)

2, 64-66: Finding one’s own solution is rarely actionable advice, and this might go without saying.

2, Footnote 1: Not sure if this adds relevant information (*).

4, 153: Typo (scientists, apostrophe)

5, 205-222: I was still not sure if this is useful here. Considering that even the predefined template require adding targets to the Makefile, this might be better placed in Section 3.3 (*).

6, 229: Type (run the desired software the software, missing comma?)

6, 259-261: Partial duplicate of an earlier statement (*).

8, 292: Typo (aims to simplifies)

10, Figure 3.1: There is no reference to this figure in the text.

10, 368: Typo (Git, did you mean GitHub?)

11, 403: Typo (the R Markdown, word missing?)

12, 408-412: Not sure how much this adds (*).

13, 439-442: Partial duplicate (*).

13, 443: Typo (setup)

14, 465: Typo (the)

15, Section 4: The cases discussed here are very technical and somewhat removed from the “advanced topics” that researchers are likely to encounter first (*).

17, 559-561: Text states that commands should be wrapped in RUN1 and RUN2, but the code only wraps them in RUN1.

19, 675-686: Not sure how much this adds (*).

20, Sections 5.3.1 to 5.3.3: Not sure how much this adds or if it’s needed in this much detail (*).

21, 750: Typo (an R Markdown, missing word?).

21, 1: Typo (planned_analyis)

17, 578: The word “bias” has a clear meaning in statistics that differs from the kind of statistical decision errors described here, so I maintain that this should be changed (see my previous review; see also p. 22, 785 and p. 23, 817). 

22, 786-788: I wondered whether this wouldn’t contradict the previous statements about preregistrations after data collection, blinded analysis, and shuffling. Where’s the line?

26, 881-885: This is a nice touch but also a bit convoluted and confusing (*).

27, 884: Typo (preregistratered)

27, 1 (code block 2): Typo (three parentheses open, only one closes)

 

Example code and output:

 

library(repro)
path <- getwd()

repro::check_git()
# ✔ Git is installed, don't worry.

repro::check_make()
# ✔ Make is installed, don't worry.

# before post-install steps
repro::check_docker()
# ✖ Docker is not installed.
# ℹ Adapt to your native package manager (deb, rpm, brew, csw, eopkg).
# ℹ You may need admin rights, use `sudo` in this case.
# • Run `apt install docker` in a terminal to install Docker.
# • Add your user to the docker user group. Follow instructions on:
# 'https://docs.docker.com/install/linux/linux-postinstall/'
# • Consider restarting your computer.
# ℹ For more infos visit:
# 'https://docs.docker.com/install/'

# after post-install steps (add user/group, log out, start background services)
repro::check_docker()
# ✔ Docker is installed, don't worry.

repro::use_repro_template(path = path)

usethis::use_git_ignore(".gitignore")
# ✔ Setting active project to '...'
# ✔ Writing 'markdown.Rmd'
# ✔ Writing 'R/clean.R'

usethis::use_git()
# ✔ Initialising Git repo
# ✔ Adding '.Rproj.user', '.Rhistory', '.Rdata', '.httr-oauth', '.DS_Store' to '.gitignore'
# There are 4 uncommitted files:
# * 'data/'
# * 'markdown.Rmd'
# * 'R/'
# * 'repro_tutorial_test_20211104.R'
# Is it ok to commit them?

# 1: Negative
# 2: No way
# 3: Yeah

# first try
# NOTE: error message not too helpful here (if the error is from a system
# command that was supposed to open these files, you might try vi or nano)
repro::automate()
# ✔ Directory `.repro` created!
# ✔ Writing '.repro/Dockerfile_base'
# ✔ Writing '.repro/Dockerfile_packages'
# ✔ Writing '.repro/Dockerfile_manual'
# ✔ Writing 'Dockerfile'
# ✔ Writing '.repro/Makefile_Rmds'
# ✔ Writing '.repro/Makefile_Docker'
# ✔ Writing '.dockerignore'
# • Modify '.dockerignore'
# Error in editor(file = file, title = title) :
# Feature not implemented. Use nvim to edit files.

# second try (no change to dockerignore)
repro::automate()
# ✔ Writing '.repro/Dockerfile_packages'
# ✔ Writing 'Dockerfile'
# ✔ Writing '.repro/Makefile_Rmds'
# ✔ Writing 'Makefile'
# • You probably want to add:
# 'markdown.html'
# to the 'Makefile'-target 'all'.
# • Modify 'Makefile'
# Error in editor(file = file, title = title) :
# Feature not implemented. Use nvim to edit files.

# third try (after editing Makefile)
# NOTE: typo (allready)
repro::automate()
# ✔ Writing '.repro/Dockerfile_packages'
# ✔ Writing 'Dockerfile'
# ✔ Writing '.repro/Makefile_Rmds'
# ✔ `Makefile` allready exists, skip creating it.

repro::reproduce()
# • To reproduce this project, run the following code in a terminal:
# make docker &&
# make -B DOCKER=TRUE
# [Copied to clipboard]

# NOTE: this seems to write several GB of files to the root partition, which
# is really not ideal (possible to define a different directory, e.g., via
# symlink?)
system(repro::reproduce()

Author Response

Action Letter Reviewer I

Aaron Peikert, Caspar J. Van Lissa and Andreas M. Brandmaier

The current manuscript presents a revised account and tutorial of an automated workflow for reproducible research (with version control, build automation, containerization, automated document generation), its implementation with the R package “repro,” and its application in preregistrations as code (PAC). I thoroughly appreciated the authors’ responsiveness and transparency, and I agree with many of their arguments. The revised text is now clearer about its scope and intended audience, and it strikes a better balance between the merits and limitations of the proposed workflow.

Thank you very much for your thoughtful review. We are glad that we could improve the manuscript based on your last set of comments. In the following, we address your remaining concerns and comments point by point. Our notes are submitted in plain text and as PDF but are identical in content. In accordance with the workflow we suggest in this manuscript, we track all changes to the manuscript using Git and GitHub. Whenever we made changes in response to your suggestions, we provide the link to the changes in GitHub that allows everyone to exactly track and compare the changes to the manuscript. You find all changes we made since the second submission here:

https://github.com/aaronpeikert/repro-tutorial/compare/v0.0.4.0-resubmission..main#diff-a9a4aad3fa8c9c10c5404b632bc3a01a25d2d8430eb932bc35c76769963e4b70

Personally, I would have wished for a more substantial revision of the tutorial parts (before PAC), which still lack a real worked example and are still a bit heavy on hypotheticals/light on specifics, but I also understand the authors’ reasoning and their preference to go into the specifics elsewhere.

We appreciate that you understand our decision about the revision of the tutorial part.

I have a few additional comments that the authors might want to consider, some general, some specific. However, I want to emphasize that these are all (in my opinion) fairly minor and mostly subjective.

Regarding the issues I had running the code, I also attach an R script with output at the end of this review. None of these issues really kept me from following the tutorial, and I’m sure that some of these can be considered a “won’t fix” (e.g., those related to editors that aren’t RStudio), but perhaps they still provide some useful pointers for future development.

Thank you for attaching the script; we could now track down what caused this problem. If no RStudio installation is available, we fall back on utils::file.edit() which opens the file in the editor specified in getOption("editor"). The default value for this is vi on linux, which is not installed by default on Ubuntu and therefore fails. repro will simply skip opening the file on linux if there is no vi installed. While this needs a fix in the long run, it does not break our proposed workflow.

General comments:

  1. Scope. The scope and target audience of the article is now much clearer. One point that can still be improved is for the introduction to provide a clear statement of what is within (tutorial for the workflow and PAC for simple projects based on R) and outside (e.g., external software, extensions for projects with many files and dependencies, conversion of existing projects) the scope of this article.

Thank you for pointing this out, we now clearly state what we mean by “small-scale”:

https://github.com/aaronpeikert/repro-tutorial/commit/3cfc8a3ab2924c416c34539392e128ae8bf3d3b4

  1. Length. Due to the many useful additions, the article has become quite long and might benefit from cutting or moving certain parts (e.g., to a footnote, supplement). This is obviously subjective, so I simply included a list of potential candidates for this in my specific comments below (marked with a “*”).

We agree that the article is somewhat lengthy, but in the end we decided to remove only few parts. After all, the second reviewer even asked for more specifics, so we try to maintain the current balance of generality and specificity at the given length. Many of the sections that you point out were in response to specific requests by users or reviewers, so we opted for keeping them as they are. We could, however, reduce the manuscript by three pages.

Specific comments (page, lines):

Thank you for pointing out typos. We have edited our manuscript extensively to improve language and style. We therefore do not reply point by point to the typos:

https://github.com/aaronpeikert/repro-tutorial/pull/260/files

2, 64-66: Finding one’s own solution is rarely actionable advice, and this might go without saying.

We agree. It was meant as an encouragement to ignore our advice in the face of hurdles we did not anticipate but this is probably not helpful. We removed the sentence altogether.

https://github.com/aaronpeikert/repro-tutorial/commit/5616558044471af77e112fc1002e6507f7e83f06

2, Footnote 1: Not sure if this adds relevant information (*).

We agree and removed it.

5, 205-222: I was still not sure if this is useful here. Considering that even the predefined template require adding targets to the Makefile, this might be better placed in Section 3.3 (*).

In this section we give an overview about each software we recommend, removing just the part about Make did not strike us a particular helpful.

6, 259-261: Partial duplicate of an earlier statement (*).

We mention the difference of Dockerfile vs Docker image at several places but with different functions. In the introduction it is meant to explain how Docker works and in the discussion the implication for archival is emphasized.

10, Figure 3.1: There is no reference to this figure in the text.

Thank for noticing this oversight:

https://github.com/aaronpeikert/repro-tutorial/commit/396d762b5f6cc998b9d9d66840256bc581edbaa0

12, 408-412: Not sure how much this adds (*).

We believe it to be important to clearly link the static manuscript to the version controlled code. Otherwise, a reader could end up trying to reproduce a manuscript with the wrong version of the code.

15, Section 4: The cases discussed here are very technical and somewhat removed from the “advanced topics” that researchers are likely to encounter first (*).

The few users that repro could attracted to this point are technically well versed and had some issues related to these concepts, however we agree that this could be as well moved to an appendix.

17, 559-561: Text states that commands should be wrapped in RUN1 and RUN2, but the code only wraps them in RUN1.

Thanks for spotting this oversight.

https://github.com/aaronpeikert/repro-tutorial/commit/f27115556485ba7ae5e9ec630c7d8f703e158cb8

9, 675-686: Not sure how much this adds (*).

We wrote this in response to another reviewers comment, which we take the liberty of citing here:

However, data can often have unexpected properties and not only expected data dependent decisions can be formulated as code (line 600). Unexpected things like multicollinearity can occur, or sometimes interpretation is required, for instance in cases of determining measurement invariance. What to do in such cases?

We highlight here that deviations must be interpreted on in light of previous results and existing theory.

20, Sections 5.3.1 to 5.3.3: Not sure how much this adds or if it’s needed in this much detail (*).

According to you advice, we removed these sections.

17, 578: The word “bias” has a clear meaning in statistics that differs from the kind of statistical decision errors described here, so I maintain that this should be changed (see my previous review; see also p. 22, 785 and p. 23, 817).

We have replaced any mention of bias with “opportunistic bias.” Indeed through the discussion we have with you and others about how bias and researchers’ degree of freedom relate, we reconsidered the use of both terms. Our previous use was not only sloppy but may be mistaken in a way to misrepresent the goals of preregistration. We hoped to address this concern in:

https://github.com/aaronpeikert/repro-tutorial/pull/277

22, 786-788: I wondered whether this wouldn’t contradict the previous statements about preregistrations after data collection, blinded analysis, and shuffling. Where’s the line?

We have clarified in the manuscript that the purpose of preregistration is to distinguish confirmatory analyses from exploratory ones. The value of preregistration is greatest when a researcher has had no prior exposure to the data. When researchers are exposed to the data, there is a risk that the line between confirmatory and exploratory analyses begins to blur. This decreases the value of preregistration, but not in an all-or-nothing manner. Therefore, researchers ought to disclose the degree of prior exposure to the data.

https://github.com/aaronpeikert/repro-tutorial/commit/781b7bf76be942e9cd07bce14c6182ed4afd2fe0

26, 881-885: This is a nice touch but also a bit convoluted and confusing (*).

We moved this suggestion to a footnote and suggested a clearer way to inspect changes. GitHub switches to a different view for big diffs (150+ commits) which indeed made this a bit difficult.

27, 1 (code block 2): Typo (three parentheses open, only one closes)

Thanks for spotting this. It is fixed now:

https://github.com/aaronpeikert/repro-tutorial/commit/a7e9791736ed0dae39817220f16ef438d606e02f

Author Response File: Author Response.pdf

Reviewer 2 Report

I am very happy with the adjustments made by the authors. The following are only very minor comments:

line 52: I do not understand this sentence.

line 63: "achieved" instead of "archived"

line 225: "threaten" instead of "threatens"

lines 228-230: Is there a word missing (e.g., "becomes")?

line 292: "simplify" instead of "simplifiies"

line 413: either "in a RMarkdown document" or "in RMarkdown"

line 589: Is there a word missing ("is")?

line 623: "." is missing

line 714: "in such cases" or "in such a case"

line 724 A "," is missing.

line 745 -747: I think this sentence is misleading in the sense that a formal power analysis can never increase the planned sample size of a study in itself. (This is like writing cigarettes increase lung cancer.) Researchers, who conduct a formal power analysis before conducting a study, might evaluate their planned sample size differently (or not). 

line 788 "sometimes" instead of "sometime"

line 920: An "a" is missing.

Limitations section: I think it is a bit unclear to the reader, what this section references to. Are these limitations of the manuscript, limitations of the package repro or limitatons of PAC?

Author Response

Action Letter Reviewer II

Aaron Peikert, Caspar J. Van Lissa and Andreas M. Brandmaier

The current manuscript presents a revised account and tutorial of an automated workflow for reproducible research (with version control, build automation, containerization, automated document generation), its implementation with the R package “repro,” and its application in preregistrations as code (PAC). I thoroughly appreciated the authors’ responsiveness and transparency, and I agree with many of their arguments. The revised text is now clearer about its scope and intended audience, and it strikes a better balance between the merits and limitations of the proposed workflow.

I am very happy with the adjustments made by the authors.

We are glad that we could improve the manuscript based on your previous set of comments. In the following, we address your remaining concerns and comments point by point. Our notes are submitted in plain text and as PDF but are identical in content. In accordance with the workflow we suggest in this manuscript, we track all changes to the manuscript using Git and GitHub. Whenever we made changes in response to your suggestions, we provide the link to the changes in GitHub that allows everyone to exactly track and compare the changes to the manuscript. You find all changes we made since the second submission here:

https://github.com/aaronpeikert/repro-tutorial/compare/v0.0.4.0-resubmission..main#diff-a9a4aad3fa8c9c10c5404b632bc3a01a25d2d8430eb932bc35c76769963e4b70

The following are only very minor comments:

We have extensively edited our manuscript to improve style and language. We therefore do not reply point by point to all the typos mentioned (but we are very thankful that you made us aware of them):

https://github.com/aaronpeikert/repro-tutorial/pull/260/files

line 52: I do not understand this sentence.

We revised this sentence to improve clarity:

https://github.com/aaronpeikert/repro-tutorial/commit/077a9f9092baac186a337a6aa9b4a3be53067b52

line 745 -747: I think this sentence is misleading in the sense that a formal power analysis can never increase the planned sample size of a study in itself. (This is like writing cigarettes increase lung cancer.) Researchers, who conduct a formal power analysis before conducting a study, might evaluate their planned sample size differently (or not).

Thank you for clarifing what “misleading” means here, we removed this implication:

https://github.com/aaronpeikert/repro-tutorial/commit/68665c53c6eeb8ba6f7ff110921c4356e2305f79

Limitations section: I think it is a bit unclear to the reader, what this section references to. Are these limitations of the manuscript, limitations of the package repro or limitatons of PAC?

We shortened the summary of PAC to clarify that the discussion relates to the workflow and repro:

https://github.com/aaronpeikert/repro-tutorial/commit/13510cbb5aff30224311bdd32928abf7b4a0299d

Author Response File: Author Response.pdf

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

The submitted tutorial on the R package repro for automating reproducible research workflows is well and concisely written. The R package itself seems to be well documented and to a high degree self-explanatory in its usage. While reproducible research is an important and trending topic and the tools used by repro are state of the art, I have some concerns regarding the manuscript and the applicability for other researchers:

Repro expects users to have admin rights on their machines, which is frequently not the case on, for example, university hardware. This is an important limitation that should be stated clearly and early in the manuscript. (If extensions for the package are planned, alternatives to docker which do not rely on admin rights could be a priority. Could approaches like the R packages groundhog or checkpoint work as alternatives for pure R projects?)

The authors suggest RMarkdown for dynamic document generation. However, there are only very few hints to how actual manuscript submissions can be done via RMarkdown. Often, journals only accept Word or (sometimes) LaTeX submissions. Are there any specific resources how RMarkdown can be automatically converted to Word-documents which are, for example, APA compliant?

The manuscript suggests at various points that data can and should be stored at openly accessible repositories. While this seems reasonable from an open science perspective, this raises data privacy issues which should at least be mentioned in the manuscript.

The suggested workload for applied researches wanting to do rigorous Preregistration as Code (PAC) seems enormous. Simulation studies are complex and how to conduct simulation studies is rarely taught in study programs. While the manuscript points to some general literature on simulation studies, no specific hands on tutorials specific to R are referenced. (Also: To a certain degree it seems ironic that software development has moved to agile development practices while PAC asks researchers to do the exact opposite: Foresee and program all eventualities up front.)

While the suggested three-function work flow seems intuitive for small scale research questions it is not clear for the reader how this should be upscaled. If very different types of analyses are performed (e.g., some manipulation checks, linear regressions with different predictor sets, random forests), how should these analyses be integrated? Should all analyses be performed in planned_analysis() and the return-object be extended? Should multiple planned_analysis() functions be written? Would very large and complex planned_analysis() functions not reduce reusability for other research projects?

There are various alternatives to all chosen tools. Some of these alternatives are discussed in greater length (e.g., renv), some only briefly mentioned (e.g., Bitbucket, Gitlab). I think it would increase the readability of the manuscript if such listings and discussions could be bundled at a certain point in the manuscript, also to give the reader a better understanding of the reproducible research eco-system.

Connecting Git, RStudio and Github is not trivial, especially for the presented “local repository first, Github last” approach. While repro and the manuscript help the reader with installing Git, the manuscript does not give any hints to how to generate SSH or HTTPS keys and how to set up credentials, in fact it does not even mention that this is a necessary step. Note that the setup is also quite error prone; for example it took me quite a while to figure out that I had to update the R package gh for the setup to work (see https://community.rstudio.com/t/usethis-use-github-error-in-validate-gh-pat/105737).

Some minor issues:  

Page 1 line 26 “applications” instead of “application”?

Page 2 line 52 Is there a word missing?

Page 2 line 60 “setting up” instead of “setup”?

Page 4 line 134 Is commit defined twice intentionally?

Page 5/6 lines 212/213 How is this quote linked to the text?

Page 6 line 227 “Ubuntu” is first used here but defined in line 237/238

Page 7 line 263 “in the future” instead of “in future”

Page 9 line 322 The package “usethis” is used without any prior mention in the manuscript. Even if it is automatically installed alongside repro as a dependency, this might confuse readers. Also, if the package is essential for the recommended workflow it should be cited in the manuscript.

First paragraph on Comprehensiveness: Is this specific for PAC or a general remark on pre-registration?

Page 19 line 654 Writing that  “a formal power analysis increases the planned sample size” can be misleading.

Multiple links in the manuscript are broken or not working as intended (e.g., Page 22 Line 738, Page 23 Line 757-759)

 

Curious question:

It appears to me that “Make” is distributed alongside Rtools 40, but it is not recognized by repro. Is this intentional or a limitation of the Rtools distribution?

Reviewer 2 Report

The present manuscript introduces a reproducible workflow for project management and scientific writing. The workflow is focused on R and relies four key components to manage, run, and compile project-related files: version control (with Git), dynamic documents (with “knitr”), build automation (with Make), and containerization (with Docker). This workflow was originally presented by Peikert and Brandmaier (https://doi.org/10.5964/qcmb.3763). The novel contribution of the manuscript is that it provides a tutorial for implementing this workflow with the R package “repro” and that it explains how to apply these techniques to create “preregistrations as code” (PAC).

 


In my opinion, there is a lot to like about this manuscript and the ideas that it presents, for example, the thorough discussion of different threats to reproducibility, the didactic nature of the tutorial, and the inclusion of code snippets for the “repro” package. That being said, I have a number of comments for the authors to consider.

 


Major comments:

 


1. Feasibility and practical limitations. The suggested workflow, though elegant and powerful, also has some significant limitations that make me doubt how feasible it would be in practice, especially for larger project that involve multiple parties and a variety of tools and software. For example: Containerization makes it difficult to include external software, and I am not sure how non-free software (e.g., “Mplus”) could be included in this approach. Make executes command-line instructions, but some tasks may be hard to represent in this paradigm. Git tracks changes in text files, whereas binary files can only be tracked “as is” or to the extent that there are text files that generate them. In addition, the workflow has a number of “soft” dependencies and restrictions such as the reliance of “repro” on RStudio and the need to use platforms like Github for sharing and collaboration, which (in my experience) is better than nothing but does not replace collaborative tools. 

The authors do mention some of these limitations throughout the manuscript, but I think this often comes too late or in too little detail (e.g., the pros and cons of containerization). This is especially noticeable in the introduction and the discussion, which strongly emphasize the potential benefits but do little to prepare readers for the practical limitations. I understand that many of these limitations are due to the underlying tools, that some can be solved in principle (i.e., with more effort), and that some research practices are simply not compatible with this workflow. However, I think the authors should clarify the scope of this workflow early on and provide a more thorough and balanced discussion of its practical limitations.

 


2. Target audience. Although the manuscript is intended for a general audience, the entry barrier still seems quite high. Specifically, although the “repro” package is useful and the prepared templates work well, even simple adjustments often require manual intervention, at which point readers will need a significant familiarity with tools like Git, Make, or Docker. This is exacerbated by the fact that (a) the manuscript is currently missing a real worked example and relies on the default template for illustration (except in the PAC section) and (b) there is relatively little step-by-step instruction for how editing and compilation tasks interact (e.g., when and in what order to run which commands upon a change in the analyses or the final document).

To address this issue and lower the entry barrier, I think it would be helpful if the manuscript included a fully worked example that is used throughout the text (e.g., the one used in the PAC section). In addition, it might be extremely helpful for some readers, if the authors walked them through some common tasks in a step-by-step manner (e.g., perform/commit an edit and update all relevant targets; add a target to the makefile and update; add R and non-R dependencies; convert an existing project; create a Docker image for archiving). This is also important given that the “repro” package is one of the novel aspects of this manuscript, so the authors should use this opportunity to highlight their contribution here.

 


3. PAC. Although I agree that PAC can be useful, I was not convinced by some of the arguments made in favor of PAC over conventional preregistration. For example, researcher degrees of freedom cannot always be avoided or converted into simple if-else statements; nor are they always bad. In other words, the ambiguity in conventional preregistration may be intentional, because not all decisions or problems can be anticipated in advance, and the purpose of preregistration may be primarily to make this transparent. In addition, writing only one document may not always be realistic (e.g., in larger projects) or beneficial (e.g., because preregistrations serve a different purpose and often contain more detailed information than the manuscript). Finally, preregistering analyses with simulated data seems unrealistic to me. Not only does this require significant familiarity with simulation techniques, simulated data are often much simpler and “well-behaved” than real data and may serve as a poor guide for planning statistical analyses.

Finally, I felt that this section is a bit disconnected from the rest of the manuscript, because it primarily discusses the structure of the dynamic document in a PAC, whereas the other components for the suggested workflow were left out. Given that PAC is one of the main contributions of this manuscript, I think it is important that the relation between the suggested workflow and PAC is made clear. By contrast, I think this section currently contains some material that I would consider optional, because it is not specific to PAC (e.g., rules for writing a preregistration, rules for academic writing in general, power simulations).

 


Minor comments (page, lines):

 


3, 93: Maybe state here that RStudio is required (not just R). I tried running the code in a different editor but was unsuccessful.

 


4, 145-146: Would the authors also recommend storing projects on GitHub long term? Given that GitHub is a proprietary service, it may be useful to distinguish hosting projects for collaboration version storage.

 


5, 209-211: What restrictions are there regarding what software can and cannot be containerized?

 


6, 225-226: Apart from the installation, I was also wondering if the difference between OS (e.g., with an Ubuntu container on a Windows machine) could lead to additional threats to reproducibility. It is well known that difference in system libraries can cause differences in computation results between different OSs.

 


6, 231-233: I was a bit confused whether this was a general preference or also the one used in “repro”. More generally, I think the benefit of not “being stuck” may not be clear to readers, who aren’t already familiar with Docker. There are also some typos here (“with a unmaintainable because“).

 


7, 287-288: It might be helpful to some readers if these installation issues (e.g., on Windows) were discussed more. Both the manuscript and “repro” currently only refer to “chocolatey” instead of providing step-by-step instructions, which may deter some readers.

 


8, 1 (check git): This line is missing the “::” operator.

 


9, 326-328: Some readers may find it helpful if this was illustrated with screenshots or the “… -> …” syntax used earlier.

 


10, 368: On what settings does this depend and how?

 


11, 375-378: This recommendation assumes significant familiarity with Git, and I don’t think that many readers will be able to follow it. I was also wondering if this wasn’t a problem that Makefiles were supposed to solve.

 


11, 380: Typo (“metadate”)

 


12, 1 (check make/docker): These lines are missing the “::” operator.

 


12, 394: I was unsure in what context “automate” was supposed to be executed, and this only because clear to be with the summary (on pp. 12-13). In addition, I had some errors here. First, the function required “bookdown” and “rticles”, which weren’t named in the DESCRIPTION file. Second, this seems to require RStudio (but this is not a problem for the manuscript).

 


12, 409-410: To me, it was confusing that the reproduction (reproduce) was explained before the creation of the project (automate). It might be helpful to some readers if the tutorial followed the same order as the summary (on pp. 12-13). In addition, I had some errors with this command, because on my (fresh, Linux) installation of Docker, the Docker daemon was not running and the user permissions were not set automatically, so this code did not run “as is” for me.

 


12, 412-413: The meaning of these commands is only much later, and it might be helpful to discuss this here.

 


13, 1 (use repro template): This command required a “path” argument that had no default value, so it returned an error.

 


14, 460: Typo (“these limitation”)

 


15, 489-493: I was wondering what “repro” does if the locally installed package is outdated (i.e., older than the CRAN version at build time). Will the container then use the more recent version?

 


15, 503: Typo (“snipped”)

 


 15, 1 (apt-get): I was wondering how this workflow deals with different versions of system software. Specifically, if a project is build/reproduced from a Dockerfile, how does the workflow ensure that the same version is used for system/external software (e.g., when installed with “apt”)? How does it deal with packages being removed from the Ubuntu repositories?

 


16, 543: In what way do researcher dfs by themselves “bias” results?

 


17, 581: Typo (“did included”)

 


19, 678-679: This recommendation was not clear to me. Maybe I misunderstood, but doesn’t this defeat the purpose of preregistration?

 


21, 1-14 (planned analyses): It might be helpful for some readers, if the code inside this function would be commented a bit more.

 


22, 735-740: For many readers, this might be too difficult without additional guidance, especially when the relevant steps aren’t shown.

 


24, 765: How does PAC facilitate spotting deviations?

 


24, 799: Typo (“revise”)

 


24, 799-800: Not sure if I agree with this. I assume this refers to inspecting diffs in Git, but I don’t think that this necessarily facilitates this spotting deviations for human readers (at least not unless the number of changes between the PAC and the final document is trivial).

 


25, 815: Typo (“to”)

Reviewer 3 Report

In this article two main things are described, the package repro and the concept of Preregistration as code (PAC). I’ll split out the reaction to cover both topics.

 

Repro

The package repro seems a useful addition to the toolset of researchers trying to achieve reproducibility of their work. It simplifies barriers that one might have before using docker and make and this is a welcome addition. In general, the part about repro is well written and explained. I have some small comments that might be addressed:

  • At the start of section 2, threats to reproducibility, the use of seed values could be emphasized. I might be mistaken, but I think, even with the reproducible environment, seed values might be required to reproduce results from for instance (Markov Chain) Monte Carlo methods.
  • I like the reflection on earlies work of the authors in their workflow for open reproducible code in science. Perhaps a short table with an overview of how the current work extends this previous work could be a nice addition.
  • Lines 231-233, I think there might be something missing related to unmaintainable. Not sure what should be here.
  • Lines 245-247 made me wonder how github installed repro’s are handled. In their example they use “aaronpeikert/repro@adb5fa569”, it could be good to put some emphasis on this @ reference to ensure not only “aaronpeikert/repro” like references are used.
  • Not everywhere is a consistent use of repro:: or withholding. E.g. lines 298, 406, 407, 415. This code would not run when copy pasted.
  • Naively following the manuscript in section 3 can lead to some confusion. E.g. I do not get “you are inside a Docker container” with repro::check_docker(), only the message I don’t need to worry, it is installed. Additionally, the repro::automate() function would deserve its own line so that people would copy paste this. Going naively through the document, copying the lines, I missed this command and thus the repro::reproduce() did not behave as expected.

 

A bigger point that should be addressed relates to the next part of PAC but I’ll name it here first.

  • Scalability should be discussed. How does the proposed method work for large studies? In such cases multi-center data might be involved with all sorts of restrictions related to sharing of data between centers. Is this workflow suitable for such situations? The larger point, also related to PAC I would like to see is a clearer discussion of what type of research projects the proposed methods are useful for and for which they might not be useful.

 

PAC

Regarding PAC. In general, I’m not convinced that this workflow would be widely adopted, and it I think it might be unsuitable for many types of research. However, I’m not against this proposal and it would seem to deserve a place in the literature such that it can be incorporated in discussions. Again, the authors present a well written document. Related to PAC I have the following comments.

  • First, as mentioned before, I think there should be a clear description of when the authors think a study would qualify as being suitable to use PAC and when not. I would see this being very useful when studies are relatively straightforward, confirmatory, and preferably single center, or without major data-sharing hurdles. I would see many hurdles in large (consortia) collaborations where sharing of data is complex. One does not always know what data will exactly look like when it will be obtained from consortia partners and the data cleaning alone might already pose a great challenge to be incorporated in PAC.
  • Fake data simulation is encouraged, this is great. However, data can often have unexpected properties and not only expected data dependent decisions can be formulated as code (line 600). Unexpected things like multicollinearity can occur, or sometimes interpretation is required, for instance in cases of determining measurement invariance. What to do in such cases?
  • There seems to be a part of reproducible workflow which is not discussed which is related to this work. Line 582-583 mentions that preregistered code has not gained much traction, that seems correct. However, in for instance clinical trials, often there is a production of a protocol followed by a statistical analysis plan (SAP) and accompanied by a specification of Tables, Listings and Figures (TLF). The TLF specifications are extremely detailed and short of the actual code being given, together with the SAP they leave very little room for imagination of what will be analyzed and how this is interpreted. PAC might be suitable in such situations and to increase the audience / readership for the paper, a discussion of this might be worthwhile.
  • Line 612, exploratory analyses should be clearly noted as such without PAC too.
  • Line 668-672, provides insight that PAC is not as rigid as it first sounds, it can be adapted if data seems unexpected. It might be good to emphasize this early on.
  • From line 677, alternatives to simulated data. “researchers can delay their preregistration till they have gathered data..”. This, to me, seems like a weird proposal in the context of PAC. PAC is meant to remove researchers’ degrees of freedom. Collecting data before writing up a PAC seems to negate this effect and leave much room for intentional, or unintentional, bad actors. I would suggest removing this from the manuscript.

 

In general, the manuscript is well written and would deserve a place in the literature. I hope the researchers find the above comments useful to refine their manuscript.

Back to TopTop