Software for Reproducible Research with R

Author

Steven J. Pierce

Modified

2025-09-13 10:41:49 EDT

Statisticians can demonstrate leadership in research projects by advocating for and engaging in reproducible research (Hochheimer et al., 2024). Reproducible research can be facilitated by using a set of software packages that combine into a powerful, integrated set of tools. The statistical software R, paired with RStudio Desktop (RStudio Team, 2025), is the foundation for a set of tools that can generate dynamic documents containing a mix of narrative text along with R code that can be compiled to automatically produce a fully-formatted report, manuscript, website, or set of presentation slides complete with text, equations, figures, tables, and reference sections (Mair, 2016).

While Mair (2016) introduces R Markdown as a tool for writing dynamic documents, Quarto (Allaire et al., 2025) is a next-generation tool that replaces R Markdown. Quarto will be a key part of the future of scientific publishing for those who want to do reproducible research. It can produce documents in many different formats (e.g., HTML, PDF, Word, PowerPoint, and many more). Combining Quarto with TinyTeX enables production of very nice PDF files, which are great for distribution and archiving. Meanwhile, Git (Torvalds et al., 2025) and GitHub solve source code version control problems and facilitate collaboration on code (Bryan, 2018).

Marwick et al. (2018) discuss creating research compendiums that bundle collections of documents along with data files into custom R packages to make a research project more reproducible. These software packages become additional scholarly products that make manuscripts more reproducible.

1 Purpose

This document describes the software packages I personally use and recommend to enable reproducible research via R packages that serve as research compendiums. It covers which packages one will need, why they are needed, and how to install them. I share these notes when helping other people set up the software environment that I rely on for my daily research and statistical work. If you want to see how I use these software tools, check out my recent presentation on reproducible research (Pierce, 2025, March 6).

Tip

I update this page whenever I install an update to R, RTools, RStudio, Quarto, or Git.

2 Assumptions

  1. You will be using either MS Windows 10 or 11 or MacOS. The notes below about MacOS are a result of helping a few collaborators who rely on MacOS; they may be less detailed (and less up-to-date) because I don’t personally use MacOS.
  2. You have administrative access to that computer to install and configure software.
  3. You aim to use custom R packages as research compendiums containing Quarto scripts for creating dynamic documents that compile into formatted PDF or HTML files (or other output formats).
  4. You want to use version control software to track how the code and documents change over time.

3 R, version 4.5.1 (or later)

R (R Development Core Team, 2025) is a free, open-source software environment for statistical computing and graphics. This is the foundational software for my approach to reproducible research. It will handle most of your data management and analysis tasks. You can download R from the Comprehensive R Archive Network (CRAN).

3.1 R for Windows

If you are using Windows, follow the links labeled Download R for Windows, then base, then Download R-4.5.1 for Windows or use the direct link to the Windows installer. Save that installation file to your computer, then run it.

3.2 R for MacOS

If you are using MacOS, follow the link labeled Download R for macOS, then download the latest release specific to your version of MacOS and whether you have an Intel processor or ARM processor! Those come with different requirements for which version of XCode and other tools have to be installed to compile R packages.

3.3 R Installation Tips

I recommend selecting the following options when instlling R.

  1. 64-bit User installation
  2. Customized startup
  3. SDI (separate windows)
  4. HTML help
  5. Save version number in registry
  6. Associate R with .RData files

If you previously were using an older version of R (any version in the 4.4.x series), you should plan to reinstall all your R packages from scratch under R 4.5.0 or later. The best way to do that is to use a script such as Reinstall_Packages.R under the older version to save a data file containing the names of installed packages, then remove the older version of R and replace it with the newest version of R, and use the remainder of that script to read in that list of packages and install them. That will take several minutes if you have a lot of packages.

4 Development Tools

Developing your own custom R packages requires more software than just using R and relying solely on R packages created by others. The development tools you need to install depend on your operating system. Use the subsection below relevant to your OS.

4.1 RTools for Windows, version 4.5 (6608-6492 or later)

Installing RTools is important because it contains some tools for compiling software packages and one of the R packages we will use later (the devtools package) depends on having RTools available. You can download RTools from CRAN. Either follow the CRAN links labeled Download R for Windows, then RTools, then RTools 4.5, then RTools44_installer, or use the direct link to the installer file. Save that installation file to your computer, then run it.

4.2 MacOS: Xcode, GNU Fortran, and Mandatory Libraries

If you are using MacOS instead of Windows, then you need a few MacOS-specific development tools that fulfill the compiler functions that RTools serves on Windows. The details are available at https://mac.r-project.org/tools/. Specifically, you need Xcode, GNU Fortran 12.2 Compiler, and some mandatory libraries.

5 RStudio Desktop, version 2025.09.0+387 (or later)

RStudio Desktop (RStudio Team, 2025) is an integrated development environment (IDE) that provides many convenient features for working with R. It provides a better user interface for R than the raw R software. It contains a very good programming editor, plus a nice way to browse the list of objects R has in memory. You can download the desktop version from https://posit.co/download/rstudio-desktop/.

5.1 RStudio for Windows

You can use this direct link to the Windows installer. Save that installation file to your computer’s Downloads folder, then run it.

5.2 RStudio for MacOS

You can use this direct link for the MacOS installer.

6 Quarto, version 1.8.24 (or later)

Quarto (Allaire et al., 2025) is an open-source scientific and technical publishing system that makes it easy to create reproducible, dynamic documents that interweave narrative text, code, and the result of running code (including tables, figures, and other statistical output). Quarto is the evolution of, and successor to, R Markdown. An older version of Quarto may come bundled into RStudio, but we will usually want the most recent stable release, so install it separately. You can download the latest stable version from https://quarto.org/docs/get-started/.

6.1 Quarto for Windows

You can use this direct link to the Windows installer. Save that installation file to your computer’s Downloads folder, then run it.

6.2 Quarto for MacOS

You can use this direct link to the MacOS installer.

7 Git, version 2.50.1.1 (or later)

Git (Torvalds et al., 2025) is version control software developed by programmers to help them manage the source code they write. It is a very sophisticated tool for tracking the history of what changed in a text file from one version to the next, who made the changes, when they were made, and why. It allows you to easily revert to any tracked version of a file. After you install Git, it is helpful to explore the site https://happygitwithr.com/ (Bryan et al., n.d.).

7.1 Git for Windows

You can download it from either https://git-scm.com or https://gitforwindows.org. You can use this direct link for the Windows installer. Save that installation file to your compute, then run it.

7.2 Git for MacOS

Download instructions are available at
https://git-scm.com/download/mac.

8 Installing R Packages

User-contributed R packages add extra features to R. They are a large part of what makes R so useful. Below is a set of specific R packages that we plan to use and can be downloaded from CRAN via the following command within R’s console.

install.packages(c("devtools", "here", "kableExtra", "knitr", "quarto", 
                   "rmarkdown", "tidyverse", "tinytex"))

Each of these may also automatically install additional packages on which they depend. We may recommend installing additional packages later.

The devtools package provides developer tools and functions that simplify creating and quality-checking custom R packages.

The here package facilitates using relative filenames when one is using RStudio projects to organize files.

The rmarkdown and knitr packages work together to translate R Markdown and/or Quarto scripts into Pandoc markdown files. The quarto package provides support for interfacing with Quarto. Meanwhile, the kableExtra package extends knitr’s tools for making nice, formatted tables in via R Markdown or Quarto. Then, the tinytex package provides an easy way to install and start using TinyTex to convert R Markdown or Quarto documents to fully formatted PDF files.

Tip

In the paragraph above, note the capitalization of the software names: quarto is an R package that helps you interface with the stand-alone Quarto software. Similarly, tinytex is an R package that helps you install and use TinyTeX (a LaTeX distribution created by the same author).

The tidyverse is a collection of related R packages that provide an unified set of functions that make working with R much easier. They support creating piped operations that chain together easily.

9 TinyTex

TinyTex is a specific distribution of LaTeX, which is document preparation software that allows high-quality typesetting. It takes plain text LaTeX files (*.tex files) that describe the structure of a document and compiles them into fully-formatted PDF files with nice fonts and layout. We can actually use the tinytex R package to install TinyTeX via the following command inside R.

tinytex::install_tinytex()

You can use alternative LaTeX distributions and tools (e.g., MiKTeX) instead, but TinyTeX is very convenient because of how well it integrates with the other tools we’re using. Yihui Xie, the author of tinytex and TinyTex, is a contributor to R Markdown, Quarto, and the knitr and rmarkdown packages in R.

If it has been a while since you installed TinyTex, you can easily update to the latest version by the following command. This should include any previously installed LaTeX packages you had accumulated along the way. My experience is that the first time you render a PDF file after a reinstall takes longer than normal, but rendering speed returns to normal after that.

tinytex::reinstall_tinytex()

10 Pandoc

Pandoc is software that allows you to convert documents between multiple formats. A copy of Pandoc comes bundled with RStudio and another copy comes bundled with Quarto, so you do not need to download and install it separately. If you want PDF output, Pandoc is what converts documents into the LaTeX files that can be processed by TinyTeX. Quarto is built on top of Pandoc because it facilitates producing many different kinds of output files.

11 rcrpkg.template

rcrpkg.template (Pierce, 2024) is a Git repository I created to serve as a starter template for new, project-specific Git repositories. It streamlines creating a new custom R package that serves as a research compendium (Marwick et al., 2018) for a project. Inside the scripts subfolder, the Example_Render_to_HTML.qmd script demonstrates rendering to HTML output and the Example_Render_to_PDF.qmd script demonstrates rendering to PDF output. The latter file also uses the title.tex and compact-title.tex files stored in the same folder to control the formatting of space around the title and showing author affiliation info. References to those can be removed from the YAML header to simplify the script and reduce dependency on additional files. The Setup_as_Package.qmd Quarto script guides one through some key steps in setting up a custom R package.

The basic approach to using this template is:

  1. Fork and clone the rcrpkg.template repository to your GitHub account.
  2. Use GitHub settings to set that as a repository template for yourself.
  3. Use GitHub to create a new repository (with a suitable name of your choice) from that template repository.
  4. Clone that your new repository to your local computer,
  5. Start using the Setup_as_Package.qmd script inside your new local repository to take additional steps.

12 References

Allaire, J. J., Dervieux, C., Scheidegger, C., Teague, C., & Xie, Y. (2025). Quarto (Version 1.8.24) [Computer Program]. Posit Software, PBC. https://quarto.org
Bryan, J. (2018). Excuse me, do you have a moment to talk about version control? The American Statistician, 72(1), 20–27. https://doi.org/10.1080/00031305.2017.1399928
Bryan, J., TAs, T. S. 545., & Hester, J. (n.d.). Happy Git and GitHub for the useR [Web Page]. https://happygitwithr.com
Hochheimer, C. J., Bosma, G. N., Gunn-Sandell, L., & Sammel, M. D. (2024). Reproducible research practices: A tool for effective and efficient leadership in collaborative statistics. Stat, 13(1), e653. https://doi.org/10.1002/sta4.653
Mair, P. (2016). Thou shalt be reproducible! A technology perspective. Frontiers in Psychology, 7(1079), 1–17. https://doi.org/10.3389/fpsyg.2016.01079
Marwick, B., Boettiger, C., & Mullen, L. (2018). Packaging data analytical work reproducibly using R (and friends). The American Statistician, 72(1), 80–88. https://doi.org/10.1080/00031305.2017.1375986
Pierce, S. J. (2025, March 6). Reproducible research: Principles, practices, and tools for generating reproducible statistical analyses and reports [Online seminar]. Center for Statistical Training and Consulting webinar series on Responsible and Ethical Conduct of Research. https://github.com/sjpierce/CSTAT.RR2025
Pierce, S. J. (2024). rcrpkg.template: Research compendium R package template (Version 0.6.0) [Computer Program, R package, Public Repository]. GitHub. https://github.com/sjpierce/rcrpkg.template
R Development Core Team. (2025). R: A language and environment for statistical computing (Version 4.5.1) [Computer Program]. R Foundation for Statistical Computing. http://www.R-project.org
RStudio Team. (2025). RStudio Desktop: Integrated development environment for R (Version 2025.09.0+387) [Computer Program]. Posit Software, PBC. https://posit.co
Torvalds, L., Hamano, J. C., & other contributors to the Git Project. (2025). Git for Windows (Version 2.50.1(1)) [Computer Program]. Software Freedom Conservancy. https://git-scm.com