Advanced Reproducible Research in R

A workshop on creating collaborative and automated analysis pipelines

Authors
Affiliation

Luke William Johnston

Steno Diabetes Center Aarhus

Anders Askeland

Published

September 16, 2025

Welcome!

Three people working together to brainstorm, design, and develop a project.

DOI Copier GitHub License GitHub Release Build website pre-commit.ci status lifecycle Project Status: Active – The project has reached a stable, usable state and is being actively developed.

Reproducibility and open scientific practices are increasingly in demand and needed by scientists and researchers in modern research environments. More frequently, our work, as researchers, includes a high level of collaboration on scientific projects. Consequently, many new challenges arise that we lack the training or knowledge to resolve.

These challenges include:

  • Establishing common coding styles and standards to make it easier to read or review each other’s code;
  • Documenting the software dependencies of a project to synchronize computing environments among collaborators and potentially with servers;
  • Documenting (and automating) the steps taken to process, analyze, and present data and findings in a way that allows collaborators to regenerate the most recent results.

Training and awareness of the skills and knowledge necessary to create reproducible and transparent data analysis pipelines are still significantly lacking among researchers. Partly due to this gap, how exactly an analysis is done (including data processing and wrangling) to produce a given result are often poorly, if at all, described in scientific studies. This can have a major impact on the reproducibility and, ultimately, the reliability of studies.

This 3-day workshop is designed to address these issues by using participatory live-typing or “code-alongs”, where the teacher demonstrates the tasks on their computer connected to a projector while learners type along on their own computer. The workshop also includes reading tasks, discussion activities, hands-on exercises using a real-world dataset, and group work on a project to apply the skills gained from the workshop.

This website contains all of the material for the workshop, including readings, exercises, presentations, live-typing material, and images. It is structured as a book, with “chapters” as sessions, in order of appearance. We make heavy use of the website throughout the workshop where “type-along” sessions almost identically follow the material on the website (with slight modifications for time or more detailed explanations).

Check out the overview section of the workshop, starting with the Syllabus.

If you plan on attending the workshop, please make sure to complete the Pre-workshop tasks to get set up and ready for the workshop. The pre-workshop tasks include a survey that you need to fill out before the workshop starts.

TipDo you find this workshop material useful?

If yes, please consider “starring” our GitHub repository. Starring the repository will save it to your list of saved repositories, so it’s easy for you to find again later. As a plus, it helps give our project more visibility 🌟

Target audiences

This website and its content are targeted to three groups:

  1. For the learners to use during the workshop, both to follow along and also to use as a reference after the workshop ends. A more detailed description of who the learner is can be found in Is this for you?
  2. For the teachers to use as a guide for when they do the type-along sessions.
  3. For those who are interested in teaching, who may not have much experience or may not know where to start, to use this website as a guide to running and teaching their own workshops.

Re-use and licensing

The workshop material is licensed under the Creative Commons Attribution 4.0 License, so the material can be used, re-used, and modified, as long as there is attribution to this source. Check out the For teachers section for more details and tips on using this material for teaching.

Contributing

Want to contribute to this workshop? Look through our CONTRIBUTING page for contribution guidelines on how to get started.

Contributors

These are the people who have contributed by submitting changes through pull requests 🎉

@lwjohnst86, @AndersAskeland, @signekb, @nestanyol, @Mortendall

How the website is made

The workshop material is created using Quarto to write the material and create the book format, GitHub to host the Git repository of the material, and GitHub Actions with Netlify to build and host the website. The original source material for this workshop is found on the r-cubed-advanced GitHub repository.

Acknowledgements

Illustration cover is by Storyset.

The Danish Diabetes and Endocrinology Academy hosts, organizes, and sponsors this workshop. A huge thanks to them for their involvement and support! Steno Diabetes Center Aarhus and Aarhus University employs Luke Johnston, who is the lead instructor and curriculum developer.

Logo for Steno Diabetes Center Aarhus

Logo for Danish Diabetes and Endocrinology Academy