Advanced Reproducible Research in R
A workshop on creating collaborative and automated analysis pipelines
Welcome!
Reproducibility and open scientific practices are increasingly in demand and needed by scientists and researchers in modern research environments. More frequently, our work, as researchers, includes a high level of collaboration on scientific projects. Consequently, many new challenges arise that we lack the training or knowledge to resolve.
These challenges include:
- Establishing common coding styles and standards to make it easier to read or review each other’s code;
- Documenting the software dependencies of a project to synchronize computing environments among collaborators and potentially with servers;
- Documenting (and automating) the steps taken to process, analyze, and present data and findings in a way that allows collaborators to regenerate the most recent results.
Training and awareness of the skills and knowledge necessary to create reproducible and transparent data analysis pipelines are still significantly lacking among researchers. Partly due to this gap, how exactly an analysis is done (including data processing and wrangling) to produce a given result are often poorly, if at all, described in scientific studies. This can have a major impact on the reproducibility and, ultimately, the reliability of studies.
This 3-day workshop is designed to address these issues by using participatory live-typing or “code-alongs”, where the teacher demonstrates the tasks on their computer connected to a projector while learners type along on their own computer. The workshop also includes reading tasks, discussion activities, hands-on exercises using a real-world dataset, and group work on a project to apply the skills gained from the workshop.
This website contains all of the material for the workshop, including readings, exercises, presentations, live-typing material, and images. It is structured as a book, with “chapters” as sessions, in order of appearance. We make heavy use of the website throughout the workshop where “type-along” sessions almost identically follow the material on the website (with slight modifications for time or more detailed explanations).
Check out the overview section of the workshop, starting with the Syllabus.
If you plan on attending the workshop, please make sure to complete the Pre-workshop tasks to get set up and ready for the workshop. The pre-workshop tasks include a survey that you need to fill out before the workshop starts.
Target audiences
This website and its content are targeted to three groups:
- For the learners to use during the workshop, both to follow along and also to use as a reference after the workshop ends. A more detailed description of who the learner is can be found in Is this for you?
- For the teachers to use as a guide for when they do the type-along sessions.
- For those who are interested in teaching, who may not have much experience or may not know where to start, to use this website as a guide to running and teaching their own workshops.
Re-use and licensing
The workshop material is licensed under the Creative Commons Attribution 4.0 License, so the material can be used, re-used, and modified, as long as there is attribution to this source. Check out the For teachers section for more details and tips on using this material for teaching.
Contributing
Want to contribute to this workshop? Look through our CONTRIBUTING page for contribution guidelines on how to get started.
Contributors
These are the people who have contributed by submitting changes through pull requests 🎉
@lwjohnst86, @AndersAskeland, @signekb, @nestanyol, @Mortendall
How the website is made
The workshop material is created using Quarto to write the material and create the book format, GitHub to host the Git repository of the material, and GitHub Actions with Netlify to build and host the website. The original source material for this workshop is found on the r-cubed-advanced
GitHub repository.
Acknowledgements
Illustration cover is by Storyset.
The Danish Diabetes and Endocrinology Academy hosts, organizes, and sponsors this workshop. A huge thanks to them for their involvement and support! Steno Diabetes Center Aarhus and Aarhus University employs Luke Johnston, who is the lead instructor and curriculum developer.