Session 10
October 3, 2024
Make it easier for people to cite your code
Provide a CITATION file on GitHub
Archive to get a digital object identifier (DOI)
Include DOI and/or citation in your paper’s Data/Code Availability Statement
A CITATION.cff
file contains citation information written in YAML
Adding a CITATION.cff
file to your repo…
Puts a “cite this repository” button on GitHub
Helps code archive tools fill out metadata correctly when you archive your repo
Learn more and create your own: https://citation-file-format.github.io/
See example here
Service | Versioned DOIs? | Free? | GitHub integration? | Notes |
---|---|---|---|---|
Zenodo | Yes | Yes | Yes | Backed by CERN, built with code and data in mind |
Dryad | Yes | No, but some publishers cover cost | No | Intended for data, not code. Partners with Zenodo |
Figshare | Yes | Yes | Yes | Can’t choose your license |
UA ReDATA | Yes | Yes (for UA researchers) | No | University of Arizona Libraries |
No hard rules on this, but my preference:
Congrats! Your code is reproducible! But what about ….
in 3 years when an R package is updated with breaking changes?
on a different operating system with different versions of system libraries?
Capture the computational environment for ultimate reproducibility
renv
The renv
package records R packages and their versions used in your project
Projects are isolated with their own set of packages
Can restore exact versions of packages recorded
renv
Exercise
Install renv
and activate it for a project with renv::init()
. Inspect the files that were created.
If you change your mind …
To deactivate renv
, run renv::deactivate()
. To also remove all the files it created, run renv::deactivate(clean = TRUE)
instead.
renv
Only tracks R packages 1
Can’t reproduce operating system or system libraries
Sometimes quite annoying to use (but it’s getting better!)
Docker containers…
Are isolated “virtual machines”
Run Linux regardless of the host machine OS
Can be built with specific versions of OS, system libraries, and R packages (using renv
)
Can be downloaded and run from the command line
There is a reproducibility trade-off for using renv
and Docker—robust computational reproducibility but harder for novices to reproduce
If you use these tools, provide:
Tuesday 10/18: Drop-in co-working session.
Thursday 10/10: Reproducibility Colloquium!