dplyr
and tidyr
Session 6
September 21, 2023
Informal poll:
Do you collaborate with anyone who could or currently uses GitHub?
Chat:
What would you say to encourage them to start?
Chat:
How would you get set up? What would be your core workflow?
Create a branch for today’s work.
Note
Create this branch from a repo you own, not the demo from Tuesday!
We might need to create a repo for R scripting for this workshop.
dplyr
and tidyr
) to work with data in R.Let’s install some packages:
Key points:
data.frame
or tibble
NA
s.dplyr
Core dplyr
verbs:
select
pulls columnsfilter
pulls rows based on valuesmutate
adds or modifies a columngroup_by
+ summarize
calculates group-wise summary statistics*_join
functions combine data frames based on matching columns%>%
: older, included in dplyr
(ultimately depends on magrittr
)|>
: included in base R as of 4.1.0tidyr
Key tidyr
manipulations:
pivot_longer
turns columns names into row valuespivot_wider
creates new columns based on the values of a given field.Note
Data should be as long as is reasonable (but not longer)!
?tidyr::pivot_wider
and index pagesWork through the steps to synchronize the changes you’ve made today with GitHub.
Work through the steps to synchronize the changes you’ve made today with GitHub.
Command-line instructions
git add <your script name>
git commit
git push