In January of this year, I posted a summary of my EssentialPhD Toolkit. In that post, I discussed the trifecta of RStudio, Markdown, and
knitr. I also mentioned Project Template, an R package that helps keep files
organized and can automate some analyses. I learned about and started
using all of these tools at the same time, so I’m not always clear on how to
use them independently... That said, I learned an integrated routine from a
colleague and have adopted it for myself. I’ll share that routine with you in
this post. I’m sure it can be improved, so please feel free to
leave some tips in the comments. Also, ask me if anything is unclear.
Pre-steps: Install R, RStudio, and packages
I'm assuming you have a basic understanding of R. Install RStudio and then the ProjectTemplate package. I don't recall if you need to install knitr or if it's automatic, but maybe install it too just in case.
Step 1: Create the project
Within RStudio, run the following lines of code. Change "MyProjectName" to whatever you would like your project to be called.
require(ProjectTemplate) create.project("MyProjectName") |
Folder structure created by ProjectTemplate |
Step 2: Open the project within RStudio
Within the pop-up window, select "Existing Directory".
Using the Browse... button, find the folder you just created using create.project(), above. In this example, it's called "MyProjectName".
Click "Create Project".
Select "Create Project" from the Project menu, then "Existing Directory" from this window | Browse for the folder you just created in Step 1 |
Using Markdown and knitr with ProjectTemplate
I also won't dwell on the specifics and mechanics of ProjectTemplate - for that, visit the website, which contains much more detail than I will give here. Here I'm simply providing the extra bits of code I add to ensure that my scripts run smoothly without any working directory issues.
ProjectTemplate relies on the folder structure it created to automate some scripts and analyses. I don't take advantage of this as much as I should, but I do benefit from auto-loading packages and auto-loading any basic scripts (i.e. custom functions I wrote for this project). In order for scripts to be run automatically, they must be saved in specific places and the working directories must be specified in specific ways.
I keep my analysis scripts in my "src" folder. Any data-manipulation and pre-processing scripts are in the "munge" folder. My custom scripts are in the "lib" folder. The first chunk of any of my scripts is a Startup script, which specifies which folder the file is saved in (e.g. "src" or "munge") and then sets the directory one level higher - this step is necessary to load the project using ProjectTemplate.
``` {r startup} require(ProjectTemplate) if(basename(getwd()) == "src") setwd("..") load.project() ```
To my understanding, knitr re-sets the working directory to the "src" (or "munge") folder for each chunk, so I try to add a conditional working directory change at the start of each one. This conditional setwd() allows me to run the script within RStudio AND to let knitr knit the script, without confusing working directories. So, for example:
``` {r load_files} if(basename(getwd()) == "MyProjectName") setwd("src") load("../cache/oldanalyses.RData") ```
That's the basics of how I start up a new project using ProjectTemplate, and how I navigate some of the issues with working directories using Markdown/knitr/RStudio. If you have any questions, leave a comment. I'm not an expert, but I'll try my best to address your question.
Merci beaucoup!
ReplyDeleteNice work. Thanks for sharing!
ReplyDelete