SC1

Statistical Computing 1

Projects: organizing your code


RStudio has a nice interface for programming in R. You may have only been using it to create R scripts / RMarkdown files and execute / render them.

It is generally a good idea to keep code for a “project” in one place. RStudio can help with this by letting you create a “Project”, which is little more than a directory with files in it. Opening a project in RStudio automatically sets R’s working directory to that directory.

Project structure

The structure of the files in a project should of course be dictated by the purpose of the project. However, the following top-level folder structure is often used:

project/
├── R/
├── README.md
├── data/
├── doc/
└── output/

A specific project may require additional folders. For example, in some cases it is useful to write some C or C++ code for performance reasons, and this code would normally reside in a src/ folder. One popular way to incorporate C++ code is through the Rcpp package.

If your R code is complicated enough, you may want to add a tests folder with test scripts that you can run automatically to check that the functions do what you expect them to do. In some such cases, you might wish to convert your project into a package to take advantage of various tools that work with packages.

Loading code from files and packages

When code is split across multiple files, there needs to be a mechanism to load it. For R projects and scripts, one can use the source command. For example, one could load the code in the “algorithms.R” file in the “R” directory.

source("R/algorithms.R")

To use an R package, one uses the library or requires command. For example, one could load the Matrix package, which has support for some sparse matrix computations.

library(Matrix)

Example project

An example R project can be found here. You can clone it and knit the RMarkdown document if you like.