Figure path mess with R Markdown/knitr/jekyll/GitHub
25 Apr 2017Thanks largely to this awesome blog post, it took me less than an hour to set up a local jekyll site on my machine and familiarize myself with the structure. Next, I started pulling and modifying files from the hyde theme until I had the basic look I wanted (a minimalist black-and-white theme with a permanent sidebar and blog). I added some basic content (About Me, publications, etc), and by nightfall I was happy enough with my new site to go live on GitHub Pages. Not bad for a day’s work.
But the smooth sailing ended the next day when I started trying to incorporate R Markdown documents with R code and graphical output. I read several blog posts on using R Markdown and knitr
to create pages for your jekyll/GitHub site (e.g., here and here). The basic idea is this:
- Write your page in an R Markdown document, including chunks of R code
- Use
knitr
to “knit” the*.Rmd
file, which produces a simple Markdown*.md
file and output files, including any figures produced by your R code - Add the
*.md
file and output to the GitHub repository for your site - Sit back and relax as GitHub pages automatically builds and serves your site, rendering your
*.md
and dependent files into an attractive*.html
page with syntax highlighting for your code and R output displayed.
I added my first *.Rmd
->*.md
page with R figures (my silly cards
page). I then built and previewed my site locally on my machine, and when I saw that everything was beautiful, I pushed it to GitHub. But when I went to view the live site, I was horrified to see that all of the R figures were missing! How could this be?
I checked the html
source code for the page and discovered the immediate problem: the file paths to the figures had an extra forward slash appended to the beginning of the path, e.g., <img src="//project/figure/fig.png" />
rather than <img src="/project/figure/fig.png" />
. Apparently the offending slash was being added when my *.md
files were rendered into *.html
by GitHub/jekyll.
I am not going to attempt to figure out what sorcery caused this extra slash to appear in my html
- I’m sure there is a good reason why GitHub/jekyll behave this way. What I AM going to do is explain what exactly I did that seemed to cause the problem, and how I fixed it with a few lines of R code.
The problem
Locally, my website lives in the directory /Users/nunnlab/Desktop/GitHub/rgriff23.github.io/
. After creating my caRds.Rmd
file, I added it to my website’s projects
folder, and then ran knit
like so:
# load knitr
library(knitr)
# navigate to directory of caRds.Rmd
setwd("/Users/nunnlab/Desktop/GitHub/rgriff23.github.io/projects/")
# knit
knit("caRds.Rmd")
As expected, knit
produced the file caRds.md
in my projects
directory, as well as a new folder called figure
which was filled with all the figures produced by my R code. Opening the Markdown file reveals that caRds.md
points to each figure with a relative path like this:
![figure1](figure/figure1.png)
I had no problem previewing the the Markdown file in R Studio, nor did I have a problem building the site locally on my machine. The issue only occurred when I pushed the site to GitHub. The html
generated by GitHub/jekyll seemingly wanted the image paths to be absolute, and also fixed an extra slash to the beginning of each absolute path: <img src="//projects/figure/figure1.png" />
. After a bit of tinkering, I discovered that if I manually edited the Markdown file to include the absolute paths from my site root, like this:
![figure1](/projects/figure/figure1.png)
Then GitHub/jekyll would leave my paths alone and produce proper html
tags linking to my figures. I suppose I could do that manually for all my Markdown files every time I knit, but the whole point of blogging with R Markdown/knitr/jekyll/GitHub is supposed to be the convenient workflow, and having to edit all the figure paths in my Markdown files after knitting them is not my idea of convenience. I reasoned that there were two general approaches to solving this problem:
- Make some adjustment to the knitting process so that the resulting
*.md
files include the absolute path to the figures on my site - Make some adjustment to the jekyll configuration so that GitHub/jekyll don’t mess up my paths
I ended up going with the first approach, mostly because I am more comfortable with R/knitr than I am messing with jekyll
settings. I should note that because my approach changes the paths in the *.md
files such that they are no longer relative, you will no longer be able to preview them in R Studio- you will get an error telling you that your document points to files that don’t exist. I don’t love it, but it isn’t a big deal to me because I rarely feel the need to preview the intermediate *.md
files.
The solution
The key is to recognize that the figure paths produced by knit
are a combination of two parameters, base.url
+ fig.path
, which can be modified with the functions opts_knit
and opts_chunk
respectively. The defaults are base.url = NULL
and fig.path = "figure/"
, such that the default figure path ends up being simply figure/
. Importantly, since fig.path
is a chunk option, changing it will not just change the paths in the resulting *.md
file; it will also change where your figures are actually sent when you run knit
. Specifically, your figures will be sent to base.dir
+ fig.path
(note that base.dir
and base.url
are different!). Wow, that’s a bit confusing! Let’s distill this down:
knit
SENDS your figures tobase.dir
+fig.path
, and the resulting Markdown POINTS to figures inbase.url
+fig.path
.
When I run knit
for my *.Rmd
files, what I want is for the figures to be SENT the correct directory on my machine, so I want base.dir
+ fig.path
to define an absolute path on my machine:
/Users/nunnlab/Desktop/GitHub/rgriff23.github.io/projects/figure/
However, in order for the paths to be specified correctly when GitHub/jekyll build my site, I want the resulting *.md
file to point towards figures with an absolute path from the root of my website:
/projects/figure/
Here is how this can work in R:
# define paths
base.dir <- "/Users/nunnlab/Desktop/GitHub/rgriff23.github.io/"
base.url <- "/"
fig.path <- "projects/figure/"
# this is where figures will be sent
paste0(base.dir, fig.path)
## [1] "/Users/nunnlab/Desktop/GitHub/rgriff23.github.io/projects/figure/"
# this is where markdown will point for figures
paste0(base.url, fig.path)
## [1] "/projects/figure/"
Okay, that looks like what I want! Now, set the parameters using opts_knit
and opts_chunk
.
# set knitr parameters
opts_knit$set(base.dir = base.dir, base.url = base.url)
opts_chunk$set(fig.path = fig.path)
And finally, navigate to the project
directory containing caRds.Rmd
and run knit
(this ensures that knit
sends caRds.md
to the same directory as caRds.Rmd
, which is what I want)
# change directory
setwd("/Users/nunnlab/Desktop/GitHub/rgriff23.github.io/projects/")
# knit
knit("caRds.Rmd")
And that does the trick! The resulting Markdown file now points towards figures with absolute paths from my site root, e.g., /project/figure/figure1.png
, and GitHub/jekyll produce an html
page that displays the figures properly.
I cleaned this up by setting the knit
options within the *.Rmd
file itself. I added the following lines of code to an R chunk placed right after the front matter in caRds.Rmd
:
knitr::opts_knit$set(base.dir = "/Users/nunnlab/Desktop/GitHub/rgriff23.github.io/", base.url = "/")
knitr::opts_chunk$set(fig.path = "projects/figure/")
Now I can just navigate to the project
directory and run knit
without setting any options outside of the *.Rmd
file. Assuming I want the figures for all of my ‘projects’ to be stored in /projects/figure/
, I can use the same knitr
settings for all of my project *.Rmd
’s.