Every new Kaggle kernel must begin as a branch off of an existing Kaggle dataset, for example, see the “New Notebook” button on the right side here:
https://www.kaggle.com/heesoo37/120-years-of-olympic-history-athletes-and-results
If you click that button, you will be creating a notebook or ‘kernel’ that is only allowed to see the data that is included in this dataset (which can include multiple files).
Now let’s see what the inside of a Kaggle kernel looks like. Look at the ‘Code’ view of this kernel and see the raw R Markdown code that was used to generate it:
https://www.kaggle.com/heesoo37/olympic-history-data-a-thorough-analysis/code
This is the rendered report:
https://www.kaggle.com/heesoo37/olympic-history-data-a-thorough-analysis/report
The R Markdown script loads the data directly from Kaggle, and the kernel only has access to the data that is stored in it’s associated Kaggle dataset. You can see that the kernel is explicitly associated with this dataset:
https://www.kaggle.com/heesoo37/olympic-history-data-a-thorough-analysis/data
This means that the code in the kernel must point to the location of the data files on Kaggle, just like how you had to point to the files on your computer when running the code locally. As I will explain later, the data files on Kaggle always live in the relative path ../input/
.
Your Kaggle kernel will require code that is very similar to your existing R Markdown documents, but with a few important differences:
knits::opts_chunk
block of code at the top of your R Markdown documents (that is only there for R Studio - so you should not include it in the Kaggle code).---
at the very top of your R Markdown script). Replace the title with your project title. You don’t have to put your name the way you do in the homework, because Kaggle will already show your profile with the kernel. Note that you can modify these settings if you wish. Start simple to get your kernel up and running, but then you can look around at other R Markdown kernels you like and see what they put in their YAML header to improve the appearance of their document. You can change the theme, figure size options, etc. This example YAML header is taken from the Olympic History kernel I pointed to earlier:---
title: 'Olympic history: a thorough analysis'
output:
html_document:
number_sections: true
toc: true
fig_width: 8
fig_height: 5
theme: cosmo
highlight: tango
code_folding: hide
---
knits::opts_chunk
block of code. You do NOT want to include that.file.choose()
or the file path on your system), and you will need to replace that with the correct paths on Kaggle’s platform. The files for your dataset will always be located at the path ../input/
, and then you add the names of your files, e.g., ../input/data.csv
. Check the organization of your data to see what the file structure is.If you feel stuck, please look at other Kaggle kernels for guidance. You can look at the code in the R Markdown kernels to see exactly how they wrote their kernel in order to get their work to display.