How to make a submission

Become a github collaborator

To make a submission, you need to become a collaborator with the Data Challenge.

If you have not done already, register a github account.
Send your github user ID to oliver.ratmann@imperial.ac.uk
Log on to your github account and remain logged on for the day.
Accept our invitation to collaborate on the DataChallenge.2017 repository.

Prepare your submission

Store your predictions in a csv file.

The following R code illustrates how to do this:

require(DataChallenge.2017)
data(train)
#   set the file name of your prediction
#   please replace TEAM with your team name
#   and SUBMISSION with your submission ID
outfile.base <- 'DataChallenge_TEAM_SUBMISSION_'
#   extract the countries and years for which you need to make a prediction
#   and make your prediction
sub <- subset(train, YEAR>=2015)    
sub[, PREDICTION:= as.integer(runif(nrow(sub),0,1)>0.5)]

#   submissions must have the following format: 5 columns with names
#   ISO (country code as in sub above)
#   YEAR (2015 and 2016 as in sub above)
#   TEAM_ID (allocated to you)
#   SUBMISSION_ID (unique to this submission)
#   PREDICTION (real numbers are allowed). 
#   You need to make a prediction for each country and each year in `subset(train, YEAR>=2015)`.
sub     <- unique(subset(sub, select=c(ISO,YEAR,PREDICTION)))
sub[, TEAM_ID:='olli0601']
sub[, SUBMISSION_ID:='random']

#   check your submission, this must evaluate to TRUE. 
dc.check.submission(sub)

#   write submission to file
write.csv(sub, row.names=FALSE, file=paste0(outfile.base,'predictions.csv'))

Upload your submission

Next, you need to upload your submission. This is easy once you accepted the github invitation to collaborate on the Data Challenge.

Go to the github repository of the Data Challenge.
Make sure you are logged in on github with your github user ID.
Click on the submissions folder.
Click the Upload Files button, drag and drop the submission you created.
Commit your submission to the master branch. That s it!

** First rule for today: submissions that do not pass dc.check.submission will be ignored. **

** Second rule for today: your team can make at most 30 submissions. **

How your submission is evaluated

We will calculate the mean squared error between all of your submissions and the true number of epidemic outbreaks in 2015 and 2016 across all countries.

Submissions that do not pass the dc.check.submission will be ignored. If more than 30 entries are submitted by your team, only the first 30 will be considered.

Beating our baseline prediction is not difficult, do not worry. We hope you enjoy working with real & messy data today, which is a bit different to what you typically see in your courses. =J

2017-11-21

Become a github collaborator

Prepare your submission

Upload your submission

How your submission is evaluated