# Reproducibility in Statistics and Data Science (Discussion)

2016/08/03 @ JSM Chicago

# Jenny

• I admire Jenny’s courage to tackle this problem
• can store any mess
• actions not recorded
• The Importance of Reproducible Research in High-Throughput Biology (Keith Baggerly) https://youtu.be/7gYIs7uYbMo
• two possible solutions
• spreadsheet users, we hate you
• or perhaps we can help
• I wish spreadsheets could die eventually, but this is probably not possible or will take an extremely long time
• how do we encourage people to switch to better tools/approaches?

# Karl

• steps toward reproducible research
• personally I tend to avoid the phrase “reproducible research” (e.g. in my book “Dynamic Documents with R and knitr”)
• click click click vs type type type
• In Code We Trust
• Karl is pretty good at writing short tutorials, so please don’t buy my book but read this instead http://kbroman.org/steps2rr/ (and a few other ones)

# Karthik

• computational reproducibility
• reproducible != correct, but better than not being able to reproduce
• testing in a nutshell
• if (output != expected) stop("did not get expected output")
• e.g. you can use testthat from the Hadleyverse (I use my little package testit)
• research with big data / intensive computing?
• making reproducible research tools more accessible
• incentives

# Mine

• perspective from education
• reminded me of my (dark) life as a PhD student