Yihui Xie
, RStudio PBC ### 2021/09/09 @ Brazilian R-Day --- ## About me - Currently a software engineer at RStudio - Majored in statistics from 2002 to 2013 (2002-09 Renmin University of China, and 2009-13 Iowa State University) - Homepage: []( --- ## About me - I love programming (not an expert), mostly in R and occasionally JavaScript - I love writing even more (not expert, either) - Programming + Writing = Literate Programming (LP) --- background-image: url( background-size: contain background-position: right center ## Where ## I am ## now --- ## How the journey started - Sweave: - I started using Sweave in 2007 and fell in love with it - I did my homework assignments in Sweave whenever possible, and tried my best to promote it to my classmates ---  --- ## Why Sweave? - Reproducible research? - For a student like me doing my homework assignments, that was not the most convincing reason. -- - Convenience! - Before Sweave, I had to run the code separately, copy the results, and paste into my homework assignments. - If I had to change the code for some reason, I would have to repeat this boring process (copy and paste). --- ## Sweave syntax ```tex \documentclass{article} \begin{document} <<>>= # this is a code chunk 1 + 1 @ This is a narrative. \end{document} ``` - Filename extension is `.Rnw`, e.g., `test.Rnw` - `<<>>=` starts a code chunk; `@` starts a narrative (prose / text / documentation) --- ## Sweave syntax - Sweave is much more powerful and flexible than its ancestor (WEB/noweb) - You can use chunk options to control the behavior of a code chunk, e.g., ```tex <<echo=FALSE, width=8>>= plot(1:10) @ ``` --- ## Where did the syntax come from? - Donald Knuth, Literate Programming (1984): - computer program + natural language - tangle and weave - WEB and CWEB (1987): - noweb (1989): - They all use the `<<>>=` and `@` syntax --- ## A lesser-known fact about LP A very powerful feature is that you can label code chunks, and re-organize them freely in a document, e.g., ```tex <<chunk-A>>= add_one = function(x) { <<chunk-B>> } @ <<chunk-B>>= x + 1 @ ``` See an example at and more at --- ## The birth of knitr - Sweave's implementation of Literate Programming was clever, novel, and simple (starting in 2002) - but it has some limitations, e.g., - no easy way to specify the width of a plot in a code chunk - primarily supports LaTeX output - not easy to extend --- ## The birth of knitr - Extension packages existed, e.g., **cacheSweave** (caching), **pgfSweave** (high-quality TikZ graphics), **highlight** (syntax highlighting), and **R2HTML** (HTML output) - but you could only have one extension, e.g., you cannot have both caching and TikZ graphics - One package to combine all these nice features and remain extensible at the same time --- ## The idea of Literate Programming, again - Program code (for computer) + narratives (for human) - The program code can be in any language, e.g., R, Python, Julia, JavaScript, C++, SQL, etc. - The narratives can also be written in any documentation language, e.g., LaTeX, HTML, Markdown, etc. - knitr's design is language-agnostic. --- ## knitr releases - 2011-10-16: initial development - 2012-01-17: initial CRAN release - 50 CRAN releases in 10 years - The latest release (v1.34) was from today! --- ## My first book in 2013 A few months before I graduated from Iowa State University, I published the book [Dynamic Documents with R and knitr]( .center[] --- ## Dynamic Documents with R and knitr - I had to document the knitr package anyway. The full documentation would be lengthy. Why not write a book? - As a PhD student, I was free and *fearless*. - Content: (1) Package documentation (2) Q&A from knitr mailing list and Stack Overflow (3) Software internals. --- ## The 2nd edition - The 1st edition focused on Rnw documents (R + LaTeX) - R Markdown became much more popular, therefore I updated the book and published the 2nd edition in 2015. --- ## Lessons learned? - Tackle a problem that you run into very frequently (e.g., for me, I wanted to do my homework assignments more efficiently) - You will be excited when the problem is solved, and that can be where the [Flywheel effect]( starts > [...] relentlessly pushing a giant, heavy flywheel, turn upon turn, building momentum until a point of breakthrough, and beyond --- ## Lessons learned? - If you seek to make an impact, I'd recommend depth before breadth (see my blog post [Impact: Depth or Breadth?]( - For example, consider writing a book or something of substantial length. - Make use of the community help, e.g., **knitr** has received 441 pull requests on Github in 10 years: and 99 contributors: --- ## Lessons learned? - The existing "best" choice could be improved or even challenged. Software is written by humans, and no human is perfect. - But... -- - In retrospect 10 years later, I was not nice enough to Sweave from the beginning and appeared ungrateful. - Recognize the second-mover advantage. Show respect and be grateful to pioneers in the field. --- ## The first generation of R Markdown - Based on the R package **markdown** (2013-14): (N.B. not **rmarkdown**!) - Only supports HTML output - But quickly became popular: --- ## R Markdown: the second generation - Based on Pandoc - Multiple output formats: HTML, Word, PDF, PowerPoint, E-book, etc. - Initial experiment: `knitr::pandoc()` (2014) - Matured as the **rmarkdown** package (2015): --- ## How R Markdown works `rmarkdown::render()` ≈ `knitr::knit()` + Pandoc  --- ## R Markdown supports multiple computer languages Despite the name "R Markdown", it is not only for "R". --- .center[] (2018) --- .center[] (2020) --- ## The bookdown package (2016) - An R package for writing books with R Markdown - Supports HTML, LaTeX/PDF, and EPUB output - --- ## The blogdown package (2017) - Creating websites with R Markdown, based on static site generators, such as Hugo, Jekyll, and Hexo - Another year, another package, another book: --- ## More in the R Markdown ecosystem - rticles: - tufte: - pagedown: - flexdashboard: - xaringan: - shower: - distill: --- ## More in the R Markdown ecosystem - learnr: - pkgdown: - htmlwidgets: - prettydoc: - minidown: - rmdformats: - ... --- ## The future? 