A 10-second introduction:
Write _narratives_ in Markdown, and _computer code_ in backticks:
```{r}
mean(rnorm(100))
```
Write **more** code chunks and narratives.
The idea come from Knuth’s Literate Programming.
I was fascinated by Sweave1 in R when I came across it around 2007
but found it too limited after using it for 4 years (for most of my homework assignments)
I started developing the knitr package in 2011, which became a backend of R Markdown
LaTeX as the documentation language is too difficult for beginners, and Markdown is much simpler
R Markdown = Markdown + computing languages (not limited to R)
Appears so.
Number of files on Github:
Another piece of evidence: 1600+ books on bookdown.org.
2012 | 2024 | |
---|---|---|
Output formats | HTML | HTML, LaTeX, PDF, Word, PowerPoint, EPUB, … |
Applications | Reports | Reports, slides, articles, books, websites, dashboards, … |
Backend | sundown (a tiny C library) | Pandoc |
R package dependencies | 0 | 25 |
Package size | ~2Mb | ~83Mb (Pandoc: 152Mb) |
The above comparison does not include R Markdown extension packages such as bookdown and blogdown, which are even heavier.
Reproducibility is hard without stability, i.e., we don’t want something worked yesterday but breaks today.
As a software developer, I’ve gradually become tired of adding features endlessly and managing dependencies.
It feels like playing an infinite game.
We chose Markdown for simplicity, right?
If we make Markdown do everything, is it still simple?
The more features we add, the more challenging it is to maintain the software. We may change our mind or regret, and deprecate certain features in the future, which means… breakage!
[…] While it has achieved its mission of demonstrating that unifying computational reproducibility and provenance tracking is doable and useful, it has also demonstrated that Python is not a suitable platform to build on for reproducible research. Breaking changes at all layers of the software stack are too frequent. The ActivePapers framework itself (this project) uses an API that was removed in Python 3.9, and while it can be updated with reasonable effort, there is little point in doing so: Published ActivePapers cannot be expected to work with a current Python stack for more than a year.
[…] If you came here to learn about reproducible research practices, the best advice I can give is not to use Python.
Some examples in my software development career:
Pandoc (relatively stable, but breakage/regression happens)
Bootstrap (v2, v3, v4, v5, …)
jQuery (v1, v2, v3, …, and security issues)
Hugo (moving too fast and hard to follow)
GitBook (impossible to maintain after importing into bookdown)
Yes, users could adopt tools like renv
, virtualenv
, or even Docker
to manage dependencies, but that could bring another problem:
If your results are reproducible only in a highly specific environment, can we really call them reproducible?
Are they really useful?
Computer languages often have package repositories such as CPAN, CRAN, CTAN, PyPI, and NPM, etc.
A perhaps unique feature of CRAN is that package maintainers must check their reverse dependencies before they are able to publish new versions of their packages to CRAN, which means normally you are not allowed to break packages that depend on your package.
This is a huge pain for me as the maintainer of some popular R packages including knitr and rmarkdown (which have > 10K reverse dependencies), but is enormously beneficial to the whole R community.
litedown: https://github.com/yihui/litedown
The goal: litedown = min{knitr + evaluate + rmarkdown + bookdown + blogdown + pagedown + xaringan + tufte} - Pandoc - Hugo - GitBook - Bootstrap - jQuery
Re-implementing 12 years of work with fewer features and dependencies
Easier to develop, install, and manage
Also easier to be embedded in other applications (e.g., WebAssembly)
Be determined to say No (if you want rich features, you should use rmarkdown or Quarto instead)
Choose a stable foundation to develop software on top of
Hoping to declare “feature-complete” in a few months
Github: https://github.com/yihui
Personal website: https://yihui.org
RIP, Fritz Leisch! ↩