The Lean Stack: Rethinking Reproducible Reporting and Visualization in R

Yihui Xie

2026-04-03 @ UNMC School of Public Health

It is not the man who has too little that is poor, but the one who hankers after more.

— Seneca, “Letters from a Stoic”


Two richest men in Omaha

  1. Warren Buffett

  2. ???

(if you think from the perspective of “degrees of freedom”…)


What does “lean” even mean?



Heaviest object in the universe


Part I

⚡ litedown ⚡

Reproducible Reporting, Reimagined

Do one thing, and do it well


What is litedown?

A minimal reimplementation of the R Markdown ecosystem:

$$\mathrm{litedown} = \min{\{R\}} + \{D_i\} - \{D_e\} + \{J\}$$

install.packages('litedown')
litedown::roam()  # get started

In HTML I Trust

Do you know how awesome web browsers are?…

(now cast the spell “go up”, “go down”, “dark mode”, “mirror my slides”, “edit my slides”, “hi *anyone…” (voice input), or “reset my slides”)


What you get with litedown

All from one package. Two dependencies. No Pandoc.

I re-implemented 12 years of work from scratch in a couple of months in 2024, with fewer features and dependencies—not because I’m smart, but because I had a decade to think about what really matters to me.


Demo time


When litedown falls short

Minimalism has costs:


Part II

💡 gglite 💡

Interactive Visualization, the Light Way


ggplot2

ggplot2 is brilliant. The Grammar of Graphics changed how we think about data visualization. I have nothing but respect for this masterpiece of software engineering.

But…

What if we could have Grammar of Graphics and interactivity and keep it light?


What is gglite?

gglite = a lightweight R interface to G2, a JavaScript visualization library built on the Grammar of Graphics.

install.packages('gglite', repos = 'https://yihui.r-universe.dev')

ggplot2 → gglite: a quick comparison

ggplot2 gglite Notes
ggplot(data, aes(x, y)) g2(data, x = 'x', y = 'y') Column names are character strings
g2(data, y ~ x) Formula interface
+ operator |> pipe or +
geom_point() mark_point() “geom” → “mark”
geom_line() mark_line()
theme_minimal() theme_light()
labs(title = ...) titles(...)
Static PNG/PDF Interactive HTML/JS Built-in tooltips, brushing

No aes(), no non-standard evaluation, no .data[[]] gymnastics.


Demo: Scatter plot (Hello, iris!)

library(gglite)
g2(iris, x = 'Sepal.Width', y = 'Sepal.Length', color = 'Species')

Try: hover over points, click legend entries to filter species.


Demo: Box plots

g2(iris, x = 'Species', y = 'Petal.Width') |>
  mark_boxplot()

Box plots with hover tooltips showing quartiles—no extra packages needed.


Demo: Bar chart with dodging

# Simulated clinical trial enrollment by site and treatment arm
enrollment = data.frame(
  site = rep(c('UNMC', 'Johns Hopkins', 'Mayo Clinic', 'UCSF'), each = 2),
  arm = rep(c('Treatment', 'Placebo'), 4),
  count = c(45, 42, 38, 35, 52, 48, 30, 33)
)
g2(enrollment, x = 'site', y = 'count', color = 'arm') |>
  mark_interval() |>
  transform('dodgeX') |>
  titles('Trial Enrollment by Site and Arm') |>
  interact('elementHighlightByX')

Demo: Pie chart (everyone’s guilty pleasure)

# Causes of missing data (we've all been there)
missing = data.frame(
  reason = c('Patient withdrew', 'Lab error', 'Lost to follow-up',
             'Data entry mistake', 'The dog ate it'),
  count = c(25, 15, 30, 20, 10)
)
g2(missing, y = 'count', color = 'reason') |>
  mark_interval() |>
  transform('stackY') |>
  coord_theta(innerRadius = 0.4) |>
  titles('Reasons for Missing Data') |>
  labels(text = 'reason')

Demo: Interactive time series with slider

# Simulated daily case counts
set.seed(42)
n = 365
dates = as.character(seq(as.Date('2025-01-01'), by = 'day', length.out = n))
cases = data.frame(
  date = dates,
  count = cumsum(rpois(n, lambda = 5)) + round(50 * sin(seq(0, 4*pi, length.out = n)))
)
g2(cases, x = 'date', y = 'count') |>
  mark_area(style = list(fill = 'steelblue', fillOpacity = 0.4)) |>
  mark_line(style = list(stroke = 'steelblue')) |>
  titles('Daily Case Count (2025)', subtitle = 'Drag the slider to zoom') |>
  slider_x()

Demo: Radar chart for grant reviewer scores

scores = data.frame(
  criterion = rep(c('Significance', 'Innovation', 'Approach',
                     'Investigators', 'Environment'), 2),
  score = c(8, 7, 9, 8, 6, 6, 9, 5, 7, 8),
  reviewer = rep(c('Reviewer 1', 'Reviewer 2'), each = 5)
)
g2(scores, x = 'criterion', y = 'score', color = 'reviewer') |>
  mark_area(style = list(fillOpacity = 0.3)) |>
  mark_line(style = list(lineWidth = 2)) |>
  mark_point() |>
  coord_polar() |>
  scale_x(padding = 0.5, align = 0) |>
  scale_y(domainMin = 0, domainMax = 10) |>
  axis_y(grid = TRUE, title = FALSE) |>
  titles('NIH Grant Review Scores')

Demo: Word cloud (your dissertation committee’s feedback)

feedback = data.frame(
  text = c('revise', 'resubmit', 'interesting', 'methods', 'typo',
           'references', 'sample size', 'p-value', 'bias', 'limitation',
           'future work', 'well-written', 'novel', 'unclear', 'Table 1'),
  value = c(50, 45, 30, 35, 40, 25, 38, 42, 28, 33, 20, 15, 18, 35, 22)
)
g2(feedback) |>
  mark_word_cloud() |>
  encode(text = 'text', value = 'value', color = 'text') |>
  titles('Dissertation Committee Feedback')

Demo: Sankey diagram (patient flow)

flow = data.frame(
  source = c('Screened', 'Screened', 'Eligible', 'Eligible',
             'Randomized', 'Randomized', 'Treatment', 'Placebo'),
  target = c('Eligible', 'Excluded', 'Randomized', 'Declined',
             'Treatment', 'Placebo', 'Completed', 'Completed'),
  value = c(200, 50, 150, 30, 60, 60, 55, 52)
)
g2(flow) |>
  mark_sankey(layout = list(nodeAlign = 'center', nodePadding = 0.03)) |>
  encode(source = 'source', target = 'target', value = 'value') |>
  titles('Clinical Trial Patient Flow (CONSORT-ish)')

Many more examples on the package site


When gglite falls short

For exploratory analysis, dashboards, HTML reports, and presentations? gglite shines. For The New England Journal of Medicine Figure 2? Probably stick with ggplot2 for now.


Part III

Conclusion

When “lighter” is actually better


When minimalism wins


When the full stack wins

The right tool depends on the job. I built litedown and gglite not to replace everything, but to offer a lighter path when the heavy machinery isn’t needed.


Take-home message

litedown:

install.packages('litedown')
litedown::roam()  # live preview (like a mini RStudio viewer)

gglite:

install.packages('gglite', repos = 'https://yihui.r-universe.dev')

Thank you!

https://yihui.org

Slides made with litedown (of course). Plots made with gglite (of course).