Make Shiny Fast

…by doing as little work as possible

Alan Dipert (@alandipert)

February 2, 2018

Agenda

  1. Introduce methodology
  2. Learn to measure, analyze with Rprof & profvis
  3. CRAN Explorer optimization tour

Optimization Loop Method

Benchmark

What’s in a benchmark?

  1. Model: Representative user actions
  2. Metrics: Latencies experienced by model user

Model example

Reserving flights

😱

Results took > 20 seconds!

It’s OK.

  • Users expect to wait, UI confirms expectation
  • It’s Fast Enough™

Benchmarking in practice

Best done casually!

  • Fast Enough is easy to see
  • Only when it’s not Fast Enough must we Analyze

Analyze

Analysis

  1. Exercise model to produce metric data
  2. Identify the one slowest thing

Optimizing slowest thing gives highest payoff

Rprof and profvis

  • “Feels slow” usually means R is busy
  • Rprof: sample what R is doing
    • Computing (ggplot2, dplyr)
    • Waiting (database, network, disk)
  • profvis: visualize Rprof output

The call stack

Traceback

Call stack over time

🤔

profvis in action

Short profvis Demo

example_apps/profvis_demo

In Practice

CRAN explorer

Optimizing CRAN explorer

Organization

cran_explorer/
├── app.R
├── deps.csv
├── packages.csv
├── plot_cache.R
└── utils.R
  • app.R: Shiny app
  • deps.csv, packages.csv: data
  • plot_cache.R: Disk-based plot cache
  • utils.R: Download, prepare .csv files

Architecture

  • utils.R for downloading .csv files
  • Data loaded as global reactiveVals on app.R startup
  • dplyr used to search, filter
  • ggplot2 used for plots

Optimization #1: Pre-process

  • Didn’t download from METACRAN every time
  • Winston’s experience saved time
  • Rule of thumb: if the data is big, pre-process

Optimization #2: Beware dplyr::group_by()

group_by() takes an existing tbl and converts it into a grouped tbl where operations are performed “by group”.

group_by() example

filter() after group_by() Slowdown

  • First filter applied only to mtcars
  • Second filter applied to each group

Offending reactive

app.R at 0f7560

Optimization #3: CSVs read faster than RDS

expr mean
read_csv("packages.csv") 661.4826
readRDS("packages.rds") 851.1554

Sidenote: scopes

  • R process-global (top-level)
  • Per-session (inside server function)

app.R at 698b8fc

Optimization #4: Plot caching

  • plotCache: read-through cache for plots
  • Coming soon to Shiny

Thank you!

https://twitter.com/alandipert

https://github.com/alandipert/rstudio-conf-2017-shiny-perf