ggdist and Epidemic Curves

library(ggplot2)
library(dplyr)
library(tidyr)
library(ggdist)
library(odin)
theme_set(theme_minimal())

This little post is based on a recent paper by Juul et al. (2020). For infectious disease modelling we are often running many scenarios through the models. Why? Because there are many parameters to estimate, but very few things that we actually observe (cases by day or incidence) so in some sense our models all suffer from identifiability issues. Additionally, many of the parameters that we do measure have some uncertainty around them. For this reason it is very common to run models with different parameters estimates or distributions in order to understand a range of possible values. A simple way to summarise these models is through point-wise quantiles through time. The genius of Juul’s paper is that this can underestimate many outcomes because it might pick up the beginning of one epidemic curve and miss the peak.

Enter curve estimates. These measures actually look at the curves proper through the trajectory rather than at discrete time points. Here we can rank the entire curve and capture the contribution of the curve rather than the point-wise estimate.

To see this let’s run a simple example from the odin package (FitzJohn 2019). Additionally, the amazing ggdist (Kay 2020) provides an easy way to summarise curves.

sir_model <- system.file("examples/discrete_stochastic_sir_arrays.R", package = "odin")

writeLines(readLines(sir_model))
## Core equations for transitions between compartments:
update(S[]) <- S[i] - n_SI[i]
update(I[]) <- I[i] + n_SI[i] - n_IR[i]
update(R[]) <- R[i] + n_IR[i]

## Individual probabilities of transition:
p_SI[] <- 1 - exp(-beta * I[i] / N[i])
p_IR <- 1 - exp(-gamma)

## Draws from binomial distributions for numbers changing between
## compartments:
n_SI[] <- rbinom(S[i], p_SI[i])
n_IR[] <- rbinom(I[i], p_IR)

## Total population size
N[] <- S[i] + I[i] + R[i]

## Initial states:
initial(S[]) <- S_ini
initial(I[]) <- I_ini
initial(R[]) <- 0

## User defined parameters - default in parentheses:
S_ini <- user(1000)
I_ini <- user(1)
beta <- user(0.2)
gamma <- user(0.1)

## Number of replicates
nsim <- user(100)
dim(N) <- nsim
dim(S) <- nsim
dim(I) <- nsim
dim(R) <- nsim
dim(p_SI) <- nsim
dim(n_SI) <- nsim
dim(n_IR) <- nsim
x <- odin(sir_model)
── R CMD INSTALL ───────────────────────────────────────────────────────────────
* installing *source* package ‘discrete.stochastic.sir.arraysb3edd559’ ...
** using staged installation
** libs
/usr/local/opt/llvm/bin/clang  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -UNDEBUG -Wall -pedantic -g -O0 -c odin.c -o odin.o
In file included from odin.c:2:
In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/include/R.h:76:
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/R_ext/Memory.h:48:17: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
   48 | int     R_gc_running();
      |                     ^
      |                      void
In file included from odin.c:2:
In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/include/R.h:78:
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/R_ext/Random.h:61:25: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
   61 | Sampletype R_sample_kind();
      |                         ^
      |                          void
In file included from odin.c:4:
In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/include/Rinternals.h:51:
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/R_ext/Rdynload.h:38:26: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
   38 | typedef void * (*DL_FUNC)();
      |                          ^
      |                           void
In file included from odin.c:4:
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/Rinternals.h:1228:21: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
 1228 | SEXP R_GetCurrentEnv();
      |                     ^
      |                      void
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/Rinternals.h:1229:26: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
 1229 | const char *R_curErrorBuf();
      |                          ^
      |                           void
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/Rinternals.h:1306:19: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
 1306 | void R_init_altrep();
      |                   ^
      |                    void
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/Rinternals.h:1321:22: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
 1321 | SEXP R_MakeUnwindCont();
      |                      ^
      |                       void
7 warnings generated.
/usr/local/opt/llvm/bin/clang  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -UNDEBUG -Wall -pedantic -g -O0 -c registration.c -o registration.o
In file included from registration.c:1:
In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/include/R.h:76:
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/R_ext/Memory.h:48:17: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
   48 | int     R_gc_running();
      |                     ^
      |                      void
In file included from registration.c:1:
In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/include/R.h:78:
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/R_ext/Random.h:61:25: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
   61 | Sampletype R_sample_kind();
      |                         ^
      |                          void
In file included from registration.c:2:
In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/include/Rinternals.h:51:
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/R_ext/Rdynload.h:38:26: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
   38 | typedef void * (*DL_FUNC)();
      |                          ^
      |                           void
In file included from registration.c:2:
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/Rinternals.h:1228:21: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
 1228 | SEXP R_GetCurrentEnv();
      |                     ^
      |                      void
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/Rinternals.h:1229:26: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
 1229 | const char *R_curErrorBuf();
      |                          ^
      |                           void
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/Rinternals.h:1306:19: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
 1306 | void R_init_altrep();
      |                   ^
      |                    void
/Library/Frameworks/R.framework/Versions/4.1/Resources/include/Rinternals.h:1321:22: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
 1321 | SEXP R_MakeUnwindCont();
      |                      ^
      |                       void
7 warnings generated.
/usr/local/opt/llvm/bin/clang -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Versions/4.1/Resources/lib -L/usr/local/lib -o discrete.stochastic.sir.arraysb3edd559.so odin.o registration.o -F/Library/Frameworks/R.framework/Versions/4.1 -framework R -Wl,-framework -Wl,CoreFoundation
installing to /private/var/folders/0x/6bnjy4n15kz8nbfbbwbyk3pr0000gn/T/Rtmpo81zqq/devtools_install_12a5098da5a2/00LOCK-file12a5072af16eb/00new/discrete.stochastic.sir.arraysb3edd559/libs
** checking absolute paths in shared objects and dynamic libraries
* DONE (discrete.stochastic.sir.arraysb3edd559)
fit <- x()

sampz <- as.data.frame(fit$run(step = 100))

Now that we have values, a simple analysis would look like the following:

val_long <- sampz %>%
  select(step, contains("I")) %>%
  tidyr::gather(key, value, -step) %>%
  mutate(.draw = readr::parse_number(stringr::str_extract(key,"\newlined+")))

sumz_point <- val_long %>%
  group_by(step) %>%
  summarise(q05 = quantile(value, probs = .05),
            q95 = quantile(value, probs = .95))

val_long %>%
  ggplot(aes(step, y = value, group = .draw))+
  geom_line(alpha = .2)+
  geom_ribbon(data = sumz_point,
              aes(x = step, ymin = q05, ymax = q95),
              inherit.aes = FALSE, fill = "orange", alpha =.2)+
  labs(
    title = "Simulated SIR Curve for Infections"
  )

If we take the curve distribution approach we can see the following

p <- val_long %>%
  group_by(step) %>%
  curve_interval(value, .width = c(.5, .8, .95)) %>%
  ggplot(aes(x = step, y = value)) +
  geom_hline(yintercept = 1, color = "gray75", linetype = "dashed") +
  geom_lineribbon(aes(ymin = .lower, ymax = .upper)) +
  scale_fill_brewer() +
  labs(
    title = "Simulated SIR Curve for Infections"
  )

p

Now when we lay these ontop of one another, we see the issue with using the time interval method:

p+
  geom_ribbon(data = sumz_point,
              aes(x = step, ymin = q05, ymax = q95),
              inherit.aes = FALSE, fill = "orange", alpha =.5)

We see that using the fixed time method we underestimate the peak much of the time! This is very dangerous and important when setting policy and establishing needs.

FitzJohn, Rich. 2019. Odin: ODE Generation and Integration. https://CRAN.R-project.org/package=odin.

Juul, Jonas L., Kaare Græsbøll, Lasse Engbo Christiansen, and Sune Lehmann. 2020. “Fixed-Time Descriptive Statistics Underestimate Extremes of Epidemic Curve Ensembles.” arXiv:2007.05035 [Physics, q-Bio, Stat], July. http://arxiv.org/abs/2007.05035.

Kay, Matthew. 2020. ggdist: Visualizations of Distributions and Uncertainty. https://doi.org/10.5281/zenodo.3879620.


Citation

BibTex citation:

@online{dewitt2020
author = {Michael E. DeWitt},
title = {ggdist and Epidemic Curves},
date = 2020-08-09,
url = {https://michaeldewittjr.com/articles/2020-08-09-ggdist-and-epidemic-curves},
langid = {en}
}

For attribution, please cite this work as:

Michael E. DeWitt. 2020. "ggdist and Epidemic Curves." August 9, 2020. https://michaeldewittjr.com/articles/2020-08-09-ggdist-and-epidemic-curves