Scattered Thoughts - Epidemic modeling

Motivation for write-up

This is the 2nd part of a multi-part series blog post on modeling in epidemiology.

The COVID-19 pandemic has brought a lot of attention to the study of epidemiology and more specifically to the various mathematical models that are used to inform public health policies. Everyone has been trying to understand the growth or slowing of new cases and trying to predict the necessary sanitary resources. This blog post attempts to explain the foundations for some of the most used models and enlighten the reader on two key points.

After introducing the concepts of compartmentalization and disease dynamics in the first blog post, this second part is focused on developing a deterministic numerical solution for the SEIR model discussed there.

While normally the goal is to use real-world data to infer characteristics of the underlying disease (as will be done in later blog posts), here we want to use simulate the spread of a COVID-19 like disease in a population of 10000, and look at the effects of the different parameters on the spread.

Recall SEIR model equations

See the first blog post for derivation.

Continuous-time:
- \(\frac{ds(t)}{dt}=-\beta i(t) s(t)\)
- \(\frac{de(t)}{dt}=\beta i(t) s(t) - \sigma e(t)\)
- \(\frac{di(t)}{dt}=\sigma e(t) - \gamma i(t)\)
- \(\frac{dr(t)}{dt}=\gamma i(t)\)
Discrete-time:
- \(\Delta S = -\beta I S \Delta T\)
- \(\Delta E = (\beta I S -\sigma E) \Delta T\)
- \(\Delta I = (\sigma E - \gamma I) \Delta T\)
- \(\Delta R = \gamma I \Delta T\)

Numerical solution to this deteministic population level model

Coding the SEIR model

To build the SEIR model we simply use the discrete-time set of equations above.

The model will thus take as input the following:

Initial proportion of S, E, I, and R in the population
\(\beta\) parameter pertaining to the population in question
\(\sigma\) and \(\gamma\) parameters pertaining to the disease
Numbers of days to run the simulation

Code

# Import required libraries
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

import plotly
import plotly.io as pio
from IPython.display import display, HTML

## Tomas Mazak's workaround
plotly.offline.init_notebook_mode()
display(HTML(
    #'<script type="text/javascript" async src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-MML-AM_SVG"></script>'
    '<script type="text/javascript" async src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_SVG"></script>'
    '<script src="https://cdn.plot.ly/plotly-3.0.1.js" charset="utf-8"></script>'
))

pio.renderers.default = "plotly_mimetype+notebook_connected"

pio.templates.default = "plotly_dark"

Code

# Let's build a numerical solution

def seir_model(init, parms, days):
    S_0, E_0, I_0, R_0 = init
    Epd, Ipd, Rpd = [0], [0], [0]
    S, E, I, R = [S_0], [E_0], [I_0], [R_0]
    dt=0.1
    t = np.linspace(0,days,int(days/dt))
    sigma, beta, gam = parms
    for _ in t[1:]:
        next_S = S[-1] - beta*S[-1]*I[-1]*dt
        Epd.append(beta*S[-1]*I[-1]*dt)
        next_E = E[-1] + (beta*S[-1]*I[-1] - sigma*E[-1])*dt
        Ipd.append(sigma*E[-1]*dt)
        next_I = I[-1] + (sigma*E[-1] - gam*I[-1])*dt
        Rpd.append(gam*I[-1]*dt)
        next_R = R[-1] + (gam*I[-1])*dt
        S.append(next_S)
        E.append(next_E)
        I.append(next_I)
        R.append(next_R)
    return np.stack([S, E, I, R, Epd, Ipd, Rpd]).T

COVID-19 parameters

Simulation parameters used for plot below:

Days = 100
Population = 10000
Number of susceptible people on day 0 = 9999
Number of exposed people on day 0 = 1
No infected or recovered people on day 0

A lot of research is ongoing into the COVID-19 characteristics of \(\beta\), \(\sigma\), and \(\gamma\).

However, these are complex studies that require a lot of data and so far we have little information to go on.

The literature suggests the following:

\(\underline{T_{Incubation}}\):

The mean is 5-6 days but it can range anywhere from 2-14 days ¹ ²

Another paper reports a mean incubation period of 5.2 days and the 95th percentile at 12.5 days ³.

There are reports of pre-symptomatic infections⁴, but these are reportedly rare ⁵ so in the following models we will assume: \[T_{Incubation} = T_{Latent}\] And so: \[\sigma = \frac{1}{5.2} days^{-1}\]

\(\underline{T_{Infectious}}\):

Again it is very difficult to say for sure and the period of communicability is very uncertain for COVID-19.

Research suggests a median of 20 days of viral shedding after onset of symptoms ⁶.

Ranging from 8 to 37 days in survivors.

While it is noted PCR positivity does not necessarily reflect the infectious period (virus may not be viable but the PCR amplification will result in a positive), for the purpose of this blog post we will assume the following: \[T_{Infectious} = T_{Clinical}\] To obtain an exponential distribution with median M, the scale A is calculated as follows: \[A = \frac{M}{\ln2} = \frac{20}{\ln2}\] This results in \[\gamma = \frac{\ln2}{20} = \frac{1}{28.85}\ days^{-1}\] * \(\underline{Beta= \beta}\):

While difficult to estimate this parameter as there is a lot of variation between countries, cultures, societal norms, etc.. a little thought experiment can help us evaluate the value for \(\beta = r\rho\) in Switerland or France for example.

If no control measures are put in place and people do not change habits (as is the case in this blog post), we can expect the following:

Average number of contacts per day:

\[r = 10\ contacts\ per\ day\]

Average probability of transmission from contact:

\[\rho = 5\%\]

And so: \[\beta = r\rho = 0.5\]

Running the simulation

Code

#Define parameters
days = 200
N = 10000
init = 1 - 1/N, 1/N, 0, 0
sigma = 1/5.2
beta = 0.5
gam = 1/28.85
parms = sigma, beta, gam

# Run simulation
results_avg = seir_model(init, parms, days)

Code

fig = go.Figure(data=[       
    go.Scatter(name='S', x=np.linspace(0,days,days*10), y=results_avg.T[0], line={'dash':'solid', 'color':'blue'}),
    go.Scatter(name='E', x=np.linspace(0,days,days*10), y=results_avg.T[1], line={'dash':'solid', 'color':'yellow'}),
    go.Scatter(name='I', x=np.linspace(0,days,days*10), y=results_avg.T[2], line={'dash':'solid', 'color':'red'}), 
    go.Scatter(name='R', x=np.linspace(0,days,days*10), y=results_avg.T[3], line={'dash':'solid', 'color':'green'}),
])

fig.update_layout(
    xaxis_title = 'Day',
    yaxis_title = 'Proportion of population',
  title={
      'text':'Deterministic SEIR model - COVID-19 parameters',
      'x':0.5,
      'xanchor':'center'
    }
)
fig.update_layout(height=500, template="plotly_dark")

fig.show()

Qualitative analysis of \(\beta\), \(\sigma\), and \(\gamma\)

Effect of \(\sigma\) (\(T_{Latent}\))

Let’s have a look at the effect of \(\sigma\) (or inversely, the latent period) on the SEIR simulation.

A higher \(\sigma\) means shorter average latent period, and vice-versa.

Code

## Let's try to see how the model changes 
days = 1000
N = 10000
init = 1 - 1/N, 1/N, 0, 0
sigma_high = 1   # 1 --> Average 1 day from E --> I (ressembles SIR model)
sigma_low = 1/100 #10 days on average, twice as long as COVID-19
sigma_covid = 1/5.2
beta = 0.5
gam = 1/28.85
parms_fastEI = sigma_high, beta, gam
parms_slowEI = sigma_low, beta, gam
parms_avg = sigma_covid, beta, gam

# Run simulation
results_fastEtoI = seir_model(init, parms_fastEI, days)
results_slowEtoI = seir_model(init, parms_slowEI, days)
results_avg = seir_model(init, parms_avg, days)

Code

fig = go.Figure(data=[    
    go.Scatter(name=r'$S:\sigma_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[0], line={'dash':'solid', 'color':'blue'}, legendgroup="COVID", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\sigma_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[1], line={'dash':'solid', 'color':'yellow'}, legendgroup="COVID", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\sigma_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[2], line={'dash':'solid', 'color':'red'}, legendgroup="COVID", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\sigma_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[3], line={'dash':'solid', 'color':'green'}, legendgroup="COVID", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\sigma_{high}$', x=np.linspace(0,days,days*10), y=results_fastEtoI.T[0], line={'dash':'dash','color':'blue'}, legendgroup="high", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\sigma_{high}$', x=np.linspace(0,days,days*10), y=results_fastEtoI.T[1], line={'dash':'dash', 'color':'yellow'}, legendgroup="high", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\sigma_{high}$', x=np.linspace(0,days,days*10), y=results_fastEtoI.T[2], line={'dash':'dash', 'color':'red'}, legendgroup="high", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\sigma_{high}$', x=np.linspace(0,days,days*10), y=results_fastEtoI.T[3], line={'dash':'dash', 'color':'green'}, legendgroup="high", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\sigma_{low}$', x=np.linspace(0,days,days*10), y=results_slowEtoI.T[0], line={'dash':'dot', 'color':'blue'}, legendgroup="slow", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\sigma_{low}$', x=np.linspace(0,days,days*10), y=results_slowEtoI.T[1], line={'dash':'dot', 'color':'yellow'}, legendgroup="slow", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\sigma_{low}$', x=np.linspace(0,days,days*10), y=results_slowEtoI.T[2], line={'dash':'dot', 'color':'red'}, legendgroup="slow", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\sigma_{low}$', x=np.linspace(0,days,days*10), y=results_slowEtoI.T[3], line={'dash':'dot', 'color':'green'}, legendgroup="slow", hoverinfo='x+y'),
])

fig.update_layout(
    template='plotly_dark',
    height=500,
    xaxis_title = 'Day',
    yaxis_title = 'Proportion of population',
    title={
        'text':r'$\text{Effect of } \sigma \ \text{on Deterministic SEIR model}$',
        #'text':'Effect of sigma on Deterministic SEIR model',
        'x':0.5,
        'xanchor':'center'
    }
)

fig.show()

We notice a few things from the plot above on the impact of the average time from E → I:

The shorter the latent period:
- the faster the epidemic propogates in the population
- the higher the peak of infected individuals will be (meaning a higher chance hospital resources will be saturated)
However, the latent period has no impact on the total number of individuals infected over the entire time of the epidemic.

Effect of \(\beta = r~\rho\)

Let’s have a look at the effect of \(\beta\) on the SEIR simulation.

A higher \(\beta\) can either mean a higher average number of contacts per day (\(r\)) in the population and/or a higher probability of transmission of disease from I → S.

The opposite holds also.

Code

## Let's try to see how the model changes 
days = 500
N = 10000
init = 1 - 1/N, 1/N, 0, 0
sigma_avg = 1/5.2
beta_avg = 0.5
beta_noepi = 1/30
beta_low = 0.1
beta_high = 4
gam = 1/28.85
parms_avg = sigma_avg, beta_avg, gam
parms_noepi = sigma_avg, beta_noepi, gam 
parms_low = sigma_avg, beta_low, gam
parms_high = sigma_avg, beta_high, gam

# Run simulation
results_avg = seir_model(init, parms_avg, days)
results_noepi = seir_model(init, parms_noepi, days)
results_low = seir_model(init, parms_low, days)
results_high = seir_model(init, parms_high, days)

Code

fig = go.Figure(data=[    
    go.Scatter(name=r'$S:\beta_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[0], line={'dash':'solid', 'color':'blue'}, legendgroup="COVID"),
    go.Scatter(name=r'$E:\beta_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[1], line={'dash':'solid', 'color':'yellow'}, legendgroup="COVID"), 
    go.Scatter(name=r'$I:\beta_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[2], line={'dash':'solid', 'color':'red'}, legendgroup="COVID"),
    go.Scatter(name=r'$R:\beta_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[3], line={'dash':'solid', 'color':'green'}, legendgroup="COVID"),
    go.Scatter(name=r'$S:\beta_{noepi}$', x=np.linspace(0,days,days*10), y=results_noepi.T[0], line={'dash':'dashdot','color':'blue'}, legendgroup="noepi"),
    go.Scatter(name=r'$E:\beta_{noepi}$', x=np.linspace(0,days,days*10), y=results_noepi.T[1], line={'dash':'dashdot', 'color':'yellow'}, legendgroup="noepi"),
    go.Scatter(name=r'$I:\beta_{noepi}$', x=np.linspace(0,days,days*10), y=results_noepi.T[2], line={'dash':'dashdot', 'color':'red'}, legendgroup="noepi"),
    go.Scatter(name=r'$R:\beta_{noepi}$', x=np.linspace(0,days,days*10), y=results_noepi.T[3], line={'dash':'dashdot', 'color':'green'}, legendgroup="noepi"),
    go.Scatter(name=r'$S:\beta_{low}$', x=np.linspace(0,days,days*10), y=results_low.T[0], line={'dash':'dash','color':'blue'}, legendgroup="low"),
    go.Scatter(name=r'$E:\beta_{low}$', x=np.linspace(0,days,days*10), y=results_low.T[1], line={'dash':'dash', 'color':'yellow'}, legendgroup="low"),
    go.Scatter(name=r'$I:\beta_{low}$', x=np.linspace(0,days,days*10), y=results_low.T[2], line={'dash':'dash', 'color':'red'}, legendgroup="low"),
    go.Scatter(name=r'$R:\beta_{low}$', x=np.linspace(0,days,days*10), y=results_low.T[3], line={'dash':'dash', 'color':'green'}, legendgroup="low"),
    go.Scatter(name=r'$S:\beta_{high}$', x=np.linspace(0,days,days*10), y=results_high.T[0], line={'dash':'dot', 'color':'blue'}, legendgroup="high"),
    go.Scatter(name=r'$E:\beta_{high}$', x=np.linspace(0,days,days*10), y=results_high.T[1], line={'dash':'dot', 'color':'yellow'}, legendgroup="high"),
    go.Scatter(name=r'$I:\beta_{high}$', x=np.linspace(0,days,days*10), y=results_high.T[2], line={'dash':'dot', 'color':'red'}, legendgroup="high"),
    go.Scatter(name=r'$R:\beta_{high}$', x=np.linspace(0,days,days*10), y=results_high.T[3], line={'dash':'dot', 'color':'green'}, legendgroup="high"),
])

fig.update_layout(
    #template='ggplot2',
    height=500,
    xaxis_title = 'Day',
    yaxis_title = 'Proportion of population',
    title={
        'text':r'$\text{Effect of } \beta \ \text{on Deterministic SEIR model}$',
        'x':0.5,
        'xanchor':'center'
    }
)

fig.show()

We notice a few things from the plot above on the impact of \(\beta\):

The higher \(\beta\) is:
- the faster the epidemic seems to propogate in the population
- the higher the peak of infected individuals seems to be (meaning a higher chance hospital resources will be saturated)
\(\beta\) also appears to affect the overall number of people infected over the course of the epidemic
A low \(\beta\) means a low \(R_0\) and we have seen in the first part of this blog that no epidemic occurs when \(R_0 < 1\)
But even if \(R_0 > 1\), keeping \(\beta\) low reduces the total number of people infected

Effect of \(\gamma\) (\(T_{Infectious}\))

Let’s have a look at the effect of \(\gamma\) on the SEIR simulation.

A higher \(\gamma\) means a shorter infectious period, and vice-versa.

Code

## Let's try to see how the model changes 
days = 500
N = 10000
init = 1 - 1/N, 1/N, 0, 0
sigma_avg = 1/5.2
beta = 0.5
gam_avg = 1/28.85
gam_low = 1/200
gam_high = 0.2
parms_fastIR = sigma_avg, beta, gam_high
parms_slowIR = sigma_avg, beta, gam_low
parms_avg = sigma_avg, beta, gam_avg

# Run simulation
results_fastItoR = seir_model(init, parms_fastIR, days)
results_slowItoR = seir_model(init, parms_slowIR, days)
results_avg = seir_model(init, parms_avg, days)

Code

fig = go.Figure(data=[    
    go.Scatter(name=r'$S:\gamma_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[0], line={'dash':'solid', 'color':'blue'}, legendgroup="COVID", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\gamma_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[1], line={'dash':'solid', 'color':'yellow'}, legendgroup="COVID", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\gamma_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[2], line={'dash':'solid', 'color':'red'}, legendgroup="COVID", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\gamma_{COVID}$', x=np.linspace(0,days,days*10), y=results_avg.T[3], line={'dash':'solid', 'color':'green'}, legendgroup="COVID", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\gamma_{high}$', x=np.linspace(0,days,days*10), y=results_fastItoR.T[0], line={'dash':'dash','color':'blue'}, legendgroup="fast", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\gamma_{high}$', x=np.linspace(0,days,days*10), y=results_fastItoR.T[1], line={'dash':'dash', 'color':'yellow'}, legendgroup="fast", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\gamma_{high}$', x=np.linspace(0,days,days*10), y=results_fastItoR.T[2], line={'dash':'dash', 'color':'red'}, legendgroup="fast", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\gamma_{high}$', x=np.linspace(0,days,days*10), y=results_fastItoR.T[3], line={'dash':'dash', 'color':'green'}, legendgroup="fast", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\gamma_{low}$', x=np.linspace(0,days,days*10), y=results_slowItoR.T[0], line={'dash':'dot', 'color':'blue'}, legendgroup="slow", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\gamma_{low}$', x=np.linspace(0,days,days*10), y=results_slowItoR.T[1], line={'dash':'dot', 'color':'yellow'}, legendgroup="slow", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\gamma_{low}$', x=np.linspace(0,days,days*10), y=results_slowItoR.T[2], line={'dash':'dot', 'color':'red'}, legendgroup="slow", hoverinfo='x+y'),
    go.Scatter(name=r'$S:\gamma_{low}$', x=np.linspace(0,days,days*10), y=results_slowItoR.T[3], line={'dash':'dot', 'color':'green'}, legendgroup="slow", hoverinfo='x+y'),
])

fig.update_layout(
    #template='ggplot2',
    height=500,
    xaxis_title = 'Day',
    yaxis_title = 'Proportion of population',
    title={
        'text':r'$\text{Effect of } \gamma \ \text{on Deterministic SEIR model}$',
        'x':0.5,
        'xanchor':'center'
    }
)

fig.show()

We notice a few things from the plot above on the impact of the infectious period:

The longer the infectious period:
- the faster the epidemic propogates in the population
- the higher the peak of infectious individuals will be and the longer it will last (meaning a higher chance hospital resources will be saturated)
As opposed to the latent period above, but similarly as \(\beta\), the infectious period has an impact on the total number of individuals infected over the entire time of the epidemic
With no epidemic if \(\gamma > \beta\)

Discussion

So we can see the latent and infectious periods, along with the value of \(\beta\) are critical components in how the model will react.

Worth noting also is that the higher \(R_0\) is, the faster the epidemic spreads and the higher the peak of infectious individuals will be (see further blog posts for some nuance on this).

Notably, and as predicted in part 1 of the blog series, no epidemic occurs if: \[R_0 < 1\] In other words, no epidemic if: \[\beta < \gamma\]

There are major flaws with this model however. While this model is deterministic and uses average time to model \(\sigma\) and \(\gamma\), this is a major flaw and does not represent the reality for most diseases.

Part 3 of this blog series will discuss this further.

Epidemic modeling - Part 2

Motivation for write-up

Recall SEIR model equations

Numerical solution to this deteministic population level model

Coding the SEIR model

COVID-19 parameters

Running the simulation

Qualitative analysis of \(\beta\), \(\sigma\), and \(\gamma\)

Effect of \(\sigma\) (\(T_{Latent}\))

Effect of \(\beta = r~\rho\)

Effect of \(\gamma\) (\(T_{Infectious}\))

Discussion

Footnotes