HIV modeling

Building a new stochastic SEIR model to deal with probability distributions
probability distributions
modeling
SEIR
epidemiology
stochastic
HIV
AIDS
Author

Jeffrey Post

Published

May 7, 2021

Motivation for write-up

This is the 4th part of a multi-part series blog post on modeling in epidemiology.

The COVID-19 pandemic has brought a lot of attention to study of epidemiology and more specifically to the various mathematical models that are used to inform public health policies. Everyone has been trying to understand the growth or slowing of new cases and trying to predict the necessary sanitary resources. This blog post attempts to explain the foundations for some of the most used models and enlighten the reader on two key points.

After introducing the concepts of compartmentalization and disease dynamics in the first blog post, the second part looked at a deterministic numerical solution for the SEIR model discussed, and the effects of the parameters \(\beta\), \(\sigma\), and \(\gamma\) in parts 1 and 2.

Part 3 made the argument that most models ignore individual-level disease dynamics in favor of averaging population-level \(\sigma\) and \(\gamma\) parameters and showed some big discrepancies between actual COVID-19 probability distributions for those parameters and those used in research.

This 4th part is where I build a numerical SEIR model that takes into account these probability distributions in order to tweak the model as close to COVID-19 data as possible.

Building a stochastic model

As opposed to the deterministic model from Part 2, this model is going to focus on individual level disease dynamics to model the disease propagation.

The basic idea of this model is to have a dataframe with the number of rows equal to the population size (each individual is a row) and two columns: * State column to describe the state of each individual (S, E, I, or R) * Day column to save the day of transition of the individual into that state

However, the population-level rates of transmission still apply here i.e. a person goes from S → E following three points: 1. the number of contacts the person has per unit time (given by \(r\)) 2. the chance a given contact is with an I - infectious individual (the higher thenumber of I, the higher the chance) 3. the chance of an S contracting the disease from a contact with an I (given by \(\rho\))

This is done stochastically.

Once a person becomes E, their progression is unique to them. This progression is calculated in advance for computational reason, but it allows to use the time ditributions we want.

#collapse_hide
!pip install plotly==4.14.3
import pandas as pd
import numpy as np
import math
import plotly.graph_objects as go
import plotly.express as px
from scipy.stats import expon
from scipy.stats import gamma
from scipy.stats import weibull_min
from numpy.random import default_rng
rng = default_rng()

# Let's build a numerical solution
def seir_model(init, parms, days):
    S_0, E_0, I_0, R_0 = init
    Epd, Ipd, Rpd = [0], [0], [0]
    S, E, I, R = [S_0], [E_0], [I_0], [R_0]
    dt=0.1
    t = np.linspace(0,days,int(days/dt))
    sigma, beta, gam = parms
    for _ in t[1:]:
        next_S = S[-1] - beta*S[-1]*I[-1]*dt
        Epd.append(beta*S[-1]*I[-1]*dt)
        next_E = E[-1] + (beta*S[-1]*I[-1] - sigma*E[-1])*dt
        Ipd.append(sigma*E[-1]*dt)
        next_I = I[-1] + (sigma*E[-1] - gam*I[-1])*dt
        Rpd.append(gam*I[-1]*dt)
        next_R = R[-1] + (gam*I[-1])*dt
        S.append(next_S)
        E.append(next_E)
        I.append(next_I)
        R.append(next_R)
    return np.stack([S, E, I, R, Epd, Ipd, Rpd]).T
Collecting plotly==4.14.3
  Downloading https://files.pythonhosted.org/packages/1f/f6/bd3c17c8003b6641df1228e80e1acac97ed8402635e46c2571f8e1ef63af/plotly-4.14.3-py2.py3-none-any.whl (13.2MB)
     |████████████████████████████████| 13.2MB 303kB/s 
Requirement already satisfied: retrying>=1.3.3 in /usr/local/lib/python3.7/dist-packages (from plotly==4.14.3) (1.3.3)
Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from plotly==4.14.3) (1.15.0)
Installing collected packages: plotly
  Found existing installation: plotly 4.4.1
    Uninstalling plotly-4.4.1:
      Successfully uninstalled plotly-4.4.1
Successfully installed plotly-4.14.3

Creating the initial population dataframe

Below is a function to create the initial population dataframe: * \(p\) is the population number * \(num_E\) is the number of people exposed on day 0 * \(num_I\) is the number of infectious on day 0 * \(num_R\) is the number of people recovered on day 0

#collapse_hide
# Need this new function for model below:
def make_df(p, num_I, num_R):
  df = pd.DataFrame(np.full((p,1), 'S').T[0], columns=['State'])
  df['Year'] = 0
  df['Age'] = (np.random.random(p)*35+15).astype(int)
  tochange=df.loc[rng.choice(p, size=num_I+num_R, replace=False),'State'].index
  df.loc[tochange[0:num_I],'State'] = 'I'
  df.loc[tochange[num_I:num_I+num_R],'State'] = 'R'
  return df

Building the model

#np.random.random(size=(p,days))
#np.log(4)
j=11
over = 10
#10/np.cumsum(np.ones(100))
0.05 + (0.3/np.exp((j+1-over)/10))
0.29561922592339457
#collapse_hide
def seir_model_stoch(beta, beta2, p, num_I, num_R, years, T_Infectious, ART, control):

    # Initialize population dataframe with data given by user
    df = make_df(p, num_I, num_R)
    
    # This variable is used to track daily value of beta if it varies over time
    xxbeta=np.array([],dtype=float)

    # Initialize the arrays to return
    # Below are numbers of S, I, R total
    S=np.array([],dtype=int)
    I=np.array([],dtype=int)
    R=np.array([],dtype=int)
    # Below are the daily additions in S, I, R
    Spd=np.array([],dtype=int)
    Ipd=np.array([],dtype=int)
    Rpd=np.array([],dtype=int)

    b=beta
    #b2=beta[0]
    b2=np.array([],dtype=float)
    b1=b

    # signal diminshing beta
    over = 0 

    # signal end of deaths due to ART
    art1 = 0
    art2 = 0

    
    # Stochastic model so use random values to decide on progression
    rand = np.random.random(size=(p,years))

    # Depending if you want exponential, gamma, or Weibull distribution for T_Infectious
    # Uses distributions found on blog part 3
    if T_Infectious == 'expon':
      ItoR = expon.rvs(loc=0,scale=10,size=p)
    elif T_Infectious == 'gamma':
      ItoR = gamma.rvs(4,loc=3,scale=2,size=p)    
    else:
      ItoR = weibull_min.rvs(2.3, loc=2, scale=20.11, size=p)

    # Iterate over every day the simulation is run
    for j in range(0,years-1):

        # Record daily beta values
        xxbeta=np.append(xxbeta, b[j])

        # First we get the index of the individuals that will change state today:

        # Random number tells you which 'S' have been exposed on this day 
        #StoE_index = df.loc[(df.State == 'S') & (rand[:,j] < b[j]*len(np.where(df.State=='I')[0])/p)].index
        if ART < 2:
          StoI_index = df.loc[(df.State == 'S') & (df.Age < 49) & (rand[:,j] < b[j]*len(np.where(df.State=='I')[0])/(len(np.where(df.State=='I')[0])+len(np.where(df.State=='S')[0])))].index
          StoS_index = df.loc[(df.State == 'S') & (df.Age < 49) & (rand[:,j] < b[j]*len(np.where(df.State=='I')[0])/(len(np.where(df.State=='I')[0])+len(np.where(df.State=='S')[0])))].index
        elif ART == 2:
          if art2 == 0:
            StoI_index = df.loc[(df.State == 'S') & (df.Age < 49) & (rand[:,j] < b[j]*len(np.where(df.State=='I')[0])/(len(np.where(df.State=='I')[0])+len(np.where(df.State=='S')[0])))].index
            StoS_index = df.loc[(df.State == 'S') & (df.Age < 49) & (rand[:,j] < b[j]*len(np.where(df.State=='I')[0])/(len(np.where(df.State=='I')[0])+len(np.where(df.State=='S')[0])))].index
          elif art2 == 1:
            StoI_index = df.loc[(df.State == 'S') & (df.Age > 55)].index
            StoS_index = df.loc[(df.State == 'S') & (df.Age < 49)].index

        StoRem_index = df.loc[(df.State == 'S') & (df.Age == 49)].index

        # For each row, if a person has been a certain number of days in E, they will go to I
        # This follows EtoI variable which is either exponential or gamma distributed according to above
        #EtoI_index = df.loc[(df.State == 'E') & (j-df.Day >= EtoI)].index
        
        # Similaraly as above
        # For each row, if a person has been a certain number of days in I, they will go to R
        # This follows EtoI variable which is either exponential or gamma distributed according to above
        ItoRem_index = df.loc[(df.State == 'I') & (df.Age == 49)].index
        if ART == 0: #don't use ART
          ItoR_index = df.loc[(df.State == 'I') & (j-df.Year >= ItoR) & (df.Age < 49)].index
          ItoI_index = df.loc[(df.State == 'I') & (j-df.Year < ItoR) & (df.Age < 49)].index
        elif ART > 0:
          if art2 == 0:
            ItoR_index = df.loc[(df.State == 'I') & (j-df.Year >= ItoR) & (df.Age < 49)].index
            ItoI_index = df.loc[(df.State == 'I') & (j-df.Year < ItoR) & (df.Age < 49)].index
          elif art2 ==1:
            ItoR_index = df.loc[(df.State == 'I') & (df.Age > 49)].index
            ItoI_index = df.loc[(df.State == 'I') & (df.Age < 49)].index            

        RtoRem_index = df.loc[(df.State == 'R') & (df.Age == 49)].index
        RtoR_index = df.loc[(df.State == 'R') & (df.Age < 49)].index

        # Use indexes collected above to populate per day values
        #Epd = np.append(Epd,len(StoE_index))
        #Ipd = np.append(Ipd,len(EtoI_index))
        Ipd = np.append(Ipd,len(StoI_index))
        Rpd = np.append(Rpd,len(ItoR_index))

        # Now we use the indexes collected above randomly to change the actual population dataframe to the new states
        df.iloc[ItoRem_index] = ['S', j, 15]
        df.loc[ItoR_index, ['State','Year']] = ['S', j]
        df.loc[ItoR_index, 'Age'] = df.loc[ItoR_index, 'Age'] + 1
        df.loc[ItoI_index, 'Age'] = df.loc[ItoI_index, 'Age'] + 1
        df.iloc[StoRem_index] = ['S', j, 15]
        df.loc[StoI_index, ['State','Year']] = ['I', j]
        df.loc[StoI_index, 'Age'] = df.loc[StoI_index, 'Age'] + 1
        df.loc[StoS_index, 'Age'] = df.loc[StoS_index, 'Age'] + 1
        
        df.iloc[RtoRem_index] = ['S', j, 15]
        df.loc[RtoR_index, 'Age'] = df.loc[RtoR_index, 'Age'] + 1
        
        
        # Append the S, I, and R arrays
        S=np.append(S,len(np.where(df.State=='S')[0]))
        I=np.append(I,len(np.where(df.State=='I')[0]))
        R=np.append(R,len(np.where(df.State=='R')[0]))

        # Code below for control measures to reduce beta values
        if control == 1:
          if (I[-1]/p > 0.015):
            art1 = 1
            if over == 0:
              over = j
          
          if art1 == 1:
            if j > over + 15:    
            #if Ipd[-2] > Ipd[-1]:
              art2 = 1

          if over != 0:
            #b = beta2+(b1/np.exp((j+3-over)/15))
            b = beta2+(b1/np.exp((j+1-over)/10))

        if control == 2:
          if (I[-1]/p > 0.3):
            art1 = 1
            if over == 0:
              over = j
              #print(over)
          
          if art1 == 1:
            if j > over + 15:    
            #if Ipd[-2] > Ipd[-1]:
              art2 = 1

          if over != 0:
            #b = beta2+(b1/np.exp((j+3-over)/15))
            b = beta2+(b1/np.exp((j+1-over)/10))


        xxbeta2 = ((S[j-1]+I[j-1])/I[j-1])*Ipd[j]/S[j-1]
        #xxbeta2 = 0.5
        #print(xxbeta2)
        b2 = np.append(b2, xxbeta2)
                
    #Epd[0]+=num_E
    Ipd[0]+=num_I
    Rpd[0]+=num_R

    #return S,E,I,R, Epd, Ipd, Rpd, xxbeta
    return S, I, R, Spd, Ipd, Rpd, xxbeta, b2, over

Testing the model

#collapse_hide
# Define parameters for stochastic model
days = 200
p = 10000
num_E = 0
num_I = 1
num_R = 0
beta_stoch = 0.3*np.ones(days)
beta_stoch2 = 0.05

# Run 3 stochastic simulations
results_stoch1 = seir_model_stoch(beta_stoch,beta_stoch2, p, num_I, num_R, years, 'gamma', 0, 1)
results_stoch2 = seir_model_stoch(beta_stoch, beta_stoch2, p, num_I, num_R, years, 'gamma', 0, 1)
results_stoch3 = seir_model_stoch(beta_stoch, beta_stoch2, p, num_I, num_R, years, 'gamma', 0, 2)
results_stoch4 = seir_model_stoch(beta_stoch, beta_stoch2, p, num_I, num_R, years, 'gamma', 0, 2)
results_stoch1[8]
26
#collapse_hide
fig = go.Figure(data=[    
    go.Scatter(name='Beta_stoch1', x=np.arange(len(results_stoch1[0])), y=results_stoch1[6], line={'dash':'dot','color':'yellow'}, legendgroup="stoch1"),
    go.Scatter(name='Beta_meas1', x=np.arange(len(results_stoch1[0])), y=results_stoch1[7], line={'dash':'dot','color':'yellow'}, legendgroup="stoch1"),
    go.Scatter(name='I_stoch1', x=np.arange(len(results_stoch1[0])), y=results_stoch1[1]/p, line={'dash':'dot', 'color':'red'}, legendgroup="stoch1"),
    go.Bar(name='Ip_stoch1', x=np.arange(len(results_stoch1[0])), y=results_stoch1[4]*10/p, legendgroup="stoch1"),
    go.Scatter(name='R_stoch1', x=np.arange(len(results_stoch1[0])), y=results_stoch1[2]/p, line={'dash':'dot', 'color':'green'}, legendgroup="stoch1"),
    go.Scatter(name='Beta_stoch2', x=np.arange(len(results_stoch2[0])), y=results_stoch2[6], line={'dash':'dot','color':'yellow'}, legendgroup="stoch2"),
    go.Scatter(name='Beta_meas2', x=np.arange(len(results_stoch2[0])), y=results_stoch2[7], line={'dash':'dot','color':'yellow'}, legendgroup="stoch2"),
    go.Scatter(name='I_stoch2', x=np.arange(len(results_stoch2[0])), y=results_stoch2[1]/p, line={'dash':'dot', 'color':'red'}, legendgroup="stoch2"),
    go.Bar(name='Ip_stoch2', x=np.arange(len(results_stoch2[0])), y=results_stoch2[4]*10/p, legendgroup="stoch2"),
    go.Scatter(name='R_stoch2', x=np.arange(len(results_stoch2[0])), y=results_stoch2[2]/p, line={'dash':'dot', 'color':'green'}, legendgroup="stoch2"),
    go.Scatter(name='Beta_stoch3', x=np.arange(len(results_stoch3[0])), y=results_stoch3[6], line={'dash':'dot', 'color':'yellow'}, legendgroup="stoch3"),
    go.Scatter(name='Beta_meas3', x=np.arange(len(results_stoch3[0])), y=results_stoch3[7], line={'dash':'dot','color':'yellow'}, legendgroup="stoch3"),
    go.Scatter(name='I_stoch3', x=np.arange(len(results_stoch3[0])), y=results_stoch3[1]/p, line={'dash':'dot', 'color':'red'}, legendgroup="stoch3"),
    go.Bar(name='Ip_stoch3', x=np.arange(len(results_stoch3[0])), y=results_stoch3[4]*10/p, legendgroup="stoch3"),
    go.Scatter(name='R_stoch3', x=np.arange(len(results_stoch3[0])), y=results_stoch3[2]/p, line={'dash':'dot', 'color':'green'}, legendgroup="stoch3"),
    go.Scatter(name='Beta_stoch4', x=np.arange(len(results_stoch4[0])), y=results_stoch4[6], line={'dash':'dot', 'color':'yellow'}, legendgroup="stoch4"),
    go.Scatter(name='Beta_meas4', x=np.arange(len(results_stoch4[0])), y=results_stoch4[7], line={'dash':'dot','color':'yellow'}, legendgroup="stoch4"),
    go.Scatter(name='I_stoch4', x=np.arange(len(results_stoch4[0])), y=results_stoch4[1]/p, line={'dash':'dot', 'color':'red'}, legendgroup="stoch4"),
    go.Bar(name='Ip_stoch4', x=np.arange(len(results_stoch4[0])), y=results_stoch4[4]*10/p, legendgroup="stoch4"),
    go.Scatter(name='R_stoch4', x=np.arange(len(results_stoch4[0])), y=results_stoch4[2]/p, line={'dash':'dot', 'color':'green'}, legendgroup="stoch4")
])

fig.update_layout(
    xaxis_title = 'Day',
    yaxis_title = 'Proportion of population',
    title={
        'text':r'$\text{Effect of stochasticity on Deterministic SEIR model}$',
        'x':0.5,
        'xanchor':'center'
    }
)

fig.show()