```
import pandas as pd # Loading the NYC dataset
import numpy as np # performing fast calculations to build our simulation
import plotly.express as px # Quick visuals for data exploration
from ipywidgets import widgets # sliders, buttons and visual layout of the dashboard
import plotly.graph_objects as go # plotly objects to place on our graph
import math # python mathematical functions
```

# Python Coffee Courses 103 - Basic SIR modeling and how to visualize it

Import the libraries we will need to execute on the calculations and modelling today.

# SIR Model

In order to prepare to visualize our SIR model we will need to calculate the number of people in the susceptible, infectious, and removed categories for every single day. Because we’re making this model we can make things simple for ourselves, in the end we will have 4 arrays with data in them which we can plot on a chart:

```
days ---> [ 1, 2, 3, 4, 5, 6, 7, 8, 9, ..... 196 ]
susceptible ---> [1000, 999, 990, 960, 910, 850, 780, 700, 600, ..... 0 ]
infectious ---> [ 0, 1, 10, 35, 80, 140, 190, 250, 320, ..... 0 ]
removed ---> [ 0, 0, 0, 5, 10, 20, 30, 50, 80, ..... 1000]
```

From looking a this we can already understand some of the limits of the SIR model we’re going to build:

- At any given day, the sum of susceptible, infectious, and removed will be the same, that will be our population.
- We’re going to have to define how many days we want to simulate.
- Given that our model calculates what today’s values for S, I, and R will be based on yesterday’s this is a perfect opportunity for a for loop.

Given these insights let’s define our starting conditions now:

```
= 50
days = 1000
population = 1
initial_infection
= 1.6
infection_rate = .10
removal_rate
= np.linspace(0,days, days)
x_days
= [population]
y_susceptible = [initial_infection]
y_infected = [0] y_removed
```

With the initial conditions of our model created, let’s look at how we can calculate how many people we need to move from Susceptible to Infectious and from Infectious to Removed every day.

## Susceptible to Infectious

For Susceptible to Infectious, we’re going to multiply the infectious rate by the number of people who have been infected and by the ratio of people still susceptible.

$ \[\begin{align} \large \mathit{daily\_infected} = \mathit{ infection\_rate} * \mathit{previous\_day\_infected} * \frac{\mathit{previous\_day\_susceptible}}{\mathit{population}} \end{align}\] $

written in python as:

`= (infection_rate * y_infected[day-1] * (y_susceptible[day-1])/population) daily_infected `

## Infectious to Removed

To move people from the infectious category to the removed category we’re going to multiply the number of infectious people on the day before by the removal rate.

$ \[\begin{align} \large \mathit{daily\_removed} = \mathit{removal\_rate} * \mathit{previous\_day\_infected} \end{align}\] $

written in python as:

`= removal_rate * y_infected[day-1] daily_removed `

With the figures for the number of people who have moved from Susceptible to Infectious and have moved from Infectious to Removed calculated, we can substract them from the previous day values to get our new Susceptible, Infectious, Removed numbers.

Written out in python it looks like the code written below:

```
for day in x_days[1:days-1]:
# convert our day value to integer because numpy works with floating point
# values by default instead of integers.
= int(day)
day
# Calculate the number of people to move from one category to another category
# as described above.
= (infection_rate * y_infected[day-1] * (y_susceptible[day-1])/population)
daily_infected = removal_rate * y_infected[day-1]
daily_removed
# Add and remove the calculated amounts from each category for thie calculated day
max(y_susceptible[day-1] - daily_infected, 0) )
y_susceptible.append( -1] + daily_infected - daily_removed )
y_infected.append( y_infected[day-1] + daily_removed )
y_removed.append( y_removed[day
```

We now have the data we need to build our visualization and show how the different groups of people interact with each other.

```
# This figure builder allows us to create an interactive figure to which
# we are going to add the data we calculated one by one
= go.FigureWidget()
fig
# This visualization only looks good if the bars are stacked, so we're going to
# set the stacking option here so that the new series added to the graph will
# stack.
='stack'
fig.update_layout(barmode= 900)
, height fig
```

The first data we’re going to add is the most common one you see in the press which is the model of the number of infected people by day.

```
# add the series data, in plotly this is called a trace.
fig.add_trace(# This series is going to be a bar chart
go.Bar(= x_days # the x-values are defined by the list of days generated earlier
x = y_infected # the y-values are defined by the daily infected figures calculated in our loop
,y ='Infected' # name the series
, name= dict(color='red') # color the series red
, marker
))
=1 # ignore this, I put this here to avoid seing the output in the notebook x
```

Next, we’ll add the susceptible numbers to see how they evolve in relation to the infected population

```
fig.add_trace(= x_days
go.Bar( x = y_susceptible
, y ='Susceptible'
, name= dict(color='rgba(0,0,255,0.5)') # interesting thing about the color values in plotly,
, marker # you can use rgba to set an alpha value to increase the
# transparency
)
)
=1 # ignore this, I put this here to avoid seing the output in the notebook x
```

Finally, we’ll add the removed category to the figure.

```
fig.add_trace(= x_days
go.Bar(x = y_removed
,y ='Removed'
, name=0.5))
, opacity
=1 # ignore this, I put this here to avoid seing the output in the notebook x
```

# SIR Interactive example

The first thing we’re going to do to make our calculator interactive is to put the model calculation code into a method that we can call when variables change.

We’ll take the starting values from the code above as the parameters for the function and return a dictionary which contains the three arrays we calculated which contain the number of infected, removed, and susceptible people for each day.

```
# We'll set some easy defaults for this function so that we immediately have outputs which make some sense when we execute this method.
def calculate_values_SIR( removal_rate = .1
= 1.1
, infection_rate = 1000
, population = 150
, days = 1
, initial_infection
# we've added this additional parameter which allows us to cut the
# infection rate by 2 at a given day to simulate drastic policy action.
= 50
, intervention_day
):
# build out starting positions
= np.linspace(0,days, days)
x_days = [population]
y_susceptible = [initial_infection]
y_infected = [0]
y_removed
for day in x_days[1:days-1]:
# if this day is after the day drastic intervention is made, reduce the infection rate by 2.
if day > intervention_day:
= infection_rate / 2
infection_rate
= int(day)
day
= min((infection_rate * y_infected[day-1] * (y_susceptible[day-1])/population), y_susceptible[day-1])
daily_infected = removal_rate * y_infected[day-1]
daily_removed
-1] - daily_infected )
y_susceptible.append( y_susceptible[day-1] + daily_infected - daily_removed )
y_infected.append( y_infected[day-1] + daily_removed )
y_removed.append( y_removed[day
return {'infected': np.rint(y_infected)
'removed' : np.rint(y_removed)
, 'susceptible': np.rint(y_susceptible)}
,
```

With the smarts of our model safely tucked away in a method let’s build some controls we can use to update our chart live.

We’re going to create a layout using the widgets library which looks like this:

```
+--------------------------------+
| first group of sliders |
+--------------------------------+
| second group of sliders |
+--------------------------------+
| Chart |
| |
| |
| |
+--------------------------------+
```

Fortunately this is an easy layout to program, it’s a vertical layout where we’re stacking three layout elements. The first two layout elements will be collections of widgets arranged horizontally which is called a Horizontal Box denoted HBox and the final element is the chart.

Let’s create our two slider groups.

```
# This slider works with floating point numbers (hence being called Float Slider) and
# allows us to set a variable with this slider. This is going to be the way we set the
# infection rate.
= widgets.FloatSlider(
ir =1.187, # this is the initial value of our slider when it appears
valuemin=0.0, # the minimum value we'll allow
max=5.0, # the maximum value we'll allow
=.001, # by what increments the slider will change when we move it
step='Infection_rate:', # the name of the slider
description=False # Will this slider wait until it stops moving to
continuous_update# call it's update function or does it call the
# update function immediately?
)
= widgets.FloatSlider(
rr =.46,
valuemin=0.1, # this is set to greater than 0 because this is the denominator in the R0 calculation
max=2.0,
=.01,
step='Removal_Rate:',
description=False
continuous_update
)
= widgets.IntSlider(
ii =1,
valuemin=1,
max=50,
=1,
step='Initially Infected:',
description=False
continuous_update
)
= widgets.IntSlider(
ip =1000,
valuemin=500,
max=10_000_000,
=500,
step='Initial Population:',
description=False
continuous_update
)
= widgets.IntSlider(
iday =15,
valuemin=1,
max=500,
=1,
step='Day of intervention (reducing infection rate):',
description=False
continuous_update
)
= widgets.HBox(children=[ir, rr])
first_slider_group = widgets.HBox(children=[ii, ip, iday]) second_slider_group
```

The the top part of our interactive chart built, let’s build the chart.

```
# First, we use the method created above to calculate a model using the initial
# values of the sliders we just created. Given that at this point we haven't
# displayed the sliders yet, their values will be the default values we set above.
= calculate_values_SIR( removal_rate = rr.value
data = ir.value
, infection_rate = ip.value
, population = 150
, days = ii.value
, initial_infection = iday.value
, intervention_day
)
# Next we add all the data traces to the chart
= go.Bar(x = list(range(1,len(data['infected'])))
infected_trace = data['infected']
,y ='Infected'
, name= dict(color='red')
, marker
)
= go.Bar(x = list(range(1, len(data['susceptible'])))
susceptible_trace = data['susceptible']
, y ='Susceptible'
, name= dict(color='rgba(0,0,255,0.5)')
, marker =0.5
, opacity
)
= go.Bar(x =list(range(1, len(data['removed'])))
removed_trace = data['removed']
,y ='Removed'
, name= dict(color='rgba(0,128,0,0.5)')
, marker =0.5)
, opacity
# This trace is interesting as it's basically just drawing a straight line on the
# selected intervention day.
= go.Scatter(x = [iday.value, iday.value]
intervention_day = [0, ip.value]
,y ='Intervention day'
, name= dict(color='darkblue')
, marker = dict(width=5)
, line
)
# We create our figure adding all the traces we created to the data list, and setting some layout values in the layout parameter.
= go.FigureWidget(data=[ infected_trace, removed_trace, susceptible_trace, intervention_day ],
g =go.Layout(
layout={
title'text': f'R0 = {ir.value / rr.value} <br /> Post-Intervention R0: {ir.value/ 2* rr.value} <br />Infection_rate={ir.value} Removal_rate={rr.value}',
'y':.95,
'x':0.5,
'xanchor': 'center',
'yanchor': 'top'}
='stack'
,barmode='x'
,hovermode=900
,height=dict(title='Number of Days')
,xaxis=dict(title='Number of People')
,yaxis
))
# This is to update the x-axis range to show only the days where we have cases
range=[0,np.where(data['infected']==0)[0][0]])
g.update_xaxes(
=1 x
```

With all of the elements of our quick analysis dashboard created, let’s link them up so that the buttons actually do something.

```
# This method will be called any time one of the sliders is modified. It will re-run our model calculation
# with the new values and update the data for the 4 traces we added to the figure.
def response(change):
=150
num_days
# recalculate the model using the new values defined by the sliders
= calculate_values_SIR(removal_rate = rr.value
pop_values =ir.value
, infection_rate=ii.value
, initial_infection=ip.value
, population=num_days
, days= iday.value)
, intervention_day
# Try to find the first day where we have no more infections,
# if that fails and we get an error, use the maximum number of
# days for which we've calculated the model. We use this later
# to update the x-axis range to keep our curve centered.
try:
= np.where(pop_values['infected']==0)[0][0]
end_infection except IndexError:
= num_days
end_infection
# plotly updates much faster if you send all the updates in one go,
# this construct allows us to open an update session on our chart
# and when we leave it, it will send all the updates to the figure
# at once.
with g.batch_update():
# update the y-axis values from the model
0].y = pop_values['infected']
g.data[1].y = pop_values['removed']
g.data[2].y = pop_values['susceptible']
g.data[
# update the x-axis values
0].x = list(range(1,num_days))
g.data[1].x = list(range(1,num_days))
g.data[2].x = list(range(1,num_days))
g.data[
# Add the intervention day line
3].y = [0 , ip.value]
g.data[3].x = [iday.value, iday.value]
g.data[
# update the title to show the R values as well as the infection rate and the removal rate
= {
g.layout.title 'text': f'R0 = {ir.value / rr.value} <br /> Post-Intervention R0: {ir.value/ 2* rr.value} <br /> Infection_rate={ir.value} Removal_rate={rr.value}',
'x':0.5,
'xanchor': 'center',
'yanchor': 'top'}
# change the x-axis range to keep the infection curve in scope
= dict(range=[0,end_infection])
g.layout.xaxis
# Update each of the widgets and register our update method as the method to
# call when they change.
='value')
ir.observe(response, names='value')
rr.observe(response, names='value')
ii.observe(response, names='value')
ip.observe(response, names='value')
iday.observe(response, names
# put the widgets, and our chart together into our layout
widgets.VBox([first_slider_group, second_slider_group, g])
```

### Challenge 1: Add NYC data to the visualization

```
# here's some code to get you started.
= pd.read_csv('us_jhu_data/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv')
df = df[(df['Province_State'] == 'New York')] ## Fix a bug here to make sure you select only NY
df = df.melt(id_vars=['UID', 'iso2', 'iso3', 'code3', 'FIPS', 'Admin2', 'Province_State',
df 'Country_Region', 'Lat', 'Long_', 'Combined_Key']
='day'
, var_name='confirmed_cases')
, value_name
# Convert the loaded day to a date object
'day'] = pd.to_datetime(df['day']) df[
```

`'Admin2'].unique() df[`

```
array(['Albany', 'Allegany', 'Bronx', 'Broome', 'Cattaraugus', 'Cayuga',
'Chautauqua', 'Chemung', 'Chenango', 'Clinton', 'Columbia',
'Cortland', 'Delaware', 'Dutchess', 'Erie', 'Essex', 'Franklin',
'Fulton', 'Genesee', 'Greene', 'Hamilton', 'Herkimer', 'Jefferson',
'Kings', 'Lewis', 'Livingston', 'Madison', 'Monroe', 'Montgomery',
'Nassau', 'New York', 'Niagara', 'Oneida', 'Onondaga', 'Ontario',
'Orange', 'Orleans', 'Oswego', 'Otsego', 'Putnam', 'Queens',
'Rensselaer', 'Richmond', 'Rockland', 'St. Lawrence', 'Saratoga',
'Schenectady', 'Schoharie', 'Schuyler', 'Seneca', 'Steuben',
'Suffolk', 'Sullivan', 'Tioga', 'Tompkins', 'Ulster', 'Warren',
'Washington', 'Wayne', 'Westchester', 'Wyoming', 'Yates',
'Out of NY', 'Unassigned'], dtype=object)
```

`= df[df['Admin2'].isin(['Bronx','Kings','New York','Queens','Richmond',])] df `

```
# collect all the confirmed cases for NY per day using groupby and agg
# after that replace the many days with 0 infected with NA so that we
# can drop those rows with dropna
# We reset the index twice as a shorcut to quickly and easily get a sequential
# number representing the days from first to last because the day column
# became an index durign the group by and indexes in pandas are 'usually' sorted
= df.groupby('day')\
df 'confirmed_cases': 'sum'})\
.agg({0, np.nan)\
.replace(\
.dropna()\
.reset_index()
.reset_index()
# Increment the index by 1 to have the first case start on the first day
'index'] = df['index'] + 1
df[
# Add the NY data to the plot with the add trace method.
= df['index']
g.add_trace(go.Scattergl( x= df['confirmed_cases']
, y='NYC Cases')) , name
```

### Challenge 2: Export NYC Data and predicted cases to an Excel

```
# Hint: you may need to play around with some numbers in the interactive sheet and then use the model method to get the data you need before you can start writing out to Excel.
# From the previous challenge we have the actual data, we can pull the data we generated by playing with the sliders
# from the graph by using the same methods we used to set the data in our update method:
# g.data[0].y = pop_values['infected']
# g.data[1].y = pop_values['removed']
# g.data[2].y = pop_values['susceptible']
'predicted_cases'] = g.data[0].y[:len(df)]
df['predicted_removed_cases'] = g.data[1].y[:len(df)]
df['predicted_susceptible'] = g.data[2].y[:len(df)]
df[
'day'
df[['confirmed_cases'
, 'predicted_cases'
, 'predicted_removed_cases'
, 'predicted_susceptible']].to_excel('solution1.xlsx') ,
```

### Challenge 3 (hard): Use the supplied write_to_sheet method below to write the NYC Data compared to the predicted data to a templated excel sheet with a coversheet.

```
def write_to_sheet(excel_file, in_scope_df
='1'
, sheet_name='Lorem'
, title=True
, new_file=True
, new_sheet=True):
, include_indexfrom openpyxl import Workbook
from openpyxl import load_workbook
from openpyxl.worksheet.table import Table, TableStyleInfo
from openpyxl.utils.dataframe import dataframe_to_rows
# Load template
if new_file:
= load_workbook('AR_pyTemplate.xlsx')
wb else:
= load_workbook(excel_file)
wb
if new_sheet:
= wb.copy_worksheet(wb['basic_sheet'])
ws = sheet_name
ws.title else:
try:
= wb[sheet_name]
ws except Exception as e:
print("Exception: ", e)
return
# Convert the data frame to string
for col in in_scope_df.select_dtypes(include=['datetime64']).columns:
= in_scope_df[col].astype('str')
in_scope_df[col]
# set the title of the sheet
'A1'] = title
ws[
# add column headings. NB. these must be strings
ws.append(in_scope_df.columns.tolist())
# Add the data rows
for idx, row in in_scope_df.iterrows():
ws.append(row.tolist())
def get_excel_col(number):
= "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
cols = (number % len(cols))-1
second_letter = (number // len(cols))-1
first_letter if first_letter == -1:
return cols[second_letter]
else:
return cols[first_letter] + cols[second_letter]
# Create range to create table
= "A2:" + get_excel_col(len(in_scope_df.columns)) + str(len(in_scope_df)+2)
ref
# Create the table
= Table(displayName=sheet_name, ref=ref )
tab
# Add a default style with striped rows
= TableStyleInfo(name="TableStyleLight12"
style =False
, showFirstColumn=False
, showLastColumn=True
, showRowStripes=False)
, showColumnStripes
# Apply table style
= style
tab.tableStyleInfo
# Apply the table to the sheet
ws.add_table(tab)
#
wb.save(excel_file)
```

```
"solution3.xlsx"
write_to_sheet('day'
, df[['confirmed_cases'
, 'predicted_cases'
, 'predicted_removed_cases'
, 'predicted_susceptible']]
, ='case_prediction') , title
```

```
/Users/jvanderhoeven/miniconda/envs/analysis-101/lib/python3.7/site-packages/ipykernel_launcher.py:31: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
```