|
@@ -13,7 +13,7 @@ This chapter provides the methods, that we employ to address the problem that we
|
|
|
present in~\Cref{chap:introduction}.~\Cref{sec:preprocessing} outlines
|
|
|
our approaches for preprocessing of the available data and has two
|
|
|
sections. The first section describes the publicly available data provided by
|
|
|
-the \emph{Robert Koch Institute} (RKI)\footnote[1]{\url{https://www.rki.de/EN/Home/homepage_node.html}}.
|
|
|
+the \emph{Robert Koch Institute} (RKI)\footnote{\url{https://www.rki.de/EN/Home/homepage_node.html}}.
|
|
|
The second section outlines the techniques we use to process this data to fit
|
|
|
our project's requirements. Subsequently, we give a theoretical overview of the
|
|
|
PINN's that we employ. These latter sections, establish the foundation for the
|
|
@@ -72,7 +72,7 @@ a weekly basis.\\
|
|
|
\label{fig:rki_data}
|
|
|
\end{figure}
|
|
|
|
|
|
-The second repository is entitled \emph{SARS-CoV-2 Infektionen in Deutschland}.
|
|
|
+The second repository is entitled \emph{SARS-CoV-2 Infektionen in Deutschland}\footnote{\url{https://github.com/robert-koch-institut/SARS-CoV-2-Infektionen_in_Deutschland.git}}.
|
|
|
This dataset contains comprehensive data regarding the infections of each county
|
|
|
on a daily basis. The counties are encoded using the \emph{Community Identification Number}\footnote{\url{https://www.destatis.de/DE/Themen/Laender-Regionen/Regionales/Gemeindeverzeichnis/_inhalt.html}},
|
|
|
wherein the first two digits denote the state, the third digit represents the
|
|
@@ -182,7 +182,7 @@ Their approach, which they refer to as the \emph{disease-informed neural network
|
|
|
the two transition rates $\alpha$ and $\beta$. This method
|
|
|
achieves this by finding an approximate solution of to the inverse problem of
|
|
|
physics-informed neural networks (see~\Cref{sec:pinn}). In terms of the terms of
|
|
|
-the SIR model, a PINN addresses the inverse problemin two ways. First, it minimizes~\Cref{eq:SIR_obs_term}
|
|
|
+the SIR model, a PINN addresses the inverse problemin two ways. First, it minimizes the mean of~\Cref{eq:SIR_obs_term}
|
|
|
by bringing the model predictions $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
|
|
|
closer to the actual values $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
|
|
|
for each time point. Second, it reduces the residuals of the ODEs that
|
|
@@ -223,7 +223,7 @@ The residuals are calculated using the model predictions $(\hat{\boldsymbol{S}},
|
|
|
Thus,
|
|
|
\begin{equation}
|
|
|
\begin{split}
|
|
|
- \mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &\bigg\|\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\
|
|
|
+ \mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &\bigg\|\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}\bigg\|^2 + \bigg\|\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2 + \bigg\|\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\
|
|
|
+ &\frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
|
|
|
\end{split}
|
|
|
\end{equation}
|
|
@@ -237,9 +237,76 @@ reproduction number $\Rt$ on the German data of the RKI.
|
|
|
|
|
|
% -------------------------------------------------------------------
|
|
|
|
|
|
-\section{PINN for the reduced SIR Model 2}
|
|
|
+\section{Estimating the Reproduction Number using PINNs 2}
|
|
|
\label{sec:pinn:rsir}
|
|
|
|
|
|
+The previous section, shows the methodology we utilize to ascertain the
|
|
|
+non-time-dependent transmission and recovery rates from a data set obtained from
|
|
|
+the COVID-19 pandemic in Germany. In this section we employ PINNs to identify
|
|
|
+the time-dependent reproduction number $\Rt$, while reducing the number of state
|
|
|
+variables and the reliance on assumptions, by reducing the system of ODEs
|
|
|
+comprising the SIR model. The methodology presented in this section is based on
|
|
|
+the approach developed by Millevoi \etal~\cite{Millevoi2023}.\\
|
|
|
|
|
|
+In real-world pandemics the rate of infection is affected by a multitude of
|
|
|
+factors. Events like the rising awareness for the disease in the population, the
|
|
|
+implementation of non-pharmaceutical mitigations such as social distancing
|
|
|
+policies, and the emergence of a new variants have an impact on the transmission
|
|
|
+rate $\beta$. Accordingly, a transmission rate that is not time-dependent and
|
|
|
+constant across the whole duration of the pandemic may not accurately reflect
|
|
|
+the dynamics of the spread of a real-world disease correctly. Although we set
|
|
|
+the transmission rate to be time-dependent, the recovery time is assumed to be
|
|
|
+relatively constant in time. The Robert Koch
|
|
|
+Institute\footnote{\url{https://github.com/robert-koch-institut/SARS-CoV-2-Infektionen_in_Deutschland.git}}
|
|
|
+posits that the typical recovery period for the illness under normal conditions
|
|
|
+is 14 days, while those individuals with severe cases take about 28 days to
|
|
|
+recover. Given the negligible number of severe cases compared to the number of
|
|
|
+normal cases, we can set the recovery time to $D=14$ resulting in $\alpha = \nicefrac{1}{14}$.
|
|
|
+The reproduction number, $\Rt$ (see~\Cref{sec:pandemicModel:rsir}), represents
|
|
|
+the number of infections that occur as a result of one infectious individual. It
|
|
|
+indicates if a pandemic is emerging or if it is spreading rapidly through the
|
|
|
+susceptible population. By inserting the definition~\Cref{eq:repr_num}, into the
|
|
|
+system of ODEs of the SIR model, we can derive one~\Cref{eq:reduced_sir_ODE}. In
|
|
|
+order to solve this, we must identify a function that maps a time point to the
|
|
|
+size of the infectious compartment and the specific reproduction number.\\
|
|
|
+
|
|
|
+As with the constant transition rates, we employ a data-driven approach for
|
|
|
+identifying the time-dependent reproduction number $\Rt$. The PINN approximates
|
|
|
+the size ,$\boldsymbol{I}$, with its model prediction $\hat{\boldsymbol{I}}$ by
|
|
|
+minimizing the term,
|
|
|
+\begin{equation}\label{eq:rSir_squared_err}
|
|
|
+ \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2,
|
|
|
+\end{equation}
|
|
|
+for each $i\in\{1,...,N_t\}$. In order to identify the reproduction number, the
|
|
|
+PINN minimizes the residuals of the ODE during the training process. The
|
|
|
+training process is analogous to the one of the PINN, which identifies $\beta$
|
|
|
+and $alpha$ (see~\Cref{sec:pinn:sir}). The distinction lies in the absence of
|
|
|
+trainable parameters and the inclusion of an additional state variable that
|
|
|
+fluctuates in response to the input. While the state variable $\boldsymbol{I}$
|
|
|
+is approximated using the error between the training data and the predicted
|
|
|
+values, the state variable $\Rt$ is approximated exclusively based on the
|
|
|
+residual of the ODE.\\
|
|
|
+
|
|
|
+The PINN receives the input of $\boldsymbol{t}^{(i)}$ and generates a prediction of
|
|
|
+($\hat{\boldsymbol{I}}^{(i)}$, $\Rt^{(i)}$). As previously stated, the PINN minimizes
|
|
|
+the distance between the true values of $\boldsymbol{I}$ and the model predictions
|
|
|
+$\hat{\boldsymbol{I}}$ by minimizing the mean squared error. Consequently, the
|
|
|
+observation loss function is defined by,
|
|
|
+\begin{equation}
|
|
|
+ \mathcal{L}_{\text{rSIR}}(\boldsymbol{I}, \hat{\boldsymbol{I}}) = \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2.
|
|
|
+\end{equation}
|
|
|
+The physics loss function is defined as the squared error of the residual of the
|
|
|
+ODE. The residual of the reduced SIR model is given by,
|
|
|
+\begin{equation}
|
|
|
+ 0 = \frac{dI_s}{dt_s} - \alpha(t_f - t_0)(\Rt - 1)I_s(t_s).
|
|
|
+\end{equation}
|
|
|
+By combining the observation loss with the physics loss, we arrive at the total loss for
|
|
|
+the PINN that solves the reduced SIR model, which is given by,
|
|
|
+\begin{equation}
|
|
|
+ \mathcal{L}_{\text{rSIR}}(\boldsymbol{t}, \boldsymbol{I}, \hat{\boldsymbol{I}}) = \bigg\|\frac{dI_s}{dt_s} - \alpha(t_f - t_0)(\Rt - 1)I_s(t_s)\bigg\|^2+ \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2.
|
|
|
+\end{equation}
|
|
|
+
|
|
|
+The process of determining the reproduction number, along with the other
|
|
|
+techniques, that this chapter presents find application in the following chapter.
|
|
|
|
|
|
% -------------------------------------------------------------------
|