|
@@ -17,13 +17,13 @@ the \emph{Robert Koch Institute} (RKI). The second section outlines the
|
|
|
techniques we use to process this data to fit our project's requirements.
|
|
|
Subsequently, we give a theoretical overview of the PINN's that we employ.
|
|
|
These latter sections, establish the foundation for the implementations
|
|
|
-described in~\Cref{sec:sir:setup} and~\Cref{sec:rsir:setup}.
|
|
|
+used in~\Cref{sec:sir:setup} and~\Cref{sec:rsir:setup}.
|
|
|
|
|
|
% -------------------------------------------------------------------
|
|
|
|
|
|
\section{Epidemiological Data}
|
|
|
\label{sec:preprocessing}
|
|
|
-In this thesis we want to analyze the COVID-19 pandemic in Germany utilizing
|
|
|
+In this thesis, we want to analyze the COVID-19 pandemic in Germany utilizing
|
|
|
the SIR model and PINNs. For a PINN to learn the parameters of the SIR model,
|
|
|
we need pandemic data in the correct format for the approach. Let $N_t$ be the
|
|
|
number of training points, then let $i\in\{1, ..., N_t\}$
|
|
@@ -44,7 +44,7 @@ employ to transform it into the correct structure.
|
|
|
\subsection{RKI Data}
|
|
|
\label{sec:preprocessing:rki}
|
|
|
The RKI is a biomedical institute in Germany responsible for
|
|
|
-the on monitoring and prevention of diseases. As the central institution of the
|
|
|
+the monitoring and prevention of diseases. As the central institution of the
|
|
|
German government in the field of biomedicine, one of its tasks during the
|
|
|
COVID-19 pandemic was to track the number of infections and death cases in
|
|
|
Germany. The data was collected by university hospitals, research facilities
|
|
@@ -89,7 +89,7 @@ date is equivalent to the report date.\\
|
|
|
The RKI assumes that the duration of the illness under normal conditions is 14 days,
|
|
|
while the duration of severe cases is assumed to be 28 days. The recovery cases
|
|
|
in the dataset are calculated using these assumptions, by adding the duration on
|
|
|
-the reference date if it is given. As stated, the recovery data should be used
|
|
|
+the reference date if it is given. As the RKI states, the recovery data should be used
|
|
|
with caution. Since we require the recovery data for further calculations, the
|
|
|
following section presents the solutions we employed to address this issue.
|
|
|
|
|
@@ -99,7 +99,7 @@ following section presents the solutions we employed to address this issue.
|
|
|
\label{sec:preprocessing:rq}
|
|
|
|
|
|
At the outset of this section, we establish the format of the data, that is
|
|
|
-necessary for training the PINNs. In this subsection, we present the method, that we
|
|
|
+necessary for training the PINNs. In this section, we present the method, that we
|
|
|
employ to preprocess and transform the RKI data (see~\Cref{sec:preprocessing:rki})
|
|
|
into the training data. \\
|
|
|
|
|
@@ -132,7 +132,7 @@ and releases them into the removed group $D$ days later.\\
|
|
|
|
|
|
In order to solve the reduced SIR model, we employ a similar algorithm to that
|
|
|
used for the SIR model. However, in contrast to the recovery queue, we utilize
|
|
|
-a set recovery rate $\alpha$ to transfer a portion $\alpha\boldsymbol{I}^{(i)}$
|
|
|
+a constant recovery rate $\alpha$ to transfer a portion $\alpha\boldsymbol{I}^{(i)}$
|
|
|
of infections, which have recovered or died on the $i$'th day and put them into
|
|
|
the $\boldsymbol{R}^{(i+1)}$ compartment of the next day, which is irrelevant to
|
|
|
our purposes. The transformed data for both the SIR model and the reduced SIR
|
|
@@ -172,7 +172,7 @@ for a set of time points, with the aim of minimizing the term,
|
|
|
\begin{equation}\label{eq:SIR_obs_term}
|
|
|
\Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
|
|
|
\end{equation}
|
|
|
-for each data point in the set of training dataset of a cardinality $N_t$ and with
|
|
|
+for each data point in the set of training data points of a cardinality $N_t$ and with
|
|
|
$i\in\{1, ..., N_t\}$. Moreover, the aforementioned parameters must satisfy the system
|
|
|
of differential equations that govern the SIR model. For this reason, Shaier
|
|
|
\etal~\cite{Shaier2021} utilize a PINN framework to satisfy both requirements.
|
|
@@ -199,7 +199,7 @@ range of $[-1, 1]$. Therefore, we regularize the parameters using the
|
|
|
\begin{equation}
|
|
|
\tilde{\beta} = \tanh(\hat{\beta}),\quad \tilde{\alpha} = \tanh(\hat{\alpha}),
|
|
|
\end{equation}
|
|
|
-where $\tilde{\alpha}$ are regularized model predictions.\\
|
|
|
+where $\tilde{\alpha}$ and $\hat{\beta}$ are regularized model predictions.\\
|
|
|
|
|
|
The input data must include the time point $\boldsymbol{t}^{(i)}$ and its
|
|
|
corresponding measured true values of $\Psi^{(i)}$.
|
|
@@ -218,7 +218,7 @@ residual of each ODE. The mean square error of the residuals constitutes the
|
|
|
physics loss
|
|
|
$\mathcal{L}_{\text{physics}}(\boldsymbol{t}, \Psi, \hat{\Psi})$.
|
|
|
The residuals are calculated using the model predictions $\hat{\Psi}$
|
|
|
-and the regularized model predictions of the parameters, $\tilde{\beta}$ and $\tilde{\alpha}$.
|
|
|
+and the regularized model predictions of the parameters, $\tilde{\alpha}$ and $\tilde{\beta}$.
|
|
|
The residuals are given by,
|
|
|
\begin{equation}
|
|
|
0=\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \tilde{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}, \quad 0=\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \tilde{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \tilde{\alpha}\hat{\boldsymbol{I}}, \quad 0=\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \hat{\alpha}\hat{\boldsymbol{I}}.
|
|
@@ -243,7 +243,7 @@ reproduction number $\Rt$ on the German data of the RKI.
|
|
|
\label{sec:pinn:rsir}
|
|
|
|
|
|
The previous section illustrates the methodology we employ to determine the
|
|
|
-constant transmission and recovery rates from a data set obtained from
|
|
|
+constant transmission and recovery rates from a data set obtained during
|
|
|
the COVID-19 pandemic in Germany. In this section, we utilize PINNs to identify
|
|
|
the time-dependent reproduction number, $\Rt$, while reducing the number of
|
|
|
state variables and the reliance on assumptions, by decreasing the number of ODEs
|
|
@@ -277,8 +277,8 @@ minimizing the term,
|
|
|
\end{equation}
|
|
|
for each $i\in\{1,...,N_t\}$. In order to identify the reproduction number, the
|
|
|
PINN minimizes the residuals of the ODE during the training process. The
|
|
|
-training process is analogous to the PINN, which identifies $\beta$
|
|
|
-and $\alpha$ (see~\Cref{sec:pinn:sir}). However, there are two key differences. Firstly, the absence of
|
|
|
+training process is analogous to the PINN, which identifies $\alpha$
|
|
|
+and $\beta$ (see~\Cref{sec:pinn:sir}). However, there are two key differences. Firstly, the absence of
|
|
|
free, trainable parameters. Secondly, the inclusion of an additional state variable that
|
|
|
fluctuates in response to the input. While the state variable $\boldsymbol{I}$
|
|
|
is approximated using the error between the training data and the predicted
|
|
@@ -307,10 +307,10 @@ Then we train on composite loss function given by,
|
|
|
to achieve a better solution.\\
|
|
|
|
|
|
Although we set the transmission rate to be time-dependent, we define the
|
|
|
-recovery time constant over time to reduce the complexity of the problem. The
|
|
|
+recovery time to be constant over time to reduce the complexity of the problem. The
|
|
|
RKI~\cite{GHInf} posits that the typical recovery period for the illness under
|
|
|
normal conditions is 14 days, while those individuals with severe cases require
|
|
|
-approximately 28 days to recover. As we assume the case with normal condition,
|
|
|
+approximately 28 days to recover. As we assume the case with a normal condition,
|
|
|
we can set the recovery time to $D=14$, which yields $\alpha = \nicefrac{1}{14}$.\\
|
|
|
|
|
|
We perform extensive empirical evaluations of the methodology employed to
|