|
@@ -23,12 +23,13 @@ implementations described in~\Cref{sec:sir:setup} and~\Cref{sec:rsir:setup}.
|
|
|
|
|
|
\section{Epidemiological Data}
|
|
|
\label{sec:preprocessing}
|
|
|
-In order for the PINNs to be effective with the data available to us, it is
|
|
|
-necessary for the data to be in the format required by the epidemiological
|
|
|
-models, which the PINNs will solve. Let $N_t$ be the number of training points,
|
|
|
-then let $i\in\{1, ..., N_t\}$ be the index of the training points. The data
|
|
|
-required by the PINN for solving the SIR model (see~\Cref{sec:pinn:dinn}),
|
|
|
-consists of pairs $(\boldsymbol{t}^{(i)}, (\boldsymbol{S}^{(i)}, \boldsymbol{I}^{(i)}, \boldsymbol{R}^{(i)}))$.
|
|
|
+In this thesis we want to analyze the COVID-19 pandemic In Germany utilizing
|
|
|
+the SIR model and PINNs. For a PINN to learn the parameters of the SIR model,
|
|
|
+we need pandemic data in the correct format for the approach. Let $N_t$ be the
|
|
|
+number of training points, then let $i\in\{1, ..., N_t\}$
|
|
|
+be the index of the training points. The data required by the PINN for solving
|
|
|
+the SIR model (see~\Cref{sec:pinn:dinn}), consists of pairs
|
|
|
+$(\boldsymbol{t}^{(i)}, (\boldsymbol{S}^{(i)}, \boldsymbol{I}^{(i)}, \boldsymbol{R}^{(i)}))$.
|
|
|
Given that the system of differential equations representing the reduced SIR
|
|
|
model (see~\Cref{sec:pandemicModel:rsir}) consists of a single differential
|
|
|
equation for $I$, it is necessary to obtain pairs of the form
|
|
@@ -40,19 +41,20 @@ the correct structure.
|
|
|
|
|
|
\subsection{RKI Data}
|
|
|
\label{sec:preprocessing:rki}
|
|
|
-The Robert Koch Institute is responsible for the on monitoring and prevention of
|
|
|
-diseases. As the central institution of the German government in the field of
|
|
|
-biomedicine, one of its tasks during the COVID-19 pandemic was it to track the
|
|
|
-number of infections and death cases in Germany. The data was collected by
|
|
|
-university hospitals, research facilities and laboratories through the
|
|
|
-conduction of tests. Each new case must be reported within a period of 24 hours
|
|
|
-at the latest to the respective state authority. Each state authority collects
|
|
|
-the cases for a day and must report them to the RKI by the following working
|
|
|
-day. The RKI then refines the data and releases statistics and updates its
|
|
|
-repositories holding the information for the public to access. For the purposes
|
|
|
-of this thesis we concentrate on two of these repositories.\\
|
|
|
+The Robert Koch Institute is a biomedical institute in Germany responsible for
|
|
|
+the on monitoring and prevention of diseases. As the central institution of the
|
|
|
+German government in the field of biomedicine, one of its tasks during the
|
|
|
+COVID-19 pandemic was it to track the number of infections and death cases in
|
|
|
+Germany. The data was collected by university hospitals, research facilities
|
|
|
+and laboratories through the conduction of tests. Each new case must be
|
|
|
+reported within a period of 24 hours at the latest to the respective state
|
|
|
+authority. Each state authority collects the cases for a day and must report
|
|
|
+them to the RKI by the following working day. The RKI then refines the data and
|
|
|
+releases statistics and updates its repositories holding the information for
|
|
|
+the public to access. For the purposes of this thesis we concentrate on two of
|
|
|
+these repositories.\\
|
|
|
|
|
|
-The first repository is called \emph{COVID-19-Todesfälle in Deutschland}\footnote{\url{https://github.com/robert-koch-institut/COVID-19-Todesfaelle_in_Deutschland.git}}.
|
|
|
+The first repository is called \emph{COVID-19-Todesfälle in Deutschland}~\cite{GHDead}.
|
|
|
The dataset comprises discrete data points, each with a date indicating the
|
|
|
point in time at which the respective data was collected. The dates span from
|
|
|
March 9, 2020, to the present day. For each date, the dataset provides the total
|
|
@@ -72,7 +74,7 @@ a weekly basis.\\
|
|
|
\label{fig:rki_data}
|
|
|
\end{figure}
|
|
|
|
|
|
-The second repository is entitled \emph{SARS-CoV-2 Infektionen in Deutschland}\footnote{\url{https://github.com/robert-koch-institut/SARS-CoV-2-Infektionen_in_Deutschland.git}}.
|
|
|
+The second repository is entitled \emph{SARS-CoV-2 Infektionen in Deutschland}~\cite{GHInf}.
|
|
|
This dataset contains comprehensive data regarding the infections of each county
|
|
|
on a daily basis. The counties are encoded using the \emph{Community Identification Number}\footnote{\url{https://www.destatis.de/DE/Themen/Laender-Regionen/Regionales/Gemeindeverzeichnis/_inhalt.html}},
|
|
|
wherein the first two digits denote the state, the third digit represents the
|
|
@@ -85,10 +87,9 @@ date is equivalent to the report date.\\
|
|
|
The RKI assumes that the duration of the illness under normal conditions is 14 days,
|
|
|
while the duration of severe cases is assumed to be 28 days. The recovery cases
|
|
|
in the dataset are calculated using these assumptions, by adding the duration on
|
|
|
-the reference date if it is given. As stated in the ReadMe, the recovery data
|
|
|
-should be used with caution. Since we require the recovery data for further
|
|
|
-calculations, the following section presents the solutions we employed to address
|
|
|
-this issue.
|
|
|
+the reference date if it is given. As stated, the recovery data should be used
|
|
|
+with caution. Since we require the recovery data for further calculations, the
|
|
|
+following section presents the solutions we employed to address this issue.
|
|
|
|
|
|
% -------------------------------------------------------------------
|
|
|
|
|
@@ -144,7 +145,7 @@ employed by the PINN models, which we describe in the subsequent section.
|
|
|
In the preceding section, we present the methods we employ to preprocess and
|
|
|
format the data from the RKI in accordance with the specifications required for
|
|
|
the work of this thesis. In this section, we will present the method we employ
|
|
|
-to identify the non-time-dependent SIR parameters $\beta$ and $\alpha$ for the
|
|
|
+to identify the SIR parameters $\beta$ and $\alpha$ for the
|
|
|
data. As a foundation for our work, we draw upon the work of Shaier et
|
|
|
al.~\cite{Shaier2021}, to solve the SIR system of ODEs using PINNs.\\
|
|
|
|
|
@@ -182,7 +183,7 @@ Their approach, which they refer to as the \emph{disease-informed neural network
|
|
|
the two transition rates $\alpha$ and $\beta$. This method
|
|
|
achieves this by finding an approximate solution of to the inverse problem of
|
|
|
physics-informed neural networks (see~\Cref{sec:pinn}). In terms of the terms of
|
|
|
-the SIR model, a PINN addresses the inverse problemin two ways. First, it minimizes the mean of~\Cref{eq:SIR_obs_term}
|
|
|
+the SIR model, a PINN addresses the inverse problem in two ways. First, it minimizes the mean of~\Cref{eq:SIR_obs_term}
|
|
|
by bringing the model predictions $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
|
|
|
closer to the actual values $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
|
|
|
for each time point. Second, it reduces the residuals of the ODEs that
|
|
@@ -191,8 +192,8 @@ inverse problem presets that a parameter is unknown. Thus, we designate the para
|
|
|
$\beta$ and $\alpha$ as free, learnable parameters, $\widehat{\beta}$ and
|
|
|
$\widehat{\alpha}$. These separate trainable parameters are values that are
|
|
|
optimized during the training process and must fit the equations of the set of
|
|
|
-ODEs. Furthermore, we know, that the transition rates
|
|
|
-do not surpass the value of $1$. Consequently, we force the value of both rates to be in a
|
|
|
+ODEs. Assuming that the values of the transition rates stay below
|
|
|
+1~\cite{Shaier2021}, we force the value of both rates to be in a
|
|
|
range of $[-1, 1]$. Therefor, we regularize the parameters using the
|
|
|
\emph{tangens hyperbolicus}. This results in the terms,
|
|
|
\begin{equation}
|
|
@@ -205,31 +206,30 @@ The input data must include the time point $\boldsymbol{t}^{(i)}$ and its
|
|
|
corresponding measured true values of $(\boldsymbol{S}^{(i)}, \boldsymbol{I}^{(i)}, \boldsymbol{R}^{(i)})$.
|
|
|
In its forward path, the PINN receives the time point $\boldsymbol{t}^{(i)}$ as its input, from which it
|
|
|
calculates its model prediction $(\hat{\boldsymbol{S}}^{(i)}, \hat{\boldsymbol{I}}^{(i)}, \hat{\boldsymbol{R}}^{(i)})$
|
|
|
-based on its model parameters $\theta$. Subsequently, the model computes the loss function. It calculates the observation loss by taking the
|
|
|
+based on its model parameters $\theta$. Subsequently, the model computes the loss function. It calculates the data loss by taking the
|
|
|
mean squared error of~\Cref{eq:SIR_obs_term} over all $N_t$ training samples.
|
|
|
-Therefore, the term for the observation loss is,
|
|
|
+Therefore, the term for the data loss is,
|
|
|
\begin{equation}
|
|
|
- \mathcal{L}_{\text{obs}}(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
|
|
|
+ \mathcal{L}_{\text{data}}(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
|
|
|
\end{equation}
|
|
|
-is the term for the observation loss. Given superior performance in practical applications
|
|
|
+is the term for the data loss. Given superior performance in practical applications
|
|
|
relative to the ODEs of~\Cref{eq:sir}, we utilize the ODEs of~\Cref{eq:modSIR}
|
|
|
in our physics loss. In order for the model to learn the system of differential,
|
|
|
it is necessary to obtain the residual of each ODE. The mean square error of the residuals constitutes
|
|
|
-the physics loss $\mathcal{L}_{\text{physiks}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$.
|
|
|
+the physics loss $\mathcal{L}_{\text{physics}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$.
|
|
|
The residuals are calculated using the model predictions $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$ and the regularized model predictions of the parameters $\widehat{\beta}$ and $\widehat{\alpha}$. The residuals are given by,
|
|
|
\begin{equation}
|
|
|
0=\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}, \quad 0=\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}, \quad 0=\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}.
|
|
|
\end{equation}
|
|
|
Thus,
|
|
|
\begin{equation}
|
|
|
- \begin{split}
|
|
|
- \mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &\bigg\|\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}\bigg\|^2 + \bigg\|\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2 + \bigg\|\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\
|
|
|
- + &\frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
|
|
|
- \end{split}
|
|
|
+ \mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = \mathcal{L}_{\text{physics}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) + \mathcal{L}_{\text{data}}(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})
|
|
|
\end{equation}
|
|
|
-is the equation of the total loss for our approach. This loss value is then
|
|
|
-back-propagated through our network, while the model predictions of the
|
|
|
-parameters $\beta$ and $\alpha$ are optimized using the loss as well.\\
|
|
|
+is the multi-objective loss equation encapsuling both the physics loss and the
|
|
|
+data loss for our approach. By minimizing these loss terms our model learn the
|
|
|
+given training data but also the physics of the system. This enables our model
|
|
|
+to simultaneously learn the values of the parameters $\beta$ and $\alpha$
|
|
|
+during training. \\
|
|
|
|
|
|
As this section concentrates on the finding of the time constant parameters
|
|
|
$\beta$ and $\alpha$, the next section will show our approach of finding the
|
|
@@ -257,7 +257,7 @@ time-dependent and constant across the entire duration of the pandemic may not
|
|
|
accurately reflect the dynamics of the spread of a real-world disease correctly.
|
|
|
Although we set the transmission rate to be time-dependent, the recovery time
|
|
|
is assumed to be relatively constant over time. The Robert Koch
|
|
|
-Institute\footnote{\url{https://github.com/robert-koch-institut/SARS-CoV-2-Infektionen_in_Deutschland.git}}
|
|
|
+Institute~\cite{GHInf}
|
|
|
posits that the typical recovery period for the illness under normal conditions
|
|
|
is 14 days, while those individuals with severe cases require approximately 28
|
|
|
days to recover. In the light of the negligible number of severe cases in
|
|
@@ -292,20 +292,22 @@ The PINN receives the input of $\boldsymbol{t}^{(i)}$ and generates a prediction
|
|
|
($\hat{\boldsymbol{I}}^{(i)}$, $\Rt^{(i)}$). As previously stated, the PINN minimizes
|
|
|
the distance between the true values of $\boldsymbol{I}$ and the model predictions
|
|
|
$\hat{\boldsymbol{I}}$ by minimizing the mean squared error. Consequently, the
|
|
|
-observation loss function is defined by,
|
|
|
+data loss function is defined by,
|
|
|
\begin{equation}
|
|
|
- \mathcal{L}_{\text{rSIR}}(\boldsymbol{I}, \hat{\boldsymbol{I}}) = \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2.
|
|
|
+ \mathcal{L}_{\text{data}}(\boldsymbol{I}, \hat{\boldsymbol{I}}) = \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2.
|
|
|
\end{equation}
|
|
|
The physics loss function is defined as the squared error of the residual of the
|
|
|
ODE. The residual of the reduced SIR model is given by,
|
|
|
\begin{equation}
|
|
|
0 = \frac{dI_s}{dt_s} - \alpha(t_f - t_0)(\Rt - 1)I_s(t_s).
|
|
|
\end{equation}
|
|
|
-By combining the observation loss with the physics loss, we arrive at the total loss for
|
|
|
-the PINN that solves the reduced SIR model, which is given by,
|
|
|
+During training we first fit the data agnostic to physics utilizing only the
|
|
|
+data loss $\mathcal{L}_{\text{data}}(\boldsymbol{I}, \hat{\boldsymbol{I}})$.
|
|
|
+Then we train on composite loss function given by,
|
|
|
\begin{equation}
|
|
|
- \mathcal{L}_{\text{rSIR}}(\boldsymbol{t}, \boldsymbol{I}, \hat{\boldsymbol{I}}) = \bigg\|\frac{dI_s}{dt_s} - \alpha(t_f - t_0)(\Rt - 1)I_s(t_s)\bigg\|^2+ \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2.
|
|
|
+ \mathcal{L}_{\text{rSIR}}(\boldsymbol{t}, \boldsymbol{I}, \hat{\boldsymbol{I}}) = \bigg\|\frac{dI_s}{dt_s} - \alpha(t_f - t_0)(\Rt - 1)I_s(t_s)\bigg\|^2+ \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2,
|
|
|
\end{equation}
|
|
|
+to achieve a better solution.\\
|
|
|
|
|
|
The process of determining the reproduction number, along with the other
|
|
|
techniques, that this chapter presents find application in the following chapter.
|