|
@@ -124,7 +124,7 @@ and releases them into the removed group $D$ days later.\\
|
|
|
\includegraphics[width=\textwidth]{recovery_queue.pdf}
|
|
|
\caption{The recovery queue takes in the infected individuals for the $k$'th
|
|
|
day and releases them $D$ days later into the removed group.}
|
|
|
- \label{fig:rki_data}
|
|
|
+ \label{fig:recovery_queue}
|
|
|
\end{figure}
|
|
|
|
|
|
In order to solve the reduced SIR model, we employ a similar algorithm to that
|
|
@@ -141,58 +141,105 @@ employed by the PINN models, which we describe in the subsequent section.
|
|
|
\section{Estimating Epidemiological Parameters using PINNs 3}
|
|
|
\label{sec:pinn:sir}
|
|
|
|
|
|
-In the last section we present the methods, we use to transform the RKI data
|
|
|
-(see~\Cref{sec:preprocessing}) into the format that is used by the PINNs to seek
|
|
|
-a solution for the SIR models. In this section we lay out the methodology we
|
|
|
-employ for this thesis concerning PINNs for SIR models.\\
|
|
|
-
|
|
|
-The data, which is yielded by the preprocessing, is in the structure of pairs of
|
|
|
-$(\boldsymbol{t^{(i)}}, (\boldsymbol{S^{(i)}},\boldsymbol{I^{(i)}},\boldsymbol{R^{(i)}}))$,
|
|
|
-which contain the sizes of the susceptible, infectious, and removed compartments
|
|
|
-together with their respective time point with the index $i$. This means that
|
|
|
-this training data contains the measured solutions of the functions $S(t)$,
|
|
|
-$I(t),$ and $R(t)$, which a neural network may use to approximate these
|
|
|
-functions. Furthermore, a PINN can carry out this task with a higher precision
|
|
|
-for more complex problems were the unknown function is more complex and just a
|
|
|
-system of differential equations is given.\\
|
|
|
-
|
|
|
-In this thesis we want to find the solutions of the SIR models belonging to the
|
|
|
-cases of the datasets. The SIR model is given through the system of differential
|
|
|
-equations (see~\Cref{eq:sir}), which describes the relations and fluctuations of
|
|
|
-the three compartments through transition rates $\beta$ and $\alpha$. As we
|
|
|
-explain in~\Cref{sec:pandemicModel:sir}, these parameters influence course of
|
|
|
-the pandemic, which is described by their respective model. Mathematically, when
|
|
|
-we find a pair of parameters for a dataset, these parameters describe a
|
|
|
-function, that solves the system of differential equations for our data set. A
|
|
|
-PINN finds parameters for a given set of differential equations by solving the
|
|
|
-inverse problem. As Shaier \etal~\cite{Shaier2021} propose, a DINN solves inverse
|
|
|
-problems by setting the parameters $\beta$ and $\alpha$ to trainable parameters
|
|
|
-$\widehat{\beta}$ and $\widehat{\alpha}$. As described in~\Cref{sec:pinn}, the
|
|
|
-DINN learns the parameters to optimize its model predictions $\hat{\boldsymbol{S}}$,
|
|
|
-$\hat{\boldsymbol{I}}$, and $\hat{\boldsymbol{R}}$, to fit the differential
|
|
|
-equations through the usage of their residuals and the given data.\\
|
|
|
-
|
|
|
-The PINN uses the loss function to determine how far it is away from the true
|
|
|
-solution. For the DINN~\cite{Shaier2021} this loss function includes the mean
|
|
|
-squared error of each residual in addition to the mean squared error of the
|
|
|
-model predictions concerning their respective true solutions. On the contrary to
|
|
|
-Shaier \etal, who use the set of differential equations of~\Cref{eq:sir} for
|
|
|
-their loss function, we use~\Cref{eq:modSIR}. The reason for this choice is that
|
|
|
-we encountered a better practical performance during our work than when using
|
|
|
-the equation, used by Shaier \etal. Let $N$ be the size of the population and
|
|
|
-$N_t$ the number of training point of the used dataset then,
|
|
|
-
|
|
|
+In the preceding section, we present the methods we employ to preprocess and
|
|
|
+format the data from the RKI in accordance with the specifications required for
|
|
|
+the work of this thesis. In this section, we will present the method we employ
|
|
|
+to identify the non-time-dependent SIR parameters $\beta$ and $\alpha$ for the
|
|
|
+data. As a foundation for our work, we draw upon the work of Shaier et
|
|
|
+al.~\cite{Shaier2021}, to solve the SIR system of ODEs using PINNs.\\
|
|
|
+
|
|
|
+In order to conduct an analysis of a pandemic, it is necessary to have a quantifiable measure
|
|
|
+that indicates whether the disease in question has the capacity to spread rapidly through a
|
|
|
+population or is it not successful in infecting a significant number of
|
|
|
+individuals. We employ the SIR model to construct an abstraction of the complex
|
|
|
+relations inherent to real-world pandemics. The SIR model divides the population into three
|
|
|
+compartments. It is accompanied by a with system of ODEs that encapsulates the
|
|
|
+fluctuations and relationships between these compartments (see~\Cref{eq:sir}).
|
|
|
+The transmission rate $\beta$ and the recovery rate $\alpha$ work as the
|
|
|
+aforementioned quantifiers. We obtain data from the preprocessing stage. It
|
|
|
+provides insight into the progression of the COVID-19 pandemic in Germany.
|
|
|
+The objective is to identify a function that solves the system of differential
|
|
|
+equations of the SIR model, by returning the size of each compartment at a
|
|
|
+specific point in time. This function is supposed to be able to reconstruct the
|
|
|
+training data and is defined by the values of the transition rates $\beta$ and
|
|
|
+$\alpha$. From a mathematical and semantic perspective, it is essential to
|
|
|
+determine these values of the parameter.\\
|
|
|
+
|
|
|
+In order to ascertain the transmission rate $\beta$ and the recovery rate $\alpha$
|
|
|
+from the preprocessed RKI data of $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
|
|
|
+for a given set of time points, it is necessary to employ a data-driven approach that outputs
|
|
|
+a model prediction of $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
|
|
|
+for a set of time points, with the aim of minimizing the term,
|
|
|
+\begin{equation}\label{eq:SIR_obs_term}
|
|
|
+ \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
|
|
|
+\end{equation}
|
|
|
+for each data point in the set of training dataset of a cardinality $N_tt$ and with
|
|
|
+$i\in\{1, ..., N_t\}$. Moreover, the aforementioned parameters must satisfy the system
|
|
|
+of differential equations that govern the SIR model. For this reason, Shaier
|
|
|
+\etal~\cite{Shaier2021} utilize a PINN framework to satisfy both requirements.
|
|
|
+Their approach, which they refer to as the \emph{disease-informed neural network}
|
|
|
+(see~\Cref{sec:pinn:dinn}), takes epidemiological data as the input and returns
|
|
|
+the two transition rates $\alpha$ and $\beta$. This method
|
|
|
+achieves this by finding an approximate solution of to the inverse problem of
|
|
|
+physics-informed neural networks (see~\Cref{sec:pinn}). In terms of the terms of
|
|
|
+the SIR model, a PINN addresses the inverse problemin two ways. First, it minimizes~\Cref{eq:SIR_obs_term}
|
|
|
+by bringing the model predictions $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
|
|
|
+closer to the actual values $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
|
|
|
+for each time point. Second, it reduces the residuals of the ODEs that
|
|
|
+constitute the SIR model. While the forward problem concludes at this point, the
|
|
|
+inverse problem presets that a parameter is unknown. Thus, we designate the parameters
|
|
|
+$\beta$ and $\alpha$ as free, learnable parameters, $\widehat{\beta}$ and
|
|
|
+$\widehat{\alpha}$. These separate trainable parameters are values that are
|
|
|
+optimized during the training process and must fit the equations of the set of
|
|
|
+ODEs. Furthermore, we know, that the transition rates
|
|
|
+do not surpass the value of $1$. Consequently, we force the value of both rates to be in a
|
|
|
+range of $[-1, 1]$. Therefor, we regularize the parameters using the
|
|
|
+\emph{tangens hyperbolicus}. This results in the terms,
|
|
|
+\begin{equation}
|
|
|
+ \widehat{\beta} = \tanh(\tilde{\beta}),\quad \widehat{\alpha} = \tanh(\tilde{\alpha}),
|
|
|
+\end{equation}
|
|
|
+where $\tilde{\beta}$ and $\tilde{\alpha}$ are the predicted values of the model
|
|
|
+and $\widehat{\beta}$ and $\widehat{\alpha}$ are regularized model predictions.\\
|
|
|
+
|
|
|
+The input data must include the time point $\boldsymbol{t}^{(i)}$ and its
|
|
|
+corresponding measured true values of $(\boldsymbol{S}^{(i)}, \boldsymbol{I}^{(i)}, \boldsymbol{R}^{(i)})$.
|
|
|
+In its forward path, the PINN receives the time point $\boldsymbol{t}^{(i)}$ as its input, from which it
|
|
|
+calculates its model prediction $(\hat{\boldsymbol{S}}^{(i)}, \hat{\boldsymbol{I}}^{(i)}, \hat{\boldsymbol{R}}^{(i)})$
|
|
|
+based on its model parameters $\theta$. Subsequently, the model computes the loss function. It calculates the observation loss by taking the
|
|
|
+mean squared error of~\Cref{eq:SIR_obs_term} over all $N_t$ training samples.
|
|
|
+Therefore, the term for the observation loss is,
|
|
|
+\begin{equation}
|
|
|
+ \mathcal{L}_{\text{obs}}(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
|
|
|
+\end{equation}
|
|
|
+is the term for the observation loss. Given superior performance in practical applications
|
|
|
+relative to the ODEs of~\Cref{eq:sir}, we utilize the ODEs of~\Cref{eq:modSIR}
|
|
|
+in our physics loss. In order for the model to learn the system of differential,
|
|
|
+it is necessary to obtain the residual of each ODE. The mean square error of the residuals constitutes
|
|
|
+the physics loss $\mathcal{L}_{\text{physiks}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$.
|
|
|
+The residuals are calculated using the model predictions $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$ and the regularized model predictions of the parameters $\widehat{\beta}$ and $\widehat{\alpha}$. The residuals are given by,
|
|
|
+\begin{equation}
|
|
|
+ 0=\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}, \quad 0=\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}, \quad 0=\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}.
|
|
|
+\end{equation}
|
|
|
+Thus,
|
|
|
\begin{equation}
|
|
|
\begin{split}
|
|
|
\mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &\bigg\|\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\
|
|
|
+ &\frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
|
|
|
\end{split}
|
|
|
\end{equation}
|
|
|
+is the equation of the total loss for our approach. This loss value is then
|
|
|
+back-propagated through our network, while the model predictions of the
|
|
|
+parameters $\beta$ and $\alpha$ are optimized using the loss as well.\\
|
|
|
|
|
|
-is the loss function, that employ to find the transition parameters $\beta$ and
|
|
|
-$alpha$ for the given dataset.
|
|
|
+As this section concentrates on the finding of the time constant parameters
|
|
|
+$\beta$ and $\alpha$, the next section will show our approach of finding the
|
|
|
+reproduction number $\Rt$ on the German data of the RKI.
|
|
|
|
|
|
% -------------------------------------------------------------------
|
|
|
|
|
|
\section{PINN for the reduced SIR Model 2}
|
|
|
-\label{sec:pinn:rsir}
|
|
|
+\label{sec:pinn:rsir}
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+% -------------------------------------------------------------------
|