Browse Source

add finished methods dinn part

Phillip Rothenbeck 9 months ago
parent
commit
046b801c22
3 changed files with 93 additions and 45 deletions
  1. 1 0
      .gitignore
  2. 92 45
      chapters/chap03/chap03.tex
  3. BIN
      thesis.pdf

+ 1 - 0
.gitignore

@@ -1,3 +1,4 @@
+*.synctex(busy)
 *.aux
 *.lof
 *.log

+ 92 - 45
chapters/chap03/chap03.tex

@@ -124,7 +124,7 @@ and releases them into the removed group $D$ days later.\\
     \includegraphics[width=\textwidth]{recovery_queue.pdf}
     \caption{The recovery queue takes in the infected individuals for the $k$'th
         day and releases them $D$ days later into the removed group.}
-    \label{fig:rki_data}
+    \label{fig:recovery_queue}
 \end{figure}
 
 In order to solve the reduced SIR model, we employ a similar algorithm to that
@@ -141,58 +141,105 @@ employed by the PINN models, which we describe in the subsequent section.
 \section{Estimating Epidemiological Parameters using PINNs  3}
 \label{sec:pinn:sir}
 
-In the last section we present the methods, we use to transform the RKI data
-(see~\Cref{sec:preprocessing}) into the format that is used by the PINNs to seek
-a solution for the SIR models. In this section we lay out the methodology we
-employ for this thesis concerning PINNs for SIR models.\\
-
-The data, which is yielded by the preprocessing, is in the structure of pairs of
-$(\boldsymbol{t^{(i)}}, (\boldsymbol{S^{(i)}},\boldsymbol{I^{(i)}},\boldsymbol{R^{(i)}}))$,
-which contain the sizes of the susceptible, infectious, and removed compartments
-together with their respective time point with the index $i$. This means that
-this training data contains the measured solutions of the functions $S(t)$,
-$I(t),$ and $R(t)$, which a neural network may use to approximate these
-functions. Furthermore, a PINN can carry out this task with a higher precision
-for more complex problems were the unknown function is more complex and just a
-system of differential equations is given.\\
-
-In this thesis we want to find the solutions of the SIR models belonging to the
-cases of the datasets. The SIR model is given through the system of differential
-equations (see~\Cref{eq:sir}), which describes the relations and fluctuations of
-the three compartments through transition rates $\beta$ and $\alpha$. As we
-explain in~\Cref{sec:pandemicModel:sir}, these parameters influence course of
-the pandemic, which is described by their respective model. Mathematically, when
-we find a pair of parameters for a dataset, these parameters describe a
-function, that solves the system of differential equations for our data set. A
-PINN finds parameters for a given set of differential equations by solving the
-inverse problem. As Shaier \etal~\cite{Shaier2021} propose, a DINN solves inverse
-problems by setting the parameters $\beta$ and $\alpha$ to trainable parameters
-$\widehat{\beta}$ and $\widehat{\alpha}$. As described in~\Cref{sec:pinn}, the
-DINN learns the parameters to optimize its model predictions $\hat{\boldsymbol{S}}$,
-$\hat{\boldsymbol{I}}$, and $\hat{\boldsymbol{R}}$, to fit the differential
-equations through the usage of their residuals and the given data.\\
-
-The PINN uses the loss function to determine how far it is away from the true
-solution. For the DINN~\cite{Shaier2021} this loss function includes the mean
-squared error of each residual in addition to the mean squared error of the
-model predictions concerning their respective true solutions. On the contrary to
-Shaier \etal, who use the set of differential equations of~\Cref{eq:sir} for
-their loss function, we use~\Cref{eq:modSIR}. The reason for this choice is that
-we encountered a better practical performance during our work than when using
-the equation, used by Shaier \etal. Let $N$ be the size of the population and
-$N_t$ the number of training point of the used dataset then,
-
+In the preceding section, we present the methods we employ to preprocess and
+format the data from the RKI in accordance with the specifications required for
+the work of this thesis. In this section, we will present the method we employ
+to identify the non-time-dependent SIR parameters $\beta$ and $\alpha$ for the
+data. As a foundation for our work, we draw upon the work of Shaier et
+al.~\cite{Shaier2021}, to solve the SIR system of ODEs using PINNs.\\
+
+In order to conduct an analysis of a pandemic, it is necessary to have a quantifiable measure
+that indicates whether the disease in question has the capacity to spread rapidly through a
+population or is it not successful in infecting a significant number of
+individuals. We employ the SIR model to construct an abstraction of the complex
+relations inherent to real-world pandemics. The SIR model divides the population into three
+compartments. It is accompanied by a with system of ODEs that encapsulates the
+fluctuations and relationships between these compartments (see~\Cref{eq:sir}).
+The transmission rate $\beta$ and the recovery rate $\alpha$ work as the
+aforementioned quantifiers. We obtain data from the preprocessing stage. It
+provides insight into the progression of the COVID-19 pandemic in Germany.
+The objective is to identify a function that solves the system of differential
+equations of the SIR model, by returning the size of each compartment at a
+specific point in time. This function is supposed to be able to reconstruct the
+training data and is defined by the values of the transition rates $\beta$ and
+$\alpha$. From a mathematical and semantic perspective, it is essential to
+determine these values of the parameter.\\
+
+In order to ascertain the transmission rate $\beta$ and the recovery rate $\alpha$
+from the preprocessed RKI data of $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
+for a given set of time points, it is necessary to employ a data-driven approach that outputs
+a model prediction of $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
+for a set of time points, with the aim of minimizing the term,
+\begin{equation}\label{eq:SIR_obs_term}
+    \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2  + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
+\end{equation}
+for each data point in the set of training dataset of a cardinality $N_tt$ and with
+$i\in\{1, ..., N_t\}$. Moreover, the aforementioned parameters must satisfy the system
+of differential equations that govern the SIR model. For this reason, Shaier
+\etal~\cite{Shaier2021} utilize a PINN framework to satisfy both requirements.
+Their approach, which they refer to as the \emph{disease-informed neural network}
+(see~\Cref{sec:pinn:dinn}), takes epidemiological data as the input and returns
+the two transition rates $\alpha$ and $\beta$. This method
+achieves this by finding an approximate solution of to the inverse problem of
+physics-informed neural networks (see~\Cref{sec:pinn}). In terms of the terms of
+the SIR model, a PINN addresses the inverse problemin two ways. First, it minimizes~\Cref{eq:SIR_obs_term}
+by bringing the model predictions $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
+closer to the actual values $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
+for each time point. Second, it reduces the residuals of the ODEs that
+constitute the SIR model. While the forward problem concludes at this point, the
+inverse problem presets that a parameter is unknown. Thus, we designate the parameters
+$\beta$ and $\alpha$ as free, learnable parameters, $\widehat{\beta}$ and
+$\widehat{\alpha}$. These separate trainable parameters are values that are
+optimized during the training process and must fit the equations of the set of
+ODEs. Furthermore, we know, that the transition rates
+do not surpass the value of $1$. Consequently, we force the value of both rates to be in a
+range of $[-1, 1]$. Therefor, we regularize the parameters using the
+\emph{tangens hyperbolicus}. This results in the terms,
+\begin{equation}
+    \widehat{\beta} = \tanh(\tilde{\beta}),\quad \widehat{\alpha} = \tanh(\tilde{\alpha}),
+\end{equation}
+where $\tilde{\beta}$ and $\tilde{\alpha}$ are the predicted values of the model
+and $\widehat{\beta}$ and $\widehat{\alpha}$ are regularized model predictions.\\
+
+The input data must include the time point $\boldsymbol{t}^{(i)}$ and its
+corresponding measured true values of $(\boldsymbol{S}^{(i)}, \boldsymbol{I}^{(i)}, \boldsymbol{R}^{(i)})$.
+In its forward path, the PINN receives the time point $\boldsymbol{t}^{(i)}$ as its input, from which it
+calculates its model prediction $(\hat{\boldsymbol{S}}^{(i)}, \hat{\boldsymbol{I}}^{(i)}, \hat{\boldsymbol{R}}^{(i)})$
+based on its model parameters $\theta$. Subsequently, the model computes the loss function. It calculates the observation loss by taking the
+mean squared error of~\Cref{eq:SIR_obs_term} over all $N_t$ training samples.
+Therefore, the term for the observation loss is,
+\begin{equation}
+    \mathcal{L}_{\text{obs}}(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2  + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
+\end{equation}
+is the term for the observation loss. Given superior performance in practical applications
+relative to the ODEs of~\Cref{eq:sir}, we utilize the ODEs of~\Cref{eq:modSIR}
+in our physics loss. In order for the model to learn the system of differential,
+it is necessary to obtain the residual of each ODE. The mean square error of the residuals constitutes
+the physics loss $\mathcal{L}_{\text{physiks}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$.
+The residuals are calculated using the model predictions $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$ and the regularized model predictions of the parameters $\widehat{\beta}$ and $\widehat{\alpha}$. The residuals are given by,
+\begin{equation}
+    0=\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}, \quad 0=\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}, \quad 0=\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}.
+\end{equation}
+Thus,
 \begin{equation}
     \begin{split}
         \mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &\bigg\|\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\
         + &\frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2  + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
     \end{split}
 \end{equation}
+is the equation of the total loss for our approach. This loss value is then
+back-propagated through our network, while the model predictions of the
+parameters $\beta$ and $\alpha$ are optimized using the loss as well.\\
 
-is the loss function, that employ to find the transition parameters $\beta$ and
-$alpha$ for the given dataset.
+As this section concentrates on the finding of the time constant parameters
+$\beta$ and $\alpha$, the next section will show our approach of finding the
+reproduction number $\Rt$ on the German data of the RKI.
 
 % -------------------------------------------------------------------
 
 \section{PINN for the reduced SIR Model   2}
-\label{sec:pinn:rsir}
+\label{sec:pinn:rsir}
+
+
+
+% -------------------------------------------------------------------

BIN
thesis.pdf