9 months ago · 046b801c22
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,4 @@
 
				+*.synctex(busy)
			
 
				 *.aux
			
 
				 *.lof
			
 
				 *.log
			
--- a/chapters/chap03/chap03.tex
+++ b/chapters/chap03/chap03.tex
@@ -124,7 +124,7 @@ and releases them into the removed group $D$ days later.\\
 
				     \includegraphics[width=\textwidth]{recovery_queue.pdf}
			
 
				     \caption{The recovery queue takes in the infected individuals for the $k$'th
			
 
				         day and releases them $D$ days later into the removed group.}
			
 
				-    \label{fig:rki_data}
			
 
				+    \label{fig:recovery_queue}
			
 
				 \end{figure}
			
 
				 
			
 
				 In order to solve the reduced SIR model, we employ a similar algorithm to that
			
@@ -141,58 +141,105 @@ employed by the PINN models, which we describe in the subsequent section.
 
				 \section{Estimating Epidemiological Parameters using PINNs  3}
			
 
				 \label{sec:pinn:sir}
			
 
				 
			
 
				-In the last section we present the methods, we use to transform the RKI data
			
 
				-(see~\Cref{sec:preprocessing}) into the format that is used by the PINNs to seek
			
 
				-a solution for the SIR models. In this section we lay out the methodology we
			
 
				-employ for this thesis concerning PINNs for SIR models.\\
			
 
				-
			
 
				-The data, which is yielded by the preprocessing, is in the structure of pairs of
			
 
				-$(\boldsymbol{t^{(i)}}, (\boldsymbol{S^{(i)}},\boldsymbol{I^{(i)}},\boldsymbol{R^{(i)}}))$,
			
 
				-which contain the sizes of the susceptible, infectious, and removed compartments
			
 
				-together with their respective time point with the index $i$. This means that
			
 
				-this training data contains the measured solutions of the functions $S(t)$,
			
 
				-$I(t),$ and $R(t)$, which a neural network may use to approximate these
			
 
				-functions. Furthermore, a PINN can carry out this task with a higher precision
			
 
				-for more complex problems were the unknown function is more complex and just a
			
 
				-system of differential equations is given.\\
			
 
				-
			
 
				-In this thesis we want to find the solutions of the SIR models belonging to the
			
 
				-cases of the datasets. The SIR model is given through the system of differential
			
 
				-equations (see~\Cref{eq:sir}), which describes the relations and fluctuations of
			
 
				-the three compartments through transition rates $\beta$ and $\alpha$. As we
			
 
				-explain in~\Cref{sec:pandemicModel:sir}, these parameters influence course of
			
 
				-the pandemic, which is described by their respective model. Mathematically, when
			
 
				-we find a pair of parameters for a dataset, these parameters describe a
			
 
				-function, that solves the system of differential equations for our data set. A
			
 
				-PINN finds parameters for a given set of differential equations by solving the
			
 
				-inverse problem. As Shaier \etal~\cite{Shaier2021} propose, a DINN solves inverse
			
 
				-problems by setting the parameters $\beta$ and $\alpha$ to trainable parameters
			
 
				-$\widehat{\beta}$ and $\widehat{\alpha}$. As described in~\Cref{sec:pinn}, the
			
 
				-DINN learns the parameters to optimize its model predictions $\hat{\boldsymbol{S}}$,
			
 
				-$\hat{\boldsymbol{I}}$, and $\hat{\boldsymbol{R}}$, to fit the differential
			
 
				-equations through the usage of their residuals and the given data.\\
			
 
				-
			
 
				-The PINN uses the loss function to determine how far it is away from the true
			
 
				-solution. For the DINN~\cite{Shaier2021} this loss function includes the mean
			
 
				-squared error of each residual in addition to the mean squared error of the
			
 
				-model predictions concerning their respective true solutions. On the contrary to
			
 
				-Shaier \etal, who use the set of differential equations of~\Cref{eq:sir} for
			
 
				-their loss function, we use~\Cref{eq:modSIR}. The reason for this choice is that
			
 
				-we encountered a better practical performance during our work than when using
			
 
				-the equation, used by Shaier \etal. Let $N$ be the size of the population and
			
 
				-$N_t$ the number of training point of the used dataset then,
			
 
				-
			
 
				+In the preceding section, we present the methods we employ to preprocess and
			
 
				+format the data from the RKI in accordance with the specifications required for
			
 
				+the work of this thesis. In this section, we will present the method we employ
			
 
				+to identify the non-time-dependent SIR parameters $\beta$ and $\alpha$ for the
			
 
				+data. As a foundation for our work, we draw upon the work of Shaier et
			
 
				+al.~\cite{Shaier2021}, to solve the SIR system of ODEs using PINNs.\\
			
 
				+
			
 
				+In order to conduct an analysis of a pandemic, it is necessary to have a quantifiable measure
			
 
				+that indicates whether the disease in question has the capacity to spread rapidly through a
			
 
				+population or is it not successful in infecting a significant number of
			
 
				+individuals. We employ the SIR model to construct an abstraction of the complex
			
 
				+relations inherent to real-world pandemics. The SIR model divides the population into three
			
 
				+compartments. It is accompanied by a with system of ODEs that encapsulates the
			
 
				+fluctuations and relationships between these compartments (see~\Cref{eq:sir}).
			
 
				+The transmission rate $\beta$ and the recovery rate $\alpha$ work as the
			
 
				+aforementioned quantifiers. We obtain data from the preprocessing stage. It
			
 
				+provides insight into the progression of the COVID-19 pandemic in Germany.
			
 
				+The objective is to identify a function that solves the system of differential
			
 
				+equations of the SIR model, by returning the size of each compartment at a
			
 
				+specific point in time. This function is supposed to be able to reconstruct the
			
 
				+training data and is defined by the values of the transition rates $\beta$ and
			
 
				+$\alpha$. From a mathematical and semantic perspective, it is essential to
			
 
				+determine these values of the parameter.\\
			
 
				+
			
 
				+In order to ascertain the transmission rate $\beta$ and the recovery rate $\alpha$
			
 
				+from the preprocessed RKI data of $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
			
 
				+for a given set of time points, it is necessary to employ a data-driven approach that outputs
			
 
				+a model prediction of $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
			
 
				+for a set of time points, with the aim of minimizing the term,
			
 
				+\begin{equation}\label{eq:SIR_obs_term}
			
 
				+    \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2  + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
			
 
				+\end{equation}
			
 
				+for each data point in the set of training dataset of a cardinality $N_tt$ and with
			
 
				+$i\in\{1, ..., N_t\}$. Moreover, the aforementioned parameters must satisfy the system
			
 
				+of differential equations that govern the SIR model. For this reason, Shaier
			
 
				+\etal~\cite{Shaier2021} utilize a PINN framework to satisfy both requirements.
			
 
				+Their approach, which they refer to as the \emph{disease-informed neural network}
			
 
				+(see~\Cref{sec:pinn:dinn}), takes epidemiological data as the input and returns
			
 
				+the two transition rates $\alpha$ and $\beta$. This method
			
 
				+achieves this by finding an approximate solution of to the inverse problem of
			
 
				+physics-informed neural networks (see~\Cref{sec:pinn}). In terms of the terms of
			
 
				+the SIR model, a PINN addresses the inverse problemin two ways. First, it minimizes~\Cref{eq:SIR_obs_term}
			
 
				+by bringing the model predictions $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
			
 
				+closer to the actual values $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
			
 
				+for each time point. Second, it reduces the residuals of the ODEs that
			
 
				+constitute the SIR model. While the forward problem concludes at this point, the
			
 
				+inverse problem presets that a parameter is unknown. Thus, we designate the parameters
			
 
				+$\beta$ and $\alpha$ as free, learnable parameters, $\widehat{\beta}$ and
			
 
				+$\widehat{\alpha}$. These separate trainable parameters are values that are
			
 
				+optimized during the training process and must fit the equations of the set of
			
 
				+ODEs. Furthermore, we know, that the transition rates
			
 
				+do not surpass the value of $1$. Consequently, we force the value of both rates to be in a
			
 
				+range of $[-1, 1]$. Therefor, we regularize the parameters using the
			
 
				+\emph{tangens hyperbolicus}. This results in the terms,
			
 
				+\begin{equation}
			
 
				+    \widehat{\beta} = \tanh(\tilde{\beta}),\quad \widehat{\alpha} = \tanh(\tilde{\alpha}),
			
 
				+\end{equation}
			
 
				+where $\tilde{\beta}$ and $\tilde{\alpha}$ are the predicted values of the model
			
 
				+and $\widehat{\beta}$ and $\widehat{\alpha}$ are regularized model predictions.\\
			
 
				+
			
 
				+The input data must include the time point $\boldsymbol{t}^{(i)}$ and its
			
 
				+corresponding measured true values of $(\boldsymbol{S}^{(i)}, \boldsymbol{I}^{(i)}, \boldsymbol{R}^{(i)})$.
			
 
				+In its forward path, the PINN receives the time point $\boldsymbol{t}^{(i)}$ as its input, from which it
			
 
				+calculates its model prediction $(\hat{\boldsymbol{S}}^{(i)}, \hat{\boldsymbol{I}}^{(i)}, \hat{\boldsymbol{R}}^{(i)})$
			
 
				+based on its model parameters $\theta$. Subsequently, the model computes the loss function. It calculates the observation loss by taking the
			
 
				+mean squared error of~\Cref{eq:SIR_obs_term} over all $N_t$ training samples.
			
 
				+Therefore, the term for the observation loss is,
			
 
				+\begin{equation}
			
 
				+    \mathcal{L}_{\text{obs}}(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2  + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
			
 
				+\end{equation}
			
 
				+is the term for the observation loss. Given superior performance in practical applications
			
 
				+relative to the ODEs of~\Cref{eq:sir}, we utilize the ODEs of~\Cref{eq:modSIR}
			
 
				+in our physics loss. In order for the model to learn the system of differential,
			
 
				+it is necessary to obtain the residual of each ODE. The mean square error of the residuals constitutes
			
 
				+the physics loss $\mathcal{L}_{\text{physiks}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$.
			
 
				+The residuals are calculated using the model predictions $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$ and the regularized model predictions of the parameters $\widehat{\beta}$ and $\widehat{\alpha}$. The residuals are given by,
			
 
				+\begin{equation}
			
 
				+    0=\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}, \quad 0=\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}, \quad 0=\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}.
			
 
				+\end{equation}
			
 
				+Thus,
			
 
				 \begin{equation}
			
 
				     \begin{split}
			
 
				         \mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &\bigg\|\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\
			
 
				         + &\frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2  + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
			
 
				     \end{split}
			
 
				 \end{equation}
			
 
				+is the equation of the total loss for our approach. This loss value is then
			
 
				+back-propagated through our network, while the model predictions of the
			
 
				+parameters $\beta$ and $\alpha$ are optimized using the loss as well.\\
			
 
				 
			
 
				-is the loss function, that employ to find the transition parameters $\beta$ and
			
 
				-$alpha$ for the given dataset.
			
 
				+As this section concentrates on the finding of the time constant parameters
			
 
				+$\beta$ and $\alpha$, the next section will show our approach of finding the
			
 
				+reproduction number $\Rt$ on the German data of the RKI.
			
 
				 
			
 
				 % -------------------------------------------------------------------
			
 
				 
			
 
				 \section{PINN for the reduced SIR Model   2}
			
 
				-\label{sec:pinn:rsir}
			
 
				+\label{sec:pinn:rsir}
			
 
				+
			
 
				+
			
 
				+
			
 
				+% -------------------------------------------------------------------
			
--- a/thesis.pdf
+++ b/thesis.pdf