11 months ago · 1ce0a959ca
--- a/chapters/chap02/chap02.tex
+++ b/chapters/chap02/chap02.tex
@@ -363,10 +363,10 @@ $t_0$, the \emph{reproduction number},
 
															 \end{equation}
														
 
															 represents the number of susceptible individuals, that one infectious individual
														
 
															 infects at the onset of the pandemic. In light of the effects of $\beta$ and
														
 
															-$\alpha$ (see~\Cref{sec:pandemicModel:sir}), $\RO > 1$ indicates that the
														
 
															+$\alpha$ (see~\Cref{sec:pandemicModel:sir}), $\RO < 1$ indicates that the
														
 
															 pandemic is emerging. In this scenario $\alpha$ is relatively low due to the
														
 
															 limited number of infections resulting from $I(t_0) << S(t_0)$.\\ Further,
														
 
															-$\RO < 1$ leads to the disease spreading rapidly across the population, with an
														
 
															+$\RO > 1$ leads to the disease spreading rapidly across the population, with an
														
 
															 increase in $I$ occurring at a high rate. Nevertheless, $\RO$ does not cover
														
 
															 the entire time span. For this reason, Millevoi \etal~\cite{Millevoi2023}
														
 
															 introduce $\Rt$ which has the same interpretation as $\RO$, with the exception
														
@@ -402,7 +402,7 @@ calculate the scaled groups,
 
															 \end{equation}
														
 
															 using a large constant scaling factor $C\in\mathbb{N}$. Applying this to the
														
 
															 variable $I$, results in,
														
 
															-\begin{equation}
														
 
															+\begin{equation}\label{eq:reduced_sir_ODE}
														
 
															   \frac{dI_s}{dt_s} = \alpha(t_f - t_0)(\Rt - 1)I_s(t_s),
														
 
															 \end{equation}
														
 
															 which is a further reduced version of~\Cref{eq:sir}. This less complex
														
--- a/chapters/chap03/chap03.tex
+++ b/chapters/chap03/chap03.tex
@@ -13,7 +13,7 @@ This chapter provides the methods, that we employ to address the problem that we
 
															 present in~\Cref{chap:introduction}.~\Cref{sec:preprocessing} outlines
														
 
															 our approaches for preprocessing of the available data and has two
														
 
															 sections. The first section describes the publicly available data provided by
														
 
															-the \emph{Robert Koch Institute} (RKI)\footnote[1]{\url{https://www.rki.de/EN/Home/homepage_node.html}}.
														
 
															+the \emph{Robert Koch Institute} (RKI)\footnote{\url{https://www.rki.de/EN/Home/homepage_node.html}}.
														
 
															 The second section outlines the techniques we use to process this data to fit
														
 
															 our project's requirements. Subsequently, we give a theoretical overview of the
														
 
															 PINN's that we employ. These latter sections, establish the foundation for the
														
@@ -72,7 +72,7 @@ a weekly basis.\\
 
															     \label{fig:rki_data}
														
 
															 \end{figure}
														
 
															-The second repository is entitled \emph{SARS-CoV-2 Infektionen in Deutschland}.
														
 
															+The second repository is entitled \emph{SARS-CoV-2 Infektionen in Deutschland}\footnote{\url{https://github.com/robert-koch-institut/SARS-CoV-2-Infektionen_in_Deutschland.git}}.
														
 
															 This dataset contains comprehensive data regarding the infections of each county
														
 
															 on a daily basis. The counties are encoded using the \emph{Community Identification Number}\footnote{\url{https://www.destatis.de/DE/Themen/Laender-Regionen/Regionales/Gemeindeverzeichnis/_inhalt.html}},
														
 
															 wherein the first two digits denote the state, the third digit represents the
														
@@ -182,7 +182,7 @@ Their approach, which they refer to as the \emph{disease-informed neural network
 
															 the two transition rates $\alpha$ and $\beta$. This method
														
 
															 achieves this by finding an approximate solution of to the inverse problem of
														
 
															 physics-informed neural networks (see~\Cref{sec:pinn}). In terms of the terms of
														
 
															-the SIR model, a PINN addresses the inverse problemin two ways. First, it minimizes~\Cref{eq:SIR_obs_term}
														
 
															+the SIR model, a PINN addresses the inverse problemin two ways. First, it minimizes the mean of~\Cref{eq:SIR_obs_term}
														
 
															 by bringing the model predictions $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
														
 
															 closer to the actual values $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
														
 
															 for each time point. Second, it reduces the residuals of the ODEs that
														
@@ -223,7 +223,7 @@ The residuals are calculated using the model predictions $(\hat{\boldsymbol{S}},
 
															 Thus,
														
 
															 \begin{equation}
														
 
															     \begin{split}
														
 
															-        \mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &\bigg\|\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\
														
 
															+        \mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &\bigg\|\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}\bigg\|^2 + \bigg\|\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2 + \bigg\|\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\
														
 
															         + &\frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2  + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
														
 
															     \end{split}
														
 
															 \end{equation}
														
@@ -237,9 +237,76 @@ reproduction number $\Rt$ on the German data of the RKI.
 
															 % -------------------------------------------------------------------
														
 
															-\section{PINN for the reduced SIR Model   2}
														
 
															+\section{Estimating the Reproduction Number using PINNs   2}
														
 
															 \label{sec:pinn:rsir}
														
 
															+The previous section, shows the methodology we utilize to ascertain the
														
 
															+non-time-dependent transmission and recovery rates from a data set obtained from
														
 
															+the COVID-19 pandemic in Germany. In this section we employ PINNs to identify
														
 
															+the time-dependent reproduction number $\Rt$, while reducing the number of state
														
 
															+variables and the reliance on assumptions, by reducing the system of ODEs
														
 
															+comprising the SIR model. The methodology presented in this section is based on
														
 
															+the approach developed by Millevoi \etal~\cite{Millevoi2023}.\\
														
 
															+In real-world pandemics the rate of infection is affected by a multitude of
														
 
															+factors. Events like the rising awareness for the disease in the population, the
														
 
															+implementation of non-pharmaceutical mitigations such as social distancing
														
 
															+policies, and the emergence of a new variants have an impact on the transmission
														
 
															+rate $\beta$. Accordingly, a transmission rate that is not time-dependent and
														
 
															+constant across the whole duration of the pandemic may not accurately reflect
														
 
															+the dynamics of the spread of a real-world disease correctly. Although we set
														
 
															+the transmission rate to be time-dependent, the recovery time is assumed to be
														
 
															+relatively constant in time. The Robert Koch
														
 
															+Institute\footnote{\url{https://github.com/robert-koch-institut/SARS-CoV-2-Infektionen_in_Deutschland.git}}
														
 
															+posits that the typical recovery period for the illness under normal conditions
														
 
															+is 14 days, while those individuals with severe cases take about 28 days to
														
 
															+recover. Given the negligible number of severe cases compared to the number of
														
 
															+normal cases, we can set the recovery time to $D=14$ resulting in $\alpha = \nicefrac{1}{14}$.
														
 
															+The reproduction number, $\Rt$ (see~\Cref{sec:pandemicModel:rsir}), represents
														
 
															+the number of infections that occur as a result of one infectious individual. It
														
 
															+indicates if a pandemic is emerging or if it is spreading rapidly through the
														
 
															+susceptible population. By inserting the definition~\Cref{eq:repr_num}, into the
														
 
															+system of ODEs of the SIR model, we can derive one~\Cref{eq:reduced_sir_ODE}. In
														
 
															+order to solve this, we must identify a function that maps a time point to the
														
 
															+size of the infectious compartment and the specific reproduction number.\\
														
 
															+
														
 
															+As with the constant transition rates, we employ a data-driven approach for
														
 
															+identifying the time-dependent reproduction number $\Rt$. The PINN approximates
														
 
															+the size ,$\boldsymbol{I}$, with its model prediction $\hat{\boldsymbol{I}}$ by
														
 
															+minimizing the term,
														
 
															+\begin{equation}\label{eq:rSir_squared_err}
														
 
															+    \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2,
														
 
															+\end{equation}
														
 
															+for each $i\in\{1,...,N_t\}$. In order to identify the reproduction number, the
														
 
															+PINN minimizes the residuals of the ODE during the training process. The
														
 
															+training process is analogous to the one of the PINN, which identifies $\beta$
														
 
															+and $alpha$ (see~\Cref{sec:pinn:sir}). The distinction lies in the absence of
														
 
															+trainable parameters and the inclusion of an additional state variable that
														
 
															+fluctuates in response to the input. While the state variable $\boldsymbol{I}$
														
 
															+is approximated using the error between the training data and the predicted
														
 
															+values, the state variable $\Rt$ is approximated exclusively based on the
														
 
															+residual of the ODE.\\
														
 
															+
														
 
															+The PINN receives the input of $\boldsymbol{t}^{(i)}$ and generates a prediction of
														
 
															+($\hat{\boldsymbol{I}}^{(i)}$, $\Rt^{(i)}$). As previously stated, the PINN minimizes
														
 
															+the distance between the true values of $\boldsymbol{I}$ and the model predictions
														
 
															+$\hat{\boldsymbol{I}}$ by minimizing the mean squared error. Consequently, the
														
 
															+observation loss function is defined by,
														
 
															+\begin{equation}
														
 
															+    \mathcal{L}_{\text{rSIR}}(\boldsymbol{I}, \hat{\boldsymbol{I}}) = \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2.
														
 
															+\end{equation}
														
 
															+The physics loss function is defined as the squared error of the residual of the
														
 
															+ODE. The residual of the reduced SIR model is given by,
														
 
															+\begin{equation}
														
 
															+    0 = \frac{dI_s}{dt_s} - \alpha(t_f - t_0)(\Rt - 1)I_s(t_s).
														
 
															+\end{equation}
														
 
															+By combining the observation loss with the physics loss, we arrive at the total loss for
														
 
															+the PINN that solves the reduced SIR model, which is given by,
														
 
															+\begin{equation}
														
 
															+    \mathcal{L}_{\text{rSIR}}(\boldsymbol{t}, \boldsymbol{I}, \hat{\boldsymbol{I}}) = \bigg\|\frac{dI_s}{dt_s} - \alpha(t_f - t_0)(\Rt - 1)I_s(t_s)\bigg\|^2+ \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2.
														
 
															+\end{equation}
														
 
															+
														
 
															+The process of determining the reproduction number, along with the other
														
 
															+techniques, that this chapter presents find application in the following chapter.
														
 
															 % -------------------------------------------------------------------
														
--- a/thesis.pdf
+++ b/thesis.pdf