Browse Source

add section 3.3 finished

Phillip Rothenbeck 11 months ago
parent
commit
1ce0a959ca
3 changed files with 75 additions and 8 deletions
  1. 3 3
      chapters/chap02/chap02.tex
  2. 72 5
      chapters/chap03/chap03.tex
  3. BIN
      thesis.pdf

+ 3 - 3
chapters/chap02/chap02.tex

@@ -363,10 +363,10 @@ $t_0$, the \emph{reproduction number},
 \end{equation}
 \end{equation}
 represents the number of susceptible individuals, that one infectious individual
 represents the number of susceptible individuals, that one infectious individual
 infects at the onset of the pandemic. In light of the effects of $\beta$ and
 infects at the onset of the pandemic. In light of the effects of $\beta$ and
-$\alpha$ (see~\Cref{sec:pandemicModel:sir}), $\RO > 1$ indicates that the
+$\alpha$ (see~\Cref{sec:pandemicModel:sir}), $\RO < 1$ indicates that the
 pandemic is emerging. In this scenario $\alpha$ is relatively low due to the
 pandemic is emerging. In this scenario $\alpha$ is relatively low due to the
 limited number of infections resulting from $I(t_0) << S(t_0)$.\\ Further,
 limited number of infections resulting from $I(t_0) << S(t_0)$.\\ Further,
-$\RO < 1$ leads to the disease spreading rapidly across the population, with an
+$\RO > 1$ leads to the disease spreading rapidly across the population, with an
 increase in $I$ occurring at a high rate. Nevertheless, $\RO$ does not cover
 increase in $I$ occurring at a high rate. Nevertheless, $\RO$ does not cover
 the entire time span. For this reason, Millevoi \etal~\cite{Millevoi2023}
 the entire time span. For this reason, Millevoi \etal~\cite{Millevoi2023}
 introduce $\Rt$ which has the same interpretation as $\RO$, with the exception
 introduce $\Rt$ which has the same interpretation as $\RO$, with the exception
@@ -402,7 +402,7 @@ calculate the scaled groups,
 \end{equation}
 \end{equation}
 using a large constant scaling factor $C\in\mathbb{N}$. Applying this to the
 using a large constant scaling factor $C\in\mathbb{N}$. Applying this to the
 variable $I$, results in,
 variable $I$, results in,
-\begin{equation}
+\begin{equation}\label{eq:reduced_sir_ODE}
   \frac{dI_s}{dt_s} = \alpha(t_f - t_0)(\Rt - 1)I_s(t_s),
   \frac{dI_s}{dt_s} = \alpha(t_f - t_0)(\Rt - 1)I_s(t_s),
 \end{equation}
 \end{equation}
 which is a further reduced version of~\Cref{eq:sir}. This less complex
 which is a further reduced version of~\Cref{eq:sir}. This less complex

+ 72 - 5
chapters/chap03/chap03.tex

@@ -13,7 +13,7 @@ This chapter provides the methods, that we employ to address the problem that we
 present in~\Cref{chap:introduction}.~\Cref{sec:preprocessing} outlines
 present in~\Cref{chap:introduction}.~\Cref{sec:preprocessing} outlines
 our approaches for preprocessing of the available data and has two
 our approaches for preprocessing of the available data and has two
 sections. The first section describes the publicly available data provided by
 sections. The first section describes the publicly available data provided by
-the \emph{Robert Koch Institute} (RKI)\footnote[1]{\url{https://www.rki.de/EN/Home/homepage_node.html}}.
+the \emph{Robert Koch Institute} (RKI)\footnote{\url{https://www.rki.de/EN/Home/homepage_node.html}}.
 The second section outlines the techniques we use to process this data to fit
 The second section outlines the techniques we use to process this data to fit
 our project's requirements. Subsequently, we give a theoretical overview of the
 our project's requirements. Subsequently, we give a theoretical overview of the
 PINN's that we employ. These latter sections, establish the foundation for the
 PINN's that we employ. These latter sections, establish the foundation for the
@@ -72,7 +72,7 @@ a weekly basis.\\
     \label{fig:rki_data}
     \label{fig:rki_data}
 \end{figure}
 \end{figure}
 
 
-The second repository is entitled \emph{SARS-CoV-2 Infektionen in Deutschland}.
+The second repository is entitled \emph{SARS-CoV-2 Infektionen in Deutschland}\footnote{\url{https://github.com/robert-koch-institut/SARS-CoV-2-Infektionen_in_Deutschland.git}}.
 This dataset contains comprehensive data regarding the infections of each county
 This dataset contains comprehensive data regarding the infections of each county
 on a daily basis. The counties are encoded using the \emph{Community Identification Number}\footnote{\url{https://www.destatis.de/DE/Themen/Laender-Regionen/Regionales/Gemeindeverzeichnis/_inhalt.html}},
 on a daily basis. The counties are encoded using the \emph{Community Identification Number}\footnote{\url{https://www.destatis.de/DE/Themen/Laender-Regionen/Regionales/Gemeindeverzeichnis/_inhalt.html}},
 wherein the first two digits denote the state, the third digit represents the
 wherein the first two digits denote the state, the third digit represents the
@@ -182,7 +182,7 @@ Their approach, which they refer to as the \emph{disease-informed neural network
 the two transition rates $\alpha$ and $\beta$. This method
 the two transition rates $\alpha$ and $\beta$. This method
 achieves this by finding an approximate solution of to the inverse problem of
 achieves this by finding an approximate solution of to the inverse problem of
 physics-informed neural networks (see~\Cref{sec:pinn}). In terms of the terms of
 physics-informed neural networks (see~\Cref{sec:pinn}). In terms of the terms of
-the SIR model, a PINN addresses the inverse problemin two ways. First, it minimizes~\Cref{eq:SIR_obs_term}
+the SIR model, a PINN addresses the inverse problemin two ways. First, it minimizes the mean of~\Cref{eq:SIR_obs_term}
 by bringing the model predictions $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
 by bringing the model predictions $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
 closer to the actual values $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
 closer to the actual values $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
 for each time point. Second, it reduces the residuals of the ODEs that
 for each time point. Second, it reduces the residuals of the ODEs that
@@ -223,7 +223,7 @@ The residuals are calculated using the model predictions $(\hat{\boldsymbol{S}},
 Thus,
 Thus,
 \begin{equation}
 \begin{equation}
     \begin{split}
     \begin{split}
-        \mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &\bigg\|\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\
+        \mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &\bigg\|\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}\bigg\|^2 + \bigg\|\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2 + \bigg\|\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\
         + &\frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2  + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
         + &\frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2  + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
     \end{split}
     \end{split}
 \end{equation}
 \end{equation}
@@ -237,9 +237,76 @@ reproduction number $\Rt$ on the German data of the RKI.
 
 
 % -------------------------------------------------------------------
 % -------------------------------------------------------------------
 
 
-\section{PINN for the reduced SIR Model   2}
+\section{Estimating the Reproduction Number using PINNs   2}
 \label{sec:pinn:rsir}
 \label{sec:pinn:rsir}
 
 
+The previous section, shows the methodology we utilize to ascertain the
+non-time-dependent transmission and recovery rates from a data set obtained from
+the COVID-19 pandemic in Germany. In this section we employ PINNs to identify
+the time-dependent reproduction number $\Rt$, while reducing the number of state
+variables and the reliance on assumptions, by reducing the system of ODEs
+comprising the SIR model. The methodology presented in this section is based on
+the approach developed by Millevoi \etal~\cite{Millevoi2023}.\\
 
 
+In real-world pandemics the rate of infection is affected by a multitude of
+factors. Events like the rising awareness for the disease in the population, the
+implementation of non-pharmaceutical mitigations such as social distancing
+policies, and the emergence of a new variants have an impact on the transmission
+rate $\beta$. Accordingly, a transmission rate that is not time-dependent and
+constant across the whole duration of the pandemic may not accurately reflect
+the dynamics of the spread of a real-world disease correctly. Although we set
+the transmission rate to be time-dependent, the recovery time is assumed to be
+relatively constant in time. The Robert Koch
+Institute\footnote{\url{https://github.com/robert-koch-institut/SARS-CoV-2-Infektionen_in_Deutschland.git}}
+posits that the typical recovery period for the illness under normal conditions
+is 14 days, while those individuals with severe cases take about 28 days to
+recover. Given the negligible number of severe cases compared to the number of
+normal cases, we can set the recovery time to $D=14$ resulting in $\alpha = \nicefrac{1}{14}$.
+The reproduction number, $\Rt$ (see~\Cref{sec:pandemicModel:rsir}), represents
+the number of infections that occur as a result of one infectious individual. It
+indicates if a pandemic is emerging or if it is spreading rapidly through the
+susceptible population. By inserting the definition~\Cref{eq:repr_num}, into the
+system of ODEs of the SIR model, we can derive one~\Cref{eq:reduced_sir_ODE}. In
+order to solve this, we must identify a function that maps a time point to the
+size of the infectious compartment and the specific reproduction number.\\
+
+As with the constant transition rates, we employ a data-driven approach for
+identifying the time-dependent reproduction number $\Rt$. The PINN approximates
+the size ,$\boldsymbol{I}$, with its model prediction $\hat{\boldsymbol{I}}$ by
+minimizing the term,
+\begin{equation}\label{eq:rSir_squared_err}
+    \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2,
+\end{equation}
+for each $i\in\{1,...,N_t\}$. In order to identify the reproduction number, the
+PINN minimizes the residuals of the ODE during the training process. The
+training process is analogous to the one of the PINN, which identifies $\beta$
+and $alpha$ (see~\Cref{sec:pinn:sir}). The distinction lies in the absence of
+trainable parameters and the inclusion of an additional state variable that
+fluctuates in response to the input. While the state variable $\boldsymbol{I}$
+is approximated using the error between the training data and the predicted
+values, the state variable $\Rt$ is approximated exclusively based on the
+residual of the ODE.\\
+
+The PINN receives the input of $\boldsymbol{t}^{(i)}$ and generates a prediction of
+($\hat{\boldsymbol{I}}^{(i)}$, $\Rt^{(i)}$). As previously stated, the PINN minimizes
+the distance between the true values of $\boldsymbol{I}$ and the model predictions
+$\hat{\boldsymbol{I}}$ by minimizing the mean squared error. Consequently, the
+observation loss function is defined by,
+\begin{equation}
+    \mathcal{L}_{\text{rSIR}}(\boldsymbol{I}, \hat{\boldsymbol{I}}) = \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2.
+\end{equation}
+The physics loss function is defined as the squared error of the residual of the
+ODE. The residual of the reduced SIR model is given by,
+\begin{equation}
+    0 = \frac{dI_s}{dt_s} - \alpha(t_f - t_0)(\Rt - 1)I_s(t_s).
+\end{equation}
+By combining the observation loss with the physics loss, we arrive at the total loss for
+the PINN that solves the reduced SIR model, which is given by,
+\begin{equation}
+    \mathcal{L}_{\text{rSIR}}(\boldsymbol{t}, \boldsymbol{I}, \hat{\boldsymbol{I}}) = \bigg\|\frac{dI_s}{dt_s} - \alpha(t_f - t_0)(\Rt - 1)I_s(t_s)\bigg\|^2+ \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2.
+\end{equation}
+
+The process of determining the reproduction number, along with the other
+techniques, that this chapter presents find application in the following chapter.
 
 
 % -------------------------------------------------------------------
 % -------------------------------------------------------------------

BIN
thesis.pdf