|
|
@@ -19,21 +19,21 @@ government employed a multifaceted approach~\cite{RKI}, encompassing the
|
|
|
introduction of vaccines and non-pharmaceutical mitigation policies such as
|
|
|
lockdowns. Between mitigation policies and varying strains of COVID-19, which
|
|
|
have exhibited varying degrees of infectiousness and lethality~\cite{RKIa},
|
|
|
-Germany had recorded over 38,400,000 infection cases and 174,000 deaths, as of
|
|
|
+Germany had recorded over 38,400,000 infection cases and 174,000 deaths, by
|
|
|
the end of June in 2023~\cite{SRD}. In light of these figures the need for an
|
|
|
analysis arises.\\
|
|
|
|
|
|
-The dynamics of the spread of disease transmission in the real-world are
|
|
|
-complex. A multitude of factors influence the course of a disease, and it is
|
|
|
+The dynamics of disease transmission in the real-world are complex. A multitude
|
|
|
+of factors influence the course of a disease, and it is
|
|
|
challenging to gain a comprehensive understanding of these factors and develop
|
|
|
-tools that allows for the comparison of disease courses across different
|
|
|
+tools that allow for the comparison of disease courses across different
|
|
|
diseases and time points. The common approach in epidemiology to address this is
|
|
|
the utilization of epidemiological models that approximate the dynamics by
|
|
|
focusing on specific factors and modeling these using mathematical tools. These
|
|
|
models provide epidemiological parameters that determine the behavior of a
|
|
|
disease within the boundaries of the model. A seminal epidemiological model is
|
|
|
the \emph{SIR model}, which was first proposed by Kermack and McKendrick~\cite{1927}
|
|
|
-in 1927. The SIR model is a compartmentalized model that divides the entire
|
|
|
+in 1927. The SIR model is a compartmental model that divides the entire
|
|
|
population into three distinct groups: the \emph{susceptible} compartment, $S$;
|
|
|
the \emph{infectious} compartment, $I$; and the \emph{removed} compartment, $R$.
|
|
|
In the context of the SIR model, the constant parameters of the transmission
|
|
|
@@ -41,7 +41,7 @@ rate $\beta$ and the recovery rate $\alpha$ serve to quantify and determine the
|
|
|
course of a pandemic. However, a pandemic is not a static entity, therefore Liu
|
|
|
and Stechlinski~\cite{Liu2012}, and Setianto and Hidayat~\cite{Setianto2023}
|
|
|
propose an SIR model with time-dependent epidemiological parameters and
|
|
|
-reproduction number $\Rt$. The SIR model is defined by a system of differential
|
|
|
+reproduction numbers $\Rt$. The SIR model is defined by a system of differential
|
|
|
equations, that incorporate the parameters $\alpha$ and $\beta$, thereby
|
|
|
depicting the fluctuation between the three compartments. For a given set of
|
|
|
data, the epidemiological parameters can be identified by solving the set of
|
|
|
@@ -56,7 +56,7 @@ Italian COVID-19 data using an approach based on a reduced version of the SIR
|
|
|
model.\\
|
|
|
|
|
|
The objective of this thesis is to identify the epidemiological parameters
|
|
|
-$\beta$ and $\alpha$, as well as the reproduction number $\Rt$ of COVID-19 over
|
|
|
+$\alpha$ and $\beta$, as well as the reproduction number $\Rt$ of COVID-19 over
|
|
|
the first 1200 days of recorded data in Germany and its federal states. The
|
|
|
Robert Koch Institute (RKI)\footnote{\url{https://www.rki.de/EN/Home/homepage_node.html}} has compiled data on both reported cases and
|
|
|
associated moralities from the beginning of the outbreak in Germany to the
|
|
|
@@ -64,21 +64,20 @@ present. We utilize and preprocess this data according to the required format of
|
|
|
our approaches. As the raw data lacks information on recovery incidence, we
|
|
|
introduce the recovery queue that simulates a recovery period. To estimate the
|
|
|
epidemiological parameters we adopt the approach of Shaier
|
|
|
-\etal~\cite{Shaier2021}, which utilizes a PINN learning the data, which consists
|
|
|
+\etal~\cite{Shaier2021}, which utilizes a PINN learning the data, that consists
|
|
|
of time points with their respective sizes of the $S, I$ and $R$ compartments,
|
|
|
to predict the epidemiological parameters based on the data and the governing
|
|
|
-system of differential equations. Moreover, we utilize the methodology proposed
|
|
|
-by Millevoi \etal~\cite{Millevoi2023} that estimates the reproduction number for
|
|
|
-each day across the 1200-day span for each German state and Germany as a whole,
|
|
|
-in the reduced SIR model. Thus needing only the size of the $I$ group for each
|
|
|
-time step. To validate the effectiveness of these methods, we first conduct
|
|
|
-experiments on a small synthetic dataset before applying the techniques to
|
|
|
-real-world data. We then analyze the plausibility of our results by comparing
|
|
|
-them to real-world events and data such as vaccination ratios of each region or
|
|
|
-the peaks of impactful variants to demonstrate the relevance of these numbers.
|
|
|
-This analysis demonstrates the relevance of our findings and reveals a
|
|
|
-correlation between our results and real-world developments, thus supporting the
|
|
|
-effectiveness of our approach.\\
|
|
|
+system of differential equations. Additionally, we apply the methodology by
|
|
|
+Millevoi \etal~\cite{Millevoi2023} to estimate the time-dependent reproduction
|
|
|
+number, $\Rt$, over a 1200-day period for each German federal state and Germany
|
|
|
+as a whole in the reduced SIR model. Thus needing only the size of the $I$
|
|
|
+group for each time step. To validate the effectiveness of these methods, we
|
|
|
+first conduct experiments on a small synthetic dataset before applying the
|
|
|
+techniques to real-world data. We then analyze the plausibility of our results
|
|
|
+by comparing them to real-world events and data such as vaccination ratios of
|
|
|
+each region or the peaks of impactful variants. This analysis demonstrates the
|
|
|
+relevance of our findings and reveals a correlation between our results and
|
|
|
+real-world developments, thus supporting the effectiveness of our approach.\\
|
|
|
|
|
|
|
|
|
% -------------------------------------------------------------------
|
|
|
@@ -89,36 +88,36 @@ In this section, we categorize our work into the context of existing literature
|
|
|
on the topic of solving the epidemiological models for real-world data. The
|
|
|
first work, by Smirnova \etal~\cite{Smirnova2017}, endeavors to identify a
|
|
|
stochastic methodology for estimating the time-dependent transmission rate
|
|
|
-$\beta(t)$. They achieve this by projecting the time-dependent transmission rate
|
|
|
-onto a finite subspace, that is defined by Legendre polynomials. Subsequently,
|
|
|
-they compare the three regularization techniques of variational (Tikhonov's)
|
|
|
-regularization, truncated singular value decomposition (TSVD), and modified TSVD
|
|
|
-to ascertain the most reliable method for forecasting with limited data. Their
|
|
|
-findings indicate that modified TSVD provides the most stable forecasts on
|
|
|
-limited data, as demonstrated on both simulated data and real-world data from
|
|
|
-the 1918 influenza pandemic and the Ebola epidemic. In contrast, we
|
|
|
-utilize PINNs to find the constant epidemiological parameters
|
|
|
-and the reproduction number for Germany and its states.\\
|
|
|
+$\beta(t)$. They achieve this by projecting the time-dependent transmission
|
|
|
+rate onto a finite subspace, that is defined by Legendre polynomials.
|
|
|
+Subsequently, they compare the three regularization techniques of variational
|
|
|
+(Tikhonov's) regularization, truncated singular value decomposition (TSVD), and
|
|
|
+modified TSVD to ascertain the most reliable method for forecasting with
|
|
|
+limited data. Their findings indicate that modified TSVD provides the most
|
|
|
+stable forecasts on, as demonstrated on both simulated data and real-world data
|
|
|
+from the 1918 influenza pandemic and the Ebola epidemic. In contrast, we
|
|
|
+utilize PINNs to find the constant epidemiological parameters and the
|
|
|
+reproduction number for Germany and its states.\\
|
|
|
|
|
|
-Some related works similar to our approach apply PINN approaches to COVID-19 and
|
|
|
+Some related works similar to our method apply PINN approaches to COVID-19 and
|
|
|
other real-world disease examples~\cite{Shaier2021,Millevoi2023,Berkhahn2022,Olumoyin2021}.
|
|
|
-Specifically Shaier \etal~\cite{Shaier2021} put forth a data-driven approach
|
|
|
+Specifically Shaier \etal~\cite{Shaier2021} put forth a data-driven method
|
|
|
which they refer to as \emph{Disease-Informed Neural Networks} (DINN). In their
|
|
|
-work, they demonstrate the capacity of DINNs to forecast the trajectory of
|
|
|
+work, they demonstrate the capacity of PINNs to forecast the trajectory of
|
|
|
epidemics and pandemics. They underpin the efficacy of their approach by
|
|
|
applying it to 11 diseases, that have previously been modeled. In their
|
|
|
experiments they employ the SIDR (susceptible, infectious, dead, recovered)
|
|
|
-model. Finally, they present that this method is a robust and effective means of
|
|
|
-identifying the parameters of a SIR model.\\
|
|
|
+model. Finally, they present that this method is a robust and effective means
|
|
|
+of identifying the parameters of a SIR model.\\
|
|
|
|
|
|
Similarly Berkhahn and Ehrhard~\cite{Berkhahn2022}, employ the susceptible,
|
|
|
vaccinated, infectious, hospitalized and removed (SVIHR) model. The proposed
|
|
|
PINN methodology initially estimates the SVIHR model parameters for German
|
|
|
-COVID-19 data, covering the time span from the inceptions of the outbreak to the
|
|
|
-end of 2021. For comparative purposes, Berkhahn and Ehrhard employ the method of
|
|
|
-non-standard finite differences (NSFD) as well. The authors employ both
|
|
|
-forecasting methods project the trajectory of COVID-19 from mid-April 2023
|
|
|
-onwards. Berkhahn and Ehrhard find that the PINN is able to adapt to varying
|
|
|
+COVID-19 data, covering the time span from the inceptions of the outbreak to
|
|
|
+the end of 2021. For comparative purposes, Berkhahn and Ehrhard employ the
|
|
|
+method of non-standard finite differences (NSFD) as well. The authors utilize
|
|
|
+both forecasting methods to project the trajectory of COVID-19 from mid-April
|
|
|
+2023 onwards. Berkhahn and Ehrhard find that PINNs are able to adapt to varying
|
|
|
vaccination rates and emerging variants.\\
|
|
|
|
|
|
Furthermore, Olumoyin \etal~\cite{Olumoyin2021} employ an alternative
|
|
|
@@ -128,39 +127,40 @@ approach they introduce, utilizes the cumulative and daily reported infection
|
|
|
cases and symptomatic recovered cases, to demonstrate the effect of different
|
|
|
mitigation measures and to ascertain the proportion of non-symptomatic
|
|
|
individuals and asymptomatic recovered individuals. With this they can
|
|
|
-illustrate the influence of vaccination and a set non-pharmaceutical mitigation
|
|
|
-methods on the transmission of COVID-19 on data from Italy, South Korea, the
|
|
|
-United Kingdom, and the United States.\\
|
|
|
+illustrate the influence of vaccinations and a set non-pharmaceutical
|
|
|
+mitigation methods on the transmission of COVID-19 on data from Italy, South
|
|
|
+Korea, the United Kingdom, and the United States.\\
|
|
|
|
|
|
Finally, Millevoi \etal~\cite{Millevoi2023} address the issue of the changes in
|
|
|
the transmission rate due to the dynamics of a pandemic. The authors employ the
|
|
|
-reproduction number to reduce the system of differential equations to a single
|
|
|
-equation and introduce a reduced-split version of the PINN, which initially
|
|
|
-trains on the data and then trains to minimize the residual of the ordinary
|
|
|
-differential equation. They test their approach on five synthetic and two
|
|
|
-real-world scenarios from the early stages of the COVID-19 pandemic in Italy.
|
|
|
-This method yields an increase in both accuracy and training speed. In contrast,
|
|
|
-to these works, we estimate the rates and the reproduction number for Germany
|
|
|
-for the entirety of the span from early March in 2020 to late June in 2023.
|
|
|
+reproduction number $\Rt$ to reduce the system of differential equations to a
|
|
|
+single equation and introduce a reduced-split version of the PINN, which
|
|
|
+initially trains on the data and then trains to minimize the residual of the
|
|
|
+ordinary differential equation. They test their approach on five synthetic and
|
|
|
+two real-world scenarios from the early stages of the COVID-19 pandemic in
|
|
|
+Italy. This method yields an increase in both accuracy and training speed. In
|
|
|
+contrast, to these works, we estimate the epidemiological of $\alpha$ and
|
|
|
+$\beta$ and the reproduction number $\Rt$ for Germany for the entirety of the
|
|
|
+span from early March in 2020 to late June in 2023.
|
|
|
|
|
|
% -------------------------------------------------------------------
|
|
|
|
|
|
\section{Overview}
|
|
|
|
|
|
This thesis is comprised of four chapters. \Cref{chap:background}
|
|
|
-presents with the theoretical overview of mathematical modeling in epidemiology,
|
|
|
+starts with the theoretical overview of mathematical modeling in epidemiology,
|
|
|
with a particular focus on the SIR model. Subsequently, it shifts its focus to
|
|
|
neural networks, specifically on the background of PINNs and their use in
|
|
|
solving ordinary differential equations.~\Cref{chap:methods} outlines the
|
|
|
-methodology employed in this thesis. First we present the data, that was
|
|
|
-collected by the RKI. Then we present the PINN approaches, which are inspired by
|
|
|
-the work of Shaier \etal~\cite{Shaier2021} and Millevoi
|
|
|
-\etal~\cite{Millevoi2023}.~\Cref{chap:evaluation} presents the setups and
|
|
|
-results of the experiments that we conduct. This chapter is divided into two
|
|
|
-sections. The first section presents and discusses the results concerning the
|
|
|
-epidemiological parameters of $\beta$ and $\alpha$. The subsequent section
|
|
|
+methodology employed in this thesis. First, we present the data, that was
|
|
|
+collected by the RKI and our preprocessing. Then, we present the PINN
|
|
|
+approaches, which are inspired by the work of Shaier \etal~\cite{Shaier2021}
|
|
|
+and Millevoi \etal~\cite{Millevoi2023}.~\Cref{chap:evaluation} provides the
|
|
|
+setups and results of the experiments that we conduct. This chapter is divided
|
|
|
+into two sections. The first section shows and discusses the results concerning
|
|
|
+the epidemiological parameters of $\alpha$ and $\beta$. The subsequent section
|
|
|
presents the results concerning the reproduction value $\Rt$. Finally, in
|
|
|
-\Cref{chap:conclusions}, we connect our results with the events of the
|
|
|
-real-world and give an overview of potential further work.
|
|
|
+\Cref{chap:conclusions}, give a conclusion of our work and provide an overview
|
|
|
+of potential further work.
|
|
|
|
|
|
% -------------------------------------------------------------------
|