|
@@ -9,12 +9,12 @@
|
|
|
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
\chapter{Experiments}
|
|
|
\label{chap:evaluation}
|
|
|
-In ~\Cref{chap:methods}, we explain the methods based the theoretical
|
|
|
+In ~\Cref{chap:methods}, we explain the methods based on the theoretical
|
|
|
background, that we established in~\Cref{chap:background}. In this chapter, we
|
|
|
present the setups and results from the experiments and simulations. First, we
|
|
|
discuss the experiments dedicated to identify the epidemiological transition
|
|
|
rates of $\alpha$ and $\beta$ in synthetic and real-world data. Second, we
|
|
|
-examine the reproduction number in synthetic and real-world data of Germany.
|
|
|
+examine the reproduction number $\Rt$ in synthetic and real-world data of Germany.
|
|
|
|
|
|
% -------------------------------------------------------------------
|
|
|
|
|
@@ -22,9 +22,9 @@ examine the reproduction number in synthetic and real-world data of Germany.
|
|
|
\label{sec:sir}
|
|
|
In this section, we aim to identify the transmission rate $\beta$ and the
|
|
|
recovery rate $\alpha$ from either synthetic or preprocessed real-world data.
|
|
|
-The methodology that we employ to identify the epidemiological parameters is described
|
|
|
+The methodology that we employ to identify these epidemiological parameters is described
|
|
|
in~\Cref{sec:pinn:sir}. Meanwhile, the methods we utilize to preprocess the
|
|
|
-real-world data are detailed in~\Cref{sec:preprocessing:rq}. In the first part
|
|
|
+real-world data are detailed in~\Cref{sec:preprocessing:rq}. In the first part,
|
|
|
we present the setup of our experiments, then we provide the results including a
|
|
|
discussion.\\
|
|
|
|
|
@@ -41,11 +41,12 @@ infectious individuals is $I_0 = 10$. We conduct the simulation over 150
|
|
|
days, resulting in a dataset of the form of~\Cref{fig:datasets_sir}.\\
|
|
|
|
|
|
\paragraph{Real-World Data:}In order to process the real-world RKI data, it is
|
|
|
-necessary to preprocess the raw data for each state and Germany separately.
|
|
|
+necessary to preprocess the raw data for each state from the infection
|
|
|
+dataset~\cite{GHInf} and for Germany from the death case dataset~\cite{GHDead} separately.
|
|
|
This is achieved by utilizing a recovery queue with a recovery period of 14
|
|
|
days. With regard to population size of each state, we set it to the respective
|
|
|
value counted at the end of
|
|
|
-2019\footnote{{\tiny \url{https://de.statista.com/statistik/kategorien/kategorie/8/themen/63/branche/demographie/\#overview}}}.
|
|
|
+2019\footnote{{\url{https://datacommons.org/?hl=de} Last accessed: 2024-07-20}}.
|
|
|
The initial number of infectious individuals is set to the number of infected
|
|
|
people on 2020-03-09 from the dataset. The data we extract spans from
|
|
|
2020-03-09 to 2023-06-22, encompassing a period of 1200 days and
|
|
@@ -86,13 +87,18 @@ is the average error across all three compartments.
|
|
|
\label{sec:sir:results}
|
|
|
|
|
|
In this section, we start by examining the results for the synthetic dataset,
|
|
|
-focusing the accuracy and reproducibility. We then proceed to present and
|
|
|
+focusing on the accuracy and reproducibility. We then proceed to present and
|
|
|
discuss the results for the German states and Germany.\\
|
|
|
|
|
|
The results of the experiment regarding the synthetic data can be seen
|
|
|
in~\Cref{table:alpha_beta_synth}. The error and the standard variation for both
|
|
|
parameters are negligible small. Taking the mean of the parameters across the
|
|
|
-five iterations yields more accurate results.\\
|
|
|
+five iterations yields more accurate results. The results demonstrate that the
|
|
|
+model is capable of approximating the correct parameters for the small,
|
|
|
+synthetic dataset in each of the five iterations. The mean of the predicted
|
|
|
+values results in values with a sufficiently small error. Thus, we argue that
|
|
|
+our selected method is well suited to analyze real-world pandemic data
|
|
|
+collected in Germany.\\
|
|
|
|
|
|
\begin{table}[t]
|
|
|
\begin{center}
|
|
@@ -112,12 +118,6 @@ five iterations yields more accurate results.\\
|
|
|
\end{center}
|
|
|
\end{table}
|
|
|
|
|
|
-The results demonstrate that the model is capable of approximating the correct
|
|
|
-parameters for the small, synthetic dataset in each of the five iterations.
|
|
|
-The mean of the predicted values results in values with a sufficiently small
|
|
|
-error. Thus, we argue that our selected method is well suited to analyze real
|
|
|
-world pandemic data collected in Germany.\\
|
|
|
-
|
|
|
In~\Cref{table:state_mean_std} we present the results of the training for the
|
|
|
real-world data. The results are presented from top to bottom, in the order of
|
|
|
the community identification number, with the last entry being Germany. Both
|
|
@@ -135,7 +135,7 @@ $\nu$ for each state provided by the Robert Koch Institute~\cite{FMH}.\\
|
|
|
$\Delta\beta_{\text{Germany}} = \beta_{\text{state}} - \beta_{\text{Germany}}$
|
|
|
across the 5 iterations, that we conducted for each German state (MWP=Mecklenburg-Western Pomerania, NRW=North Rhine-Westphalia) and Germany
|
|
|
as the whole country. Furthermore, we include the vaccination percentage
|
|
|
- $\nu$ provided from the RKI~\cite{FMH}.}
|
|
|
+ $\nu$ provided from the German Federal Ministry for Health~\cite{FMH}.}
|
|
|
\label{table:state_mean_std}
|
|
|
\begin{tabular}{lccccc}
|
|
|
\toprule
|
|
@@ -196,7 +196,7 @@ It is evident that there is a correlation between the values of $\alpha$ and
|
|
|
$\beta$ for each state. States with a high transmission rate tend to have a
|
|
|
high recovery rate, and vice versa. The correlation between $\alpha$ and
|
|
|
$\beta$ can be explained by the implicate definition of $\alpha$ using a
|
|
|
-recovery queue with a constant recovery period of 14 days. This might result to
|
|
|
+recovery queue with a constant recovery period of 14 days. This might result in
|
|
|
the PINN not learning $\alpha$ as a standalone parameter but rather as a
|
|
|
function of the transmission rate $\beta$. This phenomenon occurs because the
|
|
|
transmission rate determines the number of individuals that get infected per
|
|
@@ -282,7 +282,7 @@ by a factor of $\expnumber{1}{-6}$, whereas the data loss belonging to Germany
|
|
|
is also weighted with a high factor of $\expnumber{1}{4}$, relative to the total
|
|
|
loss. We found this approach to yield the best results. The model is trained
|
|
|
using a base learning rate of $\expnumber{1}{-3}$, with the same scheduler and
|
|
|
-optimizer as we describe in~\Cref{sec:sir:setup}. We train the model for the
|
|
|
+optimizer as we describe in~\Cref{sec:sir:setup}. We train the model for the federal
|
|
|
states 20000 epochs and start the physics training after 10000 epochs, while we
|
|
|
train for Germany for 25000 and start the physics training after 15000 epochs.
|
|
|
To ensure the reliability of the results, we conduct ten trials of each experiment. For
|
|
@@ -293,12 +293,6 @@ evaluation, we use the error $e_G$ as we do in the subsequent section.\\
|
|
|
\subsection{Results and Discussion}
|
|
|
\label{sec:rsir:results}
|
|
|
|
|
|
-\Cref{fig:synth_results} illustrates the results of our experiments conducted on
|
|
|
-the synthetic dataset, which can be seen in~\Cref{fig:Rt_dataset}. It is evident
|
|
|
-that the model is capable of learning the infection data across all data points.
|
|
|
-The error for this is, $e_I = 0.0016$, which is of a negligible
|
|
|
-magnitude.\\
|
|
|
-
|
|
|
\begin{figure}[t]
|
|
|
\centering
|
|
|
\begin{subfigure}{0.45\textwidth}
|
|
@@ -308,21 +302,25 @@ magnitude.\\
|
|
|
\begin{subfigure}{0.45\textwidth}
|
|
|
\includegraphics[width=\textwidth]{synthetic_R_t_statistics.pdf}
|
|
|
\end{subfigure}
|
|
|
- \label{fig:synth_results}
|
|
|
\caption{Results for the reproduction rate $\Rt$ on synthetic data. The
|
|
|
left graphic show the prediction of the model regarding the $I$ group. The
|
|
|
right graphic presents the predicted $\Rt$ against the true value, with the
|
|
|
standard deviation.}
|
|
|
+ \label{fig:r_t_synth_res}
|
|
|
\end{figure}
|
|
|
|
|
|
-An examination of the predictions for the representation value $\Rt$ reveals
|
|
|
-that here as well, the model is capable of accurately delineating the value at
|
|
|
-each time point. However, during the first 30 days, the standard deviation
|
|
|
-exhibits an upward trend, while during the final 120 days, the predictions
|
|
|
-demonstrate remarkable precision.\\
|
|
|
+\Cref{fig:r_t_synth_res} illustrates the results of our experiments conducted on
|
|
|
+the synthetic dataset, which can be seen in~\Cref{fig:Rt_dataset}. It is evident
|
|
|
+that the model is capable of learning the infection data across all data points.
|
|
|
+The error for this is, $e_I = 0.0016$, which is of a negligible
|
|
|
+magnitude. An examination of the predictions for the reproduction number $\Rt$
|
|
|
+reveals that here as well, the model is capable of accurately delineating the
|
|
|
+value at each time point. However, during the first 30 days, the standard
|
|
|
+deviation exhibits an upward trend, while during the final 120 days, the
|
|
|
+predictions demonstrate remarkable precision.\\
|
|
|
|
|
|
In~\Cref{fig:state_results}, we present the graphs of $\Rt$ for the state with
|
|
|
-the highest value of $\beta$, namely Thuringia, and for the state with the
|
|
|
+the highest value of $\beta$, namely Thuringia, and for the state with the lowest
|
|
|
$\beta$, namely Bremen. Further visualizations of the results
|
|
|
can be found in~\Cref{chap:appendix}. In all datasets, the graphs with $\alpha =
|
|
|
\nicefrac{1}{5}$ are of a smaller size than those with
|
|
@@ -357,7 +355,7 @@ that the error for all experiments falls within a range of values that is not
|
|
|
negligible and will have an influence on the resulting reproduction values that
|
|
|
are learned while fitting the data. A comparison of the results for the various
|
|
|
values of $\alpha$ reveals that the errors associated with $\alpha = \nicefrac{1}{14}$
|
|
|
-are consistently smaller, with the exception of Saxony and Germany. This can be
|
|
|
+are consistently smaller than for $\alpha = \nicefrac{1}{5}$, with the exception of Saxony and Germany. This can be
|
|
|
attributed to the differing sizes of infection counts, particularly in relation
|
|
|
to the normalization factor $C$. The model is unable to learn effectively if the
|
|
|
values of the data loss $\mathcal{L}_{\text{data}}$ are too large or too small
|