10 месяцев назад · 3dffda5dc5
--- a/chapters/chap04/chap04.tex
+++ b/chapters/chap04/chap04.tex
@@ -9,12 +9,12 @@
 
				 % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
			
 
				 \chapter{Experiments}
			
 
				 \label{chap:evaluation}
			
 
				-In ~\Cref{chap:methods}, we explain the methods based the theoretical
			
 
				+In ~\Cref{chap:methods}, we explain the methods based on the theoretical
			
 
				 background, that we established in~\Cref{chap:background}. In this chapter, we
			
 
				 present the setups and results from the experiments and simulations. First, we
			
 
				 discuss the experiments dedicated to identify the epidemiological transition
			
 
				 rates of $\alpha$ and $\beta$ in synthetic and real-world data. Second, we
			
 
				-examine the reproduction number in synthetic and real-world data of Germany.
			
 
				+examine the reproduction number $\Rt$ in synthetic and real-world data of Germany.
			
 
				 
			
 
				 % -------------------------------------------------------------------
			
 
				 
			
@@ -22,9 +22,9 @@ examine the reproduction number in synthetic and real-world data of Germany.
 
				 \label{sec:sir}
			
 
				 In this section, we aim to identify the transmission rate $\beta$ and the
			
 
				 recovery rate $\alpha$ from either synthetic or preprocessed real-world data.
			
 
				-The methodology that we employ to identify the epidemiological parameters is described
			
 
				+The methodology that we employ to identify these epidemiological parameters is described
			
 
				 in~\Cref{sec:pinn:sir}. Meanwhile, the methods we utilize to preprocess the
			
 
				-real-world data are detailed in~\Cref{sec:preprocessing:rq}. In the first part
			
 
				+real-world data are detailed in~\Cref{sec:preprocessing:rq}. In the first part,
			
 
				 we present the setup of our experiments, then we provide the results including a
			
 
				 discussion.\\
			
 
				 
			
@@ -41,11 +41,12 @@ infectious individuals is $I_0 = 10$. We conduct the simulation over 150
 
				 days, resulting in a dataset of the form of~\Cref{fig:datasets_sir}.\\
			
 
				 
			
 
				 \paragraph{Real-World Data:}In order to process the real-world RKI data, it is
			
 
				-necessary to preprocess the raw data for each state and Germany separately.
			
 
				+necessary to preprocess the raw data for each state from the infection
			
 
				+dataset~\cite{GHInf} and for Germany from the death case dataset~\cite{GHDead} separately.
			
 
				 This is achieved by utilizing a recovery queue with a recovery period of 14
			
 
				 days. With regard to population size of each state, we set it to the respective
			
 
				 value counted at the end of
			
 
				-2019\footnote{{\tiny \url{https://de.statista.com/statistik/kategorien/kategorie/8/themen/63/branche/demographie/\#overview}}}.
			
 
				+2019\footnote{{\url{https://datacommons.org/?hl=de} Last accessed: 2024-07-20}}.
			
 
				 The initial number of infectious individuals is set to the number of infected
			
 
				 people on 2020-03-09 from the dataset. The data we extract spans from
			
 
				 2020-03-09 to 2023-06-22, encompassing a period of 1200 days and
			
@@ -86,13 +87,18 @@ is the average error across all three compartments.
 
				 \label{sec:sir:results}
			
 
				 
			
 
				 In this section, we start by examining the results for the synthetic dataset,
			
 
				-focusing the accuracy and reproducibility. We then proceed to present and
			
 
				+focusing on the accuracy and reproducibility. We then proceed to present and
			
 
				 discuss the results for the German states and Germany.\\
			
 
				 
			
 
				 The results of the experiment regarding the synthetic data can be seen
			
 
				 in~\Cref{table:alpha_beta_synth}. The error and the standard variation for both
			
 
				 parameters are negligible small. Taking the mean of the parameters across the
			
 
				-five iterations yields more accurate results.\\
			
 
				+five iterations yields more accurate results. The results demonstrate that the
			
 
				+model is capable of approximating the correct parameters for the small,
			
 
				+synthetic dataset in each of the five iterations. The mean of the predicted
			
 
				+values results in values with a sufficiently small error. Thus, we argue that
			
 
				+our selected method is well suited to analyze real-world pandemic data
			
 
				+collected in Germany.\\
			
 
				 
			
 
				 \begin{table}[t]
			
 
				     \begin{center}
			
@@ -112,12 +118,6 @@ five iterations yields more accurate results.\\
 
				     \end{center}
			
 
				 \end{table}
			
 
				 
			
 
				-The results demonstrate that the model is capable of approximating the correct
			
 
				-parameters for the small, synthetic dataset in each of the five iterations.
			
 
				-The mean of the predicted values results in values with a sufficiently small
			
 
				-error. Thus, we argue that our selected method is well suited to analyze real
			
 
				-world pandemic data collected in Germany.\\
			
 
				-
			
 
				 In~\Cref{table:state_mean_std} we present the results of the training for the
			
 
				 real-world data. The results are presented from top to bottom, in the order of
			
 
				 the community identification number, with the last entry being Germany. Both
			
@@ -135,7 +135,7 @@ $\nu$ for each state provided by the Robert Koch Institute~\cite{FMH}.\\
 
				             $\Delta\beta_{\text{Germany}} = \beta_{\text{state}} - \beta_{\text{Germany}}$
			
 
				             across the 5 iterations, that we conducted for each German state (MWP=Mecklenburg-Western Pomerania, NRW=North Rhine-Westphalia) and Germany
			
 
				             as the whole country. Furthermore, we include the vaccination percentage
			
 
				-            $\nu$ provided from the RKI~\cite{FMH}.}
			
 
				+            $\nu$ provided from the German Federal Ministry for Health~\cite{FMH}.}
			
 
				         \label{table:state_mean_std}
			
 
				         \begin{tabular}{lccccc}
			
 
				             \toprule
			
@@ -196,7 +196,7 @@ It is evident that there is a correlation between the values of $\alpha$ and
 
				 $\beta$ for each state. States with a high transmission rate tend to have a
			
 
				 high recovery rate, and vice versa. The correlation between $\alpha$ and
			
 
				 $\beta$ can be explained by the implicate definition of $\alpha$ using a
			
 
				-recovery queue with a constant recovery period of 14 days. This might result to
			
 
				+recovery queue with a constant recovery period of 14 days. This might result in
			
 
				 the PINN not learning $\alpha$ as a standalone parameter but rather as a
			
 
				 function of the transmission rate $\beta$. This phenomenon occurs because the
			
 
				 transmission rate determines the number of individuals that get infected per
			
@@ -282,7 +282,7 @@ by a factor of $\expnumber{1}{-6}$, whereas the data loss belonging to Germany
 
				 is also weighted with a high factor of $\expnumber{1}{4}$, relative to the total
			
 
				 loss. We found this approach to yield the best results. The model is trained
			
 
				 using a base learning rate of $\expnumber{1}{-3}$, with the same scheduler and
			
 
				-optimizer as we describe in~\Cref{sec:sir:setup}. We train the model for the
			
 
				+optimizer as we describe in~\Cref{sec:sir:setup}. We train the model for the federal
			
 
				 states 20000 epochs and start the physics training after 10000 epochs, while we
			
 
				 train for Germany for 25000 and start the physics training after 15000 epochs.
			
 
				 To ensure the reliability of the results, we conduct ten trials of each experiment. For
			
@@ -293,12 +293,6 @@ evaluation, we use the error $e_G$ as we do in the subsequent section.\\
 
				 \subsection{Results and Discussion}
			
 
				 \label{sec:rsir:results}
			
 
				 
			
 
				-\Cref{fig:synth_results} illustrates the results of our experiments conducted on
			
 
				-the synthetic dataset, which can be seen in~\Cref{fig:Rt_dataset}. It is evident
			
 
				-that the model is capable of learning the infection data across all data points.
			
 
				-The error for this is, $e_I = 0.0016$, which is of a negligible
			
 
				-magnitude.\\
			
 
				-
			
 
				 \begin{figure}[t]
			
 
				     \centering
			
 
				     \begin{subfigure}{0.45\textwidth}
			
@@ -308,21 +302,25 @@ magnitude.\\
 
				     \begin{subfigure}{0.45\textwidth}
			
 
				         \includegraphics[width=\textwidth]{synthetic_R_t_statistics.pdf}
			
 
				     \end{subfigure}
			
 
				-    \label{fig:synth_results}
			
 
				     \caption{Results for the reproduction rate $\Rt$ on synthetic data. The
			
 
				         left graphic show the prediction of the model regarding the $I$ group. The
			
 
				         right graphic presents the predicted $\Rt$ against the true value, with the
			
 
				         standard deviation.}
			
 
				+    \label{fig:r_t_synth_res}
			
 
				 \end{figure}
			
 
				 
			
 
				-An examination of the predictions for the representation value $\Rt$ reveals
			
 
				-that here as well, the model is capable of accurately delineating the value at
			
 
				-each time point. However, during the first 30 days, the standard deviation
			
 
				-exhibits an upward trend, while during the final 120 days, the predictions
			
 
				-demonstrate remarkable precision.\\
			
 
				+\Cref{fig:r_t_synth_res} illustrates the results of our experiments conducted on
			
 
				+the synthetic dataset, which can be seen in~\Cref{fig:Rt_dataset}. It is evident
			
 
				+that the model is capable of learning the infection data across all data points.
			
 
				+The error for this is, $e_I = 0.0016$, which is of a negligible
			
 
				+magnitude. An examination of the predictions for the reproduction number $\Rt$
			
 
				+reveals that here as well, the model is capable of accurately delineating the
			
 
				+value at each time point. However, during the first 30 days, the standard
			
 
				+deviation exhibits an upward trend, while during the final 120 days, the
			
 
				+predictions demonstrate remarkable precision.\\
			
 
				 
			
 
				 In~\Cref{fig:state_results}, we present the graphs of $\Rt$ for the state with
			
 
				-the highest value of $\beta$, namely Thuringia, and for the state with the
			
 
				+the highest value of $\beta$, namely Thuringia, and for the state with the lowest
			
 
				 $\beta$, namely Bremen. Further visualizations of the results
			
 
				 can be found in~\Cref{chap:appendix}. In all datasets, the graphs with $\alpha =
			
 
				     \nicefrac{1}{5}$ are of a smaller size than those with
			
@@ -357,7 +355,7 @@ that the error for all experiments falls within a range of values that is not
 
				 negligible and will have an influence on the resulting reproduction values that
			
 
				 are learned while fitting the data. A comparison of the results for the various
			
 
				 values of $\alpha$ reveals that the errors associated with $\alpha = \nicefrac{1}{14}$
			
 
				-are consistently smaller, with the exception of Saxony and Germany. This can be
			
 
				+are consistently smaller than for $\alpha = \nicefrac{1}{5}$, with the exception of Saxony and Germany. This can be
			
 
				 attributed to the differing sizes of infection counts, particularly in relation
			
 
				 to the normalization factor $C$. The model is unable to learn effectively if the
			
 
				 values of the data loss $\mathcal{L}_{\text{data}}$ are too large or too small
			
--- a/thesis.pdf
+++ b/thesis.pdf