% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Author: Phillip Rothenbeck % Title: Your Thesis % File: conclusions/conclusions.tex % Part: conclusions % Description: % summary of the content in this chapter % Version: 01.09.2024 % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Conclusions} \label{chap:conclusions} The severe COVID-19 pandemic~\cite{WHO} infected millions of people, while hundreds of thousands succumbed to it in Germany alone~\cite{SRD}. Over three years the pandemic changed through the influence of various mitigation policies and numerous emerging variants. In order to get a hold of the complex situation the necessity for analysis arises. Therefore, the objective of this thesis is to measure the COVID-19 pandemic in Germany and its 16 federal states by identifying several epidemiological parameters that describe the spread of the disease. \\ We use the SIR model~\cite{1927} to describe the dynamics of the COVID-19 infection over time, offering an approximation of reality. In this model, the transmission rate $\beta$ and recovery rate $\alpha$ describe the infectiousness and development of the disease that the respective population experience. These rates serve as global evaluation measures throughout the entire duration of the pandemic. Meanwhile, the time-dependent reproduction number indicates the number of individuals infected by a single infectious individual. The relations between parameters are defined in the system of differential equations which governs the SIR model.\\ In order to obtain these epidemiological parameters and the reproduction number for Germany, it is necessary to solve the system of ordinary differential equations (ODEs) for real-world pandemic data recorded in each state and in Germany as a whole. One method that has gained significant attention in recent years for solving systems of differential equations is the data-driven approach known as \emph{Physics-Informed Neural Networks} (PINN)~\cite{Raissi2019}. PINNs integrate knowledge in form of physical models, while learning an approximation the solution by fitting data points. We adapt previous epidemiological PINN approaches~\cite{Shaier2021,Millevoi2023} to solve the set of ODEs of the SIR model. The data for training is collected by the Robert Koch Institute and made publicly available on GitHub~\cite{GHDead,GHInf}. After preprocessing, we solve the inverse problem posed by the SIR model utilizing PINNs in order to find the epidemiological parameters and the reproduction number for the given data. Using this we conduct experiments on synthetic data and on the data for the federal states and Germany itself. The results for the synthetic data yield a small error, which demonstrates the efficacy of our approach on small datasets.\\ We divide our analysis of the real-world data into two groups. First, we have the time-constant epidemiological parameters $\alpha$ and $\beta$, which provide insight into the overall trajectory of the pandemic in a given region. Given the assumed constant recovery period (see~\Cref{sec:preprocessing:rq}), there is a dependency between the two parameters. Therefore, we focus our analysis on the transmission rate $\beta$. The states with the highest estimated transmission rate values are Thuringia, Saxony-Anhalt, and Mecklenburg-Western Pomerania, which means that these states had a higher average number of infections during the pandemic. Furthermore, it is evident that the six eastern states exhibit a higher transmission rate than the overall German rate (see~\Cref{fig:alpha_beta_mean_std}). Our results align with similarly observed differences in vaccination rates~\cite{FMH} and highlight perceived discrepancies between the eastern and western federal states~\cite{FMH,Desson2022}. We further substantiate this observation by calculating the correlation coefficient between the vaccination ratios $\nu$ of each state and our findings of $\beta$, which yields a strong negative correlation. In other words, a lower vaccination rate is an indicator for higher infection rates. The results from our second experiments, underscore these findings. Here, we approximate the time-independent reproduction number $\Rt$ from the data. When $\Rt>1$, the disease spreads rapidly through the population. Our results indicate a tendency for states with a high $\beta$ to experience longer periods with $\Rt>1$. Furthermore, we can identify the time point on which the most impactful events happened during the pandemic in Germany such as the peak of the omicron variant~\cite{COVIDChronik} at around 700 days after the start of data collection on 2020-03-09.\\ In conclusion, our approach has proven effective in yielding meaningful results for the epidemiological parameters of $\alpha$ and $\beta$, as well as the reproduction number $\Rt$ for Germany and its federal states. Despite the SIR model being an approximation of reality, there is a clear connection between the results and real-world data and events. We hope that this work will prove useful in the analysis of the events of the COVID-19 pandemic in Germany. % ------------------------------------------------------------------- \section{Further Work} \label{sec:furtherWork} Our findings demonstrate that our methods enable the quantification of the course of the COVID-19 pandemic in Germany using the data provided by the Robert Koch Institute~\cite{GHDead,GHInf}. Here we present some limitations of our work and propose future directions to address these points. First, we find that our model does not accurately reconstruct the input data to the desired level of precision. To address this, we propose a comprehensive hyperparameter search to find the best configuration. Moreover, the SIR model does not account for individuals, who may be immune due to the vaccination status or those who are not infectious due to quarantine. In this section, we explore epidemiological models that incorporate such dynamics observed in real-world pandemics and recommend further investigation for Germany. % ------------------------------------------------------------------- \subsection{Further Compartmental Models} As our results demonstrate, the SIR model is capable of approximating the dynamics of real-world pandemics. However, the model is not without limitations. The SIR model assumes that recovered individuals remain immune and does not account for the reduction of exposure to susceptible individuals through the introduction of non-pharmaceutical mitigation policies, such as social distancing policies. These shortcomings can be addressed by incorporating additional compartments and transmission rates into the model. For example, the SEIRD model~\cite{Korolev2021} incorporates an \emph{Exposed} group and subdivides the \emph{Removed} group into \emph{Dead} and \emph{Recovered} compartments. Furthermore, the model is extended with four additional parameters: the contact rate, the manifestation index, the incubation rate, and the infection fatality rate. Doerre and Doblhammer~\cite{Doerre2022} introduce an approach utilizing a SIERD model that they specialize to be age- and gender-specific. For Germany, they show the impact of non-pharmaceutical mitigation policies, solving the model using a numerical approximation method.\\ Additionally, Cooke and van den Driessche~\cite{Cooke1996} propose the SEIRS model with two delays. This model is capable of approximating diseases, that have an immune period, after which the recovered individual becomes susceptible again. These are just a few examples of the numerous modifications of the basic SIR model that can display the dynamics of the real world in a higher degree of detail and may be used to approximate and consequently quantify a pandemic. % ------------------------------------------------------------------- \subsection{Agent based models} Compartmental models, such as the SIR model, look at the population as a divided group. Each group represents a specific characterization that all inhabitants of that group share. An \emph{Agent-Based Model} (ABM) sets its focus on the individual. Each individual, or agent, has specific attributes that determine its behavior and interactions with other agents during the simulation. As Gilbert~\cite{Gilbert2010} states, ABMs simulate the behavior of large groups. Each individual follows simple rules which leads to the emergence of complex and stochastic behaviour on the mascroscopic level of the system~\cite{Bodine2020}. With regard to COVID-19, Kerr \etal~\cite{Kerr2021} put forth a simulation tool, \emph{Covasim}, which they base on an ABM. The ABM employs local data, including demographic data, disease incidence data from the region, and contact data for household, schools and workplaces, to define its simulation for a specific region. Maziarz and Zach~\cite{Maziarz2020} address the criticism levied against ABMs for simplifying the dynamics and lacking the empirical support for the assumptions they make. The authors utilize an ABM and the data specific to Australia to demonstrate the efficacy of ABMs in portraying the dynamics of the COVID-19 pandemic. They state that ABMs can serve as a tool for assessing the impact of non-pharmaceutical mitigation policies. This illustrates that ABMs play a distinct role in analyzing the COVID-19 pandemic. As the quantity of data has evolved, it is imperative to investigate the potential of utilizing ABMs as a tool to assess the pandemic's course for Germany in greater detail. % -------------------------------------------------------------------