123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391 |
- % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- % Author: Phillip Rothenbeck
- % Title: Investigating the Evolution of the COVID-19 Pandemic in Germany Using Physics-Informed Neural Networks
- % File: chap02/chap02.tex
- % Part: theoretical background
- % Description:
- % summary of the content in this chapter
- % Version: 05.08.2024
- % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- \chapter{Theoretical Background}
- \label{chap:background}
- This chapter introduces the theoretical knowledge that forms the foundation of
- the work presented in this thesis. In~\Cref{sec:domain}
- and~\Cref{sec:differentialEq}, we talk about differential equations and the
- underlying theory. In these Sections both the explanations and the approach are
- strongly based on the book on analysis by Rudin~\cite{Rudin2007} and the book
- about ordinary differential equations by Tenenbaum and
- Pollard~\cite{Tenenbaum1985}. Subsequently, we employ this knowledge to examine
- various pandemic models in~\Cref{sec:epidemModel}.
- Finally, we address the topic of neural networks with a focus on the multilayer
- perceptron in~\Cref{sec:mlp} and physics informed neural networks
- in~\Cref{sec:pinn}.
- % -------------------------------------------------------------------
- \section{Mathematical Modelling using Functions}
- \label{sec:domain}
- To model a physical problem using mathematical tools, it is necessary to define
- a set of fundamental numbers or quantities upon which the subsequent calculations
- will be based. These sets may represent, for instance, a specific time interval
- or a distance. The term \emph{domain} describes these fundamental sets of
- numbers or quantities~\cite{Rudin2007}. A \emph{variable} is a changing entity
- living in a certain domain. In this thesis, we will focus on domains of real
- numbers in $\mathbb{R}$.\\
- The mapping between variables enables the modeling of the process and depicts
- the semantics. We use functions in order to facilitate this mapping. Let
- $A, B\subset\mathbb{R}$ be to subsets of the real numbers, then we define a
- function as the mapping
- \begin{equation}
- f: A\rightarrow B.
- \end{equation}
- In other words, the function $f$ maps elements $x\in A$ to values
- $f(x)\in B$. $A$ is the \emph{domain} of $f$, while $B$ is the \emph{codomain}
- of $f$. Functions are capable of representing the state of a system as a value
- based on an input value from their domain. One illustrative example is a
- function that maps a time point to the distance covered since a starting point.
- In this case, time serves as the domain, while the distance is the codomain.
- % -------------------------------------------------------------------
- \section{Basics of Differential Equations}
- \label{sec:differentialEq}
- Often, the change of a system is more interesting than its current state.
- Functions are able to give us the latter, but only passively give information
- about the change of a system. The objective is to determine an effective method
- for calculating the change of a function across its domain. Let $f$ be a
- function and $[a, b]\subset \mathbb{R}$ an interval of real numbers, the
- expression
- \begin{equation}
- m = \frac{f(b) - f(a)}{a-b}
- \end{equation}
- gives the average rate of change. While the average rate of change is useful in
- many cases, the momentary rate of change is more accurate. To calculate this,
- we need to narrow down, the interval to an infinitesimal. For each $x\in[a, b]$
- we calculate
- \begin{equation} \label{eqn:differential}
- \frac{df}{dx} = \lim_{t\to x} \frac{f(t) - f(x)}{t-x},
- \end{equation}
- if it exists. $\frac{df}{dx}$ is the \emph{derivative}, or
- \emph{differential equation}, it returns the momentary rate of change of $f$ for
- each value $x$ of $f$'s domain. Repeating this process on $\frac{df}{dx}$ yields
- $\frac{d^2f}{dx^2}$, which is the function that calculates the rate of change of
- the rate of change and is called the second order derivative. Iterating this $n$
- times results in $\frac{d^nf}{dx^n}$, the derivative of the $n$'th order.
- Another method for obtaining a differential equation is to create it from the
- semantics of a problem. This method is useful if no basic function exists for a
- system. Differential equations find application in several areas such as
- engineering, physics, economics, epidemiology, and beyond.\\
- In the context of functions, it is possible to have multiple domains, meaning
- that function has more than one parameter. To illustrate, consider a function
- operating in two-dimensional space, wherein each parameter represents one axis
- or one that, employs with time and locations as inputs. The term
- \emph{partial differential equations} (\emph{PDE}'s) describes differential
- equations of such functions, which require a derivative for each of their
- domains. In contrast, \emph{ordinary differential equations} (\emph{ODE}'s) are
- the single derivatives for a function having only one domain. In this thesis, we
- only need ODE's.\\
- A \emph{system of differential equations} is the name for a set of differential
- equations. The derivatives in a system of differential equations each have their
- own codomain, which is part of the problem, while they all share the same
- domain.\\
- Tenenbaum and Pollard~\cite{Tenenbaum1985} provide many examples for ODE's,
- including the \emph{Motion of a Particle Along a Straight Line}. Further,
- Newton's second law states that ``the rate of change of the momentum of a body
- ($momentum = mass \cdot velocity$) is proportional to the resultant external
- force $F$ acted upon it''~\cite{Tenenbaum1985}. Let $m$ be the mass of the body
- in kilograms, $v$ its velocity in meters per second and $t$ the time in seconds.
- Then, Newton's second law translates mathematically to
- \begin{equation} \label{eq:newtonSecLaw}
- F = m\frac{dv}{dt}.
- \end{equation}
- It is evident that the acceleration, $a=\frac{dv}{dt}$, as the rate of change of
- the velocity is part of the equation. Additionally, the velocity of a body is
- the derivative of the distance traveled by that body. Based on these findings,
- we can rewrite the~\Cref{eq:newtonSecLaw} to
- \begin{equation}
- F=ma=m\frac{d^2s}{dt^2}.
- \end{equation}\\
- This explanation of differential equations focuses on the aspects deemed crucial
- for this thesis and is not intended to be a complete explanation of the subject.
- To gain a better understanding of it, we recommend the books mentioned
- above~\cite{Rudin2007,Tenenbaum1985}. In the following section we
- describe the application of these principles in epidemiological models.
- % -------------------------------------------------------------------
- \section{Epidemiological Models}
- \label{sec:epidemModel}
- Pandemics, like \emph{COVID-19}, which has resulted in a significant
- number of fatalities. The question arises: How should we fight a pandemic
- correctly? Also, it is essential to study whether the employed countermeasures
- efficacious in combating the pandemic. Given the unfavorable public response to
- measures such as lockdowns, it is imperative to investigate that their efficacy
- remains commensurate with the costs incurred to those affected. In the event
- that alternative and novel technologies were in use, such as the mRNA vaccines
- in the context of COVID-19, it is needful to test the effect and find the
- optimal variant. In order to shed light on the aforementioned events we need to
- develop a method to quantize the pandemic along with its course of
- progression.\\
- The real world is a highly complex system, which presents a significant
- challenge attempting to describe it fully in a model. Therefore, the model must
- reduce the complexity while retaining the essential information. Furthermore, it
- must address the issue of limited data availability. For instance, during
- COVID-19 institutions such as the Robert Koch Institute
- (RKI)\footnote[1]{\url{https://www.rki.de/EN/Home/homepage_node.html}} were only
- able to collect data on infections and mortality cases. Consequently, we require
- a model that employs an abstraction of the real world to illustrate the events
- and relations that are pivotal to understanding the problem.
- % -------------------------------------------------------------------
- \subsection{SIR Model}
- \label{sec:pandemicModel:sir}
- In 1927, Kermack and McKendrick~\cite{1927} introduced the \emph{SIR Model},
- which subsequently became one of the most influential epidemiological models.
- This model enables the modeling of infections during epidemiological events such as pandemics.
- The book \emph{Mathematical Models in Biology}~\cite{EdelsteinKeshet2005}
- reiterates the model and serves as the foundation for the following explanation
- of SIR models.\\
- The SIR model is capable of illustrating diseases, which are transferred through
- contact or proximity of an individual carrying the illness and a healthy
- individual. This is possible due to the distinction between infected beings
- who are carriers of the disease and the part of the population, which is
- susceptible to infection. In the model, the mentioned groups are capable to
- change, e.g., healthy individuals becoming infected. In the real world the size
- of a population is subject to a number of factors that can contribute to change.
- The population is increased by the occurrence of births and decreased by the
- occurrence of deaths. There are different reasons for mortality, including the
- natural aging process or the development of other diseases. To omit this factor
- of complexity, the model assumes the size $N$ of the population remains constant
- throughout the duration of the epidemic. The population $N$ is comprised of
- three distinct groups: the \emph{susceptible} group $S$, the \emph{infectious}
- group $I$ and the \emph{removed} group $R$ (hence SIR model). For $S$, $I$, $R$
- and $N$ applies:
- \begin{equation} \label{eq:N_char}
- N = S + I + R.
- \end{equation}
- The model makes another assumption by stating that recovered people are immune
- to the illness and infectious individual can not infect them. The individuals in
- the $R$ group are either recovered or deceased, and thus unable to transmit or
- carry the disease.
- \begin{figure}[h]
- \centering
- \includegraphics[scale=0.3]{sir_graph.png}
- \caption{A visualization of the SIR model, illustrating $N$ being split in the
- three groups $S$, $I$ and $R$.}
- \label{fig:sir_model}
- \end{figure}
- As visualized in the~\Cref{fig:sir_model} the
- individuals may transition between groups based on transition rates. The
- transmission rate $\beta$ is responsible for individuals becoming infected,
- while the rate of removal or recovery rate $\alpha$ (also referred to as
- $\delta$ or $\nu$, e.g.,~\cite{EdelsteinKeshet2005,Millevoi2023}) moves
- individuals from $I$ to $R$.\\
- We can describe this problem mathematically using a system of differential
- equations (see ~\Cref{sec:differentialEq}). Thus, Kermack and
- McKendrick~\cite{1927} propose the following set of differential equations:
- \begin{equation}\label{eq:sir}
- \begin{split}
- \frac{dS}{dt} &= -\beta S I,\\
- \frac{dI}{dt} &= \beta S I - \alpha I,\\
- \frac{dR}{dt} &= \alpha I,
- \end{split}
- \end{equation}
- This, according to Edelstein-Keshet, is based on the following assumption:
- ``The rate of transmission of a microparasitic disease is proportional to the
- rate of encounter of susceptible and infective individuals modelled by the
- product ($\beta S I$)''~\cite{EdelsteinKeshet2005}. The system shows the change
- in size of the groups per time unit due to infections, recoveries, and deaths.\\
- The term $\beta SI$ describes the rate of encounters of susceptible and infected
- individuals. This term is dependent on the size of $S$ and $I$, thus Anderson
- and May~\cite{Anderson1991} propose a modified model:
- \begin{equation}\label{eq:modSIR}
- \begin{split}
- \frac{dS}{dt} &= -\beta \frac{SI}{N},\\
- \frac{dI}{dt} &= \beta \frac{SI}{N} - \alpha I,\\
- \frac{dR}{dt} &= \alpha I.
- \end{split}
- \end{equation}
- In which $\beta SI$ gets normalized by $N$, which is more correct in a
- real world aspect.\\
- The initial phase of a pandemic is characterized by the infection of a small
- number of individuals, while the majority of the population remains susceptible.
- The infectious group has not yet infected any individuals thus
- neither recovery nor mortality is possible. Let $I_0\in\mathbb{N}$ be
- the number of infected individuals at the beginning of the disease. Then,
- \begin{equation}
- \begin{split}
- S(0) &= N - I_{0},\\
- I(0) &= I_{0},\\
- R(0) &= 0,
- \end{split}
- \end{equation}
- describes the initial configuration of a system in which a disease has just
- emerged.\\
- \begin{figure}[h]
- \centering
- \begin{subfigure}[h]{0.3\textwidth}
- \centering
- \includegraphics[width=\textwidth]{reference_params_synth.png}
- \caption{Basic configuration, $\alpha=0.35$, $\beta=0.5$}
- \label{fig:synth_norm}
- \end{subfigure}
- \hfill
- \begin{subfigure}[h]{0.3\textwidth}
- \centering
- \includegraphics[width=\textwidth]{high_beta_synth.png}
- \caption{High $\alpha$ configuration, $\alpha=0.45$, $\beta=0.5$}
- \label{fig:synth_high_beta}
- \end{subfigure}
- \hfill
- \begin{subfigure}[h]{0.3\textwidth}
- \centering
- \includegraphics[width=\textwidth]{low_beta_synth.png}
- \caption{Low $\alpha$ configuration, $\alpha=0.25$, $\beta=0.5$}
- \label{fig:synth_low_beta}
- \end{subfigure}
- \hfill
- \begin{subfigure}[b]{0.3\textwidth}
- \centering
- \includegraphics[width=\textwidth]{high_alpha_synth.png}
- \caption{High $\beta$ configuration, $\alpha=0.35$, $\beta=0.6$}
- \label{fig:synth_high_alpha}
- \end{subfigure}
- \begin{subfigure}[b]{0.3\textwidth}
- \centering
- \includegraphics[width=\textwidth]{low_alpha_synth.png}
- \caption{Low $\beta$ configuration, $\alpha=0.35$, $\beta=0.3$}
- \label{fig:synth_low_alpha}
- \end{subfigure}
- \caption{Synthetic data, using~\Cref{eq:modSIR} and $N=7.9\cdot 10^6$,
- $I_0=10$ with different sets of parameters.}
- \label{fig:synth_sir}
- \end{figure}
- In the SIR model the temporal occurrence and the height of the peak (or peaks)
- of the infectious group are of paramount importance for understanding the
- dynamics of a pandemic. A low peak occurring at a late point in time indicates
- that the disease is unable to keep pace with the rate of recovery, resulting
- in its demise before it can exert a significant influence on the population. In
- contrast, an early and high peak means that the disease is rapidly transmitted
- through the population, with a significant proportion of individuals having been
- infected.~\Cref{fig:sir_model} illustrates the impact of modifying either
- $\beta$ or $\alpha$ while simulating a pandemic using a model such
- as~\Cref{eq:modSIR}. It is evident that both the transmission rate $\beta$
- and the recovery rate $\alpha$ influence the height and time of the peak of $I$.
- When the number of infections exceeds the number of recoveries, the peak of $I$
- will occur early and will be high. On the other hand, if recoveries occur at a
- faster rate than new infections the peak will occur later and will be low. This
- means, that it is crucial to know both $\beta$ and $\alpha$ to be able to
- quantize a pandemic using the SIR model.
- % -------------------------------------------------------------------
- \subsection{Reduced SIR Model and the Reproduction Number}
- \label{sec:pandemicModel:rsir}
- The~\Cref{sec:pandemicModel:sir} presents the classical SIR model. The model
- comprises two parameters $\beta$ and $\alpha$, which describe the course of a
- pandemic over its duration. This is beneficial when examining the overall
- pandemic; however, in the real world, disease behavior is dynamic, and the
- values of the parameters $\beta$ and $\alpha$ change at each time point. The
- reason for this is due to events such as the implementation of countermeasures
- that reduce the contact between the infectious and susceptible individuals, the
- emergence of a new variant of the disease that increases its infectivity or
- deadliness, or the administration of a vaccination that provides previously
- susceptible individuals with immunity without ever being infectious. To address
- this Millevoi et al.~\cite{Millevoi2023} introduce a model that simultaneously
- reduces the size of the system of differential equations and solves the problem
- of time scaling at hand.\\
- First, they alter the definition of $\beta$ and $\alpha$ to be dependent on the time interval
- $\mathcal{T} = [t_0, t_f]\subseteq \mathbb{R}_{\geq0}$,
- \begin{equation}
- \beta: \mathcal{T}\rightarrow\mathbb{R}_{\geq0}, \quad\alpha: \mathcal{T}\rightarrow\mathbb{R}_{\geq0}.
- \end{equation}
- Another crucial element is $D(t) = \frac{1}{\alpha(t)}$, which represents the initial time
- span an infected individual requires to recuperate. Subsequently, at the initial time point
- $t_0$, the \emph{reproduction number},
- \begin{equation}
- \RO = \beta(t_0)D(t_0) = \frac{\beta(t_0)}{\alpha(t_0)},
- \end{equation}
- represents the number of susceptible individuals, that one infectious individual
- infects at the onset of the pandemic.In light of the effects of $\beta$ and
- $\alpha$ (see~\Cref{sec:pandemicModel:sir}), $\RO > 1$ indicates that the
- pandemic is emerging. In this scenario $\alpha$ is relatively low due to the
- limited number of infections resulting from $I(t_0) << S(t_0)$. When $\RO < 1$,
- the disease is spreading rapidly across the population, with an increase in $I$
- occurring at a high rate. Nevertheless, $\RO$ does not cover the entire time
- span. For this reason, Millevoi et al.~\cite{Millevoi2023} introduce $\Rt$
- which has the same interpretation as $\RO$, with the exception that $\Rt$ is
- dependent on time. The definition of the time-dependent reproduction number on
- the time interval $\mathcal{T}$ with the population size $N$,
- \begin{equation}\label{eq:repr_num}
- \Rt=\frac{\beta(t)}{\alpha(t)}\cdot\frac{S(t)}{N}
- \end{equation}
- includes the rates of change for information about the spread of the disease and
- information of the decrease of the ratio of susceptible individuals in the
- population. In contrast to $\beta$ and $\alpha$, $\Rt$ is not a parameter but
- a state variable in the model and enabling the following reduction of the SIR
- model.\\
- \Cref{eq:N_char} allows for the calculation of the value of the group $R$ using
- $S$ and $I$, with the term $R(t)=N-S(t)-I(t)$. Thus,
- \begin{equation}
- \begin{split}
- \frac{dS}{dt} &= \alpha(\Rt-1)I(t),\\
- \frac{dI}{dt} &= -\alpha\Rt I(t),
- \end{split}
- \end{equation}
- is the reduction of~\Cref{eq:sir} on the time interval $\mathcal{T}$ using this
- characteristic and the reproduction number \Rt (see ~\Cref{eq:repr_num}).
- Another issue that Millevoi et al.~\cite{Millevoi2023} seek to address is the
- extensive range of values that the SIR groups can assume, spanning from $0$ to
- $10^7$. Accordingly, they initially scale the time interval $\mathcal{T}$ using
- its borders to calculate the scaled time $t_s = \frac{t - t_0}{t_f - t_0}\in
- [0, 1]$. Subsequently, they calculate the scaled groups,
- \begin{equation}
- S_s(t_s) = \frac{S(t)}{C},\quad I_s(t_s) = \frac{I(t)}{C},\quad R_s(t_s) = \frac{R(t)}{C},
- \end{equation}
- using a large constant scaling factor $C\in\mathbb{N}$. Applying this to the
- variable $I$, results in,
- \begin{equation}
- \frac{dI_s}{dt_s} = \alpha(t_f - t_0)(\Rt - 1)I_s(t_s),
- \end{equation}
- a further reduced version of~\Cref{eq:sir} results in a more streamlined and
- efficient process, as it entails the elimination of a parameter($\beta$) and two
- state variables ($S$ and $R$), while adding the state variable $\Rt$. This is a
- crucial aspect for the automated resolution of such differential equation
- systems, as we describe in~\Cref{sec:mlp}.
- % -------------------------------------------------------------------
- \section{Multilayer Perceptron}
- \label{sec:mlp}
- % -------------------------------------------------------------------
- \section{Physics Informed Neural Networks}
- \label{sec:pinn}
- % -------------------------------------------------------------------
- \subsection{Disease Informed Neural Networks}
- \label{sec:pinn:dinn}
- % -------------------------------------------------------------------
|