hai 10 meses · efd39bb4fc
--- a/chapters/chap01-introduction/chap01-introduction.tex
+++ b/chapters/chap01-introduction/chap01-introduction.tex
@@ -30,7 +30,7 @@ and time points. The common approach in epidemiology to address this is the
 
				 utilization of epidemiological models that approximate the dynamics by focusing
			
 
				 on specific factors and modeling these using mathematical tools. These models
			
 
				 provide transition rates and parameters that determine the behavior of a disease
			
 
				-within the boundaries of the model. A fundamental epidemiological model, is the
			
 
				+within the boundaries of the model. A seminal epidemiological model, is the
			
 
				 \emph{SIR model}, which was first proposed by Kermack and McKendrick~\cite{1927}
			
 
				 in 1927. The SIR model is a compartmentalized model that divides the entire
			
 
				 population into three distinct groups: the \emph{susceptible} compartment, $S$; the
			
--- a/chapters/chap02/chap02.tex
+++ b/chapters/chap02/chap02.tex
@@ -117,10 +117,12 @@ Then, Newton's second law translates mathematically to
 
				 It is evident that the acceleration, $a=\frac{dv}{dt}$, as the rate of change of
			
 
				 the velocity is part of the equation. Additionally, the velocity of a body is
			
 
				 the derivative of the distance traveled by that body. Based on these findings,
			
 
				-we can rewrite the~\Cref{eq:newtonSecLaw} to
			
 
				+we can rewrite the~\Cref{eq:newtonSecLaw} to,
			
 
				 \begin{equation}
			
 
				-  F=ma=m\frac{d^2s}{dt^2}.
			
 
				-\end{equation}\\
			
 
				+  F=ma=m\frac{d^2s}{dt^2},
			
 
				+\end{equation}
			
 
				+showing that the force $F$ influences the changes of the body's position over
			
 
				+time.\\
			
 
				 
			
 
				 To conclude, note that this explanation of differential equations focuses on the
			
 
				 aspects deemed crucial for this thesis and is not intended to be a complete
			
@@ -323,18 +325,18 @@ faster rate than new infections the peak will occur later and will be low. Thus,
 
				 it is crucial to know both $\beta$ and $\alpha$, as these parameters
			
 
				 characterize how the pandemic evolves.\\
			
 
				 
			
 
				-The SIR model makes a number of assumptions that are intended to reduce the
			
 
				-model's overall complexity while simultaneously increasing its divergence from
			
 
				-actual reality. One such assumption is that the size of the population, $N$,
			
 
				-remains constant, as the daily change is negligible to the total population.
			
 
				-This depiction is not an accurate representation of the actual relations \todo{other assumptions in a bad light?}
			
 
				-observed in the real world, as the size of a population is subject to a number
			
 
				-of factors that can contribute to change. The population is increased by the
			
 
				-occurrence of births and decreased by the occurrence of deaths. Other examples
			
 
				-are the impossibility for individuals to be susceptible again, after having
			
 
				-recovered, or the possibility for the transition rates to change due to new
			
 
				-variants or the implementation of new countermeasures. We address this latter
			
 
				-option in the next~\Cref{sec:pandemicModel:rsir}.
			
 
				+The SIR model is based on a number of assumptions that are intended to reduce
			
 
				+the overall complexity of the model while still representing the processes
			
 
				+observed in the real-world. For example, the size of a population
			
 
				+in the real-world is subject to a number of factors that can contribute to
			
 
				+change. The population is increased by the occurrence of births and decreased
			
 
				+by the occurrence of deaths. One assumption, stated in the SIR model is that
			
 
				+the size of the population, $N$, remains constant, as the daily change is
			
 
				+negligible to the total population. Other examples include the impossibility
			
 
				+for individuals to be susceptible again, after having recovered, or the
			
 
				+possibility for the transition rates to change due to new variants or the
			
 
				+implementation of new countermeasures. We address this latter option in the
			
 
				+next~\Cref{sec:pandemicModel:rsir}.
			
 
				 
			
 
				 % -------------------------------------------------------------------
			
 
				 
			
@@ -350,11 +352,11 @@ countermeasures that reduce the contact between the infectious and susceptible
 
				 individuals, the emergence of a new variant of the disease that increases its
			
 
				 infectivity or deadliness, or the administration of a vaccination that provides
			
 
				 previously susceptible individuals with immunity without ever being infected.
			
 
				-To address this, based on the time-dependent transition rates introduced by Liu
			
 
				-and Stechlinski~\cite{Liu2012}, and Setianto and Hidayat~\cite{Setianto2023},
			
 
				-Millevoi \etal~\cite{Millevoi2023} present a model that simultaneously reduces
			
 
				-the size of the system of differential equations and solves the problem of time
			
 
				-scaling at hand.\\
			
 
				+As these fine details of disease progression are missed in the constant rates,
			
 
				+Liu and Stechlinski~\cite{Liu2012}, and Setianto and Hidayat~\cite{Setianto2023},
			
 
				+introduce time-dependent transition rates and the time-dependent reproduction
			
 
				+number to address this issue. Millevoi \etal~\cite{Millevoi2023} present a
			
 
				+reduced version of the SIR model.\\
			
 
				 
			
 
				 First, they alter the definition of $\beta$ and $\alpha$ to be dependent on the time interval
			
 
				 $\mathcal{T} = [t_0, t_f]\subseteq \mathbb{R}_{\geq0}$,
			
@@ -371,7 +373,7 @@ represents the number of susceptible individuals, that one infectious individual
 
				 infects at the onset of the pandemic. In light of the effects of $\beta$ and
			
 
				 $\alpha$ (see~\Cref{sec:pandemicModel:sir}), $\RO < 1$ indicates that the
			
 
				 pandemic is emerging. In this scenario $\alpha$ is relatively low due to the
			
 
				-limited number of infections resulting from $I(t_0) << S(t_0)$.\\ Further,
			
 
				+limited number of infections resulting from $I(t_0) << S(t_0)$. Further,
			
 
				 $\RO > 1$ leads to the disease spreading rapidly across the population, with an
			
 
				 increase in $I$ occurring at a high rate. Nevertheless, $\RO$ does not cover
			
 
				 the entire time span. For this reason, Millevoi \etal~\cite{Millevoi2023}
			
@@ -381,12 +383,13 @@ defined as,
 
				 \begin{equation}\label{eq:repr_num}
			
 
				   \Rt=\frac{\beta(t)}{\alpha(t)}\cdot\frac{S(t)}{N},
			
 
				 \end{equation}
			
 
				-on the time interval $\mathcal{T}$. This definition includes the transition
			
 
				-rates for information about the spread of the disease and information of the
			
 
				-decrease of the ratio of susceptible individuals in the population. In contrast
			
 
				-to $\beta$ and $\alpha$, $\Rt$ is not a parameter but \todo{Sai comment - earlier?}
			
 
				-a state variable in the model and enabling the following reduction of the SIR
			
 
				-model.\\
			
 
				+on the time interval $\mathcal{T}$ and the population site $N$. This definition
			
 
				+includes the transition rates for information about the spread of the disease
			
 
				+and information of the decrease of the ratio of susceptible individuals in the
			
 
				+population. In contrast to $\beta$ and $\alpha$, $\Rt$ is not a parameter but
			
 
				+a state variable in the model, which gives information about the reproduction of the disease
			
 
				+for each day. As Millevoi \etal~\cite{Millevoi2023} show, $\Rt$ enables the
			
 
				+following reduction of the SIR model.\\
			
 
				 
			
 
				 \Cref{eq:N_char} allows for the calculation of the value of the group $R$ using
			
 
				 $S$ and $I$, with the term $R(t)=N-S(t)-I(t)$. Thus,
			
@@ -414,19 +417,20 @@ variable $I$, results in,
 
				 which is a further reduced version of~\Cref{eq:sir}. This less complex
			
 
				 differential equation results in a less complex solution, as it entails the
			
 
				 elimination of a parameter ($\beta$) and the two state variables ($S$ and $R$).
			
 
				-The reduced SIR model, is more precise in applications with a worse data
			
 
				-situation, due to its fewer input variables.
			
 
				+The reduced SIR model is more precise due to fewer input variables, making it
			
 
				+advantageous in situations with limited data, such as when recovery data is
			
 
				+missing.
			
 
				 
			
 
				 % -------------------------------------------------------------------
			
 
				 
			
 
				 \section{Multilayer Perceptron}
			
 
				 \label{sec:mlp}
			
 
				-In~\Cref{sec:differentialEq}, we demonstrate the significance of differential
			
 
				+In~\Cref{sec:differentialEq}, we discuss the modeling of systems using differential
			
 
				 equations in systems, illustrating how they can be utilized to elucidate the
			
 
				 impact of a specific parameter on the system's behavior.
			
 
				 In~\Cref{sec:epidemModel}, we show specific applications of differential
			
 
				-equations in an epidemiological context. The final objective is to solve these
			
 
				-equations by finding a function that fits. Fitting measured data points to
			
 
				+equations in an epidemiological context. Solving such systems is crucial and
			
 
				+involves finding a function that best fits the data. Fitting measured data points to
			
 
				 approximate such a function, is one of the multiple methods to achieve this
			
 
				 goal. The \emph{Multilayer Perceptron} (MLP)~\cite{Rumelhart1986} is a
			
 
				 data-driven function approximator. In the following section, we provide a brief
			
@@ -499,7 +503,7 @@ calculates the squared difference between each model prediction and true value
 
				 of a training and takes the mean across the whole training data. \\
			
 
				 
			
 
				 Ultimately, the objective is to utilize this information to optimize the parameters, in order to minimize the
			
 
				-loss. One of the most fundamental optimization strategy is \emph{gradient
			
 
				+loss. One of the most fundamental and seminal optimization strategy is \emph{gradient
			
 
				   descent}. In this process, the derivatives are employed to identify the location
			
 
				 of local or global minima within a function, which lie where the gradient is
			
 
				 zero. Given that a positive gradient
			
@@ -525,7 +529,7 @@ error backwards through the neural network.\\
 
				 
			
 
				 In practical applications, an optimizer often accomplishes the optimization task
			
 
				 by executing back propagation in the background. Furthermore, modifying the
			
 
				-learning rate during training can be advantageous. For instance, making larger \todo{leave whole paragraph out? - Niklas}
			
 
				+learning rate during training can be advantageous. For instance, making larger
			
 
				 steps at the beginning and minor adjustments at the end. Therefore, schedulers
			
 
				 are implementations algorithms that employ diverse learning rate alteration
			
 
				 strategies.\\
			
@@ -542,40 +546,47 @@ solutions to differential systems.
 
				 \label{sec:pinn}
			
 
				 
			
 
				 In~\Cref{sec:mlp}, we describe the structure and training of MLP's, which are
			
 
				-wildely recognized tools for approximating any kind of function. In this
			
 
				-section, we apply this capability to create a solver for ODE's and PDE's
			
 
				-as Legaris \etal~\cite{Lagaris1997} describe in their paper. In this approach,
			
 
				-the model learns to approximate a function using provided data points while
			
 
				+wildely recognized tools for approximating any kind of function. In 1997
			
 
				+Lagaris \etal~\cite{Lagaris1997} provided a method, that utilizes gradient
			
 
				+descent to solve ODEs and PDEs. Building on this approach, Raissi
			
 
				+\etal~\cite{Raissi2019} introduced the methodology with the name
			
 
				+\emph{Physics-Informed Neural Network} (PINN) in 2017. In this approach, the
			
 
				+model learns to approximate a function using provided data points while
			
 
				 leveraging the available knowledge about the problem in the form of a system of
			
 
				-differential equations. The \emph{physics-informed neural network} (PINN)
			
 
				-learns the system of differential equations during training, as it optimizes
			
 
				-its output to align with the equations.\\
			
 
				-
			
 
				-In contrast to standard MLP's, PINNs are not only data-driven. The loss term of a PINN comprises two
			
 
				-components. The first term incorporates the equations of the aforementioned prior knowledge to pertinent the problem. As Raissi
			
 
				-\etal~\cite{Raissi2017} propose, the residual of each differential equation in
			
 
				-the system must be minimized in order for the model to optimize its output in accordance with the theory.
			
 
				-We obtain the residual $r_i$, with $i\in\{1, ...,N_d\}$, by rearranging the differential equation and
			
 
				-calculating the difference between the left-hand side and the right-hand side
			
 
				-of the equation. $N_d$ is the number of differential equations in a system. As
			
 
				-Raissi \etal~\cite{Raissi2017} propose the \emph{physics
			
 
				-  loss} of a PINN,
			
 
				+differential equations.\\
			
 
				+
			
 
				+In contrast to standard MLP models, PINNs are not solely data-driven. The differential
			
 
				+equation,
			
 
				 \begin{equation}
			
 
				-  \mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}}) = \frac{1}{N_d}\sum_{i=1}^{N_d} ||r_i(\boldsymbol{x},\hat{\boldsymbol{y}})||^2,
			
 
				+  \boldsymbol{y}=\mathcal{D}(\boldsymbol{x}),
			
 
				 \end{equation}
			
 
				-takes the input data and the model prediction to calculate the mean square
			
 
				-error of the residuals. The second term, the \emph{observation loss}
			
 
				-$\Loss{obs}$, employs the mean square error of the distances between the
			
 
				-predicted and the true values for each training point. Additionally, the
			
 
				-observation loss may incorporate extra terms of inital and boundary conditions. Let $N_t$
			
 
				-denote the number of training points. Then,
			
 
				-\begin{equation}
			
 
				-  \mathcal{L}_{PINN}(\boldsymbol{x}, \boldsymbol{y},\hat{\boldsymbol{y}}) = \frac{1}{N_d}\sum_{i=1}^{N_d} ||r_i(\boldsymbol{x},\hat{\boldsymbol{y}})||^2 + \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{y}}^{(i)}-\boldsymbol{y}^{(i)}||^2,
			
 
				-\end{equation}\\
			
 
				-represents the comprehensive loss function of a physics-informed neural network. \\
			
 
				-
			
 
				-Given the nature of residuals, calculating the loss term of
			
 
				-$\mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}})$ requires the
			
 
				+includes both the solution $\boldsymbol{y}$, and the operant $\mathcal{D}$,
			
 
				+which incorporates all derivatives with respect to the input $\boldsymbol{x}$.
			
 
				+This equation contains the information of the physical properties and dynamics of
			
 
				+$\boldsymbol{y}$. In order to find the solution $\boldsymbol{y}$, we must solve the
			
 
				+differential equation with respect to data which is related to the problem at hand. As
			
 
				+Raissi \etal~\cite{Raissi2019} propose, we employ a neural network with the
			
 
				+parameters $\theta$. The MLP then is supposed to optimize its parameters for
			
 
				+its output $\hat{\boldsymbol{y}}$ to approximate the solution $\boldsymbol{y}$.
			
 
				+In order to achieve this, we train the model on data containing input-output
			
 
				+pairs with measures of $\boldsymbol{y}$. The output $\hat{\boldsymbol{y}}$ is
			
 
				+fitted to the data through the mean square error data loss $\mathcal{L}_{\text{data}}$.
			
 
				+Moreover, the data loss function may include additional terms for initial and boundary
			
 
				+conditions. Furthermore, the physics are incorporated through an additional loss
			
 
				+term of the physics loss $\mathcal{L}_{\text{physics}}$ which includes the
			
 
				+differential equation through its residual $r=\boldsymbol{y} - \mathcal{D}(\boldsymbol{x})$.
			
 
				+This leads to the PINN loss function,
			
 
				+\begin{align}\label{eq:PINN_loss}
			
 
				+  \mathcal{L}_{\text{PINN}}(\boldsymbol{x}, \boldsymbol{y},\hat{\boldsymbol{y}}) & = &  & \mathcal{L}_{\text{data}}         (\boldsymbol{y},\hat{\boldsymbol{y}})               & + & \quad \mathcal{L}_{\text{physics}}     (\boldsymbol{x}, \boldsymbol{y},\hat{\boldsymbol{y}}) &   \\
			
 
				+                                                                                 & = &  & \frac{1}{N_t}\sum_{i=1}^{N_t} ||  \hat{\boldsymbol{y}}^{(i)}-\boldsymbol{y}^{(i)}||^2 & + & \quad\frac{1}{N_d}\sum_{i=1}^{N_d} ||  r_i(\boldsymbol{x},\hat{\boldsymbol{y}})||^2          & ,
			
 
				+\end{align}
			
 
				+with $N_d$ the number of differential equations in a system and $N_t$ the
			
 
				+number of training samples used for training. Utilizing~\Cref{eq:PINN_loss}, the
			
 
				+PINN simultaneously optimizes its parameters $\theta$ to minimize both the data
			
 
				+loss and the physics loss. This makes it a multi-objective optimization problem.\\
			
 
				+
			
 
				+Given the nature of differential equations, calculating the loss term of
			
 
				+$\mathcal{L}_{\text{physics}}(\boldsymbol{x},\hat{\boldsymbol{y}})$ requires the
			
 
				 calculation of the derivative of the output with respect to the input of
			
 
				 the neural network. As we outline in~\Cref{sec:mlp}, during the process of
			
 
				 back-propagation we calculate the gradients of the loss term in respect to a
			
@@ -587,26 +598,28 @@ compute the respective gradients. The term,
 
				 \begin{equation}
			
 
				   \nabla_{\boldsymbol{x}} \hat{\boldsymbol{y}} = \frac{d\hat{\boldsymbol{y}}}{df^{(2)}}\frac{df^{(2)}}{df^{(1)}}\nabla_{\boldsymbol{x}}f^{(1)},
			
 
				 \end{equation}
			
 
				-illustrates that, in contrast to the procedure described in~\cref{eq:backprop},
			
 
				-this procedure the \emph{automatic differenciation} goes one step further and
			
 
				+illustrates that, in contrast to the procedure described in~\Cref{eq:backprop},
			
 
				+this procedure the \emph{automatic differentiation} goes one step further and
			
 
				 calculates the gradient of the output with respect to the input
			
 
				 $\boldsymbol{x}$. In order to calculate the second derivative
			
 
				 $\frac{d\hat{\boldsymbol{y}}}{d\boldsymbol{x}}=\nabla_{\boldsymbol{x}} (\nabla_{\boldsymbol{x}} \hat{\boldsymbol{y}} ),$
			
 
				 this procedure must be repeated.\\
			
 
				 
			
 
				-Above we present a method for approximating functions through the use of
			
 
				-systems of differential equations. As previously stated, we want to find a
			
 
				-solver for systems of differential equations. In problems, where we must solve
			
 
				+Above we present a method by Raissi et al.~\cite{Raissi2019} for approximating
			
 
				+functions through the use of systems of differential equations. As previously
			
 
				+stated, we want to find a
			
 
				+solution for systems of differential equations. In problems, where we must solve
			
 
				 an ODE or PDE, we have to find a set of parameters, that satisfies the system
			
 
				-for any input $\boldsymbol{x}$. In terms of the context of PINN's this is the
			
 
				-inverse problem, where we have a set of training data from measurements, for
			
 
				-example, is available along with the respective differential equations but
			
 
				-information about the parameters of the equations is lacking. To address this
			
 
				-challenge, we set these parameters as distinct learnable parameters within the
			
 
				-neural network. This enables the network to utilize a specific value, that
			
 
				-actively influences the physics loss $\mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}})$.
			
 
				-During the training phase the optimizer aims to minimize the physics loss,
			
 
				-which should ultimately yield an approximation of the true value.\\
			
 
				+for any input $\boldsymbol{x}$. In the context of PINNs, this is an inverse
			
 
				+problem. We have training data from measurements and the corresponding
			
 
				+differential equations, but the parameters of these equations are unknown. To
			
 
				+address this challenge, we implement these parameters as distinct learnable
			
 
				+parameters within the neural network. This enables the network to utilize a
			
 
				+specific value, that actively influences the physics loss
			
 
				+$\mathcal{L}_{\text{physics}}(\boldsymbol{x},\hat{\boldsymbol{y}})$. During the
			
 
				+training phase the optimizer aims to minimize the physics loss, which should
			
 
				+ultimately yield an approximation of the true parameter value fitting the
			
 
				+observations.\\
			
 
				 
			
 
				 \begin{figure}[h]
			
 
				   \centering
			
@@ -615,11 +628,11 @@ which should ultimately yield an approximation of the true value.\\
 
				     underdamped case. With $m=1kg$, $\mu=4\frac{Ns}{m}$ and $k=200\frac{N}{m}$.}
			
 
				   \label{fig:spring}
			
 
				 \end{figure}
			
 
				-One illustrative example of a potential application for PINN's is the
			
 
				-\emph{damped harmonic oscillator}~\cite{Demtroeder2021}. In this problem, we
			
 
				+In order to illustrate the working of a PINN, we use the example of a
			
 
				+\emph{damped harmonic oscillator} taken from~\cite{Moseley}. In this problem, we
			
 
				 displace a body, which is attached to a spring, from its resting position. The
			
 
				 body is subject to three forces: firstly, the inertia exerted by the
			
 
				-displacement $u$, which points in the direction the displacement $u$; secondly
			
 
				+displacement $u$, which points in the direction of the displacement; secondly,
			
 
				 the restoring force of the spring, which attempts to return the body to its
			
 
				 original position and thirdly, the friction force, which points in the opposite
			
 
				 direction of the movement. In accordance with Newton's second law and the
			
@@ -631,7 +644,7 @@ stiffness of the spring. The residual of the differential equation,
 
				   m\frac{d^2u}{dx^2}+\mu\frac{du}{dx}+ku=0,
			
 
				 \end{equation}
			
 
				 shows relation of these parameters in reference to the problem at hand. As
			
 
				-Tenenbaum and Morris provide, there are three potential solutions to this
			
 
				+Tenenbaum and Morris~\cite{Tenenbaum1985} provide, there are three potential solutions to this
			
 
				 issue. However only the \emph{underdamped case} results in an oscillating
			
 
				 movement of the body, as illustrated in~\Cref{fig:spring}. In order to apply a
			
 
				 PINN to this problem, we require a set of training data $x$. This consists of
			
@@ -641,22 +654,29 @@ we know the mass $m=1kg$, and the spring constant $k=200\frac{N}{m}$ and the
 
				 initial displacement $u^{(1)} = 1$ and $\frac{du(0)}{dt} = 0$. However, we do
			
 
				 not know the value of the friction $\mu$. In this case the loss function,
			
 
				 \begin{equation}
			
 
				-  \mathcal{L}_{osc}(\boldsymbol{x}, \boldsymbol{u}, \hat{\boldsymbol{u}}) = (u^{(1)}-1)+\frac{du(0)}{dt}+||m\frac{d^2u}{dx^2}+\mu\frac{du}{dx}+ku||^2 + \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{u}}^{(i)}-\boldsymbol{u}^{(i)}||^2,
			
 
				+  \begin{split}
			
 
				+    \mathcal{L}_{\text{osc}}(\boldsymbol{x}, \boldsymbol{u}, \hat{\boldsymbol{u}}) = & (u^{(1)}-1)+\frac{du(0)}{dt}+ \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{u}}^{(i)}-\boldsymbol{u}^{(i)}||^2 \\
			
 
				+    +                                                                                & ||m\frac{d^2u}{dx^2}+\mu\frac{du}{dx}+ku||^2,
			
 
				+  \end{split}
			
 
				 \end{equation}
			
 
				 includes the border conditions, the residual, in which $\mu$ is a learnable
			
 
				-parameter and the observation loss.
			
 
				+parameter and the data loss. This shows the methodology by which PINNs are capable
			
 
				+of learning the parameters of physical systems, such as the damped harmonic oscillator.
			
 
				+In the following section, we present the approach of Shaier \etal~\cite{Shaier2021}
			
 
				+to find the transmission rate and recovery rate of the SIR model using PINNs.
			
 
				 
			
 
				 % -------------------------------------------------------------------
			
 
				 
			
 
				 \subsection{Disease Informed Neural Networks}
			
 
				 \label{sec:pinn:dinn}
			
 
				-In this section, we describe the capability of MLP's to solve systems of
			
 
				-differential equations. In~\Cref{sec:pandemicModel:sir}, we describe the SIR
			
 
				-model, which models the relations of susceptible, infectious and removed
			
 
				-individuals and simulates the progress of a disease in a population with a
			
 
				-constant size. A system of differential equations models these relations. Shaier
			
 
				-\etal~\cite{Shaier2021} propose a method to solve the equations of the SIR model
			
 
				-using a PINN, which they call a \emph{disease-informed neural network} (DINN).\\
			
 
				+In the preceding section, we present a data-driven methodology, as described by Lagaris
			
 
				+\etal~\cite{Lagaris1997}, for solving systems of differential equations by employing
			
 
				+PINNs. In~\Cref{sec:pandemicModel:sir}, we describe the SIR model, which models
			
 
				+the relations of susceptible, infectious and removed individuals and simulates
			
 
				+the progress of a disease in a population with a constant size. A system of
			
 
				+differential equations models these relations. Shaier \etal~\cite{Shaier2021}
			
 
				+propose a method to solve the equations of the SIR model using a PINN, which
			
 
				+they call a \emph{Disease-Informed Neural Network} (DINN).\\
			
 
				 
			
 
				 To solve~\Cref{eq:sir} we need to find the transmission rate $\beta$ and the
			
 
				 recovery rate $\alpha$. As Shaier \etal~\cite{Shaier2021} point out, there are
			
@@ -672,7 +692,7 @@ could be defined using the amount of days a person between the point of
 
				 infection and the start of isolation $d$, $\alpha = \frac{1}{d}$. The analytical
			
 
				 solutions to the SIR models often use heuristic methods and require knowledge
			
 
				 like the sizes $S_0$ and $I_0$. A data-driven approach such as the one that
			
 
				-Shaier \etal~\cite{Shaier2021} propose does not have these problems. Since the
			
 
				+Shaier \etal~\cite{Shaier2021} propose does not suffer from these problems. Since the
			
 
				 model learns the parameters $\beta$ and $\alpha$ while learning the training
			
 
				 data consisting of the time points $\boldsymbol{t}$,  and the corresponding
			
 
				 measured sizes of the groups $\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}$.
			
@@ -684,11 +704,10 @@ and $r_R=\frac{d \hat{\boldsymbol{R}}}{dt} - \alpha \hat{\boldsymbol{I}}$ the
 
				 residuals of each differential equation using the model predictions. Then,
			
 
				 \begin{equation}
			
 
				   \begin{split}
			
 
				-    \mathcal{L}_{SIR}() = ||r_S||^2 + ||r_I||^2 + ||r_R||^2 + \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}||^2 &+\\
			
 
				-    ||\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}||^2 &+\\
			
 
				-    ||\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}||^2 &,
			
 
				+    \mathcal{L}_{SIR}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &||r_S||^2 + ||r_I||^2 + ||r_R||^2\\
			
 
				+    + &\frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}||^2 + ||\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}||^2 + ||\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}||^2,
			
 
				   \end{split}
			
 
				 \end{equation}
			
 
				-is the loss function of a DINN, with $\alpha$ and $beta$ being learnable
			
 
				-parameters.
			
 
				+is the loss function of a DINN, with $\alpha$ and $\beta$ being learnable
			
 
				+parameters. These are represented in the residuals of the ODEs.
			
 
				 % -------------------------------------------------------------------
			
--- a/thesis.bbl
+++ b/thesis.bbl
@@ -120,6 +120,13 @@ M.~Minsky and S.~A. Papert.
 
				   1972.
			
 
				 \newblock Literaturangaben.
			
 
				 
			
 
				+\bibitem{Moseley}
			
 
				+B.~Moseley.
			
 
				+\newblock So, what is a physics-informed neural network?
			
 
				+\newblock
			
 
				+  \url{https://benmoseley.blog/my-research/so-what-is-a-physics-informed-neural-network/}.
			
 
				+\newblock {Accessed: 2024-09-08}.
			
 
				+
			
 
				 \bibitem{Oksendal2000}
			
 
				 B.~Oksendal.
			
 
				 \newblock {\em Stochastic Differential Equations}.
			
@@ -141,10 +148,12 @@ A.~Paszke, S.~Gross, F.~Massa, A.~Lerer, J.~Bradbury, G.~Chanan, T.~Killeen,
 
				 \newblock Pytorch: An imperative style, high-performance deep learning library,
			
 
				   2019.
			
 
				 
			
 
				-\bibitem{Raissi2017}
			
 
				-M.~Raissi, P.~Perdikaris, and G.~E. Karniadakis.
			
 
				-\newblock Physics informed deep learning (part i): Data-driven solutions of
			
 
				-  nonlinear partial differential equations, 2017.
			
 
				+\bibitem{Raissi2019}
			
 
				+M.~Raissi, P.~Perdikaris, and G.~Karniadakis.
			
 
				+\newblock Physics-informed neural networks: A deep learning framework for
			
 
				+  solving forward and inverse problems involving nonlinear partial differential
			
 
				+  equations.
			
 
				+\newblock {\em Journal of Computational Physics}, 378:686--707, Feb. 2019.
			
 
				 
			
 
				 \bibitem{RKI}
			
 
				 RKI.
			
--- a/thesis.bib
+++ b/thesis.bib
@@ -158,16 +158,6 @@
 
				   subtitle  = {An introduction to computational geometry},
			
 
				 }
			
 
				 
			
 
				-@Misc{Raissi2017,
			
 
				-  author    = {Raissi, Maziar and Perdikaris, Paris and Karniadakis, George Em},
			
 
				-  title     = {Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations},
			
 
				-  year      = {2017},
			
 
				-  copyright = {arXiv.org perpetual, non-exclusive license},
			
 
				-  doi       = {10.48550/ARXIV.1711.10561},
			
 
				-  keywords  = {Artificial Intelligence (cs.AI), Machine Learning (cs.LG), Numerical Analysis (math.NA), Dynamical Systems (math.DS), Machine Learning (stat.ML), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Mathematics, FOS: Mathematics},
			
 
				-  publisher = {arXiv},
			
 
				-}
			
 
				-
			
 
				 @Book{Demtroeder2021,
			
 
				   author    = {Demtröder, Wolfgang},
			
 
				   publisher = {Springer Spektrum},
			
@@ -483,4 +473,24 @@
 
				   publisher = {Wiley},
			
 
				 }
			
 
				 
			
 
				+@Article{Raissi2019,
			
 
				+  author    = {Raissi, M. and Perdikaris, P. and Karniadakis, G.E.},
			
 
				+  journal   = {Journal of Computational Physics},
			
 
				+  title     = {Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations},
			
 
				+  year      = {2019},
			
 
				+  issn      = {0021-9991},
			
 
				+  month     = feb,
			
 
				+  pages     = {686--707},
			
 
				+  volume    = {378},
			
 
				+  doi       = {10.1016/j.jcp.2018.10.045},
			
 
				+  publisher = {Elsevier BV},
			
 
				+}
			
 
				+
			
 
				+@Misc{Moseley,
			
 
				+  author       = {Ben Moseley},
			
 
				+  howpublished = {\url{https://benmoseley.blog/my-research/so-what-is-a-physics-informed-neural-network/}},
			
 
				+  note         = {{Accessed: 2024-09-08}},
			
 
				+  title        = {So, what is a physics-informed neural network?},
			
 
				+}
			
 
				+
			
 
				 @Comment{jabref-meta: databaseType:bibtex;}
			
--- a/thesis.pdf
+++ b/thesis.pdf