|
@@ -117,10 +117,12 @@ Then, Newton's second law translates mathematically to
|
|
|
It is evident that the acceleration, $a=\frac{dv}{dt}$, as the rate of change of
|
|
|
the velocity is part of the equation. Additionally, the velocity of a body is
|
|
|
the derivative of the distance traveled by that body. Based on these findings,
|
|
|
-we can rewrite the~\Cref{eq:newtonSecLaw} to
|
|
|
+we can rewrite the~\Cref{eq:newtonSecLaw} to,
|
|
|
\begin{equation}
|
|
|
- F=ma=m\frac{d^2s}{dt^2}.
|
|
|
-\end{equation}\\
|
|
|
+ F=ma=m\frac{d^2s}{dt^2},
|
|
|
+\end{equation}
|
|
|
+showing that the force $F$ influences the changes of the body's position over
|
|
|
+time.\\
|
|
|
|
|
|
To conclude, note that this explanation of differential equations focuses on the
|
|
|
aspects deemed crucial for this thesis and is not intended to be a complete
|
|
@@ -323,18 +325,18 @@ faster rate than new infections the peak will occur later and will be low. Thus,
|
|
|
it is crucial to know both $\beta$ and $\alpha$, as these parameters
|
|
|
characterize how the pandemic evolves.\\
|
|
|
|
|
|
-The SIR model makes a number of assumptions that are intended to reduce the
|
|
|
-model's overall complexity while simultaneously increasing its divergence from
|
|
|
-actual reality. One such assumption is that the size of the population, $N$,
|
|
|
-remains constant, as the daily change is negligible to the total population.
|
|
|
-This depiction is not an accurate representation of the actual relations \todo{other assumptions in a bad light?}
|
|
|
-observed in the real world, as the size of a population is subject to a number
|
|
|
-of factors that can contribute to change. The population is increased by the
|
|
|
-occurrence of births and decreased by the occurrence of deaths. Other examples
|
|
|
-are the impossibility for individuals to be susceptible again, after having
|
|
|
-recovered, or the possibility for the transition rates to change due to new
|
|
|
-variants or the implementation of new countermeasures. We address this latter
|
|
|
-option in the next~\Cref{sec:pandemicModel:rsir}.
|
|
|
+The SIR model is based on a number of assumptions that are intended to reduce
|
|
|
+the overall complexity of the model while still representing the processes
|
|
|
+observed in the real-world. For example, the size of a population
|
|
|
+in the real-world is subject to a number of factors that can contribute to
|
|
|
+change. The population is increased by the occurrence of births and decreased
|
|
|
+by the occurrence of deaths. One assumption, stated in the SIR model is that
|
|
|
+the size of the population, $N$, remains constant, as the daily change is
|
|
|
+negligible to the total population. Other examples include the impossibility
|
|
|
+for individuals to be susceptible again, after having recovered, or the
|
|
|
+possibility for the transition rates to change due to new variants or the
|
|
|
+implementation of new countermeasures. We address this latter option in the
|
|
|
+next~\Cref{sec:pandemicModel:rsir}.
|
|
|
|
|
|
% -------------------------------------------------------------------
|
|
|
|
|
@@ -350,11 +352,11 @@ countermeasures that reduce the contact between the infectious and susceptible
|
|
|
individuals, the emergence of a new variant of the disease that increases its
|
|
|
infectivity or deadliness, or the administration of a vaccination that provides
|
|
|
previously susceptible individuals with immunity without ever being infected.
|
|
|
-To address this, based on the time-dependent transition rates introduced by Liu
|
|
|
-and Stechlinski~\cite{Liu2012}, and Setianto and Hidayat~\cite{Setianto2023},
|
|
|
-Millevoi \etal~\cite{Millevoi2023} present a model that simultaneously reduces
|
|
|
-the size of the system of differential equations and solves the problem of time
|
|
|
-scaling at hand.\\
|
|
|
+As these fine details of disease progression are missed in the constant rates,
|
|
|
+Liu and Stechlinski~\cite{Liu2012}, and Setianto and Hidayat~\cite{Setianto2023},
|
|
|
+introduce time-dependent transition rates and the time-dependent reproduction
|
|
|
+number to address this issue. Millevoi \etal~\cite{Millevoi2023} present a
|
|
|
+reduced version of the SIR model.\\
|
|
|
|
|
|
First, they alter the definition of $\beta$ and $\alpha$ to be dependent on the time interval
|
|
|
$\mathcal{T} = [t_0, t_f]\subseteq \mathbb{R}_{\geq0}$,
|
|
@@ -371,7 +373,7 @@ represents the number of susceptible individuals, that one infectious individual
|
|
|
infects at the onset of the pandemic. In light of the effects of $\beta$ and
|
|
|
$\alpha$ (see~\Cref{sec:pandemicModel:sir}), $\RO < 1$ indicates that the
|
|
|
pandemic is emerging. In this scenario $\alpha$ is relatively low due to the
|
|
|
-limited number of infections resulting from $I(t_0) << S(t_0)$.\\ Further,
|
|
|
+limited number of infections resulting from $I(t_0) << S(t_0)$. Further,
|
|
|
$\RO > 1$ leads to the disease spreading rapidly across the population, with an
|
|
|
increase in $I$ occurring at a high rate. Nevertheless, $\RO$ does not cover
|
|
|
the entire time span. For this reason, Millevoi \etal~\cite{Millevoi2023}
|
|
@@ -381,12 +383,13 @@ defined as,
|
|
|
\begin{equation}\label{eq:repr_num}
|
|
|
\Rt=\frac{\beta(t)}{\alpha(t)}\cdot\frac{S(t)}{N},
|
|
|
\end{equation}
|
|
|
-on the time interval $\mathcal{T}$. This definition includes the transition
|
|
|
-rates for information about the spread of the disease and information of the
|
|
|
-decrease of the ratio of susceptible individuals in the population. In contrast
|
|
|
-to $\beta$ and $\alpha$, $\Rt$ is not a parameter but \todo{Sai comment - earlier?}
|
|
|
-a state variable in the model and enabling the following reduction of the SIR
|
|
|
-model.\\
|
|
|
+on the time interval $\mathcal{T}$ and the population site $N$. This definition
|
|
|
+includes the transition rates for information about the spread of the disease
|
|
|
+and information of the decrease of the ratio of susceptible individuals in the
|
|
|
+population. In contrast to $\beta$ and $\alpha$, $\Rt$ is not a parameter but
|
|
|
+a state variable in the model, which gives information about the reproduction of the disease
|
|
|
+for each day. As Millevoi \etal~\cite{Millevoi2023} show, $\Rt$ enables the
|
|
|
+following reduction of the SIR model.\\
|
|
|
|
|
|
\Cref{eq:N_char} allows for the calculation of the value of the group $R$ using
|
|
|
$S$ and $I$, with the term $R(t)=N-S(t)-I(t)$. Thus,
|
|
@@ -414,19 +417,20 @@ variable $I$, results in,
|
|
|
which is a further reduced version of~\Cref{eq:sir}. This less complex
|
|
|
differential equation results in a less complex solution, as it entails the
|
|
|
elimination of a parameter ($\beta$) and the two state variables ($S$ and $R$).
|
|
|
-The reduced SIR model, is more precise in applications with a worse data
|
|
|
-situation, due to its fewer input variables.
|
|
|
+The reduced SIR model is more precise due to fewer input variables, making it
|
|
|
+advantageous in situations with limited data, such as when recovery data is
|
|
|
+missing.
|
|
|
|
|
|
% -------------------------------------------------------------------
|
|
|
|
|
|
\section{Multilayer Perceptron}
|
|
|
\label{sec:mlp}
|
|
|
-In~\Cref{sec:differentialEq}, we demonstrate the significance of differential
|
|
|
+In~\Cref{sec:differentialEq}, we discuss the modeling of systems using differential
|
|
|
equations in systems, illustrating how they can be utilized to elucidate the
|
|
|
impact of a specific parameter on the system's behavior.
|
|
|
In~\Cref{sec:epidemModel}, we show specific applications of differential
|
|
|
-equations in an epidemiological context. The final objective is to solve these
|
|
|
-equations by finding a function that fits. Fitting measured data points to
|
|
|
+equations in an epidemiological context. Solving such systems is crucial and
|
|
|
+involves finding a function that best fits the data. Fitting measured data points to
|
|
|
approximate such a function, is one of the multiple methods to achieve this
|
|
|
goal. The \emph{Multilayer Perceptron} (MLP)~\cite{Rumelhart1986} is a
|
|
|
data-driven function approximator. In the following section, we provide a brief
|
|
@@ -499,7 +503,7 @@ calculates the squared difference between each model prediction and true value
|
|
|
of a training and takes the mean across the whole training data. \\
|
|
|
|
|
|
Ultimately, the objective is to utilize this information to optimize the parameters, in order to minimize the
|
|
|
-loss. One of the most fundamental optimization strategy is \emph{gradient
|
|
|
+loss. One of the most fundamental and seminal optimization strategy is \emph{gradient
|
|
|
descent}. In this process, the derivatives are employed to identify the location
|
|
|
of local or global minima within a function, which lie where the gradient is
|
|
|
zero. Given that a positive gradient
|
|
@@ -525,7 +529,7 @@ error backwards through the neural network.\\
|
|
|
|
|
|
In practical applications, an optimizer often accomplishes the optimization task
|
|
|
by executing back propagation in the background. Furthermore, modifying the
|
|
|
-learning rate during training can be advantageous. For instance, making larger \todo{leave whole paragraph out? - Niklas}
|
|
|
+learning rate during training can be advantageous. For instance, making larger
|
|
|
steps at the beginning and minor adjustments at the end. Therefore, schedulers
|
|
|
are implementations algorithms that employ diverse learning rate alteration
|
|
|
strategies.\\
|
|
@@ -542,40 +546,47 @@ solutions to differential systems.
|
|
|
\label{sec:pinn}
|
|
|
|
|
|
In~\Cref{sec:mlp}, we describe the structure and training of MLP's, which are
|
|
|
-wildely recognized tools for approximating any kind of function. In this
|
|
|
-section, we apply this capability to create a solver for ODE's and PDE's
|
|
|
-as Legaris \etal~\cite{Lagaris1997} describe in their paper. In this approach,
|
|
|
-the model learns to approximate a function using provided data points while
|
|
|
+wildely recognized tools for approximating any kind of function. In 1997
|
|
|
+Lagaris \etal~\cite{Lagaris1997} provided a method, that utilizes gradient
|
|
|
+descent to solve ODEs and PDEs. Building on this approach, Raissi
|
|
|
+\etal~\cite{Raissi2019} introduced the methodology with the name
|
|
|
+\emph{Physics-Informed Neural Network} (PINN) in 2017. In this approach, the
|
|
|
+model learns to approximate a function using provided data points while
|
|
|
leveraging the available knowledge about the problem in the form of a system of
|
|
|
-differential equations. The \emph{physics-informed neural network} (PINN)
|
|
|
-learns the system of differential equations during training, as it optimizes
|
|
|
-its output to align with the equations.\\
|
|
|
-
|
|
|
-In contrast to standard MLP's, PINNs are not only data-driven. The loss term of a PINN comprises two
|
|
|
-components. The first term incorporates the equations of the aforementioned prior knowledge to pertinent the problem. As Raissi
|
|
|
-\etal~\cite{Raissi2017} propose, the residual of each differential equation in
|
|
|
-the system must be minimized in order for the model to optimize its output in accordance with the theory.
|
|
|
-We obtain the residual $r_i$, with $i\in\{1, ...,N_d\}$, by rearranging the differential equation and
|
|
|
-calculating the difference between the left-hand side and the right-hand side
|
|
|
-of the equation. $N_d$ is the number of differential equations in a system. As
|
|
|
-Raissi \etal~\cite{Raissi2017} propose the \emph{physics
|
|
|
- loss} of a PINN,
|
|
|
+differential equations.\\
|
|
|
+
|
|
|
+In contrast to standard MLP models, PINNs are not solely data-driven. The differential
|
|
|
+equation,
|
|
|
\begin{equation}
|
|
|
- \mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}}) = \frac{1}{N_d}\sum_{i=1}^{N_d} ||r_i(\boldsymbol{x},\hat{\boldsymbol{y}})||^2,
|
|
|
+ \boldsymbol{y}=\mathcal{D}(\boldsymbol{x}),
|
|
|
\end{equation}
|
|
|
-takes the input data and the model prediction to calculate the mean square
|
|
|
-error of the residuals. The second term, the \emph{observation loss}
|
|
|
-$\Loss{obs}$, employs the mean square error of the distances between the
|
|
|
-predicted and the true values for each training point. Additionally, the
|
|
|
-observation loss may incorporate extra terms of inital and boundary conditions. Let $N_t$
|
|
|
-denote the number of training points. Then,
|
|
|
-\begin{equation}
|
|
|
- \mathcal{L}_{PINN}(\boldsymbol{x}, \boldsymbol{y},\hat{\boldsymbol{y}}) = \frac{1}{N_d}\sum_{i=1}^{N_d} ||r_i(\boldsymbol{x},\hat{\boldsymbol{y}})||^2 + \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{y}}^{(i)}-\boldsymbol{y}^{(i)}||^2,
|
|
|
-\end{equation}\\
|
|
|
-represents the comprehensive loss function of a physics-informed neural network. \\
|
|
|
-
|
|
|
-Given the nature of residuals, calculating the loss term of
|
|
|
-$\mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}})$ requires the
|
|
|
+includes both the solution $\boldsymbol{y}$, and the operant $\mathcal{D}$,
|
|
|
+which incorporates all derivatives with respect to the input $\boldsymbol{x}$.
|
|
|
+This equation contains the information of the physical properties and dynamics of
|
|
|
+$\boldsymbol{y}$. In order to find the solution $\boldsymbol{y}$, we must solve the
|
|
|
+differential equation with respect to data which is related to the problem at hand. As
|
|
|
+Raissi \etal~\cite{Raissi2019} propose, we employ a neural network with the
|
|
|
+parameters $\theta$. The MLP then is supposed to optimize its parameters for
|
|
|
+its output $\hat{\boldsymbol{y}}$ to approximate the solution $\boldsymbol{y}$.
|
|
|
+In order to achieve this, we train the model on data containing input-output
|
|
|
+pairs with measures of $\boldsymbol{y}$. The output $\hat{\boldsymbol{y}}$ is
|
|
|
+fitted to the data through the mean square error data loss $\mathcal{L}_{\text{data}}$.
|
|
|
+Moreover, the data loss function may include additional terms for initial and boundary
|
|
|
+conditions. Furthermore, the physics are incorporated through an additional loss
|
|
|
+term of the physics loss $\mathcal{L}_{\text{physics}}$ which includes the
|
|
|
+differential equation through its residual $r=\boldsymbol{y} - \mathcal{D}(\boldsymbol{x})$.
|
|
|
+This leads to the PINN loss function,
|
|
|
+\begin{align}\label{eq:PINN_loss}
|
|
|
+ \mathcal{L}_{\text{PINN}}(\boldsymbol{x}, \boldsymbol{y},\hat{\boldsymbol{y}}) & = & & \mathcal{L}_{\text{data}} (\boldsymbol{y},\hat{\boldsymbol{y}}) & + & \quad \mathcal{L}_{\text{physics}} (\boldsymbol{x}, \boldsymbol{y},\hat{\boldsymbol{y}}) & \\
|
|
|
+ & = & & \frac{1}{N_t}\sum_{i=1}^{N_t} || \hat{\boldsymbol{y}}^{(i)}-\boldsymbol{y}^{(i)}||^2 & + & \quad\frac{1}{N_d}\sum_{i=1}^{N_d} || r_i(\boldsymbol{x},\hat{\boldsymbol{y}})||^2 & ,
|
|
|
+\end{align}
|
|
|
+with $N_d$ the number of differential equations in a system and $N_t$ the
|
|
|
+number of training samples used for training. Utilizing~\Cref{eq:PINN_loss}, the
|
|
|
+PINN simultaneously optimizes its parameters $\theta$ to minimize both the data
|
|
|
+loss and the physics loss. This makes it a multi-objective optimization problem.\\
|
|
|
+
|
|
|
+Given the nature of differential equations, calculating the loss term of
|
|
|
+$\mathcal{L}_{\text{physics}}(\boldsymbol{x},\hat{\boldsymbol{y}})$ requires the
|
|
|
calculation of the derivative of the output with respect to the input of
|
|
|
the neural network. As we outline in~\Cref{sec:mlp}, during the process of
|
|
|
back-propagation we calculate the gradients of the loss term in respect to a
|
|
@@ -587,26 +598,28 @@ compute the respective gradients. The term,
|
|
|
\begin{equation}
|
|
|
\nabla_{\boldsymbol{x}} \hat{\boldsymbol{y}} = \frac{d\hat{\boldsymbol{y}}}{df^{(2)}}\frac{df^{(2)}}{df^{(1)}}\nabla_{\boldsymbol{x}}f^{(1)},
|
|
|
\end{equation}
|
|
|
-illustrates that, in contrast to the procedure described in~\cref{eq:backprop},
|
|
|
-this procedure the \emph{automatic differenciation} goes one step further and
|
|
|
+illustrates that, in contrast to the procedure described in~\Cref{eq:backprop},
|
|
|
+this procedure the \emph{automatic differentiation} goes one step further and
|
|
|
calculates the gradient of the output with respect to the input
|
|
|
$\boldsymbol{x}$. In order to calculate the second derivative
|
|
|
$\frac{d\hat{\boldsymbol{y}}}{d\boldsymbol{x}}=\nabla_{\boldsymbol{x}} (\nabla_{\boldsymbol{x}} \hat{\boldsymbol{y}} ),$
|
|
|
this procedure must be repeated.\\
|
|
|
|
|
|
-Above we present a method for approximating functions through the use of
|
|
|
-systems of differential equations. As previously stated, we want to find a
|
|
|
-solver for systems of differential equations. In problems, where we must solve
|
|
|
+Above we present a method by Raissi et al.~\cite{Raissi2019} for approximating
|
|
|
+functions through the use of systems of differential equations. As previously
|
|
|
+stated, we want to find a
|
|
|
+solution for systems of differential equations. In problems, where we must solve
|
|
|
an ODE or PDE, we have to find a set of parameters, that satisfies the system
|
|
|
-for any input $\boldsymbol{x}$. In terms of the context of PINN's this is the
|
|
|
-inverse problem, where we have a set of training data from measurements, for
|
|
|
-example, is available along with the respective differential equations but
|
|
|
-information about the parameters of the equations is lacking. To address this
|
|
|
-challenge, we set these parameters as distinct learnable parameters within the
|
|
|
-neural network. This enables the network to utilize a specific value, that
|
|
|
-actively influences the physics loss $\mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}})$.
|
|
|
-During the training phase the optimizer aims to minimize the physics loss,
|
|
|
-which should ultimately yield an approximation of the true value.\\
|
|
|
+for any input $\boldsymbol{x}$. In the context of PINNs, this is an inverse
|
|
|
+problem. We have training data from measurements and the corresponding
|
|
|
+differential equations, but the parameters of these equations are unknown. To
|
|
|
+address this challenge, we implement these parameters as distinct learnable
|
|
|
+parameters within the neural network. This enables the network to utilize a
|
|
|
+specific value, that actively influences the physics loss
|
|
|
+$\mathcal{L}_{\text{physics}}(\boldsymbol{x},\hat{\boldsymbol{y}})$. During the
|
|
|
+training phase the optimizer aims to minimize the physics loss, which should
|
|
|
+ultimately yield an approximation of the true parameter value fitting the
|
|
|
+observations.\\
|
|
|
|
|
|
\begin{figure}[h]
|
|
|
\centering
|
|
@@ -615,11 +628,11 @@ which should ultimately yield an approximation of the true value.\\
|
|
|
underdamped case. With $m=1kg$, $\mu=4\frac{Ns}{m}$ and $k=200\frac{N}{m}$.}
|
|
|
\label{fig:spring}
|
|
|
\end{figure}
|
|
|
-One illustrative example of a potential application for PINN's is the
|
|
|
-\emph{damped harmonic oscillator}~\cite{Demtroeder2021}. In this problem, we
|
|
|
+In order to illustrate the working of a PINN, we use the example of a
|
|
|
+\emph{damped harmonic oscillator} taken from~\cite{Moseley}. In this problem, we
|
|
|
displace a body, which is attached to a spring, from its resting position. The
|
|
|
body is subject to three forces: firstly, the inertia exerted by the
|
|
|
-displacement $u$, which points in the direction the displacement $u$; secondly
|
|
|
+displacement $u$, which points in the direction of the displacement; secondly,
|
|
|
the restoring force of the spring, which attempts to return the body to its
|
|
|
original position and thirdly, the friction force, which points in the opposite
|
|
|
direction of the movement. In accordance with Newton's second law and the
|
|
@@ -631,7 +644,7 @@ stiffness of the spring. The residual of the differential equation,
|
|
|
m\frac{d^2u}{dx^2}+\mu\frac{du}{dx}+ku=0,
|
|
|
\end{equation}
|
|
|
shows relation of these parameters in reference to the problem at hand. As
|
|
|
-Tenenbaum and Morris provide, there are three potential solutions to this
|
|
|
+Tenenbaum and Morris~\cite{Tenenbaum1985} provide, there are three potential solutions to this
|
|
|
issue. However only the \emph{underdamped case} results in an oscillating
|
|
|
movement of the body, as illustrated in~\Cref{fig:spring}. In order to apply a
|
|
|
PINN to this problem, we require a set of training data $x$. This consists of
|
|
@@ -641,22 +654,29 @@ we know the mass $m=1kg$, and the spring constant $k=200\frac{N}{m}$ and the
|
|
|
initial displacement $u^{(1)} = 1$ and $\frac{du(0)}{dt} = 0$. However, we do
|
|
|
not know the value of the friction $\mu$. In this case the loss function,
|
|
|
\begin{equation}
|
|
|
- \mathcal{L}_{osc}(\boldsymbol{x}, \boldsymbol{u}, \hat{\boldsymbol{u}}) = (u^{(1)}-1)+\frac{du(0)}{dt}+||m\frac{d^2u}{dx^2}+\mu\frac{du}{dx}+ku||^2 + \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{u}}^{(i)}-\boldsymbol{u}^{(i)}||^2,
|
|
|
+ \begin{split}
|
|
|
+ \mathcal{L}_{\text{osc}}(\boldsymbol{x}, \boldsymbol{u}, \hat{\boldsymbol{u}}) = & (u^{(1)}-1)+\frac{du(0)}{dt}+ \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{u}}^{(i)}-\boldsymbol{u}^{(i)}||^2 \\
|
|
|
+ + & ||m\frac{d^2u}{dx^2}+\mu\frac{du}{dx}+ku||^2,
|
|
|
+ \end{split}
|
|
|
\end{equation}
|
|
|
includes the border conditions, the residual, in which $\mu$ is a learnable
|
|
|
-parameter and the observation loss.
|
|
|
+parameter and the data loss. This shows the methodology by which PINNs are capable
|
|
|
+of learning the parameters of physical systems, such as the damped harmonic oscillator.
|
|
|
+In the following section, we present the approach of Shaier \etal~\cite{Shaier2021}
|
|
|
+to find the transmission rate and recovery rate of the SIR model using PINNs.
|
|
|
|
|
|
% -------------------------------------------------------------------
|
|
|
|
|
|
\subsection{Disease Informed Neural Networks}
|
|
|
\label{sec:pinn:dinn}
|
|
|
-In this section, we describe the capability of MLP's to solve systems of
|
|
|
-differential equations. In~\Cref{sec:pandemicModel:sir}, we describe the SIR
|
|
|
-model, which models the relations of susceptible, infectious and removed
|
|
|
-individuals and simulates the progress of a disease in a population with a
|
|
|
-constant size. A system of differential equations models these relations. Shaier
|
|
|
-\etal~\cite{Shaier2021} propose a method to solve the equations of the SIR model
|
|
|
-using a PINN, which they call a \emph{disease-informed neural network} (DINN).\\
|
|
|
+In the preceding section, we present a data-driven methodology, as described by Lagaris
|
|
|
+\etal~\cite{Lagaris1997}, for solving systems of differential equations by employing
|
|
|
+PINNs. In~\Cref{sec:pandemicModel:sir}, we describe the SIR model, which models
|
|
|
+the relations of susceptible, infectious and removed individuals and simulates
|
|
|
+the progress of a disease in a population with a constant size. A system of
|
|
|
+differential equations models these relations. Shaier \etal~\cite{Shaier2021}
|
|
|
+propose a method to solve the equations of the SIR model using a PINN, which
|
|
|
+they call a \emph{Disease-Informed Neural Network} (DINN).\\
|
|
|
|
|
|
To solve~\Cref{eq:sir} we need to find the transmission rate $\beta$ and the
|
|
|
recovery rate $\alpha$. As Shaier \etal~\cite{Shaier2021} point out, there are
|
|
@@ -672,7 +692,7 @@ could be defined using the amount of days a person between the point of
|
|
|
infection and the start of isolation $d$, $\alpha = \frac{1}{d}$. The analytical
|
|
|
solutions to the SIR models often use heuristic methods and require knowledge
|
|
|
like the sizes $S_0$ and $I_0$. A data-driven approach such as the one that
|
|
|
-Shaier \etal~\cite{Shaier2021} propose does not have these problems. Since the
|
|
|
+Shaier \etal~\cite{Shaier2021} propose does not suffer from these problems. Since the
|
|
|
model learns the parameters $\beta$ and $\alpha$ while learning the training
|
|
|
data consisting of the time points $\boldsymbol{t}$, and the corresponding
|
|
|
measured sizes of the groups $\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}$.
|
|
@@ -684,11 +704,10 @@ and $r_R=\frac{d \hat{\boldsymbol{R}}}{dt} - \alpha \hat{\boldsymbol{I}}$ the
|
|
|
residuals of each differential equation using the model predictions. Then,
|
|
|
\begin{equation}
|
|
|
\begin{split}
|
|
|
- \mathcal{L}_{SIR}() = ||r_S||^2 + ||r_I||^2 + ||r_R||^2 + \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}||^2 &+\\
|
|
|
- ||\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}||^2 &+\\
|
|
|
- ||\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}||^2 &,
|
|
|
+ \mathcal{L}_{SIR}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &||r_S||^2 + ||r_I||^2 + ||r_R||^2\\
|
|
|
+ + &\frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}||^2 + ||\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}||^2 + ||\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}||^2,
|
|
|
\end{split}
|
|
|
\end{equation}
|
|
|
-is the loss function of a DINN, with $\alpha$ and $beta$ being learnable
|
|
|
-parameters.
|
|
|
+is the loss function of a DINN, with $\alpha$ and $\beta$ being learnable
|
|
|
+parameters. These are represented in the residuals of the ODEs.
|
|
|
% -------------------------------------------------------------------
|