|
@@ -462,7 +462,7 @@ calculation enables MLP's to approximate any function. As Hornik
|
|
|
three groups $S$, $I$ and $R$.}
|
|
|
\label{fig:mlp_example}
|
|
|
\end{figure}
|
|
|
-
|
|
|
+\todo{caption}
|
|
|
The term \emph{training} describes the process of optimizing the parameters
|
|
|
$\theta$. In order to undertake training, it is necessary to have a set of
|
|
|
\emph{training data}, which is a set of pairs (also called training points) of
|
|
@@ -491,9 +491,17 @@ signifies ascent and a negative gradient indicates descent, we must move the
|
|
|
variable by a constant \emph{learning rate} (step size) in the opposite
|
|
|
direction to that of the gradient. The calculation of the derivatives in respect
|
|
|
to the parameters is a complex task, since our functions is a composition of
|
|
|
-many functions (one for each layer). The algorithm of \emph{back propagation} \todo{Insert source}
|
|
|
-takes the advantage of~\Cref{eq:mlp_char} and addresses this issue by employing
|
|
|
-the chain rule of calculus.\\
|
|
|
+many functions (one for each layer). We can address this issue taking advantage
|
|
|
+of~\Cref{eq:mlp_char} and employing the chain rule of calculus. Let
|
|
|
+$\hat{\boldsymbol{y}} = f(w; \theta)$ be the model prediction with
|
|
|
+$w = f^{(2)}(z; \theta_2)$ and $z = f^{(1)}(\boldsymbol{x}; \theta_1)$.
|
|
|
+$\boldsymbol{x}$ is the input vector and $\theta_1, \theta_2\subset\theta$.
|
|
|
+Then,
|
|
|
+\begin{equation}
|
|
|
+ \nabla_{\theta_1} \Loss{ } = \frac{d\mathcal{L}}{d\hat{\boldsymbol{y}}}\frac{d\hat{\boldsymbol{y}}}{df^{(2)}}\frac{df^{(2)}}{df^{(1)}}\nabla_{\theta_1}f^{(1)},
|
|
|
+\end{equation}
|
|
|
+is the gradient of $\Loss{ }$ in respect of the parameters $\theta_1$. The name
|
|
|
+of this method in the context of neural networks is \emph{back propagation}. \todo{Insert source}\\
|
|
|
|
|
|
In practical applications, an optimizer often accomplishes the optimization task
|
|
|
by executing gradient descent in the background. Furthermore, modifying the
|
|
@@ -512,9 +520,25 @@ systems.
|
|
|
|
|
|
\section{Physics Informed Neural Networks 5}
|
|
|
\label{sec:pinn}
|
|
|
-In~\Cref{sec:mlp} we described the structure and training of MLP's, which are
|
|
|
-recognized tools for approximating any kind of function. In this section we want
|
|
|
-to make use of this ability and us neural networks as approximators for ODE's.
|
|
|
+
|
|
|
+In~\Cref{sec:mlp}, we describe the structure and training of MLP's, which are
|
|
|
+recognized tools for approximating any kind of function. This section, we
|
|
|
+show that this capability can be applied to create a solver for ODE's and PDE's
|
|
|
+as Legaris \etal~\cite{Lagaris1997} describe in their paper. In this method, the
|
|
|
+model learns to approximate a function using the given data points and employs
|
|
|
+knowledge that is available about the problem such as a system of differential
|
|
|
+system. The physics-informed neural network (PINN) learns system of differential
|
|
|
+equations during training, as it tries to optimize its output to fit the
|
|
|
+equations.\\
|
|
|
+
|
|
|
+In contrast to standard MLP's PINN's have a modified Loss term. Ultimately, the
|
|
|
+loss includes the above-mentioned prior knowledge to the problem. While still
|
|
|
+containing the loss term, that seeks to minimize the distance between the model
|
|
|
+predictions and the solutions, which is the observation loss $\Loss{obs} =
|
|
|
+ \Loss{MSE}$, a PINN adds a term that includes the residuals of the differential
|
|
|
+equations, which is the physics loss $\mathcal{L}_{physics}(\boldsymbol{x},
|
|
|
+ \hat{\boldsymbol{y}})$ of the PINN and tries to optimize the prediction to fit
|
|
|
+the differential equations.
|
|
|
|
|
|
% -------------------------------------------------------------------
|
|
|
|