FlipediFlop 11 сар өмнө
parent
commit
80040a445b
5 өөрчлөгдсөн 122 нэмэгдсэн , 19 устгасан
  1. 106 19
      chapters/chap02/chap02.tex
  2. BIN
      images/oscilator.pdf
  3. 6 0
      thesis.bbl
  4. 10 0
      thesis.bib
  5. BIN
      thesis.pdf

+ 106 - 19
chapters/chap02/chap02.tex

@@ -474,7 +474,7 @@ metric for evaluating the extent to which the model deviates from the correct
 answer. One of the most common loss function is the \emph{mean square error}
 (MSE) loss function. Let $N$ be the number of points in the set of training
 data. Then,
-\begin{equation}
+\begin{equation} \label{eq:mse}
   \Loss{MSE} = \frac{1}{N}\sum_{i=1}^{N} ||\hat{\boldsymbol{y}}^{(i)}-\boldsymbol{y}^{(i)}||^2,
 \end{equation}
 calculates the squared difference between each model prediction and true value
@@ -497,7 +497,7 @@ $\hat{\boldsymbol{y}} = f(w; \theta)$ be the model prediction with
 $w = f^{(2)}(z; \theta_2)$ and $z = f^{(1)}(\boldsymbol{x}; \theta_1)$.
 $\boldsymbol{x}$ is the input vector and $\theta_1, \theta_2\subset\theta$.
 Then,
-\begin{equation}
+\begin{equation}\label{eq:backprop}
   \nabla_{\theta_1} \Loss{ } = \frac{d\mathcal{L}}{d\hat{\boldsymbol{y}}}\frac{d\hat{\boldsymbol{y}}}{df^{(2)}}\frac{df^{(2)}}{df^{(1)}}\nabla_{\theta_1}f^{(1)},
 \end{equation}
 is the gradient of $\Loss{ }$ in respect of the parameters $\theta_1$. The name
@@ -522,23 +522,110 @@ systems.
 \label{sec:pinn}
 
 In~\Cref{sec:mlp}, we describe the structure and training of MLP's, which are
-recognized tools for approximating any kind of function. This section, we
-show that this capability can be applied to create a solver for ODE's and PDE's
-as Legaris \etal~\cite{Lagaris1997} describe in their paper. In this method, the
-model learns to approximate a function using the given data points and employs
-knowledge that is available about the problem such as a system of differential
-system. The physics-informed neural network (PINN) learns system of differential
-equations during training, as it tries to optimize its output to fit the
-equations.\\
-
-In contrast to standard MLP's PINN's have a modified Loss term. Ultimately, the
-loss includes the above-mentioned prior knowledge to the problem. While still
-containing the loss term, that seeks to minimize the distance between the model
-predictions and the solutions, which is the observation loss $\Loss{obs} =
-  \Loss{MSE}$, a PINN adds a term that includes the residuals of the differential
-equations, which is the physics loss $\mathcal{L}_{physics}(\boldsymbol{x},
-  \hat{\boldsymbol{y}})$ of the PINN and tries to optimize the prediction to fit
-the differential equations.
+wildely recognized tools for approximating any kind of function. In this
+section, we apply this capability to create a solver for ODE's and PDE's
+as Legaris \etal~\cite{Lagaris1997} describe in their paper. In this approach,
+the model learns to approximate a function using provided data points while
+leveraging the available knowledge about the problem in the form of a system of
+differential equations. The \emph{physics-informed neural network} (PINN)
+learns the system of differential equations during training, as it optimizes
+its output to align with the equations.\\
+
+In contrast to standard MLP's, the loss term of a PINN comprises two
+components. The first term incorporates the aforementioned prior knowledge to pertinent the problem. As Raissi
+\etal~\cite{Raissi2017} propose, the residual of each differential equation in
+the system must be minimized in order for the model to optimize its output in accordance with the theory.
+We obtain the residual $R_i$, with $i\in\{1, ...,N_d\}$, by rearranging the differential equation and
+calculating the difference between the left-hand side and the right-hand side
+of the equation. $N_d$ is the number of differential equations in a system. As
+Raissi \etal~\cite{Raissi2017} propose the \emph{physics
+  loss} of a PINN,\todo{check source again}
+\begin{equation}
+  \mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}}) = \frac{1}{N_d}\sum_{i=1}^{N_d} ||r_i(\boldsymbol{x},\hat{\boldsymbol{y}})||^2,
+\end{equation}
+takes the input data and the model prediction to calculate the mean square
+error of the residuals. The second term, the \emph{observation loss}
+$\Loss{obs}$, employs the mean square error of the distances between the
+predicted and the true values for each training point. Additionally, the
+observation loss may incorporate extra terms of inital and boundary conditions. Let $N_t$
+denote the number of training points. Then,
+\begin{equation}
+  \mathcal{L}_{PINN}(\boldsymbol{x}, \boldsymbol{y},\hat{\boldsymbol{y}}) = \frac{1}{N_d}\sum_{i=1}^{N_d} ||r_i(\boldsymbol{x},\hat{\boldsymbol{y}})||^2 + \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{y}}^{(i)}-\boldsymbol{y}^{(i)}||^2,
+\end{equation}\\
+represents the comprehensive loss function of a physics-informed neural network. \\
+
+\todo{check for correctness}
+Given the nature of residuals, calculating the loss term of
+$\mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}})$ requires the
+calculation of the derivative of the output with respect to the input of
+the neural network. As we outline in~\Cref{sec:mlp}, during the process of
+back-propagation we calculate the gradients of the loss term in respect to a
+layer-specific set of parameters denoted by $\theta_l$, where $l$ represents
+the index of the \todo{check for consistency} respective layer. By employing
+the chain rule of calculus, the algorithm progresses from the output layer
+through each hidden layer, ultimately reaching the first layer in order to
+compute the respective gradients. The term,
+\begin{equation}
+  \nabla_{\boldsymbol{x}} \hat{\boldsymbol{y}} = \frac{d\hat{\boldsymbol{y}}}{df^{(2)}}\frac{df^{(2)}}{df^{(1)}}\nabla_{\boldsymbol{x}}f^{(1)},
+\end{equation}
+illustrates that, in contrast to the procedure described in~\cref{eq:backprop},
+this procedure the \emph{automatic differenciation} goes one step further and
+calculates the gradient of the output with respect to the input
+$\boldsymbol{x}$. In order to calculate the second derivative
+$\frac{d\hat{\boldsymbol{y}}}{d\boldsymbol{x}}=\nabla_{\boldsymbol{x}} (\nabla_{\boldsymbol{x}} \hat{\boldsymbol{y}} ),$
+this procedure must be repeated.\\
+
+Above we present a method for approximating functions through the use of
+systems of differential equations. As previously stated, we want to find a
+solver for systems of differential equations. In problems, where we must solve
+an ODE or PDE, we have to find a set of parameters, that satisfies the system
+for any input $\boldsymbol{x}$. In terms of the context of PINN's this is the
+inverse problem, where we have a set of training data from measurements, for
+example, is available along with the respective differential equations but
+information about the parameters of the equations is lacking. To address this
+challenge, we set these parameters as distinct learnable parameters within the
+neural network. This enables the network to utilize a specific value, that
+actively influences the physics loss $\mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}})$.
+During the training phase the optimizer aims to minimize the physics loss,
+which should ultimately yield an approximation of the true value.\\
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[scale=0.87]{oscilator.pdf}
+  \caption{Illustration of of the movement of an oscillating body in the
+    underdamped case. With $m=1kg$, $\mu=4\frac{Ns}{m}$ and $k=200\frac{N}{m}$.}
+  \label{fig:spring}
+\end{figure}
+One illustrative example of a potential application for PINN's is the
+\emph{damped harmonic oscillator}~\cite{Tenenbaum1985}. In this problem, we \todo{check source for wording}
+displace a body, which is attached to a spring, from its resting position. The
+body is subject to three forces: firstly, the inertia exerted by the
+displacement $u$, which points in the direction the displacement $u$; secondly
+the restoring force of the spring, which attempts to return the body to its
+original position and thirdly, the friction force, which points in the opposite
+direction of the movement. In accordance with Newton's second law and the
+combined influence of these forces, the body exhibits oscillatory motion around
+its position of rest. The system is influenced by $m$ the mass of the body,
+$\mu$ the coefficient of friction and $k$ the spring constant, indicating the
+stiffness of the spring. The residual of the differential equation, \todo{check in book}
+\begin{equation}
+  m\frac{d^2u}{dx^2}+\mu\frac{du}{dx}+ku=0,
+\end{equation}
+shows relation of these parameters in reference to the problem at hand. As
+Tenenbaum and Morris provide, there are three potential solutions to this
+issue. However only the \emph{underdamped case} results in an oscillating
+movement of the body, as illustrated in~\Cref{fig:spring}. In order to apply a
+PINN to this problem, we require a set of training data $x$. This consists of
+pairs of timepoints and corresponding displacement measurements
+$(t^{(i)}, u^{(i)})$, where $i\in\{1, ..., N_t\}$. In this hypothetical case,
+we know the mass $m=1kg$, and the spring constant $k=200\frac{N}{m}$ and the
+initial displacement $u^{(1)} = 1$ and $\frac{du(0)}{dt} = 0$, However, we do
+not know the value of the friction $\mu$. In this case the loss function,
+\begin{equation}
+  \mathcal{L}_{osc}(\boldsymbol{x}, \boldsymbol{u}, \hat{\boldsymbol{u}}) = (u^{(1)}-1)+\frac{du(0)}{dt}+||m\frac{d^2u}{dx^2}+\mu\frac{du}{dx}+ku||^2 + \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{u}}^{(i)}-\boldsymbol{u}^{(i)}||^2,
+\end{equation}
+includes the border conditions, the residual, in which $\mu$ is a learnable
+parameter and the observation loss.
 
 % -------------------------------------------------------------------
 

BIN
images/oscilator.pdf


+ 6 - 0
thesis.bbl

@@ -77,6 +77,12 @@
 \newblock DOI 10.1037/h0042519. --
 \newblock ISSN 0033--295X
 
+\bibitem[RPK17]{Raissi2017}
+\textsc{Raissi}, Maziar ; \textsc{Perdikaris}, Paris  ; \textsc{Karniadakis},
+  George~E.:
+\newblock \emph{Physics Informed Deep Learning (Part I): Data-driven Solutions
+  of Nonlinear Partial Differential Equations}
+
 \bibitem[Rud07]{Rudin2007}
 \textsc{Rudin}, Walter:
 \newblock \emph{Analysis}.

+ 10 - 0
thesis.bib

@@ -158,4 +158,14 @@
   subtitle  = {An introduction to computational geometry},
 }
 
+@Misc{Raissi2017,
+  author    = {Raissi, Maziar and Perdikaris, Paris and Karniadakis, George Em},
+  title     = {Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations},
+  year      = {2017},
+  copyright = {arXiv.org perpetual, non-exclusive license},
+  doi       = {10.48550/ARXIV.1711.10561},
+  keywords  = {Artificial Intelligence (cs.AI), Machine Learning (cs.LG), Numerical Analysis (math.NA), Dynamical Systems (math.DS), Machine Learning (stat.ML), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Mathematics, FOS: Mathematics},
+  publisher = {arXiv},
+}
+
 @Comment{jabref-meta: databaseType:bibtex;}

BIN
thesis.pdf