11 ヶ月前 · 80040a445b
--- a/chapters/chap02/chap02.tex
+++ b/chapters/chap02/chap02.tex
@@ -474,7 +474,7 @@ metric for evaluating the extent to which the model deviates from the correct
 
				 answer. One of the most common loss function is the \emph{mean square error}
			
 
				 (MSE) loss function. Let $N$ be the number of points in the set of training
			
 
				 data. Then,
			
 
				-\begin{equation}
			
 
				+\begin{equation} \label{eq:mse}
			
 
				   \Loss{MSE} = \frac{1}{N}\sum_{i=1}^{N} ||\hat{\boldsymbol{y}}^{(i)}-\boldsymbol{y}^{(i)}||^2,
			
 
				 \end{equation}
			
 
				 calculates the squared difference between each model prediction and true value
			
@@ -497,7 +497,7 @@ $\hat{\boldsymbol{y}} = f(w; \theta)$ be the model prediction with
 
				 $w = f^{(2)}(z; \theta_2)$ and $z = f^{(1)}(\boldsymbol{x}; \theta_1)$.
			
 
				 $\boldsymbol{x}$ is the input vector and $\theta_1, \theta_2\subset\theta$.
			
 
				 Then,
			
 
				-\begin{equation}
			
 
				+\begin{equation}\label{eq:backprop}
			
 
				   \nabla_{\theta_1} \Loss{ } = \frac{d\mathcal{L}}{d\hat{\boldsymbol{y}}}\frac{d\hat{\boldsymbol{y}}}{df^{(2)}}\frac{df^{(2)}}{df^{(1)}}\nabla_{\theta_1}f^{(1)},
			
 
				 \end{equation}
			
 
				 is the gradient of $\Loss{ }$ in respect of the parameters $\theta_1$. The name
			
@@ -522,23 +522,110 @@ systems.
 
				 \label{sec:pinn}
			
 
				 
			
 
				 In~\Cref{sec:mlp}, we describe the structure and training of MLP's, which are
			
 
				-recognized tools for approximating any kind of function. This section, we
			
 
				-show that this capability can be applied to create a solver for ODE's and PDE's
			
 
				-as Legaris \etal~\cite{Lagaris1997} describe in their paper. In this method, the
			
 
				-model learns to approximate a function using the given data points and employs
			
 
				-knowledge that is available about the problem such as a system of differential
			
 
				-system. The physics-informed neural network (PINN) learns system of differential
			
 
				-equations during training, as it tries to optimize its output to fit the
			
 
				-equations.\\
			
 
				-
			
 
				-In contrast to standard MLP's PINN's have a modified Loss term. Ultimately, the
			
 
				-loss includes the above-mentioned prior knowledge to the problem. While still
			
 
				-containing the loss term, that seeks to minimize the distance between the model
			
 
				-predictions and the solutions, which is the observation loss $\Loss{obs} =
			
 
				-  \Loss{MSE}$, a PINN adds a term that includes the residuals of the differential
			
 
				-equations, which is the physics loss $\mathcal{L}_{physics}(\boldsymbol{x},
			
 
				-  \hat{\boldsymbol{y}})$ of the PINN and tries to optimize the prediction to fit
			
 
				-the differential equations.
			
 
				+wildely recognized tools for approximating any kind of function. In this
			
 
				+section, we apply this capability to create a solver for ODE's and PDE's
			
 
				+as Legaris \etal~\cite{Lagaris1997} describe in their paper. In this approach,
			
 
				+the model learns to approximate a function using provided data points while
			
 
				+leveraging the available knowledge about the problem in the form of a system of
			
 
				+differential equations. The \emph{physics-informed neural network} (PINN)
			
 
				+learns the system of differential equations during training, as it optimizes
			
 
				+its output to align with the equations.\\
			
 
				+
			
 
				+In contrast to standard MLP's, the loss term of a PINN comprises two
			
 
				+components. The first term incorporates the aforementioned prior knowledge to pertinent the problem. As Raissi
			
 
				+\etal~\cite{Raissi2017} propose, the residual of each differential equation in
			
 
				+the system must be minimized in order for the model to optimize its output in accordance with the theory.
			
 
				+We obtain the residual $R_i$, with $i\in\{1, ...,N_d\}$, by rearranging the differential equation and
			
 
				+calculating the difference between the left-hand side and the right-hand side
			
 
				+of the equation. $N_d$ is the number of differential equations in a system. As
			
 
				+Raissi \etal~\cite{Raissi2017} propose the \emph{physics
			
 
				+  loss} of a PINN,\todo{check source again}
			
 
				+\begin{equation}
			
 
				+  \mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}}) = \frac{1}{N_d}\sum_{i=1}^{N_d} ||r_i(\boldsymbol{x},\hat{\boldsymbol{y}})||^2,
			
 
				+\end{equation}
			
 
				+takes the input data and the model prediction to calculate the mean square
			
 
				+error of the residuals. The second term, the \emph{observation loss}
			
 
				+$\Loss{obs}$, employs the mean square error of the distances between the
			
 
				+predicted and the true values for each training point. Additionally, the
			
 
				+observation loss may incorporate extra terms of inital and boundary conditions. Let $N_t$
			
 
				+denote the number of training points. Then,
			
 
				+\begin{equation}
			
 
				+  \mathcal{L}_{PINN}(\boldsymbol{x}, \boldsymbol{y},\hat{\boldsymbol{y}}) = \frac{1}{N_d}\sum_{i=1}^{N_d} ||r_i(\boldsymbol{x},\hat{\boldsymbol{y}})||^2 + \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{y}}^{(i)}-\boldsymbol{y}^{(i)}||^2,
			
 
				+\end{equation}\\
			
 
				+represents the comprehensive loss function of a physics-informed neural network. \\
			
 
				+
			
 
				+\todo{check for correctness}
			
 
				+Given the nature of residuals, calculating the loss term of
			
 
				+$\mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}})$ requires the
			
 
				+calculation of the derivative of the output with respect to the input of
			
 
				+the neural network. As we outline in~\Cref{sec:mlp}, during the process of
			
 
				+back-propagation we calculate the gradients of the loss term in respect to a
			
 
				+layer-specific set of parameters denoted by $\theta_l$, where $l$ represents
			
 
				+the index of the \todo{check for consistency} respective layer. By employing
			
 
				+the chain rule of calculus, the algorithm progresses from the output layer
			
 
				+through each hidden layer, ultimately reaching the first layer in order to
			
 
				+compute the respective gradients. The term,
			
 
				+\begin{equation}
			
 
				+  \nabla_{\boldsymbol{x}} \hat{\boldsymbol{y}} = \frac{d\hat{\boldsymbol{y}}}{df^{(2)}}\frac{df^{(2)}}{df^{(1)}}\nabla_{\boldsymbol{x}}f^{(1)},
			
 
				+\end{equation}
			
 
				+illustrates that, in contrast to the procedure described in~\cref{eq:backprop},
			
 
				+this procedure the \emph{automatic differenciation} goes one step further and
			
 
				+calculates the gradient of the output with respect to the input
			
 
				+$\boldsymbol{x}$. In order to calculate the second derivative
			
 
				+$\frac{d\hat{\boldsymbol{y}}}{d\boldsymbol{x}}=\nabla_{\boldsymbol{x}} (\nabla_{\boldsymbol{x}} \hat{\boldsymbol{y}} ),$
			
 
				+this procedure must be repeated.\\
			
 
				+
			
 
				+Above we present a method for approximating functions through the use of
			
 
				+systems of differential equations. As previously stated, we want to find a
			
 
				+solver for systems of differential equations. In problems, where we must solve
			
 
				+an ODE or PDE, we have to find a set of parameters, that satisfies the system
			
 
				+for any input $\boldsymbol{x}$. In terms of the context of PINN's this is the
			
 
				+inverse problem, where we have a set of training data from measurements, for
			
 
				+example, is available along with the respective differential equations but
			
 
				+information about the parameters of the equations is lacking. To address this
			
 
				+challenge, we set these parameters as distinct learnable parameters within the
			
 
				+neural network. This enables the network to utilize a specific value, that
			
 
				+actively influences the physics loss $\mathcal{L}_{physics}(\boldsymbol{x},\hat{\boldsymbol{y}})$.
			
 
				+During the training phase the optimizer aims to minimize the physics loss,
			
 
				+which should ultimately yield an approximation of the true value.\\
			
 
				+
			
 
				+\begin{figure}[h]
			
 
				+  \centering
			
 
				+  \includegraphics[scale=0.87]{oscilator.pdf}
			
 
				+  \caption{Illustration of of the movement of an oscillating body in the
			
 
				+    underdamped case. With $m=1kg$, $\mu=4\frac{Ns}{m}$ and $k=200\frac{N}{m}$.}
			
 
				+  \label{fig:spring}
			
 
				+\end{figure}
			
 
				+One illustrative example of a potential application for PINN's is the
			
 
				+\emph{damped harmonic oscillator}~\cite{Tenenbaum1985}. In this problem, we \todo{check source for wording}
			
 
				+displace a body, which is attached to a spring, from its resting position. The
			
 
				+body is subject to three forces: firstly, the inertia exerted by the
			
 
				+displacement $u$, which points in the direction the displacement $u$; secondly
			
 
				+the restoring force of the spring, which attempts to return the body to its
			
 
				+original position and thirdly, the friction force, which points in the opposite
			
 
				+direction of the movement. In accordance with Newton's second law and the
			
 
				+combined influence of these forces, the body exhibits oscillatory motion around
			
 
				+its position of rest. The system is influenced by $m$ the mass of the body,
			
 
				+$\mu$ the coefficient of friction and $k$ the spring constant, indicating the
			
 
				+stiffness of the spring. The residual of the differential equation, \todo{check in book}
			
 
				+\begin{equation}
			
 
				+  m\frac{d^2u}{dx^2}+\mu\frac{du}{dx}+ku=0,
			
 
				+\end{equation}
			
 
				+shows relation of these parameters in reference to the problem at hand. As
			
 
				+Tenenbaum and Morris provide, there are three potential solutions to this
			
 
				+issue. However only the \emph{underdamped case} results in an oscillating
			
 
				+movement of the body, as illustrated in~\Cref{fig:spring}. In order to apply a
			
 
				+PINN to this problem, we require a set of training data $x$. This consists of
			
 
				+pairs of timepoints and corresponding displacement measurements
			
 
				+$(t^{(i)}, u^{(i)})$, where $i\in\{1, ..., N_t\}$. In this hypothetical case,
			
 
				+we know the mass $m=1kg$, and the spring constant $k=200\frac{N}{m}$ and the
			
 
				+initial displacement $u^{(1)} = 1$ and $\frac{du(0)}{dt} = 0$, However, we do
			
 
				+not know the value of the friction $\mu$. In this case the loss function,
			
 
				+\begin{equation}
			
 
				+  \mathcal{L}_{osc}(\boldsymbol{x}, \boldsymbol{u}, \hat{\boldsymbol{u}}) = (u^{(1)}-1)+\frac{du(0)}{dt}+||m\frac{d^2u}{dx^2}+\mu\frac{du}{dx}+ku||^2 + \frac{1}{N_t}\sum_{i=1}^{N_t} ||\hat{\boldsymbol{u}}^{(i)}-\boldsymbol{u}^{(i)}||^2,
			
 
				+\end{equation}
			
 
				+includes the border conditions, the residual, in which $\mu$ is a learnable
			
 
				+parameter and the observation loss.
			
 
				 
			
 
				 % -------------------------------------------------------------------
			
 
				 
			
--- a/images/oscilator.pdf
+++ b/images/oscilator.pdf
--- a/thesis.bbl
+++ b/thesis.bbl
@@ -77,6 +77,12 @@
 
				 \newblock DOI 10.1037/h0042519. --
			
 
				 \newblock ISSN 0033--295X
			
 
				 
			
 
				+\bibitem[RPK17]{Raissi2017}
			
 
				+\textsc{Raissi}, Maziar ; \textsc{Perdikaris}, Paris  ; \textsc{Karniadakis},
			
 
				+  George~E.:
			
 
				+\newblock \emph{Physics Informed Deep Learning (Part I): Data-driven Solutions
			
 
				+  of Nonlinear Partial Differential Equations}
			
 
				+
			
 
				 \bibitem[Rud07]{Rudin2007}
			
 
				 \textsc{Rudin}, Walter:
			
 
				 \newblock \emph{Analysis}.
			
--- a/thesis.bib
+++ b/thesis.bib
@@ -158,4 +158,14 @@
 
				   subtitle  = {An introduction to computational geometry},
			
 
				 }
			
 
				 
			
 
				+@Misc{Raissi2017,
			
 
				+  author    = {Raissi, Maziar and Perdikaris, Paris and Karniadakis, George Em},
			
 
				+  title     = {Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations},
			
 
				+  year      = {2017},
			
 
				+  copyright = {arXiv.org perpetual, non-exclusive license},
			
 
				+  doi       = {10.48550/ARXIV.1711.10561},
			
 
				+  keywords  = {Artificial Intelligence (cs.AI), Machine Learning (cs.LG), Numerical Analysis (math.NA), Dynamical Systems (math.DS), Machine Learning (stat.ML), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Mathematics, FOS: Mathematics},
			
 
				+  publisher = {arXiv},
			
 
				+}
			
 
				+
			
 
				 @Comment{jabref-meta: databaseType:bibtex;}
			
--- a/thesis.pdf
+++ b/thesis.pdf