chap03.tex 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245
  1. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  2. % Author: Phillip Rothenbeck
  3. % Title: Investigating the Evolution of the COVID-19 Pandemic in Germany Using Physics-Informed Neural Networks
  4. % File: chap03/chap03.tex
  5. % Part: Methods
  6. % Description:
  7. % summary of the content in this chapter
  8. % Version: 20.08.2024
  9. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  10. \chapter{Methods 8}
  11. \label{chap:methods}
  12. This chapter provides the methods, that we employ to address the problem that we
  13. present in~\Cref{chap:introduction}.~\Cref{sec:preprocessing} outlines
  14. our approaches for preprocessing of the available data and has two
  15. sections. The first section describes the publicly available data provided by
  16. the \emph{Robert Koch Institute} (RKI)\footnote[1]{\url{https://www.rki.de/EN/Home/homepage_node.html}}.
  17. The second section outlines the techniques we use to process this data to fit
  18. our project's requirements. Subsequently, we give a theoretical overview of the
  19. PINN's that we employ. These latter sections, establish the foundation for the
  20. implementations described in~\Cref{sec:sir:setup} and~\Cref{sec:rsir:setup}.
  21. % -------------------------------------------------------------------
  22. \section{Epidemiological Data 3}
  23. \label{sec:preprocessing}
  24. In order for the PINNs to be effective with the data available to us, it is
  25. necessary for the data to be in the format required by the epidemiological
  26. models, which the PINNs will solve. Let $N_t$ be the number of training points,
  27. then let $i\in\{1, ..., N_t\}$ be the index of the training points. The data
  28. required by the PINN for solving the SIR model (see~\Cref{sec:pinn:dinn}),
  29. consists of pairs $(\boldsymbol{t}^{(i)}, (\boldsymbol{S}^{(i)}, \boldsymbol{I}^{(i)}, \boldsymbol{R}^{(i)}))$.
  30. Given that the system of differential equations representing the reduced SIR
  31. model (see~\Cref{sec:pandemicModel:rsir}) consists of a single differential
  32. equation for $I$, it is necessary to obtain pairs of the form
  33. $(\boldsymbol{t}^{(i)}, \boldsymbol{I}^{(i)})$. This section, focuses on the
  34. structure of the available data and the methods we employ to transform it into
  35. the correct structure.
  36. % -------------------------------------------------------------------
  37. \subsection{RKI Data 2}
  38. \label{sec:preprocessing:rki}
  39. The Robert Koch Institute is responsible for the on monitoring and prevention of
  40. diseases. As the central institution of the German government in the field of
  41. biomedicine, one of its tasks during the COVID-19 pandemic was it to track the
  42. number of infections and death cases in Germany. The data was collected by
  43. university hospitals, research facilities and laboratories through the
  44. conduction of tests. Each new case must be reported within a period of 24 hours
  45. at the latest to the respective state authority. Each state authority collects
  46. the cases for a day and must report them to the RKI by the following working
  47. day. The RKI then refines the data and releases statistics and updates its
  48. repositories holding the information for the public to access. For the purposes
  49. of this thesis we concentrate on two of these repositories.\\
  50. The first repository is called \emph{COVID-19-Todesfälle in Deutschland}\footnote{\url{https://github.com/robert-koch-institut/COVID-19-Todesfaelle_in_Deutschland.git}}.
  51. The dataset comprises discrete data points, each with a date indicating the
  52. point in time at which the respective data was collected. The dates span from
  53. March 9, 2020, to the present day. For each date, the dataset provides the total
  54. number of infection and death cases, the number of new deaths, and the
  55. case-fatality ratio. The total number of infection and death cases represents
  56. the sum of all cases reported up to that date, including the newly reported
  57. data. The dataset includes two additional datasets, that contain the death case
  58. information organized by age group or by the individual states within Germany on
  59. a weekly basis.\\
  60. \begin{figure}[h]
  61. \centering
  62. \includegraphics[width=\textwidth]{dataset_visualization.pdf}
  63. \caption{A visualization of the total death case and infection case data for
  64. each day from the data set \emph{COVID-19-Todesfälle in Deutschland}. Status
  65. of the 20'th of August 2024.}
  66. \label{fig:rki_data}
  67. \end{figure}
  68. The second repository is entitled \emph{SARS-CoV-2 Infektionen in Deutschland}.
  69. This dataset contains comprehensive data regarding the infections of each county
  70. on a daily basis. The counties are encoded using the \emph{Community Identification Number}\footnote{\url{https://www.destatis.de/DE/Themen/Laender-Regionen/Regionales/Gemeindeverzeichnis/_inhalt.html}},
  71. wherein the first two digits denote the state, the third digit represents the
  72. government district, and the last two digits indicate the county. Each data
  73. point displays the gender, the age group, number death, infection and recovery
  74. cases and the reference and report date. The reference date marks the onset of
  75. illness in the individual. In the absence of this information, the reference
  76. date is equivalent to the report date.\\
  77. The RKI assumes that the duration of the illness under normal conditions is 14 days,
  78. while the duration of severe cases is assumed to be 28 days. The recovery cases
  79. in the dataset are calculated using these assumptions, by adding the duration on
  80. the reference date if it is given. As stated in the ReadMe, the recovery data
  81. should be used with caution. Since we require the recovery data for further
  82. calculations, the following section presents the solutions we employed to address
  83. this issue.
  84. % -------------------------------------------------------------------
  85. \subsection{Data Preprocessing 1}
  86. \label{sec:preprocessing:rq}
  87. At the outset of this section, we establish the format of the data, that is
  88. necessary for training the PINNs. In this subsection, we present the method, that we
  89. employ to preprocess and transform the RKI data (see~\Cref{sec:preprocessing:rki})
  90. into the training data. \\
  91. In order to obtain the SIR data we require the size of each SIR compartment for
  92. each time point. The infection case data for the German states is available on
  93. a daily basis. To obtain the daily cases for the entire country we need to
  94. differentiate the total number of cases. The size of the population is defined
  95. as the respective size at the beginning of 2020. Using the starting conditions
  96. of~\Cref{eq:startCond}, we iterate through each day, modifying the sizes of the
  97. groups in a consecutive manner. For each iteration we subtract the new infection
  98. cases from $\boldsymbol{S}^{(i-1)}$ to obtain $\boldsymbol{S}^{(i)}$, for
  99. $\boldsymbol{I}^{(i)}$, we add the new cases and subtract deaths and recoveries,
  100. and the size of $\boldsymbol{R}^{(i)}$ is obtained by adding the new deaths and
  101. recoveries as they occur.\\
  102. As previously stated in~\Cref{sec:preprocessing:rki} the data on recoveries may
  103. either be unreliable or is entirely absent. To address this, we propose a method
  104. for computing the number of recovered individuals per day. Under the assumption
  105. that recovery takes $D$ days, we present the recovery queue, a data structure
  106. that holds the number of infections for a given day, retains them for $D$ days,
  107. and releases them into the removed group $D$ days later.\\
  108. \begin{figure}[h]
  109. \centering
  110. \includegraphics[width=\textwidth]{recovery_queue.pdf}
  111. \caption{The recovery queue takes in the infected individuals for the $k$'th
  112. day and releases them $D$ days later into the removed group.}
  113. \label{fig:recovery_queue}
  114. \end{figure}
  115. In order to solve the reduced SIR model, we employ a similar algorithm to that
  116. used for the SIR model. However, in contrast to the recovery queue, we utilize
  117. the set recovery rate $\alpha$ to transfer a portion $\alpha\boldsymbol{I}^{(i)}$
  118. of infections, which have recovered on the $i$ and put them into the
  119. $\boldsymbol{R}^{(i)}$ compartment, which is irrelevant to our purposes. \\
  120. The transformed data for both the SIR model and the reduced SIR model are then
  121. employed by the PINN models, which we describe in the subsequent section.
  122. % -------------------------------------------------------------------
  123. \section{Estimating Epidemiological Parameters using PINNs 3}
  124. \label{sec:pinn:sir}
  125. In the preceding section, we present the methods we employ to preprocess and
  126. format the data from the RKI in accordance with the specifications required for
  127. the work of this thesis. In this section, we will present the method we employ
  128. to identify the non-time-dependent SIR parameters $\beta$ and $\alpha$ for the
  129. data. As a foundation for our work, we draw upon the work of Shaier et
  130. al.~\cite{Shaier2021}, to solve the SIR system of ODEs using PINNs.\\
  131. In order to conduct an analysis of a pandemic, it is necessary to have a quantifiable measure
  132. that indicates whether the disease in question has the capacity to spread rapidly through a
  133. population or is it not successful in infecting a significant number of
  134. individuals. We employ the SIR model to construct an abstraction of the complex
  135. relations inherent to real-world pandemics. The SIR model divides the population into three
  136. compartments. It is accompanied by a with system of ODEs that encapsulates the
  137. fluctuations and relationships between these compartments (see~\Cref{eq:sir}).
  138. The transmission rate $\beta$ and the recovery rate $\alpha$ work as the
  139. aforementioned quantifiers. We obtain data from the preprocessing stage. It
  140. provides insight into the progression of the COVID-19 pandemic in Germany.
  141. The objective is to identify a function that solves the system of differential
  142. equations of the SIR model, by returning the size of each compartment at a
  143. specific point in time. This function is supposed to be able to reconstruct the
  144. training data and is defined by the values of the transition rates $\beta$ and
  145. $\alpha$. From a mathematical and semantic perspective, it is essential to
  146. determine these values of the parameter.\\
  147. In order to ascertain the transmission rate $\beta$ and the recovery rate $\alpha$
  148. from the preprocessed RKI data of $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
  149. for a given set of time points, it is necessary to employ a data-driven approach that outputs
  150. a model prediction of $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
  151. for a set of time points, with the aim of minimizing the term,
  152. \begin{equation}\label{eq:SIR_obs_term}
  153. \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
  154. \end{equation}
  155. for each data point in the set of training dataset of a cardinality $N_tt$ and with
  156. $i\in\{1, ..., N_t\}$. Moreover, the aforementioned parameters must satisfy the system
  157. of differential equations that govern the SIR model. For this reason, Shaier
  158. \etal~\cite{Shaier2021} utilize a PINN framework to satisfy both requirements.
  159. Their approach, which they refer to as the \emph{disease-informed neural network}
  160. (see~\Cref{sec:pinn:dinn}), takes epidemiological data as the input and returns
  161. the two transition rates $\alpha$ and $\beta$. This method
  162. achieves this by finding an approximate solution of to the inverse problem of
  163. physics-informed neural networks (see~\Cref{sec:pinn}). In terms of the terms of
  164. the SIR model, a PINN addresses the inverse problemin two ways. First, it minimizes~\Cref{eq:SIR_obs_term}
  165. by bringing the model predictions $(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R})$
  166. closer to the actual values $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$
  167. for each time point. Second, it reduces the residuals of the ODEs that
  168. constitute the SIR model. While the forward problem concludes at this point, the
  169. inverse problem presets that a parameter is unknown. Thus, we designate the parameters
  170. $\beta$ and $\alpha$ as free, learnable parameters, $\widehat{\beta}$ and
  171. $\widehat{\alpha}$. These separate trainable parameters are values that are
  172. optimized during the training process and must fit the equations of the set of
  173. ODEs. Furthermore, we know, that the transition rates
  174. do not surpass the value of $1$. Consequently, we force the value of both rates to be in a
  175. range of $[-1, 1]$. Therefor, we regularize the parameters using the
  176. \emph{tangens hyperbolicus}. This results in the terms,
  177. \begin{equation}
  178. \widehat{\beta} = \tanh(\tilde{\beta}),\quad \widehat{\alpha} = \tanh(\tilde{\alpha}),
  179. \end{equation}
  180. where $\tilde{\beta}$ and $\tilde{\alpha}$ are the predicted values of the model
  181. and $\widehat{\beta}$ and $\widehat{\alpha}$ are regularized model predictions.\\
  182. The input data must include the time point $\boldsymbol{t}^{(i)}$ and its
  183. corresponding measured true values of $(\boldsymbol{S}^{(i)}, \boldsymbol{I}^{(i)}, \boldsymbol{R}^{(i)})$.
  184. In its forward path, the PINN receives the time point $\boldsymbol{t}^{(i)}$ as its input, from which it
  185. calculates its model prediction $(\hat{\boldsymbol{S}}^{(i)}, \hat{\boldsymbol{I}}^{(i)}, \hat{\boldsymbol{R}}^{(i)})$
  186. based on its model parameters $\theta$. Subsequently, the model computes the loss function. It calculates the observation loss by taking the
  187. mean squared error of~\Cref{eq:SIR_obs_term} over all $N_t$ training samples.
  188. Therefore, the term for the observation loss is,
  189. \begin{equation}
  190. \mathcal{L}_{\text{obs}}(\boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = \frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
  191. \end{equation}
  192. is the term for the observation loss. Given superior performance in practical applications
  193. relative to the ODEs of~\Cref{eq:sir}, we utilize the ODEs of~\Cref{eq:modSIR}
  194. in our physics loss. In order for the model to learn the system of differential,
  195. it is necessary to obtain the residual of each ODE. The mean square error of the residuals constitutes
  196. the physics loss $\mathcal{L}_{\text{physiks}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$.
  197. The residuals are calculated using the model predictions $(\hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}})$ and the regularized model predictions of the parameters $\widehat{\beta}$ and $\widehat{\alpha}$. The residuals are given by,
  198. \begin{equation}
  199. 0=\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}, \quad 0=\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}, \quad 0=\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}.
  200. \end{equation}
  201. Thus,
  202. \begin{equation}
  203. \begin{split}
  204. \mathcal{L}_{\text{SIR}}(\boldsymbol{t}, \boldsymbol{S}, \boldsymbol{I}, \boldsymbol{R}, \hat{\boldsymbol{S}}, \hat{\boldsymbol{I}}, \hat{\boldsymbol{R}}) = &\bigg\|\frac{d\hat{\boldsymbol{S}}}{d\boldsymbol{t}}+ \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{I}}}{d\boldsymbol{t}} - \widehat{\beta}\frac{\hat{\boldsymbol{S}}\hat{\boldsymbol{I}}}{N} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\ + &\bigg\|\frac{d\hat{\boldsymbol{R}}}{d\boldsymbol{t}} + \widehat{\alpha}\hat{\boldsymbol{I}}\bigg\|^2\\
  205. + &\frac{1}{N_t}\sum_{i=1}^{N_t} \Big\|\hat{\boldsymbol{S}}^{(i)}-\boldsymbol{S}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{I}}^{(i)}-\boldsymbol{I}^{(i)}\Big\|^2 + \Big\|\hat{\boldsymbol{R}}^{(i)}-\boldsymbol{R}^{(i)}\Big\|^2,
  206. \end{split}
  207. \end{equation}
  208. is the equation of the total loss for our approach. This loss value is then
  209. back-propagated through our network, while the model predictions of the
  210. parameters $\beta$ and $\alpha$ are optimized using the loss as well.\\
  211. As this section concentrates on the finding of the time constant parameters
  212. $\beta$ and $\alpha$, the next section will show our approach of finding the
  213. reproduction number $\Rt$ on the German data of the RKI.
  214. % -------------------------------------------------------------------
  215. \section{PINN for the reduced SIR Model 2}
  216. \label{sec:pinn:rsir}
  217. % -------------------------------------------------------------------