Return to Article Details Solving ill-posed Helmholtz problems with physics-informed neural networks

Solving ill-posed Helmholtz problems
with physics-informed neural networks\(^\ast \)

Mihai Nechita\(^{\ast \S \bullet }\)

February 21, 2023; accepted: June 29, 2023; published online: July 5, 2023.

We consider the unique continuation (data assimilation) problem for the Helmholtz equation and study its numerical approximation based on physics-informed neural networks (PINNs). Exploiting the conditional stability of the problem, we first give a bound on the generalization error of PINNs. We then present numerical experiments in 2d for different frequencies and for geometric configurations with different stability bounds for the continuation problem. The results show that vanilla PINNs provide good approximations even for noisy data in configurations with robust stability (both low and moderate frequencies), but may struggle otherwise. This indicates that more sophisticated techniques are needed to obtain PINNs that are frequency-robust for inverse problems subject to the Helmholtz equation.

MSC. 65N12, 65N20, 68T07.

Keywords. ill-posed problems; inverse problems; unique continuation; data assimilation; Helmholtz equation; physics-informed neural networks.

\(^\ast \)This work was supported by the project "The Development of Advanced and Applicative Research Competencies in the Logic of STEAM + Health” /POCU/993/6/13/153310, project co-financed by the European Social Fund through The Romanian Operational Programme Human Capital 2014-2020.

\(^\S \)Tiberiu Popoviciu Institute of Numerical Analysis, Romanian Academy, Cluj-Napoca, Romania, e-mail: mihai.nechita@ictp.acad.ro, orcid.org/0000-0002-6448-912X

\(^\bullet \)Department of Mathematics, Babeș-Bolyai University, Cluj-Napoca, Romania

1 Introduction

Physics-informed neural networks [ 16 ] have recently emerged as an effective way of approximating ill-posed/inverse problems for partial differential equations (PDEs), which are challenging to solve numerically due their inherent instability [ 11 ] . One of their main advantages is the versatility with which they can solve both forward and inverse problems. As reported in [ 9 ] for well-posed boundary value problems, PINNs are not able, however, to outperform finite elements methods for low dimensional forward problems (2d and 3d).

In this paper we explore PINNs for an important class of inverse problems, namely the unique continuation (data assimilation) problem in which measurements are available in a subset of the domain and no boundary conditions are given. The goal is to find a solution that extends the data subject to the PDE considered. Numerical methods for such problems typically consider regularization at the continuous level (Tikhonov or quasi-reversibility). More recently, other kinds of methods have been proposed that make use of regularization at the discrete level in the framework of stabilized finite element methods (FEM). In this latter approach, conditional stability estimates can be used to prove error bounds and convergence in terms of the degree of ill-posedness and the approximation order, see e.g. [ 4 , 5 , 15 ] . Conditional stability estimates have also been employed in [ 14 ] to obtain bounds for the generalization error of PINNs solving unique continuation problems for Poisson, Stokes, heat and wave equations.

We are interested in the ill-posed unique continuation problem for the Helmholtz equation, whose stability properties (in terms of the frequency) depend on the geometry of the data set relative to the geometry of the target set where the solution is reconstructed [ 6 ] . We consider PINNs for approximating this problem and we first give a bound in section 3 on the generalization error that takes into account the frequency dependence. We then explore numerically in section 4 how different frequencies and geometric configuration impact the effectiveness of PINNs. We see that PINNs give good approximations when the frequency is low and the target set is inside the convex hull of the data set (robust stability in terms of frequency as discussed in section 2.1). This includes results where data perturbations are present. When the target set is outside of the convex hull of the data set (stability highly sensitive to frequency) or when the frequency is large we observe poor results with vanilla PINNs. This indicates that more sophisticated techniques are needed to obtain PINNs that are robust in frequency for inverse problems subject to the Helmholtz equation.

2 Unique continuation for the Helmholtz equation

Let \(\Omega \subset \mathbb R^2\) be a bounded domain (open and connected), and let \(\omega \subset \Omega \) be an open and connected subset. We consider the ill-posed unique continuation problem for the Helmholtz equation: find \(u\in H^1(\Omega )\) such that

\begin{equation} \left\{ \begin{aligned} \label{eq:helmholtz_uc} -\Delta u - k^2u & = f & \text{in } \Omega ,\\ u & = g & \text{in } \omega , \end{aligned} \right. \end{equation}

with wave number \(k{\gt}0\), source term \(f\in L^2(\Omega )\) and datum \(g\in H^1(\omega )\). In other words, partial measurements are given in a subset \(\omega \subset \Omega \) of the domain for the solution of the Helmholtz equation, while no boundary conditions are prescribed. If problem ?? has a solution, then its uniqueness is guaranteed by the unique continuation principle for elliptic operators. We will assume that the function \(g\) is the restriction on \(\omega \) of a solution to the Helmholtz equation with source term \(f\).

Problem ?? is ill-posed in the sense of Hadamard: there is no uniform stability with respect to the data \(f\) and \(g\), see e.g. [ 1 ] or [ 11 ] . A small perturbation in data can lead to a large change in the solution. Nonetheless, assuming an additional a priori bound, the solution can be bounded by the data in the following way.

2.1 Conditional stability

We denote by \(B\subset \Omega \) a target set containing \(\omega \) such that \(B\setminus \bar{\omega } \subset \Omega \), i.e. \(B\setminus \omega \) does not touch the boundary of \(\Omega \). A standard result for quantitative unique continuation of elliptic operators, see e.g. [ 1 ] , states that there exist constants \(C_{st}(k){\gt}0\) and \(\alpha \in (0,1)\) such that the following Hölder stability estimate holds

\begin{equation} \label{eq:stability} \left\| u \right\| _{H^1(B)} \le C_{st}(k) \Big(\left\| f \right\| _{L^2(\Omega )} + \left\| g \right\| _{H^1(\omega )}\Big)^\alpha \left\| u \right\| _{H^1(\Omega )}^{1-\alpha }, \end{equation}

for any \(u\in H^1(\Omega )\) satisfying ??, where the stability constant \(C_{st}(k)\) depends on the frequency \(k\). The exponent \(\alpha \in (0,1)\) encodes the degree of ill-posedness for the continuation problem: as \(\alpha {\lt}1\) decreases the Hölder stability deteriorates, while \(\alpha =1\) would give a well-posed problem with Lipschitz stability. Both \(C_{st}\) and \(\alpha \) depend on the geometric configuration in a nontrivial way, and we give details below regarding \(C_{st}\).

\includegraphics[scale=0.4]{geom_nonconv.png} — Figure 1 \(C_{st}\) sensitive to \(k\).

\includegraphics[scale=0.4]{geom_conv.png} — Figure 2 \(C_{st}\) robust in \(k\).

Frequency dependence for the stability constant. Data set \(\omega \) (dark grey) and target set \(B\) (light grey). \(\Omega \) is the whole square.

An important aspect for the stability of this ill-posed Helmholtz problem is the dependence of the stability constant \(C_{st}(k)\) on the frequency \(k\). For example, when there is a straight line that intersects \(B\) but not \(\bar{\omega }\), as in figure 1, it was proven in [ 6 , Example 4 ] that for any \(N\in \mathbb N\), \(C_{st}(k) \le k^N\) cannot hold uniformly in \(k\); this means that the stability constant in 2 grows superpolynomially in the frequency. Also, for three-ball inequalities (where \(\omega , B, \Omega \) are concentric balls) it was recently shown in [ 2 ] that in the maximum norm \(C_{st}(k)\) grows exponentially in \(k\) and this dependence is optimal.

Bounds that have a different behavior with respect to the frequency can be obtained under a convexity condition of the target domain \(B\) relative to the data set \(\omega \), essentially that \(B\) is included in the convex hull of \(\omega \), as for example in figure 2. Such a condition was first considered in [ 10 ] , where it was shown that the stability of the solution in the \(L^2\)-norm can actually improve in a certain sense as the frequency increases. In this vein, similar results that make use of a convexity condition were proven in [ 6 ] for a particular geometric setting prototypical for continuation inside the convex hull of \(\omega \). It was shown in [ 6 , Corollary 2 ] that there exist constants \(C{\gt}0\) and \(\alpha \in (0,1)\) such that

\begin{equation} \label{eq:stability_convex} \left\| u \right\| _{H^1(B)} \le C \big( \left\| f \right\| _{L^{2}(\Omega )} + \left\| g \right\| _{H^1(\omega )} \big)^\alpha \left\| u \right\| _{H^1(\Omega )}^{1-\alpha }, \end{equation}

for any \(u\in H^1(\Omega )\) satisfying ??. Moreover, the norms in which data is measured can be weaken [ 6 , Corollary 3 and Lemma 2 ] : there exist constants \(C{\gt}0\) and \(\alpha \in (0,1]\) such that

\begin{align} \label{eq:stability_convex_shifted} \left\| \nabla u \right\| _{L^2(B)} \! +\! k\left\| u \right\| _{L^2(B)} & \! \le \! C k \Big( \! \left\| f \right\| _{H^{-1}(\Omega )} \! +\! \left\| g \right\| _{L^2(\omega )} \! \Big)^\alpha \Big( \! \left\| f \right\| _{H^{-1}(\Omega )} \! +\! \left\| u \right\| _{L^2(\Omega )} \! \Big)^{1-\alpha } \\ & \! \le \! C k \Big( \! \left\| f \right\| _{L^2(\Omega )} \! +\! \left\| g \right\| _{L^2(\omega )} \! \Big)^\alpha \Big(\! \left\| f \right\| _{L^2(\Omega )} \! +\! \left\| u \right\| _{L^2(\Omega )} \! \Big)^{1-\alpha }, \nonumber \end{align}

for any \(u\in H^1(\Omega )\) satisfying ??. Note that the bound is robust in the \(L^2\)-norm, while the frequency dependence is linear for the \(H^1\)-seminorm. Apart from the good dependence on the frequency, this kind of estimate is particularly suitable for numerical analysis since it can be directly applied to the error equation.

Remark 1

If the target set is the whole domain \(\Omega \), then a global stability estimate holds with the modulus of continuity being logarithmic \(|\log (\cdot )|^{-\alpha }\) instead of Hölder-type \(|\cdot |^\alpha \). â–¡

3 Physics-informed neural networks (PINNs)

Let us first recall the setup of a feed-forward neural network, which is based on successively applying linear and nonlinear transformations to the inputs. Let \(\mathcal{N}^L(\mathbf{x}): \mathbb {R}^{d_{\text{in}}} \to \mathbb {R}^{d_{\text{out}}}\) be an \(L\)-layer neural network with \((L-1)\) hidden layers and \(N_\ell \) neurons in the \(\ell \)-th layer (\(N_0 = d_{\text{in}}\), \(N_L = d_{\text{out}}\)). Let us denote the weight matrix and bias vector in the \(\ell \)-th layer by \(\bm@general \boldmath \m@ne \mv@bold \bm@command {W}^\ell \in \mathbb {R}^{N_\ell \times N_{\ell -1}}\) and \(\mathbf{b}^\ell \in \mathbb {R}^{N_\ell }\), respectively. For a nonlinear activation function \(\sigma \) applied element wise, the feed-forward neural network is given by:

\begin{align*} \text{input layer:} & \quad \mathcal{N}^0(\textbf{x}) := \textbf{x} \in \mathbb {R}^{d_{\text{in}}}, \\ \text{hidden layers:} & \quad \mathcal{N}^\ell (\textbf{x}) := \sigma (\bm@general \boldmath \m@ne \mv@bold \bm@command {W}^{\ell }\mathcal{N}^{\ell -1}(\textbf{x}) + \bm@general \boldmath \m@ne \mv@bold \bm@command {b}^{\ell }) \in \mathbb {R}^{N_\ell }, \quad \text{for} \quad 1 \le \ell \le L-1, \\ \text{output layer:} & \quad \mathcal{N}^{L}(\textbf{x}) := \bm@general \boldmath \m@ne \mv@bold \bm@command {W}^{L}\mathcal{N}^{L-1}(\textbf{x}) + \bm@general \boldmath \m@ne \mv@bold \bm@command {b}^{L} \in \mathbb {R}^{d_{\text{out}}}. \end{align*}

We denote the parameters of the network (weights and biases) by

\[ \theta := \big\{ (\bm@general \boldmath \m@ne \mv@bold \bm@command {W}^1, \bm@general \boldmath \m@ne \mv@bold \bm@command {b}^1), \cdots , (\bm@general \boldmath \m@ne \mv@bold \bm@command {W}^L, \bm@general \boldmath \m@ne \mv@bold \bm@command {b}^L)\big\} , \]

and the parameter set by \(\Theta \), hence

\[ \theta \in \Theta \subset \mathbb {R}^M,\quad M= \sum _{l = 1}^{L} N_{l} (N_{l-1} + 1). \]

The output of a feed-forward neural network,

\[ u_\theta (\textbf{x}) : = \mathcal{N}^{L}(\textbf{x}), \]

depends on the tuning parameter \(\theta \). Training the network means using an optimization algorithm to find the parameters \(\theta \in \Theta \) that minimize a certain loss function \(\mathcal{L}_\theta \).

Let us now briefly describe physics-informed neural networks (PINNs) as introduced in [ 16 ] , based on feed-forward neural networks. The idea is to consider a loss function with two components: the residual of the differential equation and the data residual. For typical boundary value problems, the data residual contains boundary conditions, initial conditions, and any other available measurements of the solution. In our case, since no boundary conditions are given, the data residual will only involve the interior partial measurements.

To solve the ill-posed unique continuation problem ?? with PINNs we consider the residual of the Helmholtz equation

\[ \mathring {\mathcal{R}}_\theta := - \Delta u_\theta - k^2 u_\theta - f \text{ in } \Omega \]

and the data residual

\[ \mathring {\mathcal{R}}_{\theta ,d} := u_\theta - g \text{ in } \omega , \]

and aim to minimize the two residuals over the admissible set of tuning parameters \(\Theta \) by considering the loss function

\[ \mathring {\mathcal{L}}_\theta := \| \mathring {\mathcal{R}}_\theta \| _{L^2(\Omega )} + \| \mathring {\mathcal{R}}_{\theta ,d}\| _{L^2(\omega )}. \]

To approximate the integrals in the setting of a neural network, we introduce the following collocation points: \(\mathcal{T}_\Omega := \{ \textbf{x}_i^\Omega \} _{i=1}^{|\mathcal{T}_\Omega |}\) for the whole domain \(\Omega \) and \(\mathcal{T}_\omega := \{ \textbf{x}_i^\omega \} _{i=1}^{|\mathcal{T}_\omega |}\) for the data region \(\omega \subset \Omega \) in which observations of the solution are given as \(\{ g_i(\textbf{x}^\omega )\} _{i=1}^{|\mathcal{T}_\omega |}\). The loss function for the PINN is hence defined as

\begin{equation} \label{eq:weightedCompositeLoss} \mathcal{L}_\theta := \mathcal{R}_\theta + \mathcal{R}_{\theta ,d}, \end{equation}

with the PDE residual

\begin{equation} \label{eq:residual_pde} \mathcal{R}_\theta := \tfrac {1}{N_\Omega }\sum _{\textbf{x} \in \mathcal{T}_\Omega } \Big{|}(\Delta u_\theta + k^2 u_\theta + f)(\textbf{x})\Big{|}^2, \end{equation}

and the data residual

\begin{equation} \mathcal{R}_{\theta ,d} := \tfrac {1}{N_\omega }\sum _{\textbf{x} \in \mathcal{T}_\omega } \big{|}(u_\theta - g)(\textbf{x})\big{|}^2. \end{equation}

By training the neural network we aim to

\begin{equation} \label{eq:pinn_minimization} \text{find } \theta ^\ast \in \Theta \text{ such that } \theta ^\ast = \arg \min _{\theta \in \Theta } \mathcal{L}_\theta . \end{equation}

We will denote by

\[ u^{\ast } = u_{\theta ^{\ast }} \]

the PINN solution obtained by this algorithm, see figure 3 for a sketch.

As the loss function is highly non-linear and non-convex, we will use the standard approach of minimizing by gradient-based methods such as ADAM [ 12 ] or L-BFGS [ 7 ] , more details are given in section 4.

\includegraphics[scale=0.75]{pinn.png} — Figure 3 Representation of a PINN for solving ??, adapted from [ 8 ] .

3.1 Estimating the generalization error

Such PINNs for unique continuation problems have been considered in [ 14 ] where the authors prove for the first time estimates for the generalization error by employing conditional stability estimates; details of the analysis and numerical examples were given for the Poisson, Stokes, heat and wave equations. In brief, one uses conditional stability estimates to bound the error in a target domain (generalization error) by the residuals, which are then bounded in terms of the training error and the number of training samples. We now adapt the analysis in [ 14 , Theorem 2.4 ] to the case of the Helmholtz equation by using the stability results discussed in section 2.1.

Theorem 2

Let \(f \in C^{n-2}(\Omega )\) and \(g \in C^n(\omega )\), with continuous extensions of the functions and derivatives up to the boundaries, with \(n \geq 2\). Let \(u \in H^1(\Omega )\) be the solution of the unique continuation problem ??. Let \(u^{\ast } = u_{\theta ^{\ast }} \in C^n(\Omega )\) be a PINN solution generated by training ??. Consider a target set \(B \subset \Omega \) containing \(\omega \) satisfying the convexity condition in [ 6 , Corollary 2 ] . Then there exist constants \(C_{st}(k){\gt}0\) and \(\alpha \in (0,1)\) such that the generalization error

\begin{equation} \label{eq:error_generalization} \EuScript {E}_G(B) := \| \nabla (u - u^{\ast })\| _{H^1(B)} + k \| u - u^{\ast }\| _{L^2(B)} \end{equation}

is bounded by

\begin{equation} \label{eq:error_estimate} \begin{aligned} \EuScript {E}_G(B) \leq & \, C_{st}(k) \left(\| u\| ^{1-\alpha }_{L^2(\Omega )} + \| u^{\ast }\| ^{1-\alpha }_{L^2(\Omega )} + \EuScript {E}_{\Omega ,T}^{1-\alpha } + C_{q}^{\frac{1-\alpha }{2}}N_\Omega ^{-\frac{\tau (1-\alpha )}{2}} \right) \\ & \, \times \left(\EuScript {E}_{\Omega ,T}^{\alpha } + \EuScript {E}_{\omega ,T}^{\alpha } + C_{q}^{\frac{\alpha }{2}}N_{\Omega }^{-\frac{\tau \alpha }{2}} + C_{qd}^{\frac{\alpha }{2}}N_\omega ^{-\frac{\tau _d \alpha }{2}}\right), \end{aligned} \end{equation}

where \(C_{st}(k)\) grows polynomially in \(k\), the training errors are

\begin{equation} \label{eq:error_training} \EuScript {E}_{\Omega ,T} = \left(\sum \limits _{j=1}^{N_\Omega } w_j |\mathring {\mathcal{R}}_{\theta ^{\ast }}(\textbf{x}_j^\Omega )|^2\right)^{\frac{1}{2}}, \quad \EuScript {E}_{\omega ,T} = \left(\sum \limits _{i=1}^{N_{\omega }} w_i|\mathring {\mathcal{R}}_{\theta ^{\ast },d}(\textbf{x}_i^\omega )|^2\right)^{\frac{1}{2}}, \end{equation}

with constants \(C_q\) and \(C_{qd}\) given by the quadrature bounds, and some \(\tau , \tau _d {\gt} 0\).

Proof â–¼

Consider the error \(\hat{u} = u^{\ast } - u \in H^1(\Omega )\) which satisfies

\begin{equation*} \begin{aligned} -\Delta \hat{u} - k^2 \hat{u} & = \mathring {\mathcal{R}}_{\theta ^{\ast }}, \quad & \text{in } \Omega , \\ \hat{u} & = \mathring {\mathcal{R}}_{\theta ^{\ast },d}, \quad & \text{in } \omega , \end{aligned}\end{equation*}

in a weak sense. Applying the conditional stability estimate ?? we obtain

\begin{align*} & \EuScript {E}_G(B) \leq \\ & \leq C_{st}(k) \left(\| \mathring {\mathcal{R}}_{\theta ^{\ast }}\| _{L^2(\Omega )} + \| \mathring {\mathcal{R}}_{\theta ^{\ast },d}\| _{L^2(\omega )} \right)^{\alpha } \left( \| \mathring {\mathcal{R}}_{\theta ^{\ast }}\| _{L^2(\Omega )} + \| \hat{u}\| _{L^2(\Omega )}\right)^{1-\alpha } \\ & \leq C_{st}(k) \left(\| \mathring {\mathcal{R}}_{\theta ^{\ast }}\| _{L^2(\Omega )} \! +\! \| \mathring {\mathcal{R}}_{\theta ^{\ast },d}\| _{L^2(\omega )} \right)^{\alpha } \left( \| \mathring {\mathcal{R}}_{\theta ^{\ast }}\| _{L^2(\Omega )} \! + \! \| u\| _{L^2(\Omega )} \! +\! \| u^\ast \| _{L^2(\Omega )}\right)^{1-\alpha } . \end{align*}

The conclusion follows by assuming quadrature rules with approximation bounds as in [ 14 , Eqs (2.9) and (2.11) ] , where for some \(\tau , \tau _d {\gt} 0\) one has that

\begin{equation} \label{eq:pl2} \begin{aligned} \| \mathring {\mathcal{R}}_{\theta ^{\ast }}\| _{L^2(\Omega )} & \leq \EuScript {E}_{\Omega ,T} + C_{q}^{\frac{1}{2}}N_{\Omega }^{-\frac{\tau }{2}}, \\ \| \mathring {\mathcal{R}}_{\theta ^{\ast },d}\| _{L^2(\omega )} & \leq \EuScript {E}_{\omega ,T} + C_{qd}^{\frac{1}{2}}N_\omega ^{-\frac{\tau _d}{2}}, \end{aligned} \end{equation}

with constants \(C_q = C_q\left(\| \mathring {\mathcal{R}}_{\theta ^{\ast }}\| _{C^{n-2}(\Omega )}\right)\) and \(C_{qd} = C_{qd}\left(\| \mathring {\mathcal{R}}_{\theta ^{\ast },d}\| _{C^{n}(\omega )}\right)\).

Proof â–¼

Remark 3

As discussed in section 2.1, if the target domain \(B\) is outside the convex hull of the measurement domain \(\omega \), the constant \(C_{st}(k)\) in the conditional stability estimate might depend exponentially on the wave number \(k\). In that case, the same behaviour will appear in the bound of the generalization error in theorem 2, which indicates that obtaining good approximations for this ill-posed problem with PINNs can be very challenging for high frequencies. â–¡

Remark 4

If the target set is the whole domain \(\Omega \), then one can obtain a global bound on the generalization error by using a global logarithmic stability estimate with an implicit dependence on the wave number. â–¡

4 Numerical experiments

We present numerical experiments for the Helmholtz unique continuation problem ?? solved with physics-informed neural networks (PINNs) described in section 3 and implemented using the open-source library DeepXDE [ 13 ] . The test case we will focus on has been considered as a benchmark test for this problem solved with primal-dual stabilized finite element methods: with a conforming discretization in [ 6 ] and with a high-order hybridized discontinuous Galerkin method in [ 3 ] . It represents the Helmholtz version of the classical Hadamard example for ill-posed elliptic equations.

Let \(n \in \mathbb {N}\) and consider the Cauchy problem

\begin{equation} \label{eq:hadamard_example} \left\{ \begin{aligned} \Delta u + k^2 u & = 0 & & \text{in } \Omega :=(0,1)\times (0,1), \\ u(x,0) & = 0 & & \text{for } x\in [0,1], \\ u_y (x,0) & = \sin (nx) & & \text{for } x\in [0,1], \end{aligned} \right. \end{equation}

whose solution for \(n{\gt}k\) is given by

\begin{equation} \label{eq:hadamard_solution} u(x,y) = \frac{1}{\sqrt{n^2-k^2}} \sin (nx) \sinh (\sqrt{n^2-k^2} y), \end{equation}

for \(n=k\) by \(u(x,y) = \sin (kx) y\), and for \(n{\lt}k\) by

\[ u(x,y) = \frac{1}{\sqrt{k^2-n^2}} \sin (nx) \sin (\sqrt{k^2-n^2}y). \]

For such Hadamard-type solutions, we consider the interior datum \(g=u|_\omega \) and study two geometric configurations of \(\omega \) and \(B\): one with frequency-robust stability bounds and one exponentially sensitive to the frequency, as discussed in section 2.1, namely

\begin{equation} \label{eq:hadamard_conv} \omega = \Omega \setminus [0,0.865] \times [0.125,0.875],\quad B = \Omega \setminus [0,0.125] \times [0.125,0.875], \end{equation}

similar to figure 2, and

\begin{equation} \label{eq:hadamard_nonconv} \omega = (0.25,0.75) \times (0,0.5),\quad B = (0.125,0.875) \times (0,0.875), \end{equation}

sketched in figure 1.

To assess the effect of increasing the frequency, we will take exact solutions ?? with \(n=5,k=1\) and \(n=7,k=5\), both having \(\sqrt{n^2-k^2} = \sqrt{24}\) .

Hyper-parameters. The PINNs described in section 3 need to be configured with the following hyper-parameters: number of hidden layers \(L-1\) (depth), number of neurons in each hidden layer \(\ell \) (width), learning rate \(\lambda \), activation function \(\sigma \). In order to find good configurations for these, we draw on previous numerical experiments for forward and inverse problems presented in the DeepXDE tutorials [ 13 ] and the numerical experiments for unique continuation subject to the Poisson equation in [ 14 ] . We also validate the choices by comparing them with the results given by the hyper-parameter optimization in [ 8 ] , which uses Gaussian processes-based Bayesian optimization. For this we consider the search space \(L-1\in [4,50],N_\ell \in [20,150],\lambda \in [10^{-4}, 10^{-2}],\sigma \in \{ \sin ,\tanh \} \). In these ways we obtain some optimal or near-optimal configurations to which the results presented below correspond. From different numerical experiments, it turns out that a good choice for the activation function is \(\sigma = \sin \) and for the learning rate \(\lambda = 10^{-3}\), which will be fixed from this point onwards.

Training. We sample \(N=N_\Omega =N_\omega \) points on Cartesian grids from the PDE domain \(\Omega \) and the measurement domain \(\omega \). We train the model for 50000 iterations with the ADAM optimizer [ 12 ] and then we train again with L-BFGS [ 7 ] , following [ 13 ] . Since PINNs approximations are obtained by non-convex minimization which in general does not have a unique solution, the method might converge to different solutions depending on the network’s initial values. We use the standard strategy of training the PINNs with different random starting values for the optimizer and average the errors over 30 retrainings. We note that the smallest training error is considerably better.

\includegraphics[scale=0.2]{had_conv_pinn.png} — Figure 4 PINN solution and exact solution ?? with \(n=7,k=5\).

Results. We first show in figure 4 the exact solution \(u\) for the Hadamard-type function ?? with \(n=7,k=5\), and the PINN solution \(u^\ast \) when considering a network generated with \(N=400\) training points, \(L-1=4\) hidden layers and \(N_\ell =24\) neurons in each layer. As discussed above, the activation function is \(\sigma = \sin \) and the learning rate \(\lambda = 10^{-3}\).

\(N\)	\(L-1\)	\(N_\ell \)	\(\left\\| u-u^\ast \right\\| _{L^2(B)}\)	\(\left\\| u-u^\ast \right\\| _{H^1(B)}\)
400	4	24	3.35%	3.77%
1600	4	24	1.87%	2.70
6400	4	24	1.45%	2.31%

Table 1 Geometry with good stability ??. Relative percentage generalization errors in \(B\), \(n=5\) and \(k=1\).

We study the efficienty of PINNs by considering the \(L^2\) and \(H^1\) relative percentage errors for the number of training points \(N=20^2,40^2,80^2\) in the geometric configuration ?? with robust bounds in theorem 2. table 1 shows the errors in the target domain \(B\), while table 2 shows the global errors. In both tables we observe similar behaviours for the error. Moreover, we notice that even for very few training points the generalization errors are small (around 3%). Note that the reported errors correspond to averages over 30 retrainings. The generalization errors corresponding to the network with the smallest training error are typically smaller by a factor of \(3\) to \(5\). The training error in these examples is in between \(10^{-3}\) and \(10^{-2}\). The training of such a network is very fast, on a single Quadro GPU it takes about 1-2 minutes.

\(N\)	\(L-1\)	\(N_\ell \)	\(\left\\| u-u^\ast \right\\| _{L^2(\Omega )}\)	\(\left\\| u-u^\ast \right\\| _{H^1(\Omega )}\)
400	4	24	4.58%	5.15%
1600	4	24	2.89%	3.61%
6400	4	24	2.22%	3.01%

Table 2 Geometry with good stability ??. Relative percentage generalization errors in \(\Omega \), \(n=5\) and \(k=1\).

\(N\)	\(L-1\)	\(N_\ell \)	\(\left\\| u-u^\ast \right\\| _{L^2(\Omega )}\)	\(\left\\| u-u^\ast \right\\| _{H^1(\Omega )}\)
400	4	24	4.73%	5.17%
1600	4	24	3.34%	4.06%
6400	4	24	2.38%	3.19%

Table 3 Geometry with good stability ??. Relative percentage generalization errors in \(\Omega \), \(n=5\) and \(k=1\). Perturbed data.

We also test the performance of the method when the measurements are perturbed with random values sampled from the normal distribution with zero mean and standard deviation 0.01 (representing approximately 2.5% of the average of the exact solution ?? with \(n=5,k=1\) considered in that example). The results given in table 3 show that the numerical approximation is robust to such perturbations in data.

In table 4 we keep the same (convex) geometric configuration but increase the frequency to \(k=5\), with \(n=7\) such that \(\sqrt{n^2-k^2} = \sqrt{24}\) as before. We observe that the approximation improves as the frequency increases: the errors decrease compared to table 2. This is probably an artefact of the particular conditions of the experiment, but we note that such a surprising phenomenon for inverse Helmholtz problems has been previously noticed and theoretically discussed in [ 10 ] . However, when increasing the wave number \(k{\gt}10\) we report that the PINNs no longer provide a good approximation to the solution.

\(N\)	\(L-1\)	\(N_\ell \)	\(\left\\| u-u^\ast \right\\| _{L^2(\Omega )}\)	\(\left\\| u-u^\ast \right\\| _{H^1(\Omega )}\)
400	4	24	2.03%	3.31%
1600	4	24	1.48%	2.04%
6400	4	24	1.27%	1.77%

Table 4 Geometry with good stability ??. Relative percentage generalization errors in \(\Omega \), \(n=7\) and \(k=5\).

We now consider continuation outside the convex hull of the data set in ??. We observe in table 5 that even though the wave number is small \(k=1\), the PINNs fail to find a good approximation of the solution. This could be caused by a bad search space for hyper-parameter optimization or it could be an inherent limitation of vanilla PINNs.

\(N\)	\(L-1\)	\(N_\ell \)	\(\left\\| u-u^\ast \right\\| _{L^2(\Omega )}\)	\(\left\\| u-u^\ast \right\\| _{H^1(\Omega )}\)
400	4	24	46.1%	71.4%
1600	4	24	41.8%	66.6%
25600	4	24	39.4%	62.5%

Table 5 Geometry with bad stability ??. Relative percentage generalization errors in \(\Omega \), \(n=5\) and \(k=1\).

Bibliography

1: Giovanni Alessandrini, Luca Rondi, Edi Rosset, and Sergio Vessella. The stability for the Cauchy problem for elliptic equations. Inverse Problems, 25:123004, 2009.
2: S. M. Berge and E. Malinnikova. On the three ball theorem for solutions of the Helmholtz equation. Complex Anal. Synerg., 7(2):1–10, 2021.
3: E. Burman, G. Delay, and A. Ern. A hybridized high-order method for unique continuation subject to the helmholtz equation. SIAM J. Numer. Anal., 59(5):2368–2392, 2021.
4: Erik Burman. Stabilized finite element methods for nonsymmetric, noncoercive, and ill-posed problems. Part I: elliptic equations. SIAM J. Sci. Comput., 35(6):A2752–A2780, 2013.
5: Erik Burman, Peter Hansbo, and Mats G. Larson. Solving ill-posed control problems by stabilized finite element methods: an alternative to Tikhonov regularization. Inverse Problems, 34:035004, 2018.
6: Erik Burman, Mihai Nechita, and Lauri Oksanen. Unique continuation for the Helmholtz equation using stabilized finite element methods. J. Math. Pures Appl., 129:1–22, 2019.
7: Richard H Byrd, Peihuang Lu, Jorge Nocedal, and Ciyou Zhu. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput., 16(5):1190–1208, 1995.
8: Paul Escapil-Inchauspé and Gonzalo A Ruz. Hyper-parameter tuning of physics-informed neural networks: Application to helmholtz problems. arXiv preprint arXiv:2205.06704, 2022.
9: Tamara G Grossmann, Urszula Julia Komorowska, Jonas Latz, and Carola-Bibiane Schönlieb. Can physics-informed neural networks beat the finite element method? arXiv preprint arXiv:2302.04107, 2023.
10: Tomasz Hrycak and Victor Isakov. Increased stability in the continuation of solutions to the Helmholtz equation. Inverse Problems, 20(3):697–712, 2004.
11: Victor Isakov. Inverse problems for partial differential equations, volume 127 of Applied Mathematical Sciences. Springer, 3rd edition, 2017.
12: Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
13: Lu Lu, Xuhui Meng, Zhiping Mao, and George Em Karniadakis. Deepxde: A deep learning library for solving differential equations. SIAM Rev., 63(1):208–228, 2021.
14: Siddhartha Mishra and Roberto Molinaro. Estimates on the generalization error of physics-informed neural networks for approximating a class of inverse problems for PDEs. IMA J. Numer. Anal., 42(2):981–1022, 2022.
15: M. Nechita. Unique continuation problems and stabilised finite element methods. PhD thesis, University College London, 2020.
16: Maziar Raissi, Paris Perdikaris, and George E Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comp. Phys., 378:686–707, 2019.