A comparative study of Filontype rules
for oscillatory integrals
Abstract.
Our aim is to answer the following question: “Among the Filontype methods for computing oscillatory integrals, which one is the most efficient in practice?”. We first discuss why we should seek the answer among the family of FilonClenshawCurtis rules. A theoretical analysis accompanied by a set of numerical experiments reveals that the plain FilonClenshawCurtis rules reach a given accuracy faster than the (adaptive) extended FilonClenshawCurtis rules. The comparison is based on the CPU runtime for certain wave numbers (medium and large).
Key words and phrases:
oscillatory integral, FilonClenshawCurtis rule, extended FCC rule, adaptive FCC+ rule.2005 Mathematics Subject Classification:
65D32, 65Y201. Introduction
Consider the oscillatory integral
(1)  $${I}_{k}(f):={\int}_{1}^{1}f(x){\mathrm{e}}^{\mathrm{i}kx}dx,$$ 
where $f$ is a given smooth function on $[1,1]$, and the wave number $k>0$ is rather large. Filontype methods, in brief, reads as ${I}_{k}(p)$, where $p$ is a polynomial approximation of $f$ in $[1,1]$. In details, the interpolation polynomial $p$ is generally expanded as
(2)  $$p(x)=\sum _{j}{\alpha}_{j}{q}_{j}(x),x\in [1,1],$$ 
where ${q}_{j}$ is a polynomial of order $j$ for each $j$. The scalars ${\alpha}_{j}$, for all $j$, are computed in the construction phase of $p$. Thus, the Filontype method is expanded as
(3)  $${I}_{k}(p)=\sum _{j}{\alpha}_{j}{m}_{j},$$ 
where ${m}_{j}:={I}_{k}({q}_{j})$ are the moments to be evaluated.
Certainly, accuracy of a Filontype method is proportional to the accuracy of the corresponding polynomial approximation because
(4)  $$ 
In practice, what is the best choice of $p$ of a given degree? Even if we restrict our investigation to interpolation, there are a tremendous number of interpolation polynomials approximating a smooth function satisfactory, e.g. splines, HermiteBirkhoff interpolation, and those based on orthogonal polynomials and their roots or extrema. The main question motivating this research is, which interpolation is the best choice for Filontype methods? Beside the accuracy of the interpolation polynomial $p$ on $[1,1]$, there are mainly three other aspects that affect the efficiency of the Filontype method ${I}_{k}(p)$ in practice: (i) the cost of construction of $p$, (ii) the cost of computing the moments, (iii) and the order of interpolation at the endpoints $\pm 1$.
In general, Lagrange interpolation at arbitrary points is rather costly. However, in [6], a rapid and stable construction method is proposed based on the fast multipole method (FMM) of [8]. Instead of an arbitrary set of interpolation points, however, one prefers roots or extrema of orthogonal polynomials and utilize FFT for better performance. Among several kinds of orthogonal polynomials, Chebyshev polynomials are often used mainly because their roots and extrema are in the most simple and explicit forms.
For fast computing the moments, one prefers polynomials which satisfy a recurrence relation. Then the moments can be computed recursively with few computational cost. Orthogonal polynomials not only can be constructed efficiently by FFT, but also they satisfy threeterm recurrence relations. Assume that $({q}_{j})$ in (2) satisfies the threeterm recurrence relation
(5)  $${q}_{j+2}={a}_{j}{q}_{j+1}+{b}_{j}{q}_{j},j=0,1,\mathrm{\dots},$$ 
where ${a}_{j}$ and ${b}_{j}$ are known scalars with ${a}_{j}+{b}_{j}\ne 0$ for all $j$. The first two moments can easily be computed analytically. Then, the other ones are computed by (5), recursively:
(6)  $${m}_{j+2}={a}_{j}{m}_{j+1}+{b}_{j}{m}_{j},j=0,1,\mathrm{\dots}.$$ 
On the other hand, the asymptotic expansion of the oscillatory integral ${I}_{k}(f)$ turns out that the value of ${I}_{k}(f)$ is mostly determined by its amplitude, $f$, and their derivatives up to some order at the endpoints $\pm 1$. More precisely, if $f$ is smooth enough in $[1,1]$, and
(7)  $$\left({f}^{(m)}{p}^{(m)}\right)(\pm 1)=0,m=0,\mathrm{\dots},s,$$ 
for some integer $s\ge 0$, then
(8)  $$\left{I}_{k}(f){I}_{k}(p)\right=\mathcal{O}({k}^{s2}),\text{as}k\to \mathrm{\infty}$$ 
(see, e.g., [11]). Therefore, not only the accuracy of the interpolation, i.e. ${\Vert fp\Vert}_{\mathrm{\infty}}$, is important for the total accuracy of a Filontype rule ${I}_{k}(p)$, but also its order $s$ at the endpoints.
The conditions (7) implies an example of HermiteBirkhoff interpolation, which construction rarely utilizes fast algorithms like FFT or FMM. Interpolation at the ClenshawCurtis points ${t}_{n,N}:=\mathrm{cos}(n\pi /N)$, $j=0,\mathrm{\dots},n$, have all the above advantages. Because of the distribution of the ClenshawCurtis points in $[1,1]$, interpolation at them is most accurate and numerically stable. Moreover, thanks to the threeterm recurrence relation for the Chebyshev polynomials, it can stably be constructed by FFT at the cost of $\mathcal{O}(n\mathrm{log}n)$ (see [5]). In addition, the computational cost for computing ${t}_{n}$, $n=0,\mathrm{\dots},N$, is negligible. Finally, having the Lagrange interpolant at the ClenshawCurtis, one can efficiently construct the corresponding HermiteBirkhoff interpolant $p$ that satisfies (7), too (see [15]).
Therefore, without doubt, FilonClenshawCurtis (FCC) rules and their extensions are the most efficient Filontype rules for computing the oscillatory integral (1). Extended FCC (FCC+) rules and their adaptive versions can be constructed efficiently by the algorithms of [15] and [14], respectively. To the best of our knowledge, they are the most efficient algorithms for constructing the FCC+ and adaptive FCC+ rules so far.
In comparison with the algorithm of [5] for constructing the FCC rules, the algorithms of [15] and [14] are more complex. This difference in the computational costs may be negligible when the maximum order of derivatives $s$ in the FCC+ rules is rather small. On the other hand, the asymptotic order of an (adaptive) FCC+ rule is better than that of the corresponding FCC rule. Actually, one cannot readily answer which algorithm is more efficient, and this question naturally triggers a comparative study. Indeed, this paper fills a gap in the literature and help in the comparison between Filontype methods.
In this paper, we carry out a set of numerical experiments to compare efficiency of FCC, FFC+, and adaptive FCC+. Actually, we are interested in the CPU runtime to achieve a specific accuracy. We apply these three classes of rules to a set of oscillatory integrals with different regularity characteristics. Since FCC is less complex than the other rules, one can use more interpolation points for the same computational effort as (adaptive) FCC+.
The rest of the paper is organized as follows. In Section 2, we carry out a set of numerical experiments in order to compare the efficiency of the FCC and the FCC+ rules. In Section 3, the same is done for the FCC and the adaptive FCC+ rules. We also modify Theorem 2.2 of [14] and give a more practical error bound for the adaptive FCC+ rules that is explicit in $k$ and the rule parameters. In Section 4, we account several other advantages of the FCC rules over the (adaptive) FCC+ ones. Finally, we bring some conclusions in Section 5.
2. FCC vs FCC+
In this section, we compare the efficiency of the FCC and FCC+ rules. In contrast to the common ideas in the literature, we see that the FCC+ rules have no advantages over the plain FCC rules in practice. At first, we give a brief review of the FCC+ rules (see [7, 15]).
2.1. A review of the FCC+ rules
The ($N+1$)point FCC+ rule of order $s\ge 1$ is defined by
(9)  $${I}_{k,N,s}^{\mathrm{FCC}+}(f):={I}_{k}({p}_{N+2s}),$$ 
where ${p}_{N+2s}$ is the ${(N+2s)}^{\mathrm{th}}$ degree Hermite interpolation polynomial satisfying the following interpolation conditions (see [15]):
(10a)  ${p}_{N+2s}({t}_{n,N})$  $=f({t}_{n,N}),n=0,\mathrm{\dots},N,$  
(10b)  ${p}_{N+2s}^{(m)}(\pm 1)$  $={f}^{(m)}(\pm 1),m=1,\mathrm{\dots},s.$ 
Then, the asymptotic error of the rule will be
(11)  $$\left{I}_{k}(f){I}_{k,N,s}^{\mathrm{FCC}+}(f)\right=\mathcal{O}\left({k}^{s2}\right),$$ 
see, e.g., [11].
An important advantage of the FCC rules over the FCC+ ones is that there are error estimates for the FCC rules, which are applicable even if $f$ is of confined regularity. Furthermore, they are explicit in both $k$ and $N$ (see [4, 5, 20, 21]). Such error bounds do not exist for the FCC+ rules so far (see [15] for details).
In [7], an efficient algorithm is proposed for constructing the FCC+ rule (9) that can be implemented by $\mathcal{O}\left(N\mathrm{log}N+{s}^{3}\right)$ flops. A more efficient algorithm was later proposed in [15]. One advantage of the latter is that its computational cost can be determined precisely. By the algorithm of [15], one needs $\mathcal{O}({s}^{3}+Ns+{s}^{2})$ flops to extend the ($N+1$)point FCC rule to the corresponding FCC+ rule of order $s$ in a natural way. Thus, beside its simplicity and naturality, the algorithm of [15] enables one to compare computational costs of the two rules easily. The algorithm for constructing the ($N+1$)point FCC+ rule of order $s>1$ leads to two linear systems of order $s$. However, the FCC+ rules of order 1 can be stated explicitly, and one only needs about $3N+14$ extra flops to build it from the corresponding ($N+1$)point FCC rule.
2.2. The race for convergence speed
Error estimate (11) turns out that for a fixed $N$, the asymptotic decaying rate of the error of ${I}_{k,N,s}^{\mathrm{FCC}+}(f)$ is higher than that of ${I}_{k,N}^{\mathrm{FCC}}(f)$, and it grows by $s$. On the other hand for rather large $N$, the complexity of the FCC rule, $\mathcal{O}\left(N\mathrm{log}N\right)$, is comparable to that of the FCC+ rule, $\mathcal{O}\left(N\mathrm{log}N+{s}^{3}\right)$ because $s$ is usually taken rather small in practice. Thus, some authors conclude that the FCC+ rules are always the winner in the competition with the FCC rules. According to [3],“the generalization of FCC … enjoys all the advantages of ‘plain’ FCC … in cost and simplicity, while arbitrarily increasing the asymptotic rate of decay of the error” (p. 66).
In this section, we show that such claims are in doubt in practice. Although, it may be required in some applications to compute ${I}_{k}(f)$ for several $k$ and $f$, but each time, we face the problem of computing one integral with a fixed parameter $k$ and a fixed amplitude function $f$. The FCC+ rule ${I}_{k,{N}_{1},s}^{\mathrm{FCC}+}(f)$ approximates the integral to a specific accuracy for some rather large integers ${N}_{1}$ and $s$. As mentioned above, the FCC rule ${I}_{k,{N}_{1}}^{\mathrm{FCC}}(f)$ with the same ${N}_{1}$ generally achieves less accuracy. However, we may choose larger integer ${N}_{2}>{N}_{1}$ and achieve the same accuracy or even better while the computational cost does not exceed that of ${I}_{k,{N}_{1},s}^{\mathrm{FCC}+}(f)$. Recall from [15] that the FCC+ rule ${I}_{k,N,s}^{\mathrm{FCC}+}(f)$ is more complex than its corresponding FCC rule ${I}_{k,N}^{\mathrm{FCC}}(f)$ by at least $\mathcal{O}({s}^{3})+N(8s5)+s(17s+9)+1$ flops. Now assume that ${C}_{k}(N)$ is the exact number of flops for constructing ${I}_{k,N}^{\mathrm{FCC}}(f)$ by the algorithm proposed in [5]. Thus, in order to compare the simplicity of the two rules ${I}_{k,{N}_{1},s}^{\mathrm{FCC}+}(f)$ and ${I}_{k,{N}_{2}}^{\mathrm{FCC}}(f)$, we should compare the two numbers ${C}_{k}({N}_{2})$ and
$${C}_{k}({N}_{1})+\mathcal{O}({s}^{3})+{N}_{1}(8s5)+s(17s+9)+1.$$ 
Even if we compute the number of flops for construction of the two rules exactly, error estimates (specially for the FCC+ rules) are not so sharp that the accuracy of the rules can be estimated a priori for a given $N$ and $s$. For practical examining the efficiency of an approximation method (or rule), one can consider the runtime that the method consumes in a specific machine to reach a given accuracy.
In the following, we apply FCC and FCC+ to a set of oscillatory integrals (1), covering wave numbers from moderate to quite large values. We try to have at least one sample of the amplitude function in each class of ${C}^{\mathrm{\infty}}([1,1])$, ${C}^{m}([1,1])$ for some integer $m>0$, elementary functions, and nonelementary functions. In each method, we choose $N$ as the minimum number of interpolation points to achieve a given accuracy. As discussed above, this number for FCC is generally larger than that of FCC+.
Experiment 1.
Let $d:={\mathrm{log}}_{10}E$, where $E$ is the relative error of an approximation. Indeed, $d$ is roughly the number of correct significant digits in the approximation. In Fig. 1, we plotted $d$ versus the machine runtime (in milliseconds) for the FCC and FCC+ rules (9) of orders $s=1,2,3$ when applied to ${I}_{k}(f)$ with $f(x)=(1+x)/(1+{x}^{2})$ and some wave numbers $k>0$ ranging from small to quite large values. Note that each time an algorithm is run by a specific machine, the runtime may vary a little mainly because of invisible services running in the background on the operating system (OS). For that reason, in all the numerical experiments in the sequel concerning runtimes, we run the corresponding Matlab algorithm ten times and consider the minimum of the runtimes. For $k\ge 70$, for which the plots have space enough, we label each marker by the corresponding $N$. All the Matlab codes have been optimized in order to use less memory and take less time to execute.
Although, the derivatives of $f$ have been precalculated manually to the advantage of FCC+, our numerical experiments turn out that FCC are still more efficient than FCC+, especially for moderate and larger $k$. Also, among the FCC+ rules, those with smaller $s$ are more efficient. This observation is against some authors’ claims that the FCC+ rules are ‘the winner’ just because the asymptotic decaying rates of their error are higher than those of the FCC rules. As it is seen, for a specific $N$, accuracy of the FCC+ rules is higher than that of the FCC rule, and this accuracy enhances as $s$ grows. For example, in the last plot corresponding to $k=1000$, the 31point FCC rule, the 26point FCC+ rule of order 1, the 19point FCC+ rule of order 2, and the 12point FCC+ rule of order 3 reach approximately 13 correct significant digits. However, the FCC rules ultimately reach a given accuracy faster than all the FCC+ rules.
Experiment 2.
The amplitude function $f$ in the previous experiment is infinitely differentiable. Here, we consider $f(x)={x}^{3/2}$, $x\in [0,1]$, for which only the first derivative exists on $[0,1]$. Thus, the parameter $s$ in the FCC+ rules cannot exceed 1. In Fig. 2, we have plotted the accuracy $d$ versus the machine runtime (in milliseconds) for the FCC rules and the FCC+ rules of order 1. Because in this example the degree of regularity is low, the rules reach a given accuracy only for larger $N$.
Experiment 3.
In order to see if the above experimental results hold for other amplitude functions $f$, here we choose a set of elementary and nonelementary functions of various types on $[0,1]$:
$$\begin{array}{cc}{x}^{\alpha},& \alpha =2.5,3.5,4.5,\hfill \\ {x}^{n}\mathrm{log}(x),& n=2,3,4,\hfill \\ {x}^{2}{H}_{0}^{(1)}(x),& x{H}_{1}^{(1)}(x),\hfill \\ {I}_{\nu}(x),& \nu =0,1,\hfill \\ {x}^{2}{K}_{0}(x),& x{K}_{1}(x).\hfill \end{array}$$ 
Here, ${H}_{\nu}^{(1)}$, ${I}_{\nu}$, and ${K}_{\nu}$ are the Hankel function of the first kind, the modified Bessel function of the first kind, and the modified Bessel function of the second kind of order $\nu $, respectively. ${I}_{\nu}(x)$ is smooth on $[0,1]$ while each of the other functions has one and only one singularity at zero. The singularity is bounded and of algebraic or logarithmic type.
3. FCC vs Adaptive FCC+
In this section, we compare the efficiency in practice of the FCC and the adaptive FCC+ rules of [14]. An adaptive FCC+ rule and its corresponding FCC+ rule always have the same asymptotic order while the former is derivativefree and less complex. We will see, however, that in the race to achieve a given accuracy, the adaptive FCC+ rules are still overtaken by the corresponding FCC rules. At first, we give a brief review of [14], where the concept of adaptive FCC+ rules has been introduced for the first time.
3.1. A review of the adaptive FCC+ rules
The ($N+2s+1$)point adaptive FCC+ rule of order $s$ is defined by
(12)  $${I}_{k,N,s}^{\mathrm{AFCC}+}(f):={I}_{k}({p}_{N+2s}),$$ 
where ${p}_{N+2s}$ is the Lagrange interpolant of $f$ at ${t}_{n,N}$, for $n=0,\mathrm{\dots},N$, and $2s$ extra points ${t}_{m}^{\prime}(k)$, ${t}_{m}^{\prime \prime}(k)$, $m=1,\mathrm{\dots},s$, that should be near the endpoints $\pm 1$ and are defined as follows:
(13)  $${t}_{m}^{\prime}(k):=\mathrm{cos}({\eta}_{m}(k))\phantom{\rule{1em}{0ex}}\text{with}\phantom{\rule{1em}{0ex}}{\eta}_{m}(k):=\sqrt{\frac{m}{k+1}},{t}_{m}^{\prime \prime}(k):={t}_{m}^{\prime}(k).$$ 
As mentioned above, an adaptive FCC+ rule preserves the asymptotic order of the corresponding FCC+ rule. More precisely, we have the following result proved in [14]:
Theorem 1.
Assume that $f\in {C}^{N+2s+1}([1,1])$. Then,
(14)  $$\left{I}_{k}(f){I}_{k,N,s}^{\mathrm{AFCC}+}(f)\right=\mathcal{O}\left({k}^{s2}\right),$$ 
as $k\to \mathrm{\infty}$.
In contrast to the FCC+ rules, there are error bounds for the adaptive FCC+ rules that are explicit in all the three parameters $k$, $N$, and $s$.
Corollary 2.
Assume that $f\in {C}^{N+2s+1}([1,1])$, and $k+1\ge 2s$. Then,
(15)  $$\left{I}_{k,N,s}^{\mathrm{AFCC}+}(f){I}_{k}(f)\right\le {k}^{2}{2}^{N}\frac{17}{(N+2s4)!}{\Vert {f}^{(N+2s+1)}\Vert}_{\mathrm{\infty}},$$ 
for all $N>1$.
Proof.
For $N>1$,
$\frac{2Ns!{k}^{s}}{(N+2s+1)!}+\frac{{\gamma}_{N,s}}{(N+2s1)!}$  $$  
$$ 
where ${\gamma}_{N,s}:=s(4N+2s1)+N(2{N}^{2}+1)/3$. Now, the result is followed by Theorem 2.2 of [14]. $\mathrm{\u25a0}$
In [14], it was shown numerically that, adaptive FCC+ is always more efficient than FCC+. However, our numerical experiments in the next section show that it is still overtaken by FCC is the race to achieve a given accuracy.
3.2. The race for convergence speed
In this section, we repeat the numerical experiments of Section 2.2, now by the adaptive FCC+ instead of the FCC+ rules. The same discussion is applicable here, too. In Fig. 5–Fig. 8, we compare convergence rates of the adaptive FCC+ rules and the corresponding FCC rules. The results are the same as in Fig. 1–Fig. 4, and FCC is always the winner.
4. Other advantages of FCC rules
The FCC rules have further advantages over the FCC+ and adaptive FCC+ rules. In this section, we account some of them.

(1)
Composite FCC rules can readily be developed [4] while the composite versions of the (adaptive) FCC+ rules require many data within the integration interval. In the case of FCC+ rules, one should compute derivatives of the amplitude function at some (or many) internal points in addition to the endpoints. This can increase the computational cost most largely.

(2)
One can apply composite FCC rules with suitably graded meshes and treat oscillatory integrals with algebraic or logarithmic endpoint singularities [4]. Trying to apply this idea to the FCC+ rules, one should compute derivatives of the amplitude function at some (or many) points clustered near the singular endpoint. In the case of adaptive FCC+ rules, we will have a bunch of points clustered near the endpoints of each subinterval. Thus, in the vicinity of the singular endpoint, the density of the points may become so high that rounding error pollutes the total accuracy.

(3)
When estimating the computational cost of the FCC+ rules in [15], it is assumed that the derivatives of the amplitude function have been precalculated. This is also the case in the numerical experiments of Section 2.2. However, in many situations it is impossible or awkward to differentiate manually [10]. It is wellknown that computing derivatives (especially of highorder) by finite differences is subject to cancellation. For that reason, many special numerical differentiation methods has been developed, e.g. automatic differentiation [9, 18, 19], complex and multicomplex step differentiation [12, 16, 17], differentiation rules base on Cauchy integrals [2, 1], etc. Thus, computing the derivatives is not necessarily a trivial task and imposes an extra computational cost.

(4)
One can study the stability of the FCC rules [13], but such study can be sophisticated for (adaptive) FCC+ rules.
5. Conclusions
In this work, we carried out a benchmark of numerical experiments and observed that the (adaptive) FCC+ rules are most often overtaken by the plain FCC rules in the speed race to achieve a given accuracy. Availability of error estimates when the amplitude function is of confined regularity, capability to develop efficient composite rules, availability of a stability analysis, and being derivativefree are other features that make the FCC rules the mostefficient decisivechoice ones among other Filontype methods for oscillatory integrals (1).
References
 [1] (2013) Optimal contours for highorder derivatives. IMA J. Numer. Anal. 33, pp. 403–412.
 [2] (2011) Accuracy and stability of computing highorder derivatives of analytic functions by Cauchy integrals. Found. Comput. Math. 11, pp. 1–63.
 [3] (2017) Computing highly oscillatory integrals. SIAM.
 [4] (2013) FilonClenshawCurtis rules for highly oscillatory integrals with algebraic singularities and stationary points. SIAM J. Numer. Anal. 51, pp. 1542–1566.
 [5] (2011) Stability and error estimates for FilonClenshawCurtis rules for highly oscillatory integrals. IMA J. Numer. Anal. 31, pp. 1253–1280.
 [6] (1996) Fast algorithms for polynomial interpolation, integration, and differentiation. SIAM J. Numer. Anal. 33, pp. 1689–1711.
 [7] (2017) A generalization of FilonClenshawCurtis quadrature for highly oscillatory integrals. BIT Numer. Math. 57, pp. 943–961.
 [8] (1987) A fast algorithm for particle simulations. J. Comput. Phys. 73, pp. 325–348.
 [9] (2008) Evaluating derivatives: principles and techniques of algorithmic differentiation. SIAM.
 [10] (2004) On quadrature methods for highly oscillatory integrals and their implementation. BIT Numer. Math. 44, pp. 755–772.
 [11] (2005) Efficient quadrature of highly oscillatory integrals using derivatives. Proc. R. Soc. Lond. A 461, pp. 1383–1399.
 [12] (2012) Using multicomplex variables for automatic computation of highorder derivatives. ACM Trans. Math. Softw. 38, pp. 1–21.
 [13] (2022) On the stability of FilonClenshawCurtis rules. Bull. Iran. Math. Soc. 48, pp. 2943–2964.
 [14] (2023) Adaptive FCC+ rules for oscillatory integrals. J. Comput. Appl. Math. 424, pp. 115012.
 [15] (2023) Efficient construction of FCC+ rules. J. Comput. Appl. Math. 417, pp. 114592.
 [16] (2003) The complexstep derivative approximation. ACM Trans. Math. Softw. 29, pp. 245–262.
 [17] (2014) Multicomplex Taylor series expansion for computing high order derivatives. Int. J. Appl. Math. 27, pp. 311–334.
 [18] (2010) Introduction to automatic differentiation and MATLAB objectoriented programming. SIAM Rev. 52, pp. 545–563.
 [19] (2013) An efficient overloaded method for computing derivatives of mathematical functions in MATLAB. ACM Trans. Math. Softw. 39, pp. 1–36.
 [20] (2010) Error bounds for approximation in Chebyshev points. Numer. Math. 116 (3), pp. 463–491.
 [21] (2015) On error bounds of FilonClenshawCurtis quadrature for highly oscillatory integrals. Adv. Comput. Math. 41, pp. 573–597.