Return to Article Details Multivariate error function based neural network approximations

Multivariate error function based neural network approximations

George A. Anastassiou

May 1st, 2014.

Department of Mathematical Sciences, University of Memphis, Memphis, TN 38152, U.S.A., e-mail: ganastss@memphis.edu.

Here we present multivariate quantitative approximations of real and complex valued continuous multivariate functions on a box or RN, NN, by the multivariate quasi-interpolation, Baskakov type and quadrature type neural network operators. We treat also the case of approximation by iterated operators of the last three types. These approximations are derived by establishing multidimensional Jackson type inequalities involving the multivariate modulus of continuity of the engaged function or its high order partial derivatives. Our multivariate operators are defined by using a multidimensional density function induced by the Gaussian error special function. The approximations are pointwise and uniform. The related feed-forward neural network is with one hidden layer.

MSC. 41A17, 41A25, 41A30, 41A36.

Keywords. error function, multivariate neural network approximation, quasi-interpolation operator, Baskakov type operator, quadrature type operator, multivariate modulus of continuity, complex approximation, iterated approximation.

1 Introduction

The author in [ 2 ] and [ 3 ] , see chapters 2-5, was the first to establish neural network approximations to continuous functions with rates by very specifically defined neural network operators of Cardaliagnet-Euvrard and ”Squashing” types, by employing the modulus of continuity of the engaged function or its high order derivative, and producing very tight Jackson type inequalities. He treats there both the univariate and multivariate cases. The defining these operators ”bell-shaped” and ”squashing” functions are assumed to be of compact support. Also in [ 3 ] he gives the Nth order asymptotic expansion for the error of weak approximation of these two operators to a special natural class of smooth functions, see chapters 4-5 there.

For this article the author is motivated by the article [ 12 ] of Z. Chen and F. Cao, also by [ 4 ] , [ 5 ] , [ 6 ] , [ 7 ] , [ 8 ] , [ 9 ] , [ 10 ] , [ 13 ] , [ 14 ] .

The author here performs multivariate error function based neural network approximations to continuous functions over boxes or over the whole RN, NN, then he extends his results to complex valued multivariate functions. Also he does iterated approximation. All convergences here are with rates expressed via the multivariate modulus of continuity of the involved function or its high order partial derivative and given by very tight multidimensional Jackson type inequalities.

The author here comes up with the ”right” precisely defined multivariate quasi-interpolation neural network operators related to boxes or RN, as well as Baskakov type and quadrature type related operators on RN. Our boxes are not necessarily symmetric to the origin. In preparation to prove our results we establish important properties of the basic multivariate density function induced by error function and defining our operators.

Feed-forward neural networks (FNNs) with one hidden layer, the only type of networks we deal with in this article, are mathematically expressed as

Nn(x)=j=0ncjσ(ajx+bj), \ \ \ xRs, \ \ sN,

where for 0jn, bjR are the thresholds, ajRs are the connection weights, cjR are the coefficients, ajx is the inner product of aj and x, and σ is the activation function of the network. In many fundamental network models, the activation function is the error function. About neural networks read [ 15 ] , [ 16 ] , [ 17 ] .

2 Basics

We consider here the (Gauss) error special function ( [ 1 ] , [ 11 ] )

erf(x)=2π0xet2dt,xR,
1

which is a sigmoidal type function and is a strictly increasing function.

It has the basic properties

erf(0)=0, \ erf(x)=erf(x), \ \ erf(+)=1, \ erf()=1.

We consider the activation function ( [ 10 ] )

χ(x)=14(erf(x+1)erf(x1))>0,xR,
2

which is an even function.

Next we follow [ 10 ] on χ. We got there χ(0)0.4215, and that χ is strictly decreasing on [0,) and strictly increasing on (,0], and the x-axis is the horizontal asymptote on χ, i.e. χ is a bell symmetric function.

Theorem 1

[ 10 ] We have that

i=χ(xi)=1,xR,i=χ(nxi)=1,xR, nN,

and

χ(x)dx=1,
5

that is χ(x) is a density function on R.

We need the important

Theorem 2

[ 10 ] Let 0<α<1, and nN with n1α3. It holds

|nxk|n1αk=χ(nxk)<12π(n1α2)e(n1α2)2.
6

Denote by the integral part of the number and by the ceiling of the number.

Theorem 3

[ 10 ] Let x[a,b]R and nN so that nanb. It holds

1k=nanbχ(nxk)<1χ(1)4.019,x[a,b].
7

Also from [ 10 ] we get

limnk=nanbχ(nxk)1,
8

at least for some x[a,b].

For large enough n we always obtain nanb. Also aknb, iff naknb. In general it holds by (4) that

k=nanbχ(nxk)1.
9

We introduce

Z(x1,...,xN):=Z(x):=i=1Nχ(xi),x=(x1,...,xN)RN, NN.
10

It has the properties:

  1. Z(x)>0, xRN,

  2. k=Z(xk):=k1=k2=...kN=Z(x1k1,...,xNkN)=1, 
    11

    where k:=(k1,...,kn)ZN, xRN,

    hence

  3. k=Z(nxk)==k1=k2=...kN=Z(nx1k1,...,nxNkN)=1,xRN; nN,

    and

  4. RNZ(x)dx=1,
    13

    that is Z is a multivariate density function.

Here x:=max{|x1|,...,|xN|}, xRN, also set :=(,...,), :=(,...,) upon the multivariate context, and

na:=(na1,...,naN),nb:=(nb1,...,nbN),

where a:=(a1,...,aN), b:=(b1,...,bN).

We obviously see that

k=nanbZ(nxk)=k=nanb(i=1Nχ(nxiki))=k1=na1nb1...kN=naNnbN(i=1Nχ(nxiki))=i=1N(ki=nainbiχ(nxiki)).

For 0<β<1 and nN, a fixed xRN, we have that

k=nanbχ(nxk)==knx1nβk=nanbχ(nxk)+knx>1nβk=nanbχ(nxk).

In the last two sums the counting is over disjoint vector sets of k’s, because the condition knx>1nβ implies that there exists at least one |krnxr|>1nβ, where r{1,...,N}.

We treat

Unknown environment 'bgroup'

when n1β3.

We have proved that

  1. knx>1nβk=nanbZ(nxk)12π(n1β2)e(n1β2)2,
    20

    0<β<1, nN;n1β3, x(i=1N[ai,bi]).

By Theorem 3 clearly we obtain

0<1k=nanbZ(nxk)=1i=1N(ki=nainbiχ(nxiki))<1(χ(1))N(4.019)N.

That is,

  1. it holds

    0<1k=nanbZ(nxk)<1(χ(1))N(4.019)N,x(i=1N[ai,bi]), nN.
    22

It is also clear that

  1. knx>1nβk=Z(nxk)12π(n1β2)e(n1β2)2,
    23

    0<β<1, nN:n1β3, x(i=1N[ai,bi]).

Also we get that

limnk=nanbZ(nxk)1,
24

for at least some x(i=1N[ai,bi]).

Let fC(i=1N[ai,bi]) and nN such that nainbi, i=1,...,N.

We introduce and define the multivariate positive linear neural network operator (x:=(x1,...,xN)(i=1N[ai,bi]))

An(f,x1,...,xN):=An(f,x):=k=nanbf(kn)Z(nxk)k=nanbZ(nxk):=k1=na1nb1k2=na2nb2...kN=naNnbNf(k1n,...,kNn)(i=1Nχ(nxiki))i=1N(ki=nainbiχ(nxiki)).

For large enough n we always obtain nainbi, i=1,...,N. Also aikinbi, iff naikinbi, i=1,...,N.

For convenience we call

An(f,x):=k=nanbf(kn)Z(nxk):=k1=na1nb1k2=na2nb2...kN=naNnbNf(k1n,...,kNn)(i=1Nχ(nxiki)),

x(i=1N[ai,bi]).

That is

An(f,x):=An(f,x)k=nanbZ(nxk),
27

x(i=1N[ai,bi]), nN.

Hence

An(f,x)f(x)=An(f,x)f(x)(k=nanbZ(nxk))k=nanbZ(nxk).
28

Consequently we derive

|An(f,x)f(x)|(4.019)N|An(f,x)f(x)k=nanbZ(nxk)|,
29

x(i=1N[ai,bi]).

We will estimate the right hand side of (29).

For the last we need, for fC(\footnotesizei=1N[ai,bi]) the first multivariate modulus of continuity

ω1(f,h):=supxyhx,yi=1N[ai,bi]|f(x)f(y)|,h>0.
30

It holds that

limh0ω1(f,h)=0.
31

Similarly it is defined for fCB(RN) (continuous and bounded functions on RN) the ω1(f,h), and it has the property (31), given that fCU(RN) (uniformly continuous functions on RN).

When fCB(RN) we define,

Bn(f,x):=Bn(f,x1,...,xN):=k=f(kn)Z(nxk):=:=k1=k2=...kN=f(k1n,k2n,...,kNn)(i=1Nχ(nxiki)),

nN, xRN, NN, the multivariate quasi-interpolation neural network operator.

Also for fCB(RN) we define the multivariate Kantorovich type neural network operator

Cn(f,x):=Cn(f,x1,...,xN):=k=(nNknk+1nf(t)dt)Z(nxk):==k1=k2=...kN=(nNk1nk1+1nk2nk2+1n...kNnkN+1nf(t1,...,tN)dt1...dtN)(i=1Nχ(nxiki)),

nN,  xRN.

Again for fCB(RN), NN, we define the multivariate neural network operator of quadrature type Dn(f,x), nN, as follows. Let θ=(θ1,...,θN)NN, r=(r1,...,rN)Z+N, wr=wr1,r2,...rN0, such that r=0θwr=r1=0θ1r2=0θ2...rN=0θNwr1,r2,...rN=1; kZN and

δnk(f):=δn,k1,k2,...,kN(f):=r=0θwrf(kn+rnθ):=:=r1=0θ1r2=0θ2...rN=0θNwr1,r2,...rNf(k1n+r1nθ1,k2n+r2nθ2,...,kNn+rNnθN),

where rθ:=(r1θ1,r2θ2,...,rNθN).

We put

Dn(f,x):=Dn(f,x1,...,xN):=k=δnk(f)Z(nxk):=:=k1=k2=...kN=δn,k1,k2,...,kN(f)(i=1Nχ(nxiki)),

xRN.

Let fixed jN, 0<β<1, and A,B>0. For large enough nN:n1β3, in the linear combination (Anβj+B(n1β2)e(n1β2)2), the dominant rate of convergence, as n, is nβj. The closer β is to 1 we get faster and better rate of convergence to zero.

Let fCm(i=1N[ai,bi]), m,NN. Here fα denotes a partial derivative of f, α:=(α1,...,αN), αiZ+, i=1,...,N, and |α|:=i=1Nαi=l, where l=0,1,...,m. We write also fα:=αfxα and we say it is of order l.

We denote

ω1,mmax(fα,h):=maxα:|α|=mω1(fα,h).
36

Call also

fα,mmax:=max|α|=m{fα},
37

is the supremum norm.

In this article we study the basic approximation properties of An,Bn,Cn,Dn neural network operators and as well of their iterates. That is, the quantitative pointwise and uniform convergence of these operators to the unit operator I. We study also the complex functions related approximation.

3 Multidimensional Real Neural Network Approximations

Here we present a series of neural network approximations to a function given with rates.

We give

Theorem 4

Let fC(i=1N[ai,bi]), 0<β<1, x(i=1N[ai,bi]), N,nN with n1β3. Then

1)

|An(f,x)f(x)|(4.019)N[ω1(f,1nβ)+fπ(n1β2)e(n1β2)2]=:λ1,
38

and

2)

An(f)fλ1.
39

We notice that limnAn(f)=f, pointwise and uniformly.

Proof â–¼
We observe that
Δ(x):=An(f,x)f(x)k=nanbZ(nxk)=
=k=nanbf(kn)Z(nxk)k=nanbf(x)Z(nxk)=k=nanb(f(kn)f(x))Z(nxk).

Thus

|Δ(x)|k=nanb|f(kn)f(x)|Z(nxk)==knx1nβk=nanb|f(kn)f(x)|Z(nxk)+knx>1nβk=nanb|f(kn)f(x)|Z(nxk)(by (???))ω1(f,1nβ)+2fknx>1nβk=nanbZ(nxk)(by (???))ω1(f,1nβ)+fπ(n1β2)e(n1β2)2.

So that

|Δ|ω1(f,1nβ)+fπ(n1β2)e(n1β2)2.

Now using (29) we finish proof.

Proof â–¼

We continue with

Theorem 5

Let fCB(RN), 0<β<1, xRN, N,nN with n1β3. Then

1)

|Bn(f,x)f(x)|ω1(f,1nβ)+fπ(n1β2)e(n1β2)2=:λ2,
42

2)

Bn(f)fλ2.
43

Given that f(CU(RN)CB(RN)), we obtain limnBn(f)=f, uniformly.

Proof â–¼
We have that
Bn(f,x)f(x)=(???)k=f(kn)Z(nxk)f(x)k=Z(nxk)=k=(f(kn)f(x))Z(nxk).

Hence

|Bn(f,x)f(x)|k=|f(kn)f(x)|Z(nxk)=knx1nβk=|f(kn)f(x)|Z(nxk)+knx>1nβk=|f(kn)f(x)|Z(nxk)(???)ω1(f,1nβ)+2fknx>1nβk=Z(nxk)(???)ω1(f,1nβ)+fπ(n1β2)e(n1β2)2,

proving the claim.

Proof â–¼

We give

Theorem 6

Let fCB(RN), 0<β<1, xRN, N,nN with n1β3. Then

1)

|Cn(f,x)f(x)|ω1(f,1n+1nβ)+fπ(n1β2)e(n1β2)2=:λ3,
46

2)

Cn(f)fλ3.
47

Given that f(CU(RN)CB(RN)), we obtain limnCn(f)=f, uniformly.

Proof â–¼
We notice that
knk+1nf(t)dt=k1nk1+1nk2nk2+1n...kNnkN+1nf(t1,t2,...,tN)dt1dt2...dtN=

=01n01n...01nf(t1+k1n,t2+k2n,...,tN+kNn)dt1...dtN=01nf(t+kn)dt.
48

Thus it holds

Cn(f,x)=k=(nN01nf(t+kn)dt)Z(nxk).
49

We observe that

|Cn(f,x)f(x)|==|k=(nN01nf(t+kn)dt)Z(nxk)k=f(x)Z(nxk)|=|k=((nN01nf(t+kn)dt)f(x))Z(nxk)|=|k=(nN01n(f(t+kn)f(x))dt)Z(nxk)|k=(nN01n|f(t+kn)f(x)|dt)Z(nxk)=knx1nβk=(nN01n|f(t+kn)f(x)|dt)Z(nxk)+knx>1nβk=(nN01n|f(t+kn)f(x)|dt)Z(nxk)knx1nβk=(nN01nω1(f,t+knx)dt)Z(nxk)+2f(knx>1nβk=Z(|nxk|))ω1(f,1n+1nβ)+fπ(n1β2)e(n1β2)2,

proving the claim.

Proof â–¼

We also present

Theorem 7

Let fCB(RN), 0<β<1, xRN, N,nN with n1β3. Then

1)

|Dn(f,x)f(x)|ω1(f,1n+1nβ)+fπ(n1β2)e(n1β2)2=λ3,
52

2)

Dn(f)fλ3.
53

Given that f(CU(RN)CB(RN)), we obtain limnDn(f)=f, uniformly.

Proof â–¼
We have that
|Dn(f,x)f(x)|=|k=δnk(f)Z(nxk)k=f(x)Z(nxk)|=|k=(δnk(f)f(x))Z(nxk)|=|k=(r=0θwr(f(kn+rnθ)f(x)))Z(nxk)|k=(r=0θwr|f(kn+rnθ)f(x)|)Z(nxk)=knx1nβk=(r=0θwr|f(kn+rnθ)f(x)|)Z(nxk)+knx>1nβk=(r=0θwr|f(kn+rnθ)f(x)|)Z(nxk)knx1nβk=(r=0θwrω1(f,knx+rnθ))Z(nxk)+
+2f(knx>1nβk=Z(nxk))ω1(f,1n+1nβ)+fπ(n1β2)e(n1β2)2,

proving the claim.

Proof â–¼

In the next we discuss high order of approximation by using the smoothness of f.

We give

Theorem 8

Let fCm(i=1N[ai,bi]), 0<β<1, n,m,NN, n1β3, x(i=1N[ai,bi]). Then

i)

|An(f,x)f(x)j=1m(|α|=j(fα(x)i=1Nαi!)An(i=1N(xi)αi,x))|(4.019)N{Nmm!nmβω1,mmax(fα,1nβ)+(bamfα,mmaxNmm!)1π(n1β2)e(n1β2)2},

ii)

|An(f,x)f(x)|(4.019)N{j=1m(|α|=j(|fα(x)|i=1Nαi!)[1nβj+(i=1N(biai)αi)12π(n1β2)e(n1β2)2])+Nmm!nmβω1,mmax(fα,1nβ)+(bamfα,mmaxNmm!)1π(n1β2)e(n1β2)2},

iii)

An(f)f(4.019)N{j=1m(|α|=j(|fα(x)|i=1Nαi!)[1nβj+(i=1N(biai)αi)12π(n1β2)e(n1β2)2])+
+Nmm!nmβω1,mmax(fα,1nβ)+(bamfα,mmaxNmm!)1π(n1β2)e(n1β2)2}=:Kn,

iv) Assume fα(x0)=0, for all α:|α|=1,...,m; x0(i=1N[ai,bi]). Then

|An(f,x0)f(x0)|(4.019)N{Nmm!nmβω1max(fα,1nβ)+(bamfα,mmaxNmm!)1π(n1β2)e(n1β2)2},

notice in the last the extremely high rate of convergence at nβ(m+1).

Proof â–¼
Consider gz(t):=f(x0+t(zx0)), t0; x0,zi=1N[ai,bi].

Then

gz(j)(t)=[(i=1N(zix0i)xi)jf](x01+t(z1x01),...,x0N+t(zNx0N)),
60

for all j=0,1,...,m.

We have the multivariate Taylor’s formula

f(z1,...,zN)=gz(1)==j=0mgz(j)(0)j!+1(m1)!01(1θ)m1(gz(m)(θ)gz(m)(0))dθ.

Notice gz(0)=f(x0). Also for j=0,1,...,m, we have

gz(j)(0)=α:=(α1,...,αN), αiZ+,i=1,...,N, |α|:=i=1Nαi=j(j!i=1Nαi!)(i=1N(zix0i)αi)fα(x0).
62

Furthermore

gz(m)(θ)=α:=(α1,...,αN), αiZ+,i=1,...,N, |α|:=i=1Nαi=m(m!i=1Nαi!)(i=1N(zix0i)αi)fα(x0+θ(zx0)),
63

0θ1.

So we treat fCm(i=1N[ai,bi]).

Thus, we have for kn,x(i=1N[ai,bi]) that

Unknown environment 'bgroup'

where

R:=m01(1θ)m1α:=(α1,...,αN), αiZ+,i=1,...,N, |α|:=i=1Nαi=m(1i=1Nαi!)(i=1N(kinxi)αi)
66
[fα(x+θ(knx))fα(x)]dθ.

We see that

|R|m01(1θ)m1|α|=m(1i=1Nαi!)(i=1N|kinxi|αi)|fα(x+θ(knx))fα(x)|dθm01(1θ)m1(|α|=m(1i=1Nαi!)(i=1N|kinxi|αi)ω1(fα,θknx))dθ().

Notice here that

knx1nβ|kinxi|1nβ, \ i=1,...,N.
68

We further see that

()mω1,mmax(fα,1nβ)01(1θ)m1(|α|=m(1i=1Nαi!)(i=1N(1nβ)αi))dθ=(ω1,mmax(fα,1nβ)(m!)nmβ)(|α|=mm!i=1Nαi!)=(ω1,mmax(fα,1nβ)(m!)nmβ)Nm.

Conclusion: When knx1nβ, we proved that

|R|(Nmm!nmβ)ω1,mmax(fα,1nβ).
70

In general we notice that

|R|m01(1θ)m1(|α|=m(1i=1Nαi!)(i=1N(biai)αi)2fα)dθ=2|α|=m1i=1Nαi!(i=1N(biai)αi)fα(2bamfα,mmaxm!)(|α|=mm!i=1Nαi!)=2bamfα,mmaxNmm!.

We proved in general that

|R|2bamfα,mmaxNmm!:=ρ.
72

Next we see that

Un:=k=nanbZ(nxk)R=knx1nβk=nanbZ(nxk)R+knx>1nβk=nanbZ(nxk)R.

Consequently

|Un|(knx1nβk=nanbZ(nxk))Nmm!nmβω1,mmax(fα,1nβ)+ρ12π(n1β2)e(n1β2)2Nmm!nmβω1,mmax(fα,1nβ)+ρ12π(n1β2)e(n1β2)2.

We have established that

|Un|Nmm!nmβω1,mmax(fα,1nβ)+(bamfα,mmaxNmm!)1π(n1β2)e(n1β2)2.

We observe that

k=nanbf(kn)Z(nxk)f(x)k=nanbZ(nxk)==j=1m(|α|=j(fα(x)i=1Nαi!)(k=nanbZ(nxk)(i=1N(kinxi)αi)))+k=nanbZ(nxk)R.

The last says

An(f,x)f(x)(k=nanbZ(nxk))j=1m(|α|=j(fα(x)i=1Nαi!)An(i=1N(xi)αi,x))=Un.

Clearly An is a positive linear operator.

Thus (here αiZ+:|α|=i=1Nαi=j)

Unknown environment 'bgroup'
1nβj+(i=1N(biai)αi)12π(n1β2)e(n1β2)2.

So we have proved that

|An(i=1N(xi)αi,x)|1nβj+(i=1N(biai)αi)12π(n1β2)e(n1β2)2,
81

for all j=1,...,m.

At last we observe

|An(f,x)f(x)j=1m(|α|=j(fα(x)i=1Nαi!)An(i=1N(xi)αi,x))|(4.019)N|An(f,x)f(x)k=nanbZ(nxk)j=1m(|α|=j(fα(x)i=1Nαi!)An(i=1N(xi)αi,x))|.

Putting all of the above together we prove theorem.

Proof â–¼

We make

Definition 9

Let fCB(RN), NN. We define the general neural network operator

Fn(f,x):=k=lnk(f)Z(nxk)=={Bn(f,x), \ if lnk(f)=f(kn),Cn(f,x), \ if lnk(f)=nNknk+1nf(t)dt,Dn(f,x), \ if lnk(f)=δnk(f).

Clearly lnk(f) is a positive linear functional such that |lnk(f)|f.

Hence Fn(f) is a positive linear operator with Fn(f)f, a continuous bounded linear operator.

We need

Theorem 10

Let fCB(RN), N1. Then Fn(f)CB(RN).

Proof â–¼
Clearly Fn(f) is a bounded function.

Next we prove the continuity of Fn(f). Notice for N=1, Z=χ by (10).

We will use the Weierstrass M test: If a sequence of positive constants M1,M2,M3,..., can be found such that in some interval

(a) |un(x)|Mn, n=1,2,3,...

(b) Mn converges,

then un(x) is uniformly and absolutely convergent in the interval.

Also we will use:

If {un(x)}, n=1,2,3,... are continuous in [a,b] and if un(x) converges uniformly to the sum S(x) in [a,b], then S(x) is continuous in [a,b]. I.e. a uniformly convergent series of continuous functions is a continuous function. First we prove claim for N=1.

We will prove that k=lnk(f)χ(nxk) is continuous in xR.

There always exists λN such that nx[λ,λ].

Since nxλ, then nxλ and knxkλ0, when kλ. Therefore

k=λχ(nxk)=k=λχ(knx)k=λχ(kλ)=k=0χ(k)1.
87

So for kλ we get

|lnk(f)|χ(nxk)fχ(kλ),

and

fk=λχ(kλ)f.

Hence by Weierstrass M test we obtain that k=λlnk(f)χ(nxk) is uniformly and absolutely convergent on [λn,λn].

Since lnk(f)χ(nxk) is continuous in x, then k=λlnk(f)χ(nxk) is continuous on [λn,λn].

Because nxλ, then nxλ, and knxk+λ0, when kλ. Therefore

k=λχ(nxk)=k=λχ(knx)k=λχ(k+λ)=k=0χ(k)1.

So for kλ we get

|lnk(f)|χ(nxk)fχ(k+λ),
88

and

fk=λχ(k+λ)f.

Hence by Weierstrass M test we obtain that k=λlnk(f)χ(nxk) is uniformly and absolutely convergent on [λn,λn].

Since lnk(f)χ(nxk) is continuous in x, then k=λlnk(f)χ(nxk) is continuous on [λn,λn].

So we proved that k=λlnk(f)χ(nxk) and k=λlnk(f)χ(nxk) are continuous on R. Since k=λ+1λ1lnk(f)χ(nxk) is a finite sum of continuous functions on R, it is also a continuous function on R.

Writing

k=lnk(f)χ(nxk)=k=λlnk(f)χ(nxk)++k=λ+1λ1lnk(f)χ(nxk)+k=λlnk(f)χ(nxk)

we have it as a continuous function on R. Therefore Fn(f), when N=1, is a continuous function on R.

When N=2 we have

Fn(f,x1,x2)=k1=k2=lnk(f)χ(nx1k1)χ(nx2k2)==k1=χ(nx1k1)(k2=lnk(f)χ(nx2k2))

(there always exist λ1,λ2N such that nx1[λ1,λ1] and nx2[λ2,λ2])

=k1=χ(nx1k1)[k2=λ2lnk(f)χ(nx2k2)++k2=λ2+1λ21lnk(f)χ(nx2k2)+k2=λ2lnk(f)χ(nx2k2)]=k1=k2=λ2lnk(f)χ(nx1k1)χ(nx2k2)++k1=k2=λ2+1λ21lnk(f)χ(nx1k1)χ(nx2k2)++k1=k2=λ2lnk(f)χ(nx1k1)χ(nx2k2)=:().


(For convenience call

F(k1,k2,x1,x2):=lnk(f)χ(nx1k1)χ(nx2k2). )

Thus

()=k1=λ1k2=λ2F(k1,k2,x1,x2)+k1=λ1+1λ11k2=λ2F(k1,k2,x1,x2)++k1=λ1k2=λ2F(k1,k2,x1,x2)+k1=λ1k2=λ2+1λ21F(k1,k2,x1,x2)++k1=λ1+1λ11k2=λ2+1λ21F(k1,k2,x1,x2)+k1=λ1k2=λ2+1λ21F(k1,k2,x1,x2)++k1=λ1k2=λ2F(k1,k2,x1,x2)+k1=λ1+1λ11k2=λ2F(k1,k2,x1,x2)++k1=λ1k2=λ2F(k1,k2,x1,x2).

Notice that the finite sum of continuous functions F(k1,k2,x1,x2),

k1=λ1+1λ11k2=λ2+1λ21F(k1,k2,x1,x2)

is a continuous function.

The rest of the summands of Fn(f,x1,x2) are treated all the same way and similarly to the case of N=1. The method is demonstrated as follows.

We will prove that

k1=λ1k2=λ2lnk(f)χ(nx1k1)χ(nx2k2)

is continuous in (x1,x2)R2.

The continuous function

|lnk(f)|χ(nx1k1)χ(nx2k2)fχ(k1λ1)χ(k2+λ2),

and

fk1=λ1k2=λ2χ(k1λ1)χ(k2+λ2)==f(k1=λ1χ(k1λ1))(k2=λ2χ(k2+λ2))f(k1=0χ(k1))(k2=0χ(k2))f.

So by the Weierstrass M test we get that

k1=λ1k2=λ2lnk(f)χ(nx1k1)χ(nx2k2)

is uniformly and absolutely convergent. Therefore it is continuous on R2.

Next we prove continuity on R2 of

k1=λ1+1λ11k2=λ2lnk(f)χ(nx1k1)χ(nx2k2).

Notice here that

|lnk(f)|χ(nx1k1)χ(nx2k2)fχ(nx1k1)χ(k2+λ2)fχ(0)χ(k2+λ2)=0.4215fχ(k2+λ2),

and

0.4215f(k1=λ1+1λ111)(k2=λ2χ(k2+λ2))==0.4215f(2λ11)(k2=0χ(k2))0.4215(2λ11)f.

So the double series under consideration is uniformly convergent and continuous. Clearly Fn(f,x1,x2) is proved to be continuous on R2.

Similarly reasoning one can prove easily now, but with more tedious work, that Fn(f,x1,...,xN) is continuous on RN, for any N1. We choose to omit this similar extra work.

Proof â–¼

Remark 1

By (25) it is obvious that An(f)f<, and An(f)C(i=1N[ai,bi]), given that fC(i=1N[ai,bi]).

Call Ln any of the operators An,Bn,Cn,Dn.

Clearly then

Ln2(f)=Ln(Ln(f))Ln(f)f, 
92

etc.

Therefore we get

Lnk(f)f, \  kN
93

the contraction property.

Also we see that

Lnk(f)Lnk1(f)...Ln(f)f.
94

Also Ln(1)=1, Lnk(1)=1, kN.

Here Lnk are positive linear operators.â–¡

Notation 11

Here NN, 0<β<1. Denote by

cN:={(4.019)N,if\ Ln=An,1,if \ Ln=Bn,Cn,Dn,φ(n):={1nβ,if\ Ln=AnBn,1n+1nβ,if\ Ln=Cn,Dn,Ω:={C(i=1N[ai,bi]), if\ Ln=An,CB(RN),if\ Ln=Bn,Cn,Dn,

and

Y:={i=1N[ai,bi],if\ Ln=AnRN,if\ Ln=Bn,Cn,Dn.
104

We give the condensed

Theorem 12

Let fΩ, 0<β<1, xY; n, NN with n1β3. Then

(i)

|Ln(f,x)f(x)|cN[ω1(f,φ(n))+fπ(n1β2)e(n1β2)2]=:τ,
107

(ii)

Ln(f)fτ.
108

For f uniformly continuous and in Ω we obtain

limnLn(f)=f,

pointwise and uniformly.

Proof â–¼
By Theorems 4-7.
Proof â–¼

Next we do iterated neural network approximation (see also [ 9 ] ).

We make

Remark 2

Let rN and Ln as above. We observe that

Lnrff=(LnrfLnr1f)+(Lnr1fLnr2f)++(Lnr2fLnr3f)+...+(Ln2fLnf)+(Lnff).

Then

LnrffLnrfLnr1f+Lnr1fLnr2f++Lnr2fLnr3f+...+Ln2fLnf+Lnff==Lnr1(Lnff)+Lnr2(Lnff)+Lnr3(Lnff)+...+Ln(Lnff)+LnffrLnff.

That is

Unsupported use of \hfil
110

We give

Theorem 13

All here as in Theorem 12 and rN, τ as in (107). Then

Lnrffrτ.
111

So that the speed of convergence to the unit operator of Lnr is not worse than of Ln.

Proof â–¼
By (110) and (108).
Proof â–¼

We make

Remark 3

Let m1,...,mrN:m1m2...mr, 0<β<1, fΩ. Then φ(m1)φ(m2)...φ(mr), φ as in (100).

Therefore

ω1(f,φ(m1))ω1(f,φ(m2))...ω1(f,φ(mr)).
112

Assume further that mi1β3, i=1,...,r. Then

1(m11β2)e(m11β2)21(m21β2)e(m21β2)2...1(mr1β2)e(mr1β2)2.

Let Lmi as above, i=1,...,r, all of the same kind.

We write

Lmr(Lmr1(...Lm2(Lm1f)))f==Lmr(Lmr1(...Lm2(Lm1f)))Lmr(Lmr1(...Lm2f))++Lmr(Lmr1(...Lm2f))Lmr(Lmr1(...Lm3f))++Lmr(Lmr1(...Lm3f))Lmr(Lmr1(...Lm4f))+...++Lmr(Lmr1f)Lmrf+Lmrff==Lmr(Lmr1(...Lm2))(Lm1ff)+Lmr(Lmr1(...Lm3))(Lm2ff)++Lmr(Lmr1(...Lm4))(Lm3ff)+...+Lmr(Lmr1ff)+Lmrff.

Hence by the triangle inequality property of we get

Lmr(Lmr1(...Lm2(Lm1f)))fLmr(Lmr1(...Lm2))(Lm1ff)+Lmr(Lmr1(...Lm3))(Lm2ff)++Lmr(Lmr1(...Lm4))(Lm3ff)+...++Lmr(Lmr1ff)+Lmrff

(repeatedly applying (92))

Lm1ff+Lm2ff+Lm3ff+...++Lmr1ff+Lmrff=i=1rLmiff.

That is, we proved

Unsupported use of \hfil
116

We give

Theorem 14

Let fΩ; N, m1,m2,...,mrN:m1m2...mr, 0<β<1; mi1β3, i=1,...,r, xY, and let (Lm1,...,Lmr) as (Am1,...,Amr) or (Bm1,...,Bmr) or (Cm1,...,Cmr) or (Dm1,...,Dmr). Then

|Lmr(Lmr1(...Lm2(Lm1f)))(x)f(x)|Lmr(Lmr1(...Lm2(Lm1f)))fi=1rLmiffcNi=1r[ω1(f,φ(mi))+fπ(mi1β2)e(mi1β2)2]
rcN[ω1(f,φ(m1))+fπ(m11β2)e(m11β2)2].

Clearly, we notice that the speed of convergence to the unit operator of the multiply iterated operator is not worse than the speed of Lm1.

Proof â–¼
Using (116), (112), (113) and (107), (108).
Proof â–¼

We continue with

Theorem 15

Let all as in Theorem 8, and rN. Here Kn is as in (58). Then

AnrffrAnffrKn.
118

Proof â–¼
By (110) and (58).
Proof â–¼

4 Complex Multivariate Neural Network Approximations

We make

Remark 4

Let Y=i=1n[ai,bi] or RN, and f:YC with real and imaginary parts f1,f2:f=f1+if2, i=1. Clearly f is continuous iff f1 and f2 are continuous.

Given that f1,f2Cm(Y), mN, it holds

fα(x)=f1,α(x)+if2,α(x),
119

where α indicates a partial derivative of any order and arrangement.

We denote by CB(RN,C) the space of continuous and bounded functions f:RNC. Clearly f is bounded, iff both f1,f2 are bounded from RN into R, where f=f1+if2.

Here Ln is any of An,Bn,Cn,Dn, nN.

We define

Ln(f,x):=Ln(f1,x)+iLn(f2,x), \ xY.
120

We observe that

|Ln(f,x)f(x)||Ln(f1,x)f1(x)|+|Ln(f2,x)f2(x)|,
121

and

Unsupported use of \hfil
122

We present

Theorem 16

Let fC(Y,C) which is bounded, f=f1+if2, 0<β<1, n,NN:n1β3,, xY. Then

i)

|Ln(f,x)f(x)|cN[ω1(f1,φ(n))+ω1(f2,φ(n2))+(f1+f2)π(n1β2)e(n1β2)2]=:ε,

ii)

Ln(f)fε.
124

Proof â–¼
Use of (107).
Proof â–¼

In the next we discuss high order of complex approximation by using the smoothness of f.

We give

Theorem 17

Let f:i=1n[ai,bi]C, such that f=f1+if2. Assume f1,f2Cm(i=1n[ai,bi]), 0<β<1, n,m,NN, n1β3,
x(i=1n[ai,bi]). Then

i)

|An(f,x)f(x)j=1m(|α|=j(fα(x)i=1Nαi!)An(i=1N(xi)αi,x))|(4.019)N{Nmm!nmβ(ω1,mmax(f1,α,1nβ)+ω1,mmax(f2,α,1nβ))++(bam(f1,α,mmax+f2,α,mmax)Nmm!)1π(n1β2)e(n1β2)2},

ii)

|An(f,x)f(x)|(4.019)N{j=1m(|α|=j(|f1,α(x)|+|f2,α(x)|i=1Nαi!)[1nβj+(i=1N(biai)αi)12π(n1β2)e(n1β2)2])+Nmm!nmβ(ω1,mmax(f1,α,1nβ)+ω1,mmax(f2,α,1nβ))+(bam(f1,α,mmax+f2,α,mmax)Nmm!)1π(n1β2)e(n1β2)2},

iii)

An(f)f(4.019)N{j=1m(|α|=j(f1,α(x)+f2,α(x)i=1Nαi!)[1nβj+(i=1N(biai)αi)12π(n1β2)e(n1β2)2])+Nmm!nmβ(ω1,mmax(f1,α,1nβ)+ω1,mmax(f2,α,1nβ))+(bam(f1,α,mmax+f2,α,mmax)Nmm!)1π(n1β2)e(n1β2)2},

iv) Assume fα(x0)=0, for all α:|α|=1,...,m; x0(i=1N[ai,bi]). Then

|An(f,x0)f(x0)|(4.019)N{Nmm!nmβ(ω1,mmax(f1,α,1nβ)+ω1,mmax(f2,α,1nβ))+(bam(f1,α,mmax+f2,α,mmax)Nmm!)1π(n1β2)e(n1β2)2},

notice in the last the extremely high rate of convergence at nβ(m+1).

Proof â–¼
By Theorem 8 and Remark 4.
Proof â–¼

Bibliography

1

M. Abramowitz and I. A. Stegun, eds., Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, New York, Dover Publications, 1972.

2

G. A. Anastassiou, Rate of convergence of some neural network operators to the unit-univariate case, J. Math. Anal. Appli., 212 (1997), pp. 237–262.

3

G. A. Anastassiou, Quantitative Approximations, Chapman&Hall/CRC, Boca Raton, New York, 2001.

4

G. A. Anastassiou, Inteligent Systems: Approximation by Artificial Neural Networks, Intelligent Systems Reference Library, vol. 19, Springer, Heidelberg, 2011.

5

G. A. Anastassiou, Univariate hyperbolic tangent neural network approximation, Mathematics and Computer Modelling, 53 (2011), pp. 1111–1132.

6

G. A. Anastassiou, Multivariate hyperbolic tangent neural network approximation, Computers and Mathematics 61 (2011), pp. 809–821.

7

G. A. Anastassiou, Multivariate sigmoidal neural network approximation, Neural Networks 24 (2011), pp. 378–386.

8

G. A. Anastassiou, Univariate sigmoidal neural network approximation, J. of Computational Analysis and Applications, vol. 14 (2012), no. 4, pp. 659–690.

9

G. A. Anastassiou, Approximation by neural networks iterates, Advances in Applied Mathematics and Approximation Theory, pp. 1–20, Springer Proceedings in Math. & Stat., Springer, New York, 2013, Eds. G. Anastassiou, O. Duman.

10

G. A. Anastassiou, Univariate error function based neural network approximation, submitted, 2014.

11

L. C. Andrews, Special Functions of Mathematics for Engineers, Second edition, Mc Graw-Hill, New York, 1992.

12

Z. Chen and F. Cao, The approximation operators with sigmoidal functions, Computers and Mathematics with Applications, 58 (2009), pp. 758–765.

13

D. Costarelli and R. Spigler, Approximation results for neural network operators activated by sigmoidal functions, Neural Networks 44 (2013), pp. 101–106.

14

D. Costarelli and R. Spigler, Multivariate neural network operators with sigmoidal activation functions, Neural Networks 48 (2013), pp. 72–77.

15

S. Haykin, Neural Networks: A Comprehensive Foundation (2 ed.), Prentice Hall, New York, 1998.

16

W. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity, Bulletin of Mathematical Biophysics, 7 (1943), pp. 115–133.

17

T. M. Mitchell, Machine Learning, WCB-McGraw-Hill, New York, 1997.