On the Best Approximation of Continuous Functions by Polynomials.
Five lessons held at the Faculty of Science from Cluj during the academic year 1933-1934

Dr. Tiberiu Popoviciu

(Institutul de Arte Grafice Ardealul, Cluj, 1937
(English translation 2016))

Chapter 1 First Lesson. The existence and uniqueness of the best approximation polynomials.

1.1 Bounded functions. The oscillation of a function.

We will consider real valued functions $f\left(x\right)$ of real variable $x,$ defined on the bounded and closed interval $\left[a,b\right],a<b$ .

Such a function $f\left(x\right)$ is upper bounded if there exists a real number $A$ such that all values taken by the function are less than $A$ . On the contrary, the function is not upper bounded. Let us denote by $M\left(f\right)$ the upper bound or the maximumof $f\left(x\right)$ . Let us remaind the definition of this number $M\left(f\right)$ : if $f\left(x\right)$ is not upper bounded $M\left(f\right)$ equals $+\infty$ and if $f\left(x\right)$ is upper bounded $M\left(f\right)$ is defined by the property that for any positive number $\varepsilon$ , there exists at least a point $x$ such that

f\left(x\right)>M\left(f\right)-\varepsilon

and also for any $x$ we have

f\left(x\right)\leq M\left(f\right).

It is now fairly clear what is the meaning of the lower bounded function as well as of a function which is not lower bounded. The definition of the lower bound or of the minimum $m\left(f\right)$ of the function $f\left(x\right)$ is perfectly analogous with that of $M\left(f\right)$ . A function which is simultaneously upper an lower bounded is simply called a bounded function. The difference $M\left(f\right)-m\left(f\right)$ is called the oscillation of $f\left(x\right)$ in the interval $\left[a,b\right]$ .

1.2 Continuous functions.

The meaning of the continuity of a function in an interval $\left[a,b\right]$ is well known. A continuous function in such an interval is uniformly continuous in that interval. This means that for any positive number $\varepsilon$ one can determine another positive number $\delta$ such that

\left|f\left(x^{\prime}\right)-f\left(x^{\prime\prime}\right)\right|<\varepsilon

for any $x^{\prime},\ x^{\prime\prime}$ verifying the condition

\left|x^{\prime}-x^{\prime\prime}\right|<\delta.

A continuous function attains its maximum $M\left(f\right)$ and its minimum $m\left(f\right)$ . Consequently, there exists at least a point $x^{\prime}$ such that $f\left(x^{\prime}\right)=M\left(f\right)$ and a point $x^{\prime\prime}$ such that $f\left(x^{\prime\prime}\right)=m\left(f\right)$ . Moreover, we can state that $M\left(f\right)$ is at the same time the upper bound of $f\left(x\right)$ . In other words, $M\left(f\right)$ enjoys the property that for any positive number $\varepsilon$ , there exists a set containing an infinity number of points $x$ such that

f\left(x\right)>M\left(f\right)-\varepsilon

and at most a finite number of points $x$ such that

f\left(x\right)>M\left(f\right)+\varepsilon.

In the same way the minimum $m\left(f\right)$ coincides with the lower bound of $f\left(x\right)$ , this lower bound being analogously defined with the upper bound. All these definitions extend to the functions of more variables defined in closed and bounded domains. Throughout these lectures we will need some other properties which will be recalled at the right moments.

1.3 The distance between two functions.

$f_{1}\left(x\right)$ and $f_{2}\left(x\right)$ being two functions $M\left(\left|f_{1}-f_{2}\right|\right)$ will be called their distance. If one of these functions is bounded and the other one is unbounded their distance equals infinity. If both functions are unbounded their distance can be finite. If one of the functions is bounded and their distance is finite then the other function must be bounded. The distance enjoy the following properties which can be easily proved:

1 ${}^{0}.$: $M\left(\left|f_{1}-f_{2}\right|\right)$ is a positive or null number;
2 ${}^{0}.$: $M\left(\left|f_{1}-f_{2}\right|\right)=0$ implies $f_{1}\left(x\right)=f_{2}\left(x\right)$ ;
3 ${}^{0}.$: $M\left(\left|Cf\right|\right)=CM\left(\left|f\right|\right)$ , $C$ being a positive constant;
4 ${}^{0}.$: $M\left(\left|f_{1}-f_{2}\right|\right)\leq M\left(\left|f_{1}-f_{3}\right|\right)+M\left(\left|f_{2}-f_{3}\right|\right)$ .

The problem of the best approximation which follows below depends on this definition of distance.

1.4 The problem of the best approximation using polynomials.

Let us see how this problem is formulated. Let us consider the family or the set of polynomials

P\left(x\right)=a_{0}x^{n}+a_{1}x^{n-1}+\cdots+a_{n}

of degree $n$ . A polynomial from this set is completely determined by the coefficients $a_{0},a_{1},\ldots,a_{n}$ which are real positive, negative or null numbers. It means that any polynomial of degree $n$ is at the same time a polynomial of degree $m$ with $m>n$ . In other words, the set of polynomials of degree $n$ contains the set of all polynomials of any degree less than $n$ .

For an arbitrary function $f\left(x\right)$ we say, by definition, that the distance $M\left(\left|f-P\right|\right)$ between this function and a polynomial $P\left(x\right)$ is the error or the approximation of $f\left(x\right)$ provided by the polynomial $P\left(x\right)$ .

For all polynomials $P\left(x\right)$ of degree $n$ , $M\left(\left|f-P\right|\right)$ has a lower bound denoted $mu_{n}\left(f\right)$ or simpler $\mu_{n}$ . $\mu_{n}$ is by definition the best approximation of $f\left(x\right)$ by polynomials of degree $n$ .

The problem of the best approximation using polynomials will be formulated in the following way:

Given a function $f\left(x\right)$ , one has to determine the set of polynomials $P\left(x\right)$ of degree $n$ such that $M\left(\left|f-P\right|\right)$ attains its lower bound $\mu_{n}$ and then to study the number $\mu_{n}$ .

A polynomial $P\left(x\right)$ of degree $n$ for which $\mu_{n}$ is attained will be called a best approximation polynomial of degree $n$ of the function $f\left(x\right)$ . Shortly we say that such a polynomial is a $T_{n}$ polynomial and will be denoted with $T_{n}\left(x;f\right)$ , $T_{n}\left(x\right)$ or simply $T_{n}$ .

The problem of the polynomials of the best approximation has been for the first time formulated by Russian mathematician P. L. Tchebychef.

1.5 The determination of $T_{n}$ in simple cases.

The problem of the best approximation can not be formulated for unbounded functions because in this situation $M\left(\left|f-P\right|\right)$ equals $+\infty$ , a polynomial being a bounded function (in the interval $\left[a,b\right]$ ).

If $f\left(x\right)$ is a polynomial of degree $n$ , the best approximation $\mu_{n}$ equals zero because in this case the function itself is a polynomial $T_{n}$ . The reciprocal statement is also true, as it follows from Section LABEL:sec:18 below.

If we know the polynomials $T_{n}$ for the function $f\left(x\right)$ we also know the polynomials $T_{n}$ for $f\left(x\right)+Q\left(x\right)$ and $Cf\left(x\right)$ where $Q\left(x\right)$ is a polynomial of degree $n$ and is a constant $C$ . Indeed we have

M\left(\left|f-P\right|\right)=M\left(\left|f+Q-\left(P+Q\right)\right|\right)=\mu_{n}\left(f\right)

and if $R\left(x\right)$ is a polynomial of degree $n$ , we additionally have

M\left|\left(f+Q\right)-R\right|=M\left(\left|f-\left(R-Q\right)\right|\right)\geq\mu_{n}\left(f\right).

It follows that $P\left(x\right)+Q\left(x\right)$ is a polynomial $T_{n}$ for the function $f\left(x\right)+Q\left(x\right)$ and any polynomial $T_{n}$ corresponding to this function has the form $P\left(x\right)+Q\left(x\right)$ . Actually we have

\mu_{n}\left(f+Q\right)=\mu_{n}\left(f\right).

We also have the relations

	$\displaystyle\left\|C\right\|M\left(\left\|f-P\right\|\right)$	$\displaystyle=M\left(\left\|Cf-CP\right\|\right)=\left\|C\right\|\mu_{n}\left(f\right),$
	$\displaystyle M\left(\left\|Cf-R\right\|\right)$	$\displaystyle=M\left(\left\|Cf-C\frac{R}{C}\right\|\right)=\left\|C\right\|M\left(\left\|f-\frac{R}{C}\right\|\right)\geq\left\|C\right\|\mu_{n}\left(f\right).$

It follows that $CP\left(x\right)$ is a $T_{n}$ polynomial for the function $Cf\left(x\right)$ and any polynomial $T_{n}$ corresponding to this function has the form $CP\left(x\right)$ . Consequently, we get

\mu_{n}\left(Cf\right)=\left|C\right|\mu_{n}\left(f\right).

1.6 A preliminary Lemma.

Let us suppose the for some polynomials $P\left(x\right)$ of degree $n$ we have

\left|P\left(x\right)\right|<A,in\left(a,b\right).

(1.1)

We intend to show that the coefficients $a_{r}$ are bounded. To this goal we take $n+1$ distinct points $x_{1},x_{2},\ldots,x_{n+1}$ , in the interval $\left[a,b\right]$ , and consider the system

a_{0}x_{r}^{n}+a_{1}x_{r}^{n-1}+\cdots+a_{n}=P\left(x_{r}\right),\ \ \ r=1,2,\ldots,n+1.

The determinant of this system does not vanish because is the Van Der Monde determinant of the numbers $x_{1},x_{2},\ldots,x_{n+1}$ . Using the Cramer’s rule we can solve for $a_{0},a_{1},\ldots,a_{n}$ and taking into account the inequality (1.1) we find the preliminary Lemma:

If a polynomial $P\left(x\right)$ of degree $n$ is bounded by $A$ in the interval $\left[a,b\right]$ , then the coefficients $a_{0},a_{1},\ldots,a_{n}$ remain bounded by $\lambda A$ , where $\lambda$ depends only on $n$ and the interval $\left[a,b\right]$ .

The value of $\lambda$ can be determined. The most important is the fact that this number does not depend on the polynomial $P\left(x\right)$ . Of course, the property remains valid whenever the polynomials are considered only on a linear and bounded set containing at least $n+1$ distinct points.

1.7 The continuity of $M\left(\left|f-P\right|\right)$ .

The maximum $M\left(\left|f-P\right|\right)$ is not surely attained unless the function $f\left(x\right)$ is continuous.

Let $\varepsilon$ be an arbitrary and let us define

A=M\left(\left|x\right|^{n}+\left|x\right|^{n-1}+\cdots+1\right).

Let’s suppose that

\left|a_{r}=a_{r}^{\prime}\right|<\frac{\varepsilon}{A},\ r=0,1,\ldots,n.

Defining

	$\displaystyle P\left(x\right)$	$\displaystyle=a_{0}x^{n}+a_{1}x^{n-1}+\cdots+a_{n}$
	$\displaystyle P_{1}\left(x\right)$	$\displaystyle=a_{0}^{\prime}x^{n}+a_{1}^{\prime}x^{n-1}+\cdots+a_{n}^{\prime},$

we have

M\left(\left|P-P_{1}\right|\right)\leq\left[\max\left(\left|a_{r}-a_{r}^{\prime}\right|\right)\right]M\left(\left|x\right|^{n}+\left|x\right|^{n-1}+\cdots+1\right)

where as usual we denote by $max\left(c_{1},c_{2},\ldots,c_{k}\right)$ or $\max_{r=1,2,\ldots,k}\left(c_{r}\right)$ or in the simplest way $max\left(c_{r}\right)$ the largest number from the set $c_{1},c_{2},\ldots,c_{k}$ . An analogous notation will be used for the smallest number from the same set $c_{r}$ .Consequently, we can write

M\left(\left|P-P_{1}\right|\right)<\varepsilon.

It follows that

	$\displaystyle M\left(\left\|f-P\right\|\right)$	$\displaystyle\leq M\left(\left\|f-P_{1}\right\|\right)+M\left(\left\|P-P_{1}\right\|\right)<M\left(\left\|f-P_{1}\right\|\right)+\varepsilon.$
	$\displaystyle M\left(\left\|f-P_{1}\right\|\right)$	$\displaystyle\leq M\left\|\left(f-P\right)\right\|+M\left\|\left(P-P_{1}\right)\right\|<M\left(\left\|f-P\right\|\right)+\varepsilon$

and consequently

\left|M\left(f-P\right)\right|-M\left|\left(f-P_{1}\right)\right|<\varepsilon

which means:

$f\left(x\right)$ being a continuous function, $M\left|\left(f-P\right)\right|$ is also continuous with respect to the coefficients $a_{0},a_{1},\ldots,a_{n}$ .

Thus the lower bound $\mu_{n}$ coincides with the inferior limit of the numbers $M\left(\left|f-P\right|\right)$ .

1.8 The existence of the polynomials of the best approximation.

We intend to examine the existence of the polynomials $T_{n}$ . From the previous section we observe that there exists an infinite sequence of polynomials of degree $n$ .

P_{1}\left(x\right),P_{2}\left(x\right),\ldots,P_{m}\left(x\right),\ldots

(1.2)

such that

	$\displaystyle M\left(\left\|f-P_{m}\right\|\right)$	$\displaystyle\rightarrow\mu_{n},$
	$\displaystyle m$	$\displaystyle\rightarrow\infty$

but this does not imply the existence of a polynomial such that the quantity $\mu_{n}$ is attained or in other words the existence of a polynomial $P\left(x\right)$ such that $M\left(\left|f-P\right|\right)=\mu_{n}$ .

This is not a surprising fact. It is true that $M\left|\left(f-P\right)\right|$ is continuous with respect to the coefficients of $P,$ but the range of variations of these coefficients is open and unbounded. Let’s suppose by contradiction that $M\left(\left|f\right|\right)>\mu_{n}$ .It is then enough to consider only polynomials $P$ such that

M\left|\left(f-P\right)\right|<M\left(\left|f\right|\right)..

From the last result of the previous Section it is known that there exists an infinity of such polynomials of degree $n$ . But

M\left(\left|P\right|\right)\leq M\left(\left|f-P\right|\right)+M\left(\left|f\right|\right).

and thus

M\left(\left|P\right|\right)<2M\left(\left|f\right|\right).

(1.3)

In other words we can assume that the polynomials (1.2) are chosen such that they satisfy (1.3. If we put

P_{m}=a_{0}^{\left(m\right)}x^{n}+a_{1}^{\left(m\right)}x^{n-1}+\cdots+a_{n}^{\left(m\right)},\ m=1,2,\ldots

from Sect. 1.7 we know that there exists a number $B$ which depends only on $M\left(\left|f\right|\right)$ . $\left[B=2\lambda M\left(\left|f\right|\right)\right]$ , such that

\left|a_{r}^{\left(m\right)}\right|<B,\ r=0,1,\ldots,n;m=1,2,\ldots.

From the bounded sequence

a_{0}^{\left(1\right)},a_{0}^{\left(2\right)},\ldots,a_{0}^{\left(m\right)},\ldots.

we can extract a sub sequence convergent to a limit, say $a_{\cup}^{\ast}$

a_{0}^{\left(k\right)},a_{0}^{\left(k_{12}\right)},a_{0}^{\left(k_{13}\right)},\ldots,a_{0}^{\left(k_{1m}\right)},\ldots\rightarrow a_{0}^{\ast}.

(1.4)

Let’s consider now the sequence

a_{1}^{\left(k_{1}\right)},a_{1}^{\left(k_{12}\right)},a_{1}^{\left(k_{13}\right)},\ldots,a_{1}^{\left(k_{1m}\right)},\ldots

From this sequence we can extract a sub sequence convergent to a limit, say $a_{1}^{\ast}$

a_{1}^{\left(k_{1}\right)},a_{1}^{\left(k_{2}\right)},a_{1}^{\left(k_{23}\right)},\ldots,a_{1}^{\left(k_{2m}\right)},\ldots\rightarrow a_{1}^{\ast}

We additionally have

a_{0}^{\left(k_{1}\right)},a_{0}^{\left(k_{2}\right)},al0^{\left(k_{23}\right)},\ldots,a_{0}^{\left(k_{2m}\right)},\ldots\rightarrow a_{0}^{\ast}

because this sequence is extracted from (1.4). If we repeat this procedure $n+1$ times, eventually we see that from the sequence of polynomials (1.2) we can extract the sub sequence

P_{k_{1}},P_{k_{2}},\ldots,P_{k_{m}},\ldots

such that

a_{r}^{\left(k_{1}\right)},a_{r}^{\left(k_{2}\right)},\ldots,a_{r}^{\left(k_{m}\right)},\ldots\rightarrow a_{r}^{\left(k_{m}\right)},\ldots\rightarrow a_{r}^{\ast},\ r=0,1,\ldots,n

where $a_{r}^{\ast}$ are some finite numbers.

If we define now

P^{\ast}\left(x\right)=a_{0}^{\ast}x^{n}+a_{1}^{\ast}x^{n-1}+\cdots+a_{n}^{\ast},

we see that

M\left(\left|f-P^{\ast}\right|\right)=\mu_{n}.

(1.5)

Thus the polynomial $P^{\ast}\left(x\right)$ which satisfies the equality (1.5) is one of the best approximation of degree $n$ for the function $f\left(x\right)$ . We can state now the following property: For any bounded function $f\left(x\right)$ there exists at least one polynomial of the best approximation of degree $n$ .

Along with the results of Sect. 1.5 we can now state:

The lower bound $\mu_{n}$ vanishes if and only if $f\left(x\right)$ reduces to a polynomial of degree $n$ .

We have seen that this condition is sufficient. Its necessity comes from the existence of a polynomial $P\left(x\right)$ such that $M\left(\left|f-P\right|\right)=0$ , where $f\left(x\right)\equiv P\left(x\right)$ . Whenever $f\left(x\right)$ is not a polynomial of degree $n$ , $\mu_{n}$ is a positive number.

1.9 The Chebyshev’s polynomials for a continuous function.

We will suppose now that the function $f\left(x\right)$ is continuous and let $T_{n}\left(x\right)$ be a polynomial of the best approximation of degree $n$ . The difference $f\left(x\right)-T_{n}\left(x\right)$ will attains at least one of the values $\pm\mu_{n}$ .We intend to make precise the number of points at which these values are attained. Let’s suppose that

f\left(x_{r}\right)-T_{n}\left(x_{r}\right)=\pm\mu_{n},\ r=1,2,\ldots,m

where $x_{1},x_{2},\ldots,x_{m}$ are $m$ distinct points such that $m\leq n+1$ . In all the other points of the interval $\left[a,b\right]$ we have $\left|f-P\right|<\mu_{n}$ . Let $Q\left(x\right)$ be the LAGRANGE’s polynomial determined by the conditions

Q\left(x_{r}\right)=f\left(x_{r}\right)-T_{n}\left(x_{r}\right),\ r=1,2,\ldots,m.

The LAGRANGE’s polynomial provided by the LAGRANGE’s interpolation formula is the polynomial of the lowest degree which takes on the values $A_{1},A_{2},\ldots,A_{k}$ in the points $x_{1},x_{2},\ldots,x_{k}$ . This polynomial is unique and at most of degree $k-1$ .

The polynomial $Q\left(x\right)$ is at most of degree $n$ . Let’s introduce an interval $I_{r}$ centered at $x_{r}$ and of length $\delta_{r}$ . Given a positive number $\varepsilon$ such that $\varepsilon<\mu_{n}$ , we can choose a positive number $\delta$ and the lengths $\delta_{r}$ such that:

1 ${}^{0}.$: taking $\delta_{r}\leq\delta$ the intervals $I_{1},I_{2},\ldots,I_{m}$ have no common points;
2 ${}^{0}.$: the oscillation of functions $f\left(x\right)-T_{n}\left(x\right)$ and $Q\left(x\right)$ is less than $\varepsilon$ in any interval of length $\leq\delta$ .

It follows immediately that in an interval $I_{r}$ the functions $f-T_{n}$ and $Q$ do not vanish and keep a constant sign (more exactly the same sign). Let’s suppose that $x_{r}$ is a point at which $f\left(x_{r}\right)-T_{n}\left(x_{r}\right)=Q\left(x_{r}\right)=\mu_{n}$ , then on the interval $I_{r}$ we have

\mu_{n}-\varepsilon<f-T_{n}\leq\mu_{n},\mu_{n}-\varepsilon<Q\mu_{n}+\varepsilon.

Let’s choose a positive $\lambda$ such that

\lambda<\frac{\mu_{n}-\varepsilon}{\mu_{n}+\varepsilon}.

(1.6)

Then in the interval $I_{r}$ we have

0<\mu_{n}-\varepsilon-\lambda\left(\mu_{n}+\varepsilon\right)<f-T_{n}=\lambda Q<\mu_{n}-\lambda\left(\mu_{n}-\varepsilon\right).

In a point $x_{r}$ where $f\left(x_{r}\right)-T_{n}\left(x_{r}\right)=Q\left(x_{r}\right),=-\mu_{n}$ , we have $-\mu_{n}\leq f-T_{n}<-\mu_{n}+\varepsilon,\ -\mu_{n}-\varepsilon<Q<-\mu_{n}+\varepsilon$ and along with (1.6) we get $-\mu_{n}+\lambda\left(\mu_{n}-\varepsilon\right)<f-T_{n}-\lambda Q<-\mu_{n}+\varepsilon+\lambda\left(\mu_{n}+\varepsilon\right)<0$ . It means that in the interval $I_{r}$

\left|f-T_{n}-\lambda Q\right|<\mu_{n}-\lambda\left(\mu_{n}-\varepsilon\right)<\mu_{n}.

From our initial hypothesis, it follows that in all points of the closed domain obtained taking out the intervals $I_{r}$ from $\left[a,b\right]$ , we have

\left|f-T_{n}\right|\leq\mu^{\prime}<\mu_{n},

where $\mu^{\prime}$ is a fixed number. If we take $\lambda$ small enough such that

\lambda<\frac{\mu_{n}-\mu^{\prime}}{2M\left(\left|Q\right|\right)}

(1.7)

we will additionally have

	$\displaystyle\left\|\lambda Q\right\|$	$\displaystyle<\frac{\mu_{n}-\mu^{\prime}}{2},$
	$\displaystyle\left\|f-T_{n}-\lambda Q\right\|$	$\displaystyle\leq\left\|f-T_{n}\right\|+\left\|\lambda Q\right\|<\mu^{\prime}+\frac{\mu_{n}-\mu^{\prime}}{2}=\frac{\mu_{n}+\mu^{\prime}}{2}<\mu_{n}$

except the intervals $I_{r}$ and their extremities. It means that everywhere in the interval $\left[a,b\right]$ , we have

\left|f-T_{n}-\lambda Q\right|<\mu_{n}.

Thus, if $\lambda$ verifies the inequalities (1.6) and (1.7) the polynomial $T_{n}+\lambda Q$ provides a better approximation which is contrary to the hypothesis. The following property follows:

The difference $f\left(x\right)-T_{n}\left(x\right)$ attains the values $\pm\mu_{n}$ in $n+2$ points.

1.10 The previous result revisited.

We can supplement the previous result. The difference $f\left(x\right)-T_{n}\left(x\right)$ must attain both values $+\mu_{n}$ and $-\mu_{n}$ .If we suppose for instance that $+\mu_{n}$ can not be attained then we would everywhere have

-\mu_{n}\leq f-T_{n}\leq\mu^{\prime}<\mu_{n},

$\mu^{\prime}$ being a fixed number. Taking a positive constant $\lambda$ we can write

-\mu_{n}+\lambda\leq f-T_{n}+\lambda\leq\mu^{\prime}+\lambda.

Thus, if we take $\lambda<\mu_{n}-\mu^{\prime}$ , then everywhere we have

\left|f-T_{n}+\lambda\right|<\mu_{n}.

It means that the polynomial $T_{n}-\lambda$ provides a better approximation which represents a contradiction. Moreover, we can precisely estimate the number of points where $\mu_{n}$ and respectively $-\mu_{n}$ are actually attained. Let’s suppose by instance that

f\left(x_{r}\right)-T_{n}\left(x_{r}\right)=\mu_{n},\ r=1,2,\ldots,m,

and in all the other points the following double inequality is valid

-\mu_{n}\leq f-T_{n}<\mu_{n}.

Let again be the intervals $I_{r}$ centered at $x_{r}$ and with length $\delta_{r}$ such that the intervals $I_{r}$ are disjoints. Let $x_{r}^{\prime},x_{r}^{\prime\prime}$ be the end points of the interval $I_{r}$ and let define the polynomial

Q\left(x\right)=\left(x-x_{1}^{\prime}\right)\left(x-x_{1}^{\prime\prime}\right)\left(x-x_{2}^{\prime\prime}\right)\ldots\left(x-x_{m}^{\prime}\right)\left(x-x_{m}^{\prime\prime}\right).

We have $Q\left(x\right)<0$ in the open interval $I_{r}$ and $Q\left(x\right)>0$ outside the closed intervals $I_{r}$ . We can take $\delta$ small enough such that for $\delta_{r}\leq\delta$ , in the intervals $I_{r}$

\mu^{\prime}\leq f-T_{n}\leq\mu_{n},

$\mu^{\prime}$ being a positive number such that $\mu^{\prime}<\mu_{n}$ . If the positive number $\lambda$ verifies the inequality

\lambda<\frac{\mu^{\prime}}{M\left(\left|Q\right|\right)},

(1.8)

we have in the intervals $I_{r}$

0<\mu^{\prime}+\lambda Q\leq f-T_{n}+\lambda Q\leq\mu_{n}+\lambda Q<\mu_{n}.

The last inequality is justified because we could not have the equality but in a point where we would have simultaneously $f-T=\mu_{n}$ and $Q=0$ . But by construction such points do not exist. Everywhere in the closed domain $\left[a,b\right]$ minus the intervals $I_{r}$ , we have

-\mu_{n}\leq f-T_{n}\leq\mu^{\prime\prime}<\mu_{n},

$\mu^{\prime\prime}$ being fixed. Taking $\lambda$ such that

\lambda<\frac{\mu_{n}-\mu^{\prime\prime}}{2M\left(\left|Q\right|\right)},

(1.9)

we have in this domain

\mu_{n}<\mu_{n}+\lambda Q\leq f-T_{n}-\lambda Q<\mu^{\prime\prime}+\frac{\mu_{n}-\mu^{\prime\prime}}{2}=\frac{\mu_{n}+\mu^{\prime\prime}}{2}<\mu_{n}.

We can justify the first inequality as above.

For $\lambda$ obeying the inequalities (1.8) and (1.9) we have everywhere in the interval $\left[a,b\right]$

\left|f-T_{n}+\lambda Q\right|<\mu_{n}

and we see that the polynomial $T_{n}-\lambda Q$ provides a better approximation than $\mu_{n}.$ The polynomial $Q\left(x\right)$ has degree $2m$ and we come to a contradiction if $2m\leq n$ . If $x_{r}$ would be points where $-\mu_{n}$ are attained we can make absolutely analogous considerations, so after all we can state the following property: The difference $f\left(x\right)-T_{n}\left(x\right)$ attains in at least $\left|\frac{n+2}{2}\right|$ points the values $\mu_{n}$ and at least in $\left|\frac{n+2}{2}\right|$ points the values $-\mu_{n}$ . $\left[\alpha\right]$ signifies the largest integer less or equal to $\alpha$ . The properties analyzed in Sections 1.9 and 1.10 have been elegantly improved by E. Borel as we will see below.

1.11 The set of polynomials $T_{n}$ .

Let’s suppose that the function $f\left(x\right)$ admits two distinct polynomials $T_{n}$ . If $P,P_{1}$ are these two polynomials we have

M\left(\left|f-P\right|\right)=M\left(\left|f-P_{1}\right|\right)=\mu_{n}.

If $\alpha,\beta$ are two positive numbers we can write

$\displaystyle\mu_{n}$	$\displaystyle\leq M\left(\left\|f-\frac{\alpha P+\beta P_{1}}{\alpha+\beta}\right\|\right)=M\left(\left\|\frac{\alpha^{\prime}f-P}{\alpha+\beta}+\frac{\beta\left(f-P_{1}\right)}{\alpha+\beta}\right\|\right)\leq$	(1.10)
	$\displaystyle\leq\frac{\alpha\backslash\left(\left\|f-P\right\|\right)+\beta M\left\|\left(f-P_{1}\right)\right\|}{\alpha+\beta}=\mu_{n}$
	$\displaystyle\therefore M\left(\left\|f-\frac{\alpha P+\nu P_{1}}{\alpha+\beta}\right\|\right)=\mu_{n}.$

It means that the polynomial $\frac{\alpha P+\beta P_{1}}{\alpha+\beta}$ is another $T_{n}$ polynomial and consequently we can state:

If a bounded function admits two distinct polynomials $T_{n}$ it admits an infinity (uncountable) number of such polynomials.

To each polynomial $P\left(x\right)=a_{0}x^{n}+a_{1}x^{n-1}+\cdots+a_{n}$ we can assign a point $A$ of coordinates $a_{0},a_{1},\ldots,a_{n}$ from the $n+1$ dimensional Euclidean space. As a consequence of our previous results we can state the following:

The points $A$ corresponding to the polynomials $T_{n}$ attached to a bounded function are organized as a convex, bounded and closed domain.

If the polynomial $T_{n}$ is unique this domain reduces to a single point. If the interval $\left[a,b\right]$ is symmetric with respect to the origin, i.e., $a=-b$ and if the function $f\left(x\right)$ is even, i.e., $f\left(-x\right)=f\left(x\right)$ , then, there exists an even polynomial $T_{n}$ . Indeed, it is easy to observe that $T_{n}\left(-x\right)$ it is also a $T_{n}$ polynomial. In the same way the polynomial $\frac{T_{n}\left(x\right)+T_{n}\left(-x\right)}{2},$ is an even one. In this situation $\mu_{2n+1}\left(f\right)=\mu_{2n}\left(f\right)$ . If the function is odd, i.e., $f\left(-x\right)=-f\left(x\right)$ , there exists an odd polynomial $T_{n}$ . In this case $\mu_{2n}\left(f\right)=\mu_{2n-1}\left(f\right)$ .

1.12 The uniqueness of Chebyshev’s polynomials.

The previous discussion enables us to state the following important conclusion. If $P,P_{1}$ are two distinct polynomials $T_{n}$ the polynomial $P_{2}=\frac{P+P_{1}}{2}$ it is also a $T_{n}$ polynomial. The inequality (1.10) shows that in a point $x^{\prime}$ where we have $f\left(x^{\prime}\right)-P_{2}\left(x^{\prime}\right)=\pm\mu_{n}$ , we also must have

	$\displaystyle f\left(x^{\prime}\right)-P\left(x^{\prime}\right)$	$\displaystyle=f\left(x^{\prime}\right)-P_{1}\left(x^{\prime}\right)=\pm\mu_{n}$
		$\displaystyle\therefore P\left(x^{\prime}\right)=P_{1}\left(x^{\prime}\right).$

According to the above properties the polynomials $P,P_{1}$ coincide in at least $n+2$ points. It means that they are identical. The following property is now fairly clear:

A continuous function $f\left(x\right)$ admits a unique polynomial of the best approximation of degree $n$ .

Actually the uniqueness follows from the property proved at Sect. 1.9. More exactly, this uniqueness follows solely from the fact that $|f-T_{n}|$ attains its maximum in at least $n+1$ points. Indeed, two polynomials of degree $n$ which coincide in $n+1$ points are identical.

If the interval $\left[a,b\right]$ is symmetric with respect to the origin and $f\left(x\right)$ is an even function then $T_{n}\left(x\right)$ is also even and $T_{2n+1}\equiv T_{2n}$ . If the function is odd the polynomial $T_{n}\left(x\right)$ has the same property and $T_{2n}\equiv T_{2n-1}$ .

If a function is not continuous the polynomial $T_{n}$ generally is not unique. We observe that $T_{0}$ is always unique and equals $\frac{M\left(f\right)+m\left(f\right)}{2}$ .Let’s introduce the function

f\left(x\right)=\left\{\begin{array}[c]{c}-1,\ -1\leq x<0\\ \ \ \ 1,\ \ \ \ \ \ 0\leq x\leq 1\end{array}\right.

We must have $\mu_{n}\geq 1$ . But the null polynomial provides the approximation $1$ such that $\mu_{n}=1$ for every $n$ . All polynomials $T_{n}$ must vanish in the origin. The polynomials $Cx$ where $C$ is a constant are $T_{n}$ polynomials for $0\leq C\leq 2$ and for any $n>0$ .

Chapter 2 Second lesson. The results of E. Borel.

2.1 The difference $f\left(x\right)-P\left(x\right)$ .

We will suppose that the function $f\left(x\right)$ is continuous and we will take a continuous polynomial $P\left(x\right)$ of degree $n$ . Let’s consider the difference $\phi\left(x\right)=f\left(x\right)-P\left(x\right)$ which is also a continuous function.

We will say that a point of the interval $\left[a,b\right]$ is an $x^{\prime}$ point if $\phi\left(x^{\prime}\right)=M\left(\left|\phi\right|\right)$ and an $x^{\prime\prime}$ point if $\phi\left(x^{\prime\prime}\right)=-M\left(\left|\phi\right|\right)$ .

Let now be $\varepsilon<\frac{M\left(\left|\phi\right|\right)}{2}$ a positive number and $\delta^{\prime}>0$ another number such that the oscillation of $\phi\left(x\right)$ in an interval shorter than $\delta^{\prime}$ is less than $\varepsilon$ . Let’s divide the interval $\left[a,b\right]$ in $r$ sub intervals

I_{1},I_{2},\ldots,I_{r}

(2.1)

of the same length $\delta$ which is smaller than $\delta^{\prime}$ . An interval $I_{r}$ can or can not contains points $x^{\prime},x^{\prime\prime},$ but can contain only points of the same kind.

Let $I_{s_{1}}$ be the first interval in the sequence (2.1) which contains an $x^{\prime}$ or an $x^{\prime\prime}$ . To fix ideas let’s suppose that it contains one or more $x^{\prime}$ points. Let then be $I_{s_{2}}$ the first interval following $I_{s_{1}}$ which contains $x^{\prime\prime}$ points. In between $I_{s_{1}}$ and $I_{s_{2}}$ there exists at least three consecutive intervals which do not contain neither $x^{\prime}$ points nor $x^{\prime\prime}$ points. If we denote by $\xi_{1}$ the middle of the interval $I_{s_{2}-2}$ there do not exist $x^{\prime}$ or $x^{\prime\prime}$ points in an interval of length $3\delta$ centered at $\xi_{1}$ . Let $I_{s_{3}}$ be the first interval successive to $I_{s_{2}}$ which contains $x^{\prime}$ points. Let the point $\xi_{2}$ be the middle point of $I_{s_{3}-2}$ . The point $\xi_{2}$ enjoys the same properties as $\xi_{1}$ . Working analogously along all the intervals of (2.1) we find the sequence $a,\xi_{1},\xi_{2},\ldots,\xi_{m-1},b,$ which determinates a sequence of $m$ successive and closed intervals

L_{1},L_{2},\ldots,L_{m}.

(2.2)

These intervals enjoy the following properties:

1 ${}^{0}.$: There exists at least one interval $L_{s}$ .
2 ${}^{0}.$: The division points $\xi_{s}$ are separated from the points $x^{\prime}$ and $x^{\prime\prime}$ by segments of length $\frac{3}{2}\delta$ .
3 ${}^{0}.$: Each interval $L_{s}$ contains $x^{\prime}$ or $x^{\prime\prime}$ points. If $L_{s}$ contains $x^{\prime}$ points, then the intervals $L_{s-1},L_{s+1}$ contain $x^{\prime\prime}$ points.

In an interval $L_{s}$ which contains $x^{\prime}$ points, $m\left(\phi\right)$ can not equate $-M\left(\left|\phi\right|\right)$ ; and in an interval containing $x^{\prime\prime}$ points, $M\left(\phi\right)$ can not equate $M\left(\left|\phi\right|\right)$ . We deduce that there exists a positive $\eta$ such that in every interval $L_{s}$ we have

\phi>-M\left(\left|\phi\right|\right)+\eta\ \ \ \ \ \ \phi<M\left|\left(\phi\right)\right|-\eta

according to the fact that $L_{s}$ contains $x^{\prime}$ or $x^{\prime\prime}$ points.

2.2 The fundamental property of $T_{n}$ polynomials.

If we take $\phi\left(x\right)=f\left(x\right)-T_{n}\left(x\right)$ , then $M\left(\left|\phi\right|\right)=\mu_{n}$ . Let’s suppose that the number of intervals in (2.2) is $m$ less than $n+2$ , i.e., $m\leq n+1$ .In these conditions the polynomial

Q\left(x\right)=\left(x-\xi_{1}\right)\left(x-\xi_{2}\right)\ldots\left(x-\xi_{m-1}\right)

effectively has the degree $n$ . Let’s determine the constant $\lambda$ such that $\lambda Q>0$ in the interior of the intervals $L_{s}$ which contain $x^{\prime}$ points and

\left|\lambda\right|<\frac{\eta}{M\left(\left|Q\right|\right)}

where $\eta$ is the number found out at the end of the previous Section. In every point of the interval $\ \ L_{s}$ , which contains $x^{\prime}$ points, we have

-\mu_{n}+\eta-\eta<f-T_{n}-\lambda Q<\mu_{n},

and if $L_{s}$ contains $x^{\prime\prime}$ points

-\mu_{n}<f-T_{n}-\lambda Q<\mu_{n}-\eta+\eta.

Thus, in the whole interval $\left[a,b\right]$ we have

\left|f-T_{n}-\lambda Q\right|<\mu_{n},

which contradicts the fact that $T_{n}$ is a best approximation polynomial of degree $n$ . Consequently,

If $T_{n}$ is the best approximation polynomial of degree $n$ for the continuous function $f\left(x\right)$ , the difference $f\left(x\right)-T_{n}\left(x\right)$ attains the values $\pm\mu_{n}$ in at least $n+2$ consecutive points with alternating signs.

2.3 The first Borel’s theorem.

Let $P$ be a polynomial of degree $n$ , distinct from $T_{n}$ and let’s suppose that the difference $f\left(x\right)-P\left(x\right)$ attains the values $\pm M\left(\left|f-P\right|\right)$ in at least $n+2$ consecutive points with alternating signs. Let $x_{1}^{\prime}<x_{1}^{\prime\prime}<x_{2}^{\prime}<x_{2}^{\prime\prime}<\ldots$ be $n+2$ points where $\pm M\left(\left|f-P\right|\right)$ is alternatively attained, $x_{r}^{\prime}$ being $x^{\prime}$ points and $x_{r}^{\prime\prime}$ being $x^{\prime\prime}$ points (the sequence could start also with $(x^{\prime\prime})$ ).We have

M\left(\left|f-T_{n}\right|\right)<M\left(\left|f-P\right|\right).

If we introduce the function

\psi\left(x\right)=\left(f\left(x\right)-P\left(x\right)\right)-\left(f\left(x\right)-T_{n}\left(x\right)\right)=T_{n}\left(x\right)-P\left(x\right)

it follows

\psi\left(x_{1}^{\prime}\right)>0,\ \psi\left(x_{1}^{\prime\prime}\right)<0,\ \psi\left(x_{2}^{\prime}\right)>0,\ \psi\left(x_{2}^{\prime\prime}\right)<0,\ldots

and thus the function $\psi\left(x\right)$ vanishes $n+1$ times. But this function is a polynomial of degree $n$ and thus we get $T_{n}\equiv P$ . With these results we can state the following theorem which will be called the first Borel’s theorem:

A polynomial $P$ is a $T_{n}$ polynomial of the best approximation for a continuous function $f\left(x\right)$ , if and only if the difference $f-P$ attains its maximum absolute value in at least $n+2$ consecutive points with alternating signs. This property can be formulated alternatively in the following way:

Let $x_{r}$ be the points where the difference $\left|f-P\right|$ attains its maximum value. The polynomial $P$ is a $T_{n}$ polynomial of the best approximation for the continuous function $f\left(x\right)$ , if and only if there does not exist a polynomial $Q\left(x\right)$ of degree $n$ which in $x_{r}$ , takes non vanishing values of the same sign with $f\left(x_{r}\right)-P\left(x_{r}\right)$ .

The condition is sufficient. If $P\left(x\right)$ is a $T_{n}$ , polynomial of the best approximation, we can write

\left|f\left(x_{r}\right)-P\left(x_{r}\right)\right|>\left|f\left(x_{r}\right)-T_{n}\left(x_{r}\right)\right|

and so

sg\left(T_{n}\left(x_{r}\right)-P\left(x_{r}\right)\right)=sg\left(f\left(x_{r}\right)-P\left(x_{r}\right)\right)

because

T_{n}-P=\left(f-P\right)-\left(f-T_{n}\right).

It follows that

Q\left(x\right)=T_{n}\left(x\right)-P\left(x\right)

contradicts our hypothesis.

The condition is necessary Among the points $x_{r}$ we can choose $n+2$ consecutive points $x_{1}<x_{2}<\ldots<x_{n+2}...$ , where $\pm\mu_{n}$ is alternatively attains by the difference $f-T_{n}$ .

Let’s define $Q\left(x\right)=a_{0}$ $x^{n}+a_{1}x^{n-1}+\cdots+a_{n}$ a polynomial such that

	$\displaystyle sgQ\left(x_{r}\right)$	$\displaystyle=sg\left(f\left(x_{r}\right)-T_{n}\left(x_{r}\right)\right)$
	$\displaystyle r$	$\displaystyle=1,2,\ldots,n+2.$

We must have

	$\displaystyle a_{0}x_{r}^{n}+a_{1}x_{r}^{n-1}+\cdots+a_{n}$	$\displaystyle=Q\left(x_{r}\right)$		(2.3)
	$\displaystyle r$	$\displaystyle=1,2,\ldots,n+2.$

Actually we have a system of $n+2$ equations involving only $n+1$ unknowns $a_{0},a_{1},\ldots,a_{n}$ . Its compatibility implies the fact that its characteristic determinant vanishes. Let’s denote by

V\left(\alpha_{1},a_{2},\ldots,\alpha_{k}\right)=\left|1\alpha_{r}\alpha_{r}^{2}\ldots\alpha_{r}^{k-1}\right|

(2.4)

the Van Der Monde determinant of the numbers $\alpha_{1},\alpha_{2},\ldots,\alpha_{k}.$

If $\alpha_{1}<\alpha_{2}<\cdots<\alpha_{k}$ then $V\left(\alpha_{1},\alpha_{2},\ldots,\alpha_{k}\right)>0$ . The characteristic determinant of system (2.3) equals, possibly with the exception of a sign, the sum

\sum_{r=1}^{n+2}\left|Q\left(x_{r}\right)\right|V\left(x_{1},x_{2},,x_{r-1},\right)

and does not vanish. Thus, the system (2.3) is incompatible and the theorem is proved. It is possible to show that the first Borel’s theorem follows from the above property. It means that both statements are equivalent. From the previous theorem it follows that just in case the number of intervals (2.2) is larger than $n+2$ we have

T_{n}\equiv T_{n+1}\equiv\cdots\equiv T_{m-2}.

2.4 On the distribution of zeros of $T_{n}-T_{n-1}$ polynomials.

From the previous results another interesting property is available. Let’s suppose that the polynomials $T_{n-1},T_{n}$ are not identical and then $\mu_{n-1}>\mu_{n}$ . Let $x_{1}^{\prime}<x_{1}^{\prime\prime}<x_{2}^{\prime}<\cdots$ be the $n+1$ points where $f-T_{n-1}$ arbitrarily attains $\pm\mu_{n-1}$ .If we define

\psi\left(x\right)=\left(f-T_{n-1}\right)-\left(f-T_{n}\right)=T_{n}-T_{n-1},

we get

\psi\left(x_{1}^{\prime}\right)>0,\ \psi\left(x_{1}^{\prime\prime}\right)<0,\ \psi\left(x_{2}^{\prime}\right)>0,\ldots

and consequently $\psi\left(x\right)$ vanishes in at least $n$ distinct points in $\left[a,b\right]$ . Thus we have the following property:

If $T_{n-1},T_{n}$ are two consecutive polynomials of the best approximation of a continuous function, the equation $T_{n}-T_{n-1}=0$ has real and distinct conditions in $\left(a,b\right)$ .

2.5 The $T_{n}$ polynomials for functions of order $n$ .

Let’s go back to the notation (2.4) for the Van Der Monde determinant. Let’s denote by $U\left(\alpha_{1},\alpha_{2},\ldots,\alpha_{k};f\right)$ the determinant obtained from $V\left(\alpha_{1},\alpha_{2},\ldots,\alpha_{k}\right)$ by replacing the entries in the last column respectively with $f\left(\alpha_{1}\right),f\left(\alpha_{2}\right),\ldots,f\left(\alpha_{k}\right),$ and thus

U\left(\alpha_{1},\alpha_{2},\ldots,\alpha_{k};f\right)=\left|\alpha_{r}\alpha_{r}\ldots\alpha_{r}^{k-2}f\left(x_{r}\right)\right|

(2.5)

The ratio

\left[\alpha_{1},\alpha_{2},\ldots,\alpha_{k};f\right]=\frac{U\left(\alpha_{1},\alpha_{2},\ldots,\alpha_{k};f\right)}{V\left(\alpha\,1\right),\alpha_{2},\ldots,\alpha_{k}}

is called the divided difference of order $k-1$ of the function $f\left(x\right)$ on the points $\alpha_{1},\alpha_{2},\ldots,\alpha_{k}$ . It is clear that this divided difference is symmetric with respect to the points $\alpha_{1},\alpha_{2},\ldots,\alpha_{k}$ .

If the divided difference $\left[x_{1},x_{2},\ldots,x_{n+2};f\right]$ of the function $f\left(x\right)$ does not change sign for any $n+2$ distinct points $x_{1},x_{2},\ldots,x_{n+2}$ from $\left[a,b\right]$ we will say that the function is of $n$ in this interval. More exactly, the function $f\left(x\right)$ is convex, nonconcave, polynomial, nonconvex or concave of order $n$ in $\left(a,b\right)$ if we have

\left[x_{1},x_{2},\ldots,x_{n+2};f\right]>,\geq,=,\leq,or<0

in this interval.²²2For the properties of these functions one can see Tiberiu POPOVICIU ”Sur quelques propriétés des fonctions d’une ou de deux variables réelles”. Thèse, Paris (Iunie 1933) sau Mathematica vol.VIII pp.1-86. The polynomial function of order $n$ is a polynomial of degree $n$ . Conversely, any polynomial of degree $n$ is a (polynomial) function of order $n$ . The convexity character of order $n$ of a function does not change by adding a polynomial of degree $n$ . The functions defined in this mode have the following property: A function of order $n$ can not take in more than $n+2$ consecutive points non vanishing values with alternative sign.The proof is based on the formula

	$\displaystyle U\left(\alpha_{1},\alpha_{2},\ldots,\alpha_{k};f\right)$	$\displaystyle=$		(2.6)
		$\displaystyle=\sum_{j=1}^{k}\left(-1\right)^{k-j}f\left(\alpha_{j}\right)V\left(\alpha_{1},\ldots,\alpha_{j-1},\alpha_{j+1},\ldots,\alpha_{k}\right)$

which will be useful later. If the property would not be true there would exist $n+3$ points $x_{1}<x_{2}<\ldots<x_{n+3}$ where $f\left(x\right)$ could takes on non vanishing values with alternative sign. Thus, we would have

\left[sgf\left(x_{r}\right)\right]\cdot\left[sgf\left(x_{r+1}\right)\right]=-1,\ r=1,2,\ldots,n+2.

But using the formula (2.6) we have

	$\displaystyle sg\left[x_{1},x_{2},\ldots,x_{n+2};f\right]$	$\displaystyle=sg\ f\left(x_{n+2}\right)$
	$\displaystyle sg\left[x_{2},x_{3},\ldots,x_{n+3};f\right]$	$\displaystyle=sg\ f\left(x_{n+3}\right)$

which contradicts the convexity property. Our statement is thus proved. We have made the restrictive hypothesis that the function $f\left(x\right)$ does not vanish in the considered points. One can easily find how this statement can be modified when this hypothesis is neglected. We need this property only in a formal way. The previous property applies also to the function $f-P$ , where $P$ is a polynomial of degree $n$ . Particularly, we will apply the above property to the function $f-T_{n}$ just in the points where this difference attains the values $\mu_{n}$ .

If $T_{n}$ is the best approximation polynomial of degree $n$ of the continuous function $f\left(x\right)$ of order $n$ (which is not a polynomial one) then, there exits $n+2$ and only $n+2$ consecutive points where the difference $f-T_{n}$ attains the values $\pm\mu_{n}$ with alternative sign.

In other words, we can say that: if the continuous function $f\left(x\right)$ is of order $n$ (and it is not a polynomial one) the polynomials $T_{n},\ T_{n+1}$ are for sure distinct. In this case $T_{n+1}$ effectively has degree $n+1$ and $\mu_{n+1}<\mu_{n}$ .

2.6 The second Borel’s Theorem.

E. Borel showed that the correspondence between a continuous function and its best approximation polynomial is continuous. Let $f$ and $f^{\ast}$ and $T_{n},T_{n}^{\ast}$ their best approximation polynomials of degree $n$ . Let $x_{1}^{\prime}<x_{1}^{\prime\prime}<x_{2}^{\prime}<x_{2}^{\prime\prime}<\ldots$ be $n+2$ points where $f-T_{n}$ takes alternatively the values $\pm\mu_{n}$ .We can write

	$\displaystyle M\left(\left\|f^{\ast}-T_{n}^{\ast}\right\|\right)M\left(\left\|f^{\ast}-T_{n}\right\|\right)$	$\displaystyle\leq M\left(\left\|f-T_{n}\right\|\right)+M\left(\left\|f-f^{\ast}\right\|\right)$
		$\displaystyle\therefore M\left(\left\|f^{\ast}-T_{n}^{\ast}\right\|\right)\leq\mu_{n}+\eta,$

where we have defined for simplicity $\eta=M\left(\left|f-f^{\ast}\right|\right)$ . We have

T_{n}-T_{n}^{\ast}=f^{\ast}-T_{n}^{\ast}-\left(f-T_{n}\right)+\left(f-f^{\ast}\right),

in a point $x_{r}^{\prime}$

T_{n}-T_{n}^{\ast}\leq\mu_{n}+\eta-\mu_{n}+\eta=2\eta,

and in a point $x_{r}^{\prime\prime}$

T_{n}-T_{n}^{\ast}\geq-\mu_{n}-\eta+\mu_{n}-\eta=-2\eta.

We intend to show that at least in one of the intervals $\left(x_{1}^{\prime},x_{1}^{\prime\prime}\right),\left(x_{1}^{\prime\prime},x_{2}^{\prime}\right),\left(x_{2}^{\prime},x_{2}^{\prime\prime}\right),\ldots$ we can write the inequality

\left|T_{n}-T_{n}^{\ast}\right|\leq 2\eta.

(2.7)

Let’s suppose the contrary. There exists the points $x_{1},x_{2},\ldots,x_{n+1}$ in these $n+1$ intervals such that

\left|T_{n}\left(x_{r}\right)-T_{n}^{\ast}\left(x_{r}\right)\right|>2\eta,\ r=1,2,\ldots,n+1.

It follows that

T_{n}\left(x_{r}^{\prime}\right)-T^{\ast}\left(x_{r}^{\prime}\right)\leq 2\eta,\ T_{n}\left(x_{r+1}^{\prime}\right)-T_{n}^{\ast}\left(x_{r+1}^{\prime}\right)\leq 2\eta.

Two possibilities can occur:

1 ${}^{0}.$

We can have one of the inequalities

T_{n}\left(x_{2r-1}\right)-T_{2}^{\ast}\left(x_{2r-1}\right)>2\eta,\ T_{n}\left(x_{2r}\right)-T_{n}^{\ast}\left(x_{2r}\right)>2\eta.

2 ${}^{0}.$

Or both inequalities

T_{n}\left(x_{2r-1}\right)-T_{n}^{\ast}\left(x_{2r-1}\right)<-2\eta,\ T_{n}\left(x_{2r}\right)-T_{n}^{\ast}\left(x_{2r}\right)<-2\eta

are satisfied.

The points $x_{2r-1},x_{2r}$ belong to the intervals $\left(x_{r}^{\prime},x_{r+1}^{\prime}\right)$ . One can see that in the case 1⁰ the difference $T_{n}-T_{n}^{\ast}$ has a (relative) maximum in this interval. In case 2⁰ we additionally take into account the relation

T_{n}\left(x_{r}^{\prime\prime}\right)-T_{n}^{\ast}\left(x_{r}^{\prime\prime}\right)\geq-2\eta

and because the point $x_{r}^{\prime\prime}$ belongs to the interval $\left(x_{r}^{\prime},x_{r+1}^{\prime}\right)$ we remark again that the polynomial $T_{n}-T_{n}^{\ast}$ has at least a maximum in this interval.

Actually, the polynomial $T_{n}-T_{n}^{\ast}$ has at least one minimum in each of the intervals $\left(x_{1}^{\prime},x_{2}^{\prime}\right),\ \left(x_{2}^{\prime},x_{3}^{\prime}\right)\ldots$ . In the same way we can prove that this polynomial has at least a (relative) minimum in each of the intervals $\left(x_{1}^{\prime\prime},x_{2}^{\prime\prime}\right)$ , $\left(x_{2}^{\prime\prime},x_{3}^{\prime\prime}\right)\ldots$ . Our polynomial which is by hypothesis of degree $n$ and non identical null, has at least $n$ maxima and minima which is impossible.

It is now proved that the inequality (2.7) is true at least in one of the $n+1$ intervals considered. Taking in such an interval $n+1$ distinct points and working as in Sect. 1.7 we will se that the coefficients of the polynomial $T_{n}-T_{n}^{\ast}$ are in absolute value less than a number $2\eta\lambda$ , where $\lambda$ is a fix number.It follows that

M\left(\left|T_{n}-T_{n}^{\ast}\right|\right)<2\eta\lambda A,

where

A=M\left(\left|x\right|^{n}+\left|x\right|^{n-1}+\cdots+1\right).

If we take

\eta<\frac{\varepsilon}{2\lambda A},

we get

M\left(\left|T_{n}-T_{n}^{\ast}\right|\right)<\varepsilon.

We can now state a result which will be called the second Borel’s Theorem:

For any positive number $\varepsilon$ we can find another positive number $\delta$ such that the inequality

M\left(\left|f-f^{\ast}\right|\right)<\delta

implies

M\left(\left|T_{n}-T_{n}^{\ast}\right|\right)<\varepsilon.

2.7 A consequence of the previous Theorem.

From the previous theorem an important consequence follows. Let’s suppose that a sequence of continuous functions

f_{1}\left(x\right),f_{2}\left(x\right),\ldots,f_{m}\left(x\right),\ldots

(2.8)

converges uniformly to a continuous function $f\left(x\right)$ in the whole interval $\left[a,b\right]$ . The second Borel’s Theorem states that for a given positive $\delta$ , there exists a positive $\varepsilon$ such that

M\left(\left|f-f^{\ast}\right|\right)<\delta

implies

M\left|T_{n}\left(x;f\right)-T_{n}\left(x;f^{\ast}\right)\right|<\varepsilon.

But due to the uniform convergence there exists a number $A$ such that for $m>A$ we have

M\left(\left|f-f_{m}\right|\right)<\delta

and thus for $m>A$ we also have

M\left|T_{n}\left(x;f\right)-T_{n}\left(x;f_{m}\right)\right|<\varepsilon.

Consequently we can formulate the following result: In the sequence (2.8) of continuous functions converges uniformly to the continuous function $f\left(x\right)$ , then the sequence of polynomials $T_{n}\left(x;f_{1}\right),\ T_{n}\left(x;f_{2}\right),\ldots,T_{n}\left(x;f_{m}\right),\ldots$ converges to the best approximation polynomial of degree $n$ of the function $f\left(x\right)$ .

Of course the above sequence of polynomials is uniformly convergent. As a matter of fact, a sequence of polynomials of the same degree is also uniformly convergent in the whole interval $\left[a,b\right]$ .

2.8 The computation of $T_{n}$ polynomial.

The previous results enable us to compute the polynomial $T_{n}$ with a desired approximation. If $f$ is a polynomial, the computation of $T_{n}$ is a purely algebraic problem. Indeed, if in an interior point of the interval $\left[a,b\right]$ we have $\left|f-T_{n}\right|=\mu_{n}$ , the derivative of the polynomial $f-T_{n}$ vanishes in this point. Let’s remark that the equality $\left|f-T_{n}\right|=\mu_{n}$ can takes place in an extremum point $a$ or $b$ or even in both ends of the interval $\left[a,b\right]$ . The polynomial $T_{n}$ and the quantity $\mu_{n}$ will be determined from the system

\left\{\begin{array}[c]{c}f\left(x_{r}\right)-T_{n}\left(x_{r};f\right)=\left(-1\right)^{r}\rho\ r=1,2,\ldots,n+2\\ f^{\prime}\left(x_{r}\right)-T^{\prime}\left(x_{r};f\right)=0\ \ \ \ x_{1}<x_{2}<\cdots<x_{n+2}\end{array}\right.

(2.9)

or from one of the systems obtained supposing one or both ends $x_{1}=a,x_{n+2}=b$ satisfied and suppressing from the second sequence of (2.9) the equations corresponding to these indices. The system (2.9) along with the other three obtained from this one determinates the coefficients of $T_{n}$ , the value of $\rho$ , and the points $x_{r}$ . The system is well defined, i.e., the number of unknowns equal the number of equations. These systems accept a certain number of solutions which can be found algebraically. From this set of solutions we know that a specific one provides the polynomial $T_{n}$ and the best approximation $\mu_{n}$ . From the following considerations will result that for a specified solution $\left|\rho\right|$ will have the maximum value and this solution will provide just the polynomial $T_{n}$ .

We will prove below that for any continuous function $f\left(x\right)$ and any positive $\delta$ , we can find a polynomial $P\left(x\right)$ such that

M\left(\left|f-P\right|\right)<\delta.

Particularly, we can find a polynomial $P$ such that for a positive $\varepsilon$ , mentioned a priori, we have

M\left(\left|T_{n}\left(x;f\right)-T_{n}\left(x;P\right)\right|\right)<\varepsilon,

Consequently, we can compute with a desired approximation the polynomials of the best approximation for a continuous function.

2.9 The best approximation of $x^{n+1}$ .

As an example, let’s compute the polynomial $T_{n}\left(x;x^{n+1}\right)$ in the interval $\left(-1,1\right)$ . We immediately observe that $\mu_{n}$ is attained even in the ends $-1$ or $1$ because the derivative of $f-T_{n}$ is a polynomial of degree $n$ , which can not vanish in more than $n$ points. We must have

P-\mu_{n}^{2}=Q^{2}\left(x^{2}-1\right),

where by definition $P=x^{n+1}-T_{n}\left(x;x^{n+1}\right)$ and $Q$ is a polynomial of degree $n$ .The last equation states the fundamental property of $T_{n}$ polynomials. Differentiating this equality we get

P.P^{\prime}=Q\cdot Q^{\prime}\left(x^{2}-1\right)+xQ^{2}=Q\left[Q^{\prime}\left(x^{2}-1\right)+xQ\right].

But $P$ is mutually prime with $Q$ , thus we can write $P^{\prime}=\lambda Q$ , where $\lambda$ is a constant (in fact $\lambda=\pm\left(n+1\right)$ ).Thus we have

\pm\lambda\sqrt{\mu_{n}^{2}-P}=P^{\prime}\sqrt{1-x^{2}}

\frac{dP}{\sqrt{\mu_{n}^{2}-P^{2}}}=\frac{\pm\lambda dx}{\sqrt{1-x^{2}}}

and now one can see that $P$ has the form

P=\pm\mu_{n}\cos\left(\lambda\arccos\ x+\alpha\right)

with $\alpha$ a constant. $P$ must be a polynomial of degree $n+1$ with the first term $x^{n+1}$ and thus $\alpha=0$ , and $\lambda=n+1$ , i.e.,

x^{n+1}-T_{n}\left(x;x^{n+1}\right)=\frac{1}{2^{n}}\cos\left(\overline{n+1}.\arccos\ x\right)

and

\mu_{n}\left(x^{n+1}\right)=\frac{1}{2^{n}}.

The polynomial $T_{n}\left(x;x^{n+1}\right)$ corresponding to an arbitrary interval $\left(a,b\right)$ can be obtained using a linear transformation and thus we find

x^{n+1}-T_{n}\left(x;x^{n+1}\right)=\frac{\left(b-1\right)^{n+1}}{2^{2n+1}}\cos\left(\overline{n+1}.\arccos\frac{2x-a-b}{b-a}\right)

(2.10)

and

\mu_{n}x^{n+1}=\frac{\left(b-a\right)^{n+1}}{2^{2n+1}}.

The polynomial (2.10) is the one which differs at the least extent from zero among all the polynomials of degree $n+1$ which have the first term $x^{n+1}$ . This polynomial was for the first time determined by Chebyshev.

Chapter 3 The third lesson. The results of Ch. de la Vallée Poussin.

3.1 The best approximation on $n+2$ points.

Let’s consider now a uniform function $f\left(x\right)$ defined only on $n+2$ points, namely

x_{1}<x_{2}<\cdots<x_{n+2}.

(3.1)

The maximum $M\left(\left|f-P\right|\right)$ will be defined by formula

	$\displaystyle M\left(\left\|f-P\right\|\right)$	$\displaystyle=\max\left(\left\|f\left(x_{r}\right)-P\left(x_{r}\right)\right\|\right)$
	$\displaystyle r$	$\displaystyle=1,2,\ldots,n+2.$

For all polynomials $P\left(x\right)$ of degree $n$ the expression $M\left(\left|f-P\right|\right)$ has a minimum denoted by $\rho_{n}\left(f\right)$ or simply $\rho_{n}$ . It is easy to show that this minimum is attained by at least a polynomial of degree $n$ . We will denote with $E_{n}\left(x\right)$ , or simply with $E_{n}$ , such a polynomial. $E_{n}$ is a polynomial of the best approximation of degree $n$ for the function $f$ on the $n+2$ points of (E) and $\rho_{n}$ is the best approximation for $f$ using polynomials of degree $n$ on these $n+2$ points.

Lat $\xi_{0}$ a point on the left of $x_{1}$ , $\xi_{n+2}$ a point on the right of $x_{n+2}$ , and $\xi_{r}$ the middle point of the interval $\left(x_{r},x_{r+1}\right),\ r=1,2,\ldots,n+1$ .

From the sequence of points $\xi_{0},\xi_{1},\ldots,\xi_{n+1},\xi_{n+2}$ we can choose another sequence $\xi_{0},\xi_{j_{1}},\ldots,\xi_{j_{m+1}},\xi_{n+2}$ which determinates $m$ consecutive intervals

L_{1},L_{2},\ldots,L_{m}

with the following properties:

1 ${}^{0}.$: There exists at least one interval $L_{s}$ .
2 ${}^{0}.$: Each interval contains points $x_{r}$ where $f=E_{n}=\rho_{n}$ or points $x_{r}$ where $f-E_{n}=-\rho_{n}$ but exclusively points of the same kind. If $L_{s}$ contains a type of points then $L_{s-1}$ and $L_{s+1}$ contain points of the opposite type. To fix the ideas let’s suppose that $L_{1}$ contains points where $f-E_{n}=\rho_{n}$ .

It follows immediately that in the intervals $L_{1},L_{3,}L_{5},\ldots,$ we have

-\rho_{n}<f-E_{n}\leq\rho_{n},

and in the intervals $L_{2},L_{4},L_{6},\ldots,$ we have

-\rho_{n}\leq f-E_{n}<\rho_{n}.

Let’s take now the polynomial of degree $n$

Q\left(x\right)=\left(x-\xi_{1}\right)\left(x-\xi_{2}\right)\ldots\left(x-\xi_{j_{m+1}}\right)\ \ \ \ \left(m\leq n+1\right)

and we will determine the sign of the constant $\lambda$ such that $\lambda Q>0$ in the interval $L_{1}$ . The points (E) being a finite set we immediately observe that we can take $\lambda$ small enough in absolute value such that

\rho_{n}<f-E_{n}-\lambda Q<\rho_{n},

and this implies the theorem:

If $E_{n}$ is a best approximation polynomial of degree $n$ for the function $f$ on the $n+2$ points of $(E)$ , the difference $f-E_{n}$ takes equal and of contrary sign values in two consecutive points of $(E)$ .

We neglect here and in the subsequent part the case $\rho_{n}=0$ . In this case there exists a polynomial of degree $n$ which takes on the values $f\left(x_{r}\right)$ in the points $x_{r}$ .

3.2 The determination of the polynomial $E_{n}$ .

The property proved above shows immediately that the polynomial $E_{n}$ is uniquely determined. The computation of $E_{n}$ , along with the best approximation $\rho_{n}$ , is carried out solving the system

	$\displaystyle E_{n}\left(x_{r}\right)$	$\displaystyle=f\left(x_{r}\right)+\left(-1\right)^{r}p_{n}^{\prime}\ \ \ \left(\rho_{n}=\left\|\rho^{\prime}\right\|\right)$
	$\displaystyle r$	$\displaystyle=1,2,\ldots,n+2$

which must be compatible. In order to find explicitly $\rho_{n}$ and $E_{n}$ we will use the notations introduced in Sect. 2.4 and Sect. 2.5 as well as formula (2.6). With these notations we have

\rho_{n}^{\prime}=\frac{\left(-1\right)^{n+1}U\left(x_{1},x_{2},\ldots,x\ _{n+2};f\right)}{\sum\limits_{r=1}^{n+2}V\left(x_{1},\ldots,x_{r-1},x_{r-1},\ldots,x_{n+2}\right)}\cdot

(3.2)

The polynomial $E_{n}$ will be determined using the LAGRANGE’s interpolation formula

E_{n}=\sum_{r=1}^{n+2}\left[f\left(x_{r}\right)+\left(-1\right)^{r}\rho_{n}^{\prime}\right]\frac{G\left(x\right)}{\left(x-x_{r}\right)G^{\prime}\left(x_{r}\right)}

where

G\left(x\right)=\left(x-x_{1}\right)\left(x-x_{2}\right)\ldots\left(x-x_{n+2}\right).

We also have

G^{\prime}\left(x_{r}\right)=\frac{\left(-1\right)^{n+2-r}V\left(x_{1},x_{2},\ldots,x_{n-2}\right)}{V\left(x_{1},\ldots,x_{r-1},x_{r+1},\ldots,x_{n+2}\right)}

and thus we can write

	$\displaystyle E_{n}$	$\displaystyle=\frac{\left(-1\right)^{n+2}}{V\left(x_{1},x_{2},\ldots,x_{n+2}\right)}\sum_{r=1}^{n+2}\left[\rho_{n}^{\prime}+\left(-1\right)^{r}f\left(x_{r}\right)\right]\cdot$
		$\displaystyle\cdot V\left(x_{1},\ldots,x_{r-1},x_{r+1},\ldots,x_{n+2}\right)\frac{G\left(r\right)}{x-x_{r}}.$

Apparently, this polynomial is of degree $n+1$ but using (3.2) we see that the coefficient of $x^{n+1}$ vanishes. The best approximation $\rho_{n}$ equals

\rho_{n}=\frac{\left|U\left(x_{1},x_{2},\ldots,x_{n+2}\right)\right|}{\sum\limits_{r=1}^{n+2}V\left(x_{1},\ldots,x_{r-1},x_{r+1},\ldots,x_{n-2}\right)}

(3.3)

3.3 The first theorem of Ch. de la Vallée Poussin.

Let’s suppose now that a polynomial $P\left(x\right)$ of degree $n$ is such that the numbers $f\left(x_{r}\right)-P\left(x_{r}\right)$ , $r=1,2,\ldots,n+2$ are of alternative sign. We observe that the best approximation of $f$ equals that of $f-P$ , and thus formula (3.3) becomes

\rho_{n}=\frac{\sum\limits_{r=1}^{n+2}\left|f\left(x_{r}\right)-P\left(x_{r}\right)\right|V\left(x_{1},\ldots,x_{r-1},x_{r+1},\ldots,x_{n+2}\right)}{\sum\limits_{r=1}^{n+2}V\left(x_{1},\ldots,x_{r-1},x_{r+1},\ldots,x_{n+2}\right)}

which is a mean value of the numbers $\left|f\left(x_{r}\right)-P\left(x_{r}\right)\right|$ and thus we can state the result which will be called the the first Theorem of Ch. de la Vallée Poussin:

If a polynomial $P$ of degree $n$ is such that $f-P$ takes on values of contrary sign in two consecutive points of (E) then we have

\min_{r=1,2,\ldots,n+2}\left(\left|f\left(x_{r}\right)-P\left(x_{r}\right)\right|\right)<\rho_{n}<\max_{r=1,2,\ldots,n+2}\left(\left|f\left(x_{r}\right)-P\left(x_{r}\right)\right|\right)

supposing that the numbers $\left|f\left(x_{r}\right)-P\left(x_{r}\right)\right|$ are not mutually equal.

This property will be useful in concluding on the best approximation on a whole interval.

3.4 The second theorem of Ch. de la Vallée Poussin.

Let’s consider the function $f$ defined and continuous on the interval $\left[a,b\right]$ . The first Borel’s theorem assures the existence of a set $\left(E^{\ast}\right)$ of $n+2$ points $x_{1}<x_{2}<\ldots<x_{n+2}$ such that the best approximation $\rho_{n}^{\ast}$ on these points equals the best approximation $\mu_{n}$ of $f$ on $\left[a,b\right]$ . Let $E_{n}$ be the polynomial of the best approximation of degree $n$ on $\left(E^{\ast}\right)$ . If $\left|f-E_{n}\right|\leq\rho_{n}^{\ast}$ in $\left[a,b\right]$ then, the best approximation $\rho_{n}$ on any set (E) of $n+2$ points is at most equal with $\rho_{n}^{\ast}$ . Let’s suppose by contradiction that there exists a point $x^{\prime}$ such that $f\left(x\right)-E\left(x\right)|>\rho_{n}^{\ast}$ . If $x^{\prime}$ will be placed between $x_{r}$ and $x_{r+1}$ the difference $f-E_{n}$ has the same sign in $x^{\prime}$ as it takes on in $x_{r}$ or $x_{r+1}$ . Using the results from the previous Section, the best approximation is larger than $\rho_{n}^{\ast}$ on at least one of the sets of points

\left\{\begin{array}[c]{c}x_{1},\ldots,x_{r},x^{\prime},x_{r+2},\ldots,x_{n+2}\\ x_{1},\ldots,x_{r-1},x^{\prime}x_{r+1},\ldots,x_{n+2}.\end{array}\right.

(3.4)

The same thing happens if $x^{\prime}$ is placed outside the interval $\left(x_{1},x_{n+2}\right)$ .

On the other hand formula (3.3) shows that $\rho_{n}$ is a continuous function of $x_{1},x_{2},\ldots,x_{n+2}$ and must attains a maximum for at least a set $(E)$ .

Taking again into account the first Borel’s theorem we can enounce the following property:

The best approximation $\mu_{n}$ of a continuous function $f\left(x\right)$ on an interval $\left[a,b\right]$ equals the best approximation on $n+2$ points belonging to this interval, these points being chosen such that $\rho_{n}$ has the largest possible value. In other words

\mu_{n}\left(f\right)=\max\rho_{n}\left(f\right).

This theorem is true even when the function $f\left(x\right)$ is defined on a finite number of points or on a finite and closed arbitrary set.

3.5 Applications to functions with bounded differences.

In some cases formula (3.3) provides some refinements of the best approximation. We will say that the function $f\left(x\right)$ has the $nth$ divided difference bounded in the interval $\left[a,b\right],$ if quantity

\left[x_{1},x_{2},\ldots,x_{n+1};f\right],

defined in Sect. 2.5, remains bounded whenever $x_{1},x_{2},\ldots,x_{n+1}$ are $n+1$ arbitrary points in $\left[a,b\right]$ .The number

\Delta_{n}\left[f\right]=\max_{\left(a,b\right)}\left|\left[x_{1},x_{2},\ldots,x_{n+1};f\right]\right|

is called the $nth$ boundary or the boundary of order $n$ of $f$ in the interval $\left[a,b\right]$ . Supposing we have $x_{1}<x_{2}<\ldots<x_{n+2}$ , formula (3.3) can be written as

\rho_{n}=\frac{V\left(x_{1},x_{2},\ldots,x_{n+2}\right)}{\sum\limits_{i=1}^{n+2}V\left(x_{1},\ldots,x_{i-1},x_{i+1},\ldots,x_{n+2}\right)}\left|\left[x_{1},x_{2},\ldots,x_{n+2}\right]\right|.

But

\max_{\left[a,b\right]}\frac{V\left(x_{1},x_{2},\ldots,x_{+2}\right)}{\sum\limits_{i=1}^{n+2}V\left(x_{1},\ldots,x_{i-1},x_{i+1},\ldots,x_{n+2}\right)}

equals the best approximation of $x^{n+1}$ using polynomials of degree $n$ . This maximum equals (Sect. 2.9)

\frac{\left(b-a\right)^{n+1}}{2^{2n+1}}

and thus:

If the function $f\left(x\right)$ has the $(n+1)th$ divided difference bounded in the interval $\left[a,b\right]$ we have

\mu_{n}\left(f\right)\leq\frac{\left(b-a\right)^{n+1}}{2^{2n+1}}\Delta_{n+1}\left[f\right].

Particularly, if $f$ admits a bounded derivative of order $n+1$ and if we denote by $\Delta_{0}\left[f^{\left(n+1\right)}\right]$ the maximum or the upper boundary of $\left|f^{\left(n+1\right)}\right|$ in the interval $\left(a,b\right)$ we have

\Delta_{0}\left[f^{\left(n+1\right)}\right]=\left(n+1\right)!\Delta_{n+1}\left[f\right]

and consequently

\mu_{n}\left(f\right)\leq\frac{\left(b-a\right)^{n+1}}{2^{2n+1}\left(n+1\right)!}\Delta_{0}\left[f^{\left(n+1\right)}\right].

3.6 Oscillation modulus of a function.

In order to refine $\mu_{n}$ as well as for the problem which follows in the next lesson we have to introduce the oscillation modulus $\omega\left(\delta\right)$ of a function $f\left(x\right)$ . This modulus is a function of $\delta,$ and is defined by

\omega\left(\delta\right)=\max\left|f\left(x^{\prime}\right)-f\left(x^{\prime\prime}\right)\right|

whenever $x^{\prime},x^{\prime\prime}$ are two arbitrary points in the interval $\left(a,b\right)$ , such that $\left|x^{\prime}-x^{\prime\prime}\right|\leq\delta$ .

$\omega\left(\delta\right)$ is a function defined for $\delta$ in the interval $0<\delta\leq b-a,$ non decreasing and which does not become negative. The following inequality is obvious

\left|f\left(x^{\prime}\right)-f\left(x^{\prime\prime}\right)\right|\leq\varepsilon\left(\left|x-x^{\prime\prime}\right|\right).

(3.5)

The function $\omega\left(\delta\right)$ enjoys some properties which will be recalled below. Given an $\varepsilon>0$ , there exists a couple of two points $x^{\prime}<x^{\prime\prime}$ such that we have $\left|x^{\prime}-x^{\prime\prime}\right|\leq\delta$ and

\omega\left(\delta\right)-\varepsilon<\left|f\left(x^{\prime}\right)-f\left(x^{\prime\prime}\right)\right|.

Let’s divide the interval $\left(x^{\prime},x^{\prime\prime}\right)$ in $k$ subintervals of equal length using the nodes $x^{\prime}=x_{0},x_{1},\ldots,x_{k-1},x_{k}=x^{\prime\prime}$ and we will have

f\left(x^{\prime}\right)-f\left(x^{\prime\prime}\right)=\sum_{i=1}^{k}\left[f\left(x_{i}\right)-f\left(x_{i+1}\right)\right].

From this equality, we get

\left|f\left(x^{\prime}\right)-f\left(x^{\prime\prime}\right)\right|\leq k\omega\left(\frac{\delta}{k}\right)

and thus

\omega\left(\delta\right)<k\omega\left(\frac{\delta}{k}\right)+\varepsilon

for any $\varepsilon$ . $k$ being is a positive integer and letting $k\delta$ instead of $\delta$ we eventually get

\omega\left(k\delta\right)\leq k\omega\left(\delta\right).

If $k$ is a positive number and $k^{\prime}$ is the largest integer less or equal with $k$ we can write

\omega\left(k\delta\right)\leq\omega\left(\overline{k+1\delta}\right)\leq\left(k^{\prime}+1\right)\omega\left(\delta\right).

It follows that

\omega\left(k\delta\right)<\left(k+1\right)\omega\left(\delta\right)

for any positive $k$ (of course $\delta$ and $k\delta$ must be $<b-a$ ). Thus for $\delta\leq b-a$ we can write

\left|f\left(x^{\prime}\right)-g\left(x^{\prime\prime}\right)\right|<\left[\frac{\left|x^{\prime}-x^{\prime\prime}1\right|}{\delta}+1\right]\varepsilon\left(\delta\right).

(3.6)

Eventually, the necessary and sufficient condition for the continuity of $f$ is $\omega\left(\delta\right)\rightarrow 0$ , for $\delta\rightarrow 0$ .

29.
- •
  
  The upper limit of $\mu_{n}$ In the next lesson we will indicate the upper bound of $\mu_{n}$ . We want to indicate here a direct path which, if it could be followed to the end, could eventually give us the solution of this problem. The denominator of expression (22) can be written in the form

2\mathrm{D}\left(x_{1},x_{2},\ldots,x_{n+2}\right)

where

\mathrm{D}\left(x_{1},x_{2},\ldots,x_{n+2}\right)=\left|\begin{array}[]{ccccc}x_{2}-x_{1}&x_{2}^{2}-x_{1}^{2}&\ldots&x_{2}^{n}-x_{1}^{n}&(-1)^{n}\\ x_{3}-x_{2}&x_{3}^{2}-x_{2}^{2}&\ldots&x_{3}^{n}-x_{2}^{n}&-1)^{n-1}\\ \ldots..&\ldots..&\ldots&\ldots&\ldots\\ x_{n+1}-x_{n}&x_{n+1}-x_{n}^{2}&\ldots x_{n+1}^{n}-x_{n}^{n}&-1\\ x_{n+2}-x_{n+1}&x_{n+2}^{2}-x_{n+1}^{2}&\ldots&x_{n+2}^{n}-x_{n+1}^{n}&1\end{array}\right|.

It is worth noting that the minors of the last column are positive, because we assume here too $x_{1}<x_{2}<\cdots<x_{n+2}$ .

We really have

		$\displaystyle\left\|\begin{array}[]{cccc}x_{i}-x_{1}&x_{2}^{2}-x_{1}^{\prime}&\cdots&x_{2}^{n}-x_{1}^{n}\\ x_{3}-x_{2}&x_{3}^{2}-x_{2}^{2}&\cdots&x_{3}^{n}-x_{2}^{n}\\ \cdots&\cdots&\cdots&\cdots\\ x_{i}-x_{i-1}&x_{i}^{2}-x_{i-1}^{2}&\cdots&x_{i}^{n}-x_{i-1}^{n}\\ x_{i+2}-x_{i+1}&x_{i+2}^{2}-x_{i+1}^{2}&\cdots&x_{i+2}^{n}-x_{i+1}^{n}\\ \cdots\cdots&\cdots&\cdots&\cdots\\ x_{n+2}-x_{n+1}&x_{n+2}^{2}-x_{n+1}^{2}&\cdots&x_{n+2}^{n}-x_{n+1}^{n}\end{array}\right\|=$
		$\displaystyle=n!\int_{x_{1}}^{x_{2}}\cdots\int_{x_{i-1}}^{x_{i}}\int_{x_{i+1}}^{x_{i+2}}\cdots\int_{x_{n+1}}^{x_{n+2}}V\left(t_{1},t_{2},\ldots,t_{n}\right)dt_{1}dt_{2}\ldots dt_{n}>0.$

If we subtract each line from the next and take into account (24),
we deduce

\leq\left|\begin{array}[]{ccccc}x_{2}-x_{1}&x_{2}^{2}-x_{1}^{2}&\cdots&x_{2}^{n}-x_{1}^{n}&(-1)^{n}\omega\left(x_{2}-x_{1}\right)\\ x_{3}-x_{2}&x_{3}^{2}-x_{2}^{2}&\cdots&x_{3}^{n}-x_{2}^{n}&(-1)^{n-1}\omega\left(x_{3}-x_{2}\right)\\ \cdots\cdots&\cdots&\cdots&\cdots&\cdots\\ x_{n+2}-x_{n+1}&x_{n+2}^{2}-x_{n+1}^{2}&\cdots&x_{n+2}^{n}-x_{n+1}^{n}&\omega_{1}\left(x_{n+2}-x_{n+1}\right)\end{array}\right|.

Taking into account (25) and the previous observation, we deduce $\left|\mathrm{U}\left(x_{1},x_{2},\ldots,x_{n+2};f\right)\right|<\left|\frac{\mathrm{D}_{1}\left(x_{1},x_{2},\ldots.,x_{n+2}\right)}{\delta}+\mathrm{D}\left(x_{1},x_{2},\ldots,x_{n+2}\right)\right|\omega(\delta)$ .

We have here

\begin{gathered}\mathrm{D}_{1}\left(x_{1},x_{2},\ldots,x_{n+2}\right)=\\ =\left|\begin{array}[]{cccc}x_{2}-x_{1}&x_{2}^{2}-x_{1}^{2}&\cdots&x_{2}^{n}-x_{1}^{n}\\ x_{3}-x_{2}&x_{3}^{2}-x_{2}^{2}&\cdots&x_{3}^{n}-x_{2}^{n}\\ \cdots&\cdots&(-1)^{n-1}\left(x_{2}-x_{1}\right)\\ x_{n+2}-x_{n+1}&x_{n+2}^{2}-x_{n+1}^{2}&\cdots x_{n+2}^{n}-x_{n+1}^{n}&x_{n+2}-x_{n+1}\end{array}\right|=\\ =2(n!)\int_{x_{1}}^{x_{2}}\int_{x_{2}}^{x_{3}}\cdots\int_{x_{n+1}}^{x_{n+2}}\mathrm{D}\left(t_{1},t_{2},\ldots,t_{n+1}\right)dt_{1},dt_{2}\ldots dt_{n+1}.\end{gathered}

So if we denote by $\theta_{n}$ maximum of the quotient

\frac{\mathrm{D}_{1}\left(x_{1},x_{2},\ldots,x_{n+2}\right)}{\mathrm{D}\left(x_{1},x_{2},\ldots,x_{n+2}\right)},

(26)

when the points $x_{1}<x_{2}<\cdots<x_{n+2}$ describe the range $(a,b)$ , we have

\mu_{n}(f)<\left[\frac{\theta_{n}}{2\delta}+\frac{1}{2}\right]\omega(\delta).

It can easily be seen that $\theta_{n}<b-a$ , therefore taking $\delta=\theta_{n}$ , we find

\mu_{n}(f)<\omega\left(\theta_{n}\right).

Unfortunately, his determination $\theta_{n}$ seems to be a complicated problem. It is likely that for $n\rightarrow\infty$ this number is of the order of $\frac{1}{n}$ It would be interesting to demonstrate, as a first result, that $\theta_{n}\rightarrow 0$ for $n\rightarrow\infty$ .

It can easily be shown that if two or more points $x_{i}$ tend to be confused, expression (26) tends to 0. The ratio (26) is a
homogeneous function of degree 1 with respect to $x_{1},x_{2},\ldots,x_{n+2}$ and it depends only on the differences $x_{i}-x_{j}$ It follows that the maximum can only be reached for $x_{1}=a,x_{n+2}=b$ .

Either, in particular, $n=2$ We have dots. $x_{1}=a,x_{2}=y,x_{3}=x,x_{4}=b$ and the ratio (26) is written

\frac{2(y-a)(b-x)(x-y)(b-a+x-y)}{(x-a)(b-y)(b-a+y-x)}.

To calculate the maximum, differential calculus can be applied. By canceling the logarithmic partial derivatives, we find

\begin{array}[]{r}-\frac{1}{b-x}+\frac{1}{x-y}+\frac{1}{b-a+x-y}-\frac{1}{x-a}+\frac{1}{b-a+y-x}=0\\ \frac{1}{y-a}-\frac{1}{x-y}-\frac{1}{b-a+x-y}+\frac{1}{b-y}-\frac{1}{b-a+y-x}=0\end{array}

By adding together, we find

\frac{b-a}{(y-a)(b-y)}=\frac{b-a}{(x-a)(b-x)}

(x-y)(x+y-a-b)=0.

The maximum is therefore obtained for $x+y=a+b$ , that is, Peter $x$ and $y$ symmetrical about the middle of the interval ( $a,b$ ). It is then found that æ must be the root contained between $\frac{a+b}{2}$ and $b$ of the equation

\left(\left(x-\frac{a+b}{2}\right)^{2}+(b-a)\left(x-\frac{a+b}{2}\right)-\frac{(b-a)^{2}}{4}=0\right.

x=\frac{a+b}{2}+\frac{b-a}{2}(\sqrt{2}-1)

and

\theta_{2}=2(b-a)(\sqrt{2}-1)^{2}.

It is important to note that the points $x,y$ do not rationally divide the interval ( $a,b$ ). Furthermore, the coefficients of the polynomial

\left(z-x_{4}\right)\left(z-x_{2}\right)\left(z-x_{3}\right)\left(z-x_{4}\right),

When $x_{1},x_{2},x_{3},x_{4}$ are the points for which the maximum is reached, are not rational with respect to a and b. This fact, which probably occurs for any n, is the main cause of the difficulty in determining the maximum $\theta_{m}$ .

LESSON IV

Weierstrass's theorem

30.
- •
  
  Weierstrass's theorem. K. Weierstrass proved the following theorem ( ⁸ ):

Any continuous function on the interval ( $a,b$ ) is the limit of a sequence of polynomials, uniformly convergent in this interval.

The proof is not based on polynomial theory. $\mathbf{1}_{n}$ . However, it follows from this theorem that

\mu_{n}(f)\rightarrow 0\text{, pentru }n\rightarrow\infty

(27)

if the function is continuous.
It is obvious, moreover, that for any function $f$ HAVE

\mu_{0}\geq\mu_{1}\geq\cdots\geq\mu_{n}\geq\cdots

so the limit

\lim\mu_{n}(f)=\mu,\text{ pentru }\quad n\rightarrow\infty.

there is and is $\geq 0$ If
$\mu=0$ the polynomial sequence $\mathrm{T}_{n}$ converges absolutely and uniformly in ( $a,b$ ). It follows that for a discontinuous function there must be $m\mu\neq 0$ Weierstrass's theorem tells us that for a continuous function we have certainty $\mu=0$ .

The important problem would be to prove the relation (27) directly, relying only on the properties of polynomials $\mathrm{T}_{n}$ . If for example it could be shown that the number $\theta_{n}$ defined in No. 29 tends to zero for $n\rightarrow\infty$ , the problem would be solved.

Before proving Weierstrass's theorem we will show a result of Mr. L. Tonelli in connection with such a direct proof.
31. - Mr. L. Tonelli's theorem. Suppose that the sequence of polynomials

\mathrm{T}_{0}(x;f),\mathrm{T}_{1}(x;f),\ldots,\mathrm{T}_{n}(x;f),\ldots

(28)

converges uniformly to a continuous function $F(x)$ and that we have $\mu>0$ , then

\mathrm{M}(|f-\mathrm{F}|)\leq\mathrm{M}\left(\left|f-\mathrm{T}_{n}\right|\right)+\mathrm{M}\left(\left|\mathrm{~F}-\mathrm{T}_{n}\right|\right)\leq\mu_{n}+\mathrm{M}\left(\left|\mathrm{~F}-\mathrm{T}_{n}\right|\right).

It is easily deduced that

\mathrm{M}(|f-\mathrm{F}|)\leq\mu.

$f-\mathrm{F}$ being a continuous function, we can determine a $\delta>0$ thus, in any length interval $\leq\delta$ , the oscillation of this function is smaller than $\mu$ On the other hand we can find a number $n>\frac{b-a}{\delta}$ so that we have

\mathrm{M}\left(\left|\mathrm{~F}-\mathrm{T}_{n}\right|\right)<\varepsilon<\frac{\mu}{2}.

We know that there is at least $n+2$ points in carets $\pm\mu_{n}$ is alternatively reached and, from the way it was chosen $n\left[n>\frac{b-a}{\delta}\right]$ , it follows that there are among these $n+2$ at least two points $x^{\prime},x^{\prime\prime}$ so that

\begin{gathered}\left|x^{\prime}-x^{\prime\prime}\right|<\delta,\\ f\left(x^{\prime}\right)-\mathrm{T}_{n}\left(x^{\prime}\right)=\mu_{n},\quad f\left(x^{\prime\prime}\right)-\mathrm{T}_{n}\left(x^{\prime\prime}\right)=-\mu_{n},\end{gathered}

from where

	$\displaystyle f\left(x^{\prime}\right)-\mathrm{F}\left(x^{\prime}\right)$	$\displaystyle=\left[f\left(x^{\prime}\right)-\mathrm{T}_{n}\left(x^{\prime}\right)\right]+\left[\mathrm{T}_{n}\left(x^{\prime}\right)-\mathrm{F}\left(x^{\prime}\right)\right]>\mu_{n}-\varepsilon\geq\mu-\varepsilon>\frac{\mu}{2};$
	$\displaystyle f\left(x^{\prime\prime}\right)-\mathrm{F}\left(x^{\prime\prime}\right)$	$\displaystyle=\left[f\left(x^{\prime\prime}\right)-\mathrm{T}_{n}\left(x^{\prime\prime}\right)\right]+\left[\mathrm{T}_{n}\left(x^{\prime\prime}\right)-\mathrm{F}\left(x^{\prime\prime}\right)\right]<-\mu_{n}+\varepsilon\leq-\mu+\varepsilon<-\frac{\mu}{2}.$

It follows that the oscillation of the function $f-\mathrm{F}$ in the interval ( $x^{\prime},x^{\prime\prime}$ ) is greater than $\mu$ , which is impossible. The hypothesis $u>0$ so it is not good. Therefore we must have $\mu=0$ We have the following theorem of Mr. Tonelli:

If the series of polynomials (28) converges absolutely and uniformly to a function (necessarily continuous), this function coincides with f(x).
32. - Mr. S. Bernstein's polynomials. We will prove Weierstrass's theorem with the help of Mr. S. Bernstein's polynomials. We must therefore, first of all, give the definition of these polynomials.

Let's divide the interval ( $a,b$ ) in $n$ equal parts and either

a_{i}=a+i\frac{b-a}{n},\quad i=0,1,\ldots,n\quad\left(a_{0}=a,a_{n}=b\right)

points of division.
A polynomial of degree n whose coefficients depend linearly and homogeneously on those $n+1$ important $f\left(a_{i}\right),i=0,1,\ldots,n$ , is called an interpolation polynomial of degree $n$ of the function $f(x)$ We will study, in particular, the interpolation polynomial introduced by DI S. Bernstein (9)

P_{n}(x;f)=\frac{1}{(b-a)^{n}}\sum_{i=0}^{n}\binom{n}{i}f\left(a_{i}\right)(x-a)^{i}(b-x)^{n-i}

It is interesting to note how this polynomial can be obtained in a somewhat geometric way.

Whether $\mathbf{A}_{0},\mathbf{A}_{1},\ldots,\mathbf{A}_{n}$ representative points of the function $f(x)$ for $x=a_{0},a_{1},\ldots,a_{n}$ that is, the points of forgiveness $a_{i},f\left(a_{i}\right)$ Let's build the polygonal line $\mathrm{A}_{0}\mathrm{~A}_{1}\ldots\mathrm{~A}_{n}$ .

Let's take the sides $\mathrm{A}_{0}\mathrm{~A}_{1},\mathrm{~A}_{1}\mathrm{~A}_{2},\ldots,\mathrm{~A}_{n-1}\mathrm{~A}_{n}$ of the polyline

gonal points $A_{0}^{\prime},A_{1}^{\prime},\ldots,A_{n-1}^{\prime}$ which intersect these sides in the same direction and in the same ratio. We choose this ratio so that

\mathrm{A}_{0}\mathrm{~A}_{0}^{\prime}=\mathrm{A}_{1}\mathrm{~A}_{1}^{\prime}=\cdots=\mathrm{A}_{n-1}\mathrm{~A}_{n-1}^{\prime}=\frac{s}{n}\cdot\frac{b-a}{n},

$s$ being a whole, $0\leq s\leq n$ . In the polygonal line $A_{0}^{\prime}A_{1}^{\prime}\ldots A_{n-1}^{\prime}$ we inscribe the polygonal line in the same way $\mathrm{A}_{0}^{\prime\prime}\mathrm{A}_{1}^{\prime\prime}\ldots\mathrm{A}_{n-2}^{\prime\prime}$ preserving the meaning and the meaning of dividing the sides; therefore we have everything

\mathrm{A}_{0}^{\prime}\mathrm{A}_{0}^{\prime\prime}=\mathrm{A}_{1}^{\prime}\mathrm{A}_{1}^{\prime\prime}=\cdots=\mathrm{A}_{n-2}^{\prime}\mathrm{A}_{n-2}^{\prime\prime}=\frac{s}{n}\cdot\frac{b-a}{n}.

Continuing this process, we successively insert the polygonal lines $A_{0}^{(k)}A_{1}^{(k)}\ldots A_{n-k}^{(k)},k=3,4,\ldots,n$ The last one comes down to a point. $A_{0}^{(n)}$ We have

\mathrm{A}_{0}\mathrm{~A}_{0}^{\prime}=\mathrm{A}_{0}^{\prime}\mathrm{A}_{0}^{\prime\prime}=\cdots=\mathrm{A}_{0}^{(n-1)}\mathrm{A}_{0}^{(n)}=\frac{s}{n}\cdot\frac{b-a}{n}

so its abscissa $\mathrm{A}_{0}^{(n)}$ it is precisely

a+s\frac{b-a}{n}=a_{s}.

Let's note this point. $\mathrm{A}_{0}^{(n)}$ with $\mathrm{A}_{p}^{*}$ , to highlight the number $s$ , and let's calculate its ordinate $\mathrm{A}_{s}^{*}$ For $i=0$ and $i=n$ point $\mathrm{A}_{\text{\& }}$ coincides with $\mathrm{A}_{0}$ and $\mathrm{A}_{n}$ respectively. In general, let us denote by $b_{s}$ his/her order $\mathrm{A}_{s}$ , with $b_{r}^{(k)}$ his/her order $\mathrm{A}_{r}^{(k)}$ and with $b_{s}^{*}$ his/her order $\mathrm{A}_{s}^{*}$ We have

b_{r}^{(k)}=\frac{(n-s)b_{r}^{(k-1)}+sb_{r+1}^{(k-1)}}{n},r=0,1,\ldots,n-k,k=1,2,\ldots,n-1

(29)

and

b_{s}^{*}=\frac{(n-s)b_{0}^{(n-1)}+sb_{1}^{(n-1)}}{n}.

(30)

From (29) we successively deduce

		$\displaystyle b_{r}^{(1)}=\frac{(n-s)b_{r}+sb_{r+1}}{n}$
		$\displaystyle b_{r}^{(2)}=\frac{(n-s)^{2}b_{r}+2s(n-s)b_{r+1}+s^{2}b_{r+2}}{n^{2}}$

and in general

b_{r}^{(k)}=\frac{1}{n^{k}}\sum_{i=0}^{k}\binom{k}{i}s^{i}(n-s)^{k-i}b_{r+i},\quad r=0,1,\ldots,n-k.

Formula (30) therefore gives us

b_{s}^{*}=\frac{1}{n^{n}}\sum_{i=0}^{n}\binom{n}{i}s^{i}(n-s)^{n-i}b_{i}.

Returning now to the polynomial $\mathrm{P}_{n}(x;f)$ we notice that we have:

P_{n}\left[a+s\frac{b-a}{n};f\right]=\frac{1}{n^{n}}\sum_{i=0}^{n}\binom{n}{i}s^{i}(n-s)^{n-i}f\left(a_{i}\right).

It follows that Mr. Bernstein's polynomial $\mathrm{P}_{n}(x;f)$ is the Lagrange polynomial that takes the values $b_{s}^{*}$ for the points $a_{s}$ .
33.- Determining an upper limit for $\left|f(x)-\mathrm{P}_{n}(x;f)\right|$ Let's determine an upper limit for $\left|f-\mathrm{P}_{n}(x;f)\right|$ We note that $\mathrm{P}_{n}(x;1)\equiv 1$ , from which we deduce, using the oscillation modulus $\omega(\delta)$ defined in No. 27,

	$\displaystyle\mid f-$	$\displaystyle P_{n}(x;f)\left\|=\left\|\frac{1}{(b-a)^{n}}\sum_{i=0}^{n}\binom{n}{i}\left[f(x)-f\left(a_{i}\right)\right](x-a)^{i}(b-x)^{n-i}\right\|\leq\right.$
		$\displaystyle\leq\frac{1}{(b-a)^{n}}\sum_{i=1}^{n}\binom{n}{i}\omega\left(\left\|x-a_{i}\right\|\right)(x-a)^{i}(b-x)^{n-i}<$
		$\displaystyle<\left\{\frac{1}{\delta}\cdot\frac{1}{(b-a)^{n}}\sum_{i=0}^{n}\binom{n}{i}\left\|x-a_{i}\right\|(x-a)^{\prime}(b-x)^{n-i}+1\right\}\omega(0)$

Let's put

\psi(x)=\frac{1}{(b-a)^{n}}\sum_{i=0}^{n}\binom{n}{i}\left|x-a_{i}\right|(x-a)^{i}(b-x)^{n-t}

(31)

and

\begin{gathered}\mathrm{N}_{n}==\max\psi(x)\\ \text{ in }(a,b)\end{gathered}

Finally, $\delta=2\mathrm{~N}_{n}$ , we deduce (it will be seen that actually $\delta\leq b-a$ )

\left|f-\mathrm{P}_{n}(x;f)\right|<\frac{3}{2}\omega\left(2\mathrm{~N}_{n}\right)

(32)

34.
- •
  
  The approximation given by the polynomial $\mathrm{P}_{n}(x;f)$ We can now calculate the approximation given by the polynomials $\mathrm{P}_{n}(x;f)$ Let's first calculate the function $\psi(x)$ We have in the interval ( $a_{j},a_{j+1}$ ),

	$\displaystyle\psi(x)=\frac{1}{(b-a)^{n}}$	$\displaystyle\sum_{i=0}^{j}\binom{n}{j}\left(x-a_{i}\right)(x-a)^{l}(b-x)^{n-i}+$
		$\displaystyle+\frac{1}{(b-a)^{n}}\sum_{i=j+1}^{n}\binom{n}{i}\left(a_{i}-x\right)(x-a)^{l}(b-x)^{n-i}=$
		$\displaystyle\left.=\frac{2}{(b-a)^{n}}\cdot\sum_{i=0}^{j}\binom{n}{i}\left(x-a_{i}\right)^{\prime}x-a\right)^{i}(b-x)^{n-i}$

because it is easy to pore that

\sum_{i=0}^{n}\binom{n}{i}\left(a_{i}-x\right)(x-a)^{i}(b-x)^{n-i}=0

Doing the calculations, we find

\psi(x)=\frac{2}{(b-a)^{n}}\binom{n-1}{,}(x-a)^{j+1}\left(b-x_{j}^{n-j}\right.

The maximum of this polynomial in the interval $\left(a_{j},a_{j+1}\right)$ is reached for

x=\frac{(j+1)b+(n-1)a}{n+1}

and has the value

2(b-a)(n-1)\frac{(j+1)^{j+1}(n-j)^{n-j}}{(n+1)^{n+1}}=2(b-a)\lambda_{j}

(33)

Function $\left(\frac{x+1}{x}\right)^{x+1}$ it is decreasing when $x\geq 1$ increases It results
that we have

\left(\frac{j+2}{j+1}\right)^{j+2}>\left(\frac{n-j}{n-j-1}\right)^{n-j}\quad\text{ pêntru }\quad n>\frac{j+1}{2}

\lambda_{j+1}>\lambda_{j}

We deduce from this that (33) reaches its maximum for $j=\frac{n}{2}$ or $j=\frac{n-1}{2}$ as $n$ is it even or odd.

We have that but

\begin{array}[]{ll}\mathrm{N}_{n}=2(b-a)\binom{n-1}{\frac{n}{2}}\frac{\left(\frac{n}{2}+1\right)^{\frac{n}{2}+1}\left(\frac{n}{2}\right)^{\frac{n}{2}}}{(n+1)^{n+1}}&\text{ pentru }n\text{ par }\\ \mathrm{N}_{n}=\frac{b-a}{2^{n}}\left(\frac{n-1}{2}\right)&\text{ pentru }n\text{ impar. }\end{array}

It is immediately demonstrated that

\begin{gathered}\sqrt{2n-1}\mathrm{~N}_{2n-1}>\sqrt{2n+1}\mathrm{~N}_{2n+1}\\ \mathrm{~N}_{1}=\frac{b-a}{2},\quad\mathrm{~N}_{3}=\frac{b-a}{4}\end{gathered}

from where

\begin{array}[]{ll}\mathrm{N}_{2n+1}<\frac{b-a}{2\sqrt{2n+1}}&\\ \mathrm{~N}_{2n+1}=\frac{\sqrt{3(b-a)}}{4\sqrt{2n+1}}&\text{ pentru }n\geq 1\end{array}

For $n$ we seem to have

		$\displaystyle\mathrm{N}_{2n}=\mathrm{N}_{2n+1}\frac{(n+1)^{n+1}n^{n}}{(2n+1)^{2n+1}}2^{2n+1}<\mathrm{N}_{2n+1}\cdot\frac{2^{2n+1}(n+1)}{(2n+1)^{2n+1}}\left(\frac{2n+1}{2}\right)^{2n}=$
	$\displaystyle=$	$\displaystyle\mathrm{N}_{2n+1}\frac{2(n+1)}{2n+1}\leq\frac{\sqrt{3}(b-a)}{4\sqrt{2n+1}}\cdot\frac{2(n+1)}{2n+1}=\frac{1}{2}\cdot\frac{\sqrt{3}(n+1)(a-b)}{(2n+1)\sqrt{2n+1}}<\frac{b-a}{2\sqrt{2n}}$

so in general

\mathrm{N}_{n}\leq\frac{b--a}{2\sqrt{n}}

Formula (32) therefore becomes

\left|f-\mathrm{P}_{n}(x;f)\right|<\frac{3}{2}\omega\left(\frac{b-a}{\sqrt{n}}\right)

If the function $f$ is continuous $\omega\left(\frac{b-a}{\sqrt{n}}\right)\rightarrow 0$ for $n\rightarrow\infty$ and Weierstrass' theorem is proved. Furthermore, it is seen that the best approximation of a continuous function by polynomials of degree $n$ , that is, the number $\mu_{n}$ , is at least of the order of $\omega\left(\frac{b-a}{\sqrt{n}}\right)$ .

The approximation given by Mr. S. Bernstein's polynomials cannot be improved in general. For example, let the function

f_{2}(x)=\left|x-\frac{a+b}{2}\right|

We have in this $\operatorname{caz}\omega(\delta)=\delta$ for $\delta\leq\frac{b-a}{2}$ A
simple calculation shows that
$\frac{d^{2}\mathrm{P}_{n}\left(x;f_{2}\right)}{dx^{2}}=\frac{n(n-1)}{(b-a)^{n}}\sum_{i=0}^{n-2}\binom{n-2}{i}\left[f_{2}\left(a_{i}\right)-2f_{2}\left(a_{i+1}\right)+f_{2}\left(a_{i+2}\right)\right](x-a)^{i}(b-x)^{n-i}$ from where

\frac{d^{2}\mathrm{P}_{2n}\left(x;f_{2}\right)}{dx^{2}}=\frac{d^{2}\mathrm{P}_{2n+1}\left(x;f_{2}\right)}{dx^{2}}=\frac{n}{(b-a)^{2n-1}}\binom{2n}{n}[(x-a)(b-x)]^{n-1}.

From this it follows that $\mathrm{P}_{2n}\left(x;f_{2}\right)\equiv\mathrm{P}_{2n+1}\left(x;f_{2}\right)$ and that $\mathrm{P}_{2n}\left(x;f_{2}\right)$ is a convex function (in the usual sense) in the interval ( $a,b$ ). We therefore have

\begin{gathered}\max_{\text{in }(a,b)}\left|f_{2}-\mathrm{P}_{2n}\left(x;f_{2}\right)\right|=\mathrm{P}_{2n}\left(\frac{a+b}{2};f_{2}\right)-f_{2}\left(\frac{a+b}{2}\right)=\\ =\frac{1}{2^{2n}}\sum_{i=0}^{2n}\binom{n}{i}\left|a_{i}-\frac{a+b}{2}\right|=\frac{b-a}{2^{2n+1}}\binom{2n}{n}\end{gathered}

We now have

\frac{1}{2^{2n+1}}\binom{2n}{n}\sqrt{2n}>\frac{1}{2^{2n-1}}\binom{2n-2}{n-1}\sqrt{2n-2}

\frac{1}{2^{2n+1}}\binom{2n}{n}>\frac{1}{2\sqrt{2}}\frac{1}{\sqrt{2n}}

from where

\max_{\ln(a,b)}\left|f_{2}-\mathrm{P}_{2n}\left(x;f_{2}\right)\right|>\frac{1}{2\gamma\overline{2}}\cdot\frac{b-a}{\sqrt{2n}}=\frac{1}{2\sqrt{2}}\omega\left(\frac{b-a}{\sqrt{2n}}\right),

which proves our statement.
35. - Approximation of convex functions of higher order. Mr. Bernstein's polynomials still allow us to establish some
interesting results on the approximation of convex functions of higher order ( $\left.{}^{\circ}\right)_{\text{a }}$ Using the notations from No. 17, let's put

\Delta_{k}^{i}=\left[a_{i},a_{i+1}\ldots,a_{i+k};f\right],\quad i=0,1,\ldots,n-k,k=1,2,\ldots

A simple calculation shows us that

\frac{d\mathrm{P}_{n}(x:f)}{dx}=\frac{1}{(b-a)^{n-1}}\sum_{i=0}^{n-1}\binom{n-1}{i}\Delta_{1}^{i}(x-a)^{i}(b-x)^{n-1\div i}

and in general

	$\displaystyle\frac{d^{k}\mathrm{P}_{n}(x:f)}{dx^{k}}$	$\displaystyle=k!\left(1-\frac{1}{n}\right)\left(1-\frac{2}{n}\right)\cdots$
		$\displaystyle\cdots\left(1-\frac{k-1}{n}\right)\frac{1}{(b-a)^{n-k}}\sum_{i=0}^{n-1}\binom{n-k}{i}\Delta_{k}^{i}(x-a)(b-x)^{n-k-i}$

It is immediately seen that if the function $f(x)$ enjoys a property of determinate convexity, the polynomials of D. S. Bernstein enjoy the same property of convexity. We assume here that convexity proper and polynomiality are particular cases of non-concavity. The property follows from the definition of higher-order functions and from the fact that if a function is differentiable, the necessary and sufficient condition for it to be non-concave of order n is that its derivative of the order $n+1$ not to become negative, etc.

We can however state the property:
A continuous function $f$ , which enjoys certain convexity properties, is the limit of a sequence of polynomials, uniformly convergent in the interval ( $a,b$ ) and which enjoy the same convexity property.
36. - Approximation of functions with bounded divided differences. We can also obtain some results on functions with bounded divided differences. Let us consider the relation

k!\Delta_{k}[f]=\Delta_{0}\left[f^{(k)}\right]

defined at No. 26. For the polynomial $P_{n}{}^{\prime}(a;f)$ we will have

k!\Delta_{k}\left[\mathrm{P}_{n}\right]=\Delta_{0}\left[\mathrm{P}_{n}^{(k)}\right]

whence, taking into account (34),

\Delta_{k}\left[\mathrm{P}_{n}\right]<\left(1-\frac{1}{n}\right)\left(1-\frac{2}{n}\right)\ldots\left(1-\frac{k-1}{n}\right)\Delta_{k}[f]

It can still be written.

\Delta_{k}\left[\mathrm{P}_{n}\right]\leq\Delta_{k}[f],\quad k=0,1;\quad\Delta_{k}\left[\mathrm{P}_{n}\right]<\Delta_{k}[f],\quad k>1

We have the property:
A continuous function $f$ which is with a bounded divided difference, is the limit of a sequence of polynomials, uniformly convergent in the interval a, b), which have the limits of order 0 and 1 at most equal to those of the function and the limits of order $>1$ smaller than that of the function.
37. Approximation of functions with bounded variation. Let

x_{1}<x_{2}<\cdots<x_{m}\quad(m\geq n+1)

a series of points in the interval $(a,b)$ Number

v_{m}=\sum_{i=1}^{m-n-1}\left|\left[x_{i+1},x_{i+2},\ldots,x_{i+n+1};f\right]-\left[x_{i},x_{i+1},\ldots,x_{i+n};f\right]\right|

it is called a $n^{a}$ variation of $f(x)$ on the points $x_{i}$ considered.
If we put

\max_{\text{in }(a,b)}v_{m}=V_{n}[f]

the maximum being taken when both the points vary $x_{i}$ as well as their number, the number $V_{n}[f]$ it is called a $n^{a}$ total variation of $f(x)$ in the interval ( $a,b$ ). If $\mathrm{V}_{n}[f]$ is a finite number the function is said to be with a n bounded variation.

We also have the relationship here

k!\mathrm{V}_{k}[f]=\mathrm{V}_{0}\left[f^{(k)}\right]

as well as the formula

V_{0}[f]=\int_{a}^{b}\left|f^{\prime}\right|dx

well known from the theory of functions with bounded variation (of order 0).
For polynomials $\mathrm{P}_{n}(x;f)$ HAVE

\mathrm{V}_{k}\left[\mathrm{P}_{n}^{(k)}\right]=\frac{1}{k!}\int_{a}^{b}\left|\mathrm{P}_{n}^{(k+1)}\right|dx

Taking into account formula (34), we deduce

\begin{gathered}\mathrm{V}_{k}\left[\mathrm{P}_{n}\right]\leq\frac{k+1}{(b-a)^{n-k-1}}\left(1-\frac{1}{n}\right)\left(1-\frac{2}{n}\right)\cdots\left(1-\frac{k}{n}\right)\\ \sum_{=1}^{n-k-1}\binom{n-k-1}{i}\left|\Delta_{k+1}^{i}\right|\int_{a}^{b}(x-a)^{i}(b-x)^{n-k-1-i}dx\end{gathered}

But

		$\displaystyle\int_{a}^{b}(x-a)^{i}(b-x)^{n-k-1-i}dx=(b-a)^{n-k}\frac{i!(n-i-k-1)!}{(n-k)!}=\frac{(b-a)^{n-k}}{(n-k)\binom{n-k-1}{i}}$
		so

\mathrm{V}_{k}\left[\mathrm{P}_{n}\right]\leq\left(1-\frac{1}{n}\right)\left(1-\frac{2}{n}\right)\cdots\left(1-\frac{k-1}{n}\right)\frac{(k+1)(b-a)}{n}\sum_{i=1}^{n-k-1}\left|\Delta_{k+1}^{i}\right|.

But we also have the relationship

\frac{(k+1)(b-a)}{n}\Delta_{k+1}^{i}=\Delta_{k}^{i}-\Delta_{k}^{i+1}

therefore

\frac{(k+1)(b-a)}{n}\sum_{i=1}^{n-k-1}\left|\Delta_{k+1}^{i}\right|=\sum_{i=1}^{n-k-1}\left|\Delta_{k}^{i}-\Delta_{k}^{i+1}\right|\leq\mathrm{V}_{k}[f]

We therefore deduce that

\mathrm{V}_{k}\left[\mathrm{P}_{n}\right]\leq\mathrm{V}_{k}[f];\quad k=0,1;\quad\mathrm{V}_{k}\left[\mathrm{P}_{n}\right]<\mathrm{V}_{k}[f],\quad k>1,

We have the property:
A function $f$ continue with $n^{a}$ Bounded variation is the limit of: a sequence of polynomials, uniformly convergent in the interval ( $a,b$ ), which have: the total variations of order 0 and 1 at most equal to those of the function and the total variations of order $>1$ smaller than those of the function.
38. - Approximation of differentiable functions. Let us finally see what results Mr. Bernstein's polynomials for differentiable functions lead us to. Let us therefore assume that the function $f(x)$ has a continuous derivative of order $k$ and be $\omega_{k}(\delta)$ the oscillation modulus of this derivative. We know that we have the generalized average formula

k!\Delta_{k}^{i}=f(k)\left(a+\frac{b-a}{n}(i+\theta k)\right),\quad 0<\theta<1,

using the notations above.
We deduce from this that

		$\displaystyle\left\|k!\Delta_{k}^{i}-f^{(k)}(x)\right\|\leq\omega_{k}\left(\left\|x-a-\frac{b-a}{n}(i+\theta k)\right\|\right)\leq$
		$\displaystyle\leq\omega_{k}\left(\max\left(\left\|x-a_{i}\right\|,\left\|x-a_{i+1}\right\|,\ldots,\left\|x-a_{i+k}\right\|\right)\right)$

Now let the polynomial be,

		$\displaystyle Q_{n,k}(x;f)=\frac{\mathrm{P}_{n}^{(k)}(x;f)}{\left(1-\frac{1}{n}\right)\left(1-\frac{2}{n}\right)\cdots\left(1-\frac{k-1}{n}\right)}=$
		$\displaystyle=\frac{1}{(b-a)^{n-k}}\sum_{i=0}^{n-k}\binom{n-k}{i}k!\Delta_{k}^{i}(x-a)^{i}(b-x)^{n-k-i}.$

		$\displaystyle\left\|f^{(k)}-Q_{n,k}(x;f)\right\|<\left\{\frac{1}{\delta}\left(\frac{1}{(b-a)^{n-k}}\sum_{i=0}^{s}\binom{n-k}{i}\left\|x-a_{i}\right\|(x-a)^{i}(b-x)^{n-k-i}+\right.\right.$
		$\displaystyle\left.\left.\quad+\frac{1}{(b-a)^{n-k}}\sum_{i=s+1}^{n-k}\binom{n-k}{i}\left\|x-a_{i+k}\right\|(x-a)^{i}(b-x)^{n-k-i}\right)+1\right\}\omega_{k}(\delta)$

where $s$ is determined in the following way:

\begin{array}[]{lll}s=j-\frac{k}{2}&\text{ dacă }k\text{ este par şi }&a_{j}\leq x\leq a_{j+1}\\ s=j-\frac{k+1}{2}&\text{ dacă }k\text{ este impar şi }&a_{j}\leq x\leq\frac{a_{j}+a_{j+1}}{2}\\ s=j-\frac{k-1}{2}&\text{ dacă }k\text{ este impar şi }&\frac{a_{j}+a_{j+1}}{2}\leq x\leq a_{j+1}.\end{array}

Of course, if in these formulas we have $s<0$ or $s\geq n-k$ , the first or second term in the second parenthesis disappears.

Noting that

\left|x-a_{i+1}\right|\leq\left|x-a_{i}\right|+\left|a_{i+k}-a_{i}\right|-\left|x-a_{i}\right|+\frac{k(b-a)}{n}

we can also write
$\left|f^{(k)}-Q_{n,k}(x;f)\right|<\left|\frac{1}{\delta}\left(\frac{1}{\left(b-a,^{n-k}\right.}\sum_{i=0}^{n-k}\binom{n-k}{i}\left|x-a_{i}\right|(x-a)^{i}(b-x)^{n-k-i}+\psi_{1}(x)\right)+1\right|\omega_{k}(\delta)$
where

\psi_{1}(x)=\frac{k}{n}\cdot\frac{1}{(b-a)^{n-k-1}}\sum_{i=s+1}^{n-k}\binom{n-k}{i}(x-a)^{i}(b-x)^{n-k-i}

and we have to take $\psi_{1}(x)\equiv 0$ if $s\geq n-k$
We now have

	$\displaystyle\frac{1}{(b-a)^{n-k}}\sum_{i=0}^{n-k}\binom{n-k}{i}\left\|x-a_{i}\right\|(x-a)^{i}(b-x)^{n-k-i}\leq$		(35)
	$\displaystyle\leq\frac{1}{(b-a)^{n-k}}\sum_{i=0}^{n-k}\binom{n-k}{i}\left\|x-a-i\frac{b-a}{n-k}\right\|(x-a)^{i}(b-x)^{n-k-i}+$
	$\displaystyle+\frac{k}{n(n-k)(b-a)^{n-k-1}}\sum_{i=0}^{n-k}\binom{n-k}{i}i(x-a)^{i}(b-x)^{n-k-i}$

which immediately results from the relationship

x-a_{i}=x-a-i\frac{b-a}{n}+i\frac{k(b-a)}{n(n-k)}.

But we know from No. 34 that
$\frac{1}{(b-a)^{n-k}}\sum_{i=0}^{n-k}\binom{n-k}{i}\left|x-a-i\frac{b-a}{n-k}\right|(x-a)^{i}(b-x)^{n-k-i}\leq\frac{b-a}{2\sqrt{n-k}}$ .
On the other hand

\frac{k}{n(n-k)(b-a)^{n-k-1}}\sum_{i=0}^{n-k}\binom{n-k}{i}i(x-a)^{i}(b-x)^{n-k-i}=\frac{k(x-a)}{n}

and we see that the first member of relation (35) is

\leqslant\frac{b-a}{2\sqrt{n-k}}+\frac{k(b-a)}{n}.

We now have obviously and

\psi_{1}(x)\leq\frac{k}{n}\cdot\frac{1}{(b-a)^{n-k-1}}\sum_{i=0}^{n-k}\binom{n-k}{i}(x-a)^{i}(b-x)^{n-k-i}=\frac{k(b-a)}{n}

from which it follows that

\left.\left|f^{(k)}-Q_{n,k}(x;f)\right|<\left\lvert\,\frac{1}{\delta}\left(\frac{b-a}{2\sqrt{n-k}}+\frac{2k(b-a)}{n}\right)+1\right.\right\}\omega_{k}(\delta)

or, putting $\delta=\frac{b-a}{\sqrt{n-k}}$ ,
(36) $\left|f^{(k)}-Q_{n,k}(x;f)\right|<\left(\frac{3}{2}+2\frac{k\sqrt{n-k}}{n}\right)\omega_{k}\left(\frac{b-a}{\sqrt{n-k}}\right)\leq$

\leq\frac{3+2\sqrt{k}}{2}\omega_{k}\left(\frac{b-a}{\sqrt{n-k}}\right)\quad(n\geq k+1).

39.
- •
  
  Convergence of derivatives of Mr. Bernstein's polynomials.

Derivative of the order $k$ of the function $f$ being assumed continuous, the upper edge $\Delta_{0}\left[f^{(k)}\right]$ is finite. We have .

	$\displaystyle f^{(k)}-\mathrm{P}_{n}^{(k)}(x;f)=$	$\displaystyle f^{(k)}-\mathrm{Q}_{n,k}(x;f)+$
		$\displaystyle+\left[1-\left(1-\frac{1}{n}\right)\left(1-\frac{2}{n}\right)\cdots\left(1-\frac{k-1}{n}\right)\right]\mathrm{Q}_{n,k}(x;f)$

The results from No. 36 show us that

\left|Q_{n,k}(x;f)\right|\leq\Delta_{0}\left[f^{(k)}\right]

and on the other hand we have the inequality

1-\left(1-\frac{1}{n}\right)\left(1-\frac{2}{n}\right)\cdots\left(1-\frac{k-1}{n}\right)\leq\frac{k(k-1)}{2n}

Taking into account formula (36), we deduce

\left|f^{(k)}-\mathrm{P}_{n}^{(k)}(x;f)\right|<\frac{3+2\sqrt{k}}{2}\omega_{k}\left(\frac{b-a}{\sqrt{n-k}}\right)+\frac{k(k-1)}{2n}\Delta_{0}\left[f^{(k)}\right]

(37)

which shows us that:
If the function $f(x)$ , defined in the interval ( $a,l$ ), is continuous with the first $k$ its derivations, the polynomial sequences $\mathrm{P}_{n}(x;f)$ , $\mathrm{P}^{\prime}{}_{n}(x;f),\ldots\mathrm{P}_{n}^{(k)}(x;f)$ tend absolutely and uniformly towards $f(x),f^{\prime}(x),\ldots,f^{(k)}(x)$ respectively, throughout the interval ( $a,b$ ).

DI E. Borel first posed the problem of finding a sequence of polynomials uniformly convergent to a continuous function $f(x)$ , so that the series formed with derivatives of a given order $k$ of these mixed polynomials be uniformly convergent to the derivative $f^{(k)}(x)$ , assumed continuous, of the function $f(x)$ . As can be seen, Mr. Bernstein's polynomials solve this problem in an elegant way. This qualitative result is due to Mr. S. Wigert ¹¹ ).

In particular, for the first-order derivative, the second term in the second member of inequality (37) vanishes so that

\left|f^{\prime}-\mathrm{P}_{n}^{\prime}(x;f)\right|<\frac{5}{2}\omega_{1}\left(\frac{b-a}{\sqrt{n-1}}\right)

We can still observe that if $f^{(k)}(x)$ verifies a Lipschitz condition: ordinary, its approximation $f^{(k)}$ by $\mathrm{P}_{n}^{(k)}(x;f)$ it is his order $\frac{1}{\sqrt{n}}$ therefore of the same order as the approximation by $\mathrm{P}_{n}\left(x;f^{(k)}\right)$ .

Mr. S. Bernstein's polynomials also enjoy numerous properties which have been studied mainly by Mr. Bernstein himself as well as by his students.
40. - The upper limit of $\mu_{n}$ We saw in No. 28 that the best approximation by polynomials of degree $n$ of a continuous function $f(x)$ is, in general, at least of the order of $\omega\left(\frac{b-a}{n}\right)$ where

$\omega(\delta)$ is the oscillation modulus of $f(x)$ Mr. D. Jackson demonstrated for the first time that $\mu_{n}$ it is even his order $\omega\left(\frac{b-a}{n}\right){}^{(12)}$ . Various proofs of this result are known. We will not insist here on these proofs, however. One can usefully consult the cited book by Mr. Ch. de la Vallée Poussin ⁽¹³⁾ ). It would be interesting to see if the number $\theta_{n}$ defined in No. 29 is not really of the order $\frac{1}{n}$ In this case the polynomials $\mathrm{T}_{n}$ would be sufficient for the demonstration, both qualitatively and quantitatively, of Weierstrass's theorem.

LECTURE V

The case of functions of two independent variables

41.
- •
  
  The problem of the best approximation for a function of two real variables. The preceding results can be extended, to a large extent, to functions of more than one and in particular to those of two real variables. We will briefly examine this generalization. It is important to note that uniqueness no longer holds in general if the function is continuous.

So let's take a real function $f(x,y)$ of two real variables $x$ and $y$ , uniform and defined in a certain bounded and closed domain (D). To keep things simple, we will assume that this domain is bounded by a simple and closed curve. The domain (D) can be, for example, a rectangle

a\leq x\leq b,\quad c\leq y\leq d.

(38)

function $f(x,y)$ will be assumed continuous in (D).
The problem is posed as for the case of functions of a single variable.

We consider the set of polynomials

\mathrm{P}(x,y)=a_{00}+a_{10}x+a_{01}y+\cdots+a_{n0}x^{n}+a_{n-11}x^{n-1}y+\cdots+a_{0n}y^{\prime\prime}

of two variables $x$ and $y$ of the degree $n$ A polynomial of the set is completely determined by the coefficients $a_{ij}$ .

We still note with $\mathrm{M}(f)$ the maximum or upper bound of the function $f(x,y)$ in the domain (D). The error or approximation with which the polynomial $\mathrm{P}(x,y)$ represents the function $f(x,y)$ is equal, by definition, to $\mathrm{M}(|f-\mathrm{P}|)$ . The best approximation of the function $f(x,y)$ by polynomials of degree $n$ is equal, by definition, to the lower edge $\mu_{n}(f)$ or simpler $\mu_{n}$ his/her $\mathrm{M}(|f-\mathrm{P}|)$ When $\mathrm{P}(x,y)$ traverses the set of polynomials of degree $n$ .

The problem that must now be examined is posed as for functions of a single variable:

Given the function $f(x,y)$ , to determine the polynomials of degree n for which $\mathrm{M}(|f-\mathrm{P}|)$ reaches its lower edge $\mu_{n}$ and to study this number $\mu_{n}$ .

Problema existenţei, a unicității și principalele proprietăţi ale polinoamelor de cea mai bună aproximatie au fost examinate de Dl L. Tonelli ( ¹⁴ ).

Un polinom pentru care minimul $\mu_{n}$ este atins se poate şi aici numi un polinom de cea mai bună aproximatie de gradul $n$ al funcției $f$ si se poate nota cu $\mathrm{T}_{n}(x,y;f)$ sau mai simplu cu $\mathrm{T}_{n}$ . Vom zice si aici că un astfel de polinom este un polinom $\mathrm{T}_{n}$ .

In ce priveşte numărul $\mu_{n}$ el este pozitiv sau nul și de altfel nu se poate anula decât dacă $f(x,y)$ coincide cu un polinom de gradul $n$ . In cele ce urmează vom presupune că suntem in cazul $\mu_{n}>0$ .

Dacă $\mathrm{P}(x,y)$ este un polinom $\mathrm{T}_{n}$ al functiei $f(x,y)$ , polinomul $\mathrm{P}(x,y)+\mathrm{Q}(x,y)$ , unde $\mathrm{Q}(x,y)$ este un polinom de gradul $n$ , este un polinom $\mathrm{T}_{n}$ al funcției $f(x,y)+Q(x,y)$ . Reciproc, orice polinom $\mathrm{T}_{n}$ al functiei $f(x,y)+Q(x,y)$ este de forma $\mathrm{P}(x,y)+Q(x,y)$ . Avem

\mu_{n}(f+Q)=\mu_{n}(f)

Deasemenea, C find o constantă, $\mathrm{CP}(x,y)$ este un polinom $\mathrm{T}_{n}$ al functiei $\mathrm{C}f(x,y)$ și reciproc. orice polinom $\mathrm{T}_{n}$ al lui $\mathrm{C}f(x,y)$ este de forma $\operatorname{CP}(x,y)$ . Avem

\mu_{n}(\mathrm{C}f)=|\mathrm{C}|\mu_{n}(f).

42.
- •
  
  Existenţa polinoamelor de cea mai bună aproximatie. Lema preliminară dela Nr. 6 se extinde imediat :

Dacă un polinom $\mathrm{P}(x,y)$ de gradul n rămâne mărginit de un număr A, în domeniul (D), coeficientii aij rămân mărginiti de un număr $\lambda\mathrm{A}$ , unde $\lambda$ nu depinde decât de n si de domeniul (D).

Demonstratia se face la fel. Luăm $\mathrm{N}=\binom{n+2}{2}$ puncte $\mathrm{M}_{r}\left(x_{r},y_{r}\right)$ ,

$\pi=1,2,\ldots,N$ in (D) astfel ca determinantul
(39)
$\left.\begin{array}[]{lllll}1&x_{r}&y_{r}&\ldots&x_{r}^{n}\end{array}\quad x_{r}^{n-1}y_{r}\ldots y_{r}^{n}\right\rvert\,$
să fie diferit de zero. Rezolvăm apoi sistemul

\begin{gathered}a_{00}+a_{10}x_{r}+a_{01}y_{r}+\ldots+a_{n0}x_{r}^{n}+a_{n-11}x_{r}^{n-1}y_{r}+\ldots+a_{02}y_{r}^{n}=\mathrm{P}\left(x_{r},y_{r}\right)\\ r=1,2,\ldots,\mathrm{~N}\end{gathered}

in raport cu coeficienţii $a_{ij}$ cu ajutorul regulei lui Cramer şi tinem seamă de

\left|\mathrm{P}\left(x_{r},y_{r}\right)\right|<\mathrm{A},\quad r=1,2,\ldots,\mathrm{~N}.

Se pot uşor alege punctele Mr astfel ca determinantul (39) să fie diferit de zero. E destul să luăm N puncte distincte formând o reţea triunghiulară astfel

\left(x_{r},y_{s}\right),\quad r=1,2,\ldots,n+1,\quad s=r,r+1,\ldots,n+1.

Determinantul sistemului este atunci egal, afară poate de semn, cu

\begin{gathered}\mathrm{V}\left(x_{1},x_{2}\right)\mathrm{V}\left(x_{1},x_{2},x_{3}\right)\ldots\mathrm{V}\left(x_{1},x_{2},\ldots,x_{n+1}\right)\mathrm{V}\left(y_{1},y_{2},\ldots,y_{n+1}\right).\\ .\mathrm{V}\left(y_{2},y_{3},\ldots,y_{n+1}\right)\ldots\mathrm{V}\left(y_{n-1},y_{n},y_{n+1}\right)\mathrm{V}\left(y_{n},y_{n+1}\right),\end{gathered}

intrebuinţând notația deja semnalată a determinatului lui Van der Monde.

Rezultatele dela Nr. 7 sunt aplicabile. $\mathrm{M}(|f-\mathrm{P}|)$ este o functie continuă de coeficienții $a_{ij}$ . Rezultă că marginea inferioară $\mu_{n}$ a numerilor $\mathrm{M}(|f-\mathrm{P}|)$ coincide cu limita lor inferioară.

Repetând acum rationamentul dela Nr. 8 putem enunta propriedatea :

Oricare ar fi functia continuă $f(x,y)$ , există cel puțin un polinom de cea mai bună aproximatie de gradul n.

E de observat că acest rezultat rămâne adevărat chiar şi pentru - funcţie mărginită oarecare.
43. - Prima proprietate a polinoamelor de cea mai bunã approximatie, Dacă $P(x,y)$ este un polinom de cea mai bună aproximaţie de gradul $n$ , există cel puțin un punct ( $x,y$ ) unde avem

|f(x,y)-\mathrm{P}(x,y)|=\mu_{n}

(40)

Numărul acestor puncte capătă o primă precizare prin proprietatea următoare:

Dacă $\mathrm{P}(x,y)$ este un polinom de cea mai bună aproximasie de gradul n, există cel putin $n+2$ puncte unde avem egalitatea (40).

Pentru a demonstra această proprietate să considerăm întâi $n\not+\mathbb{A}$ puncte distincte $\mathrm{M}_{r}\left(x_{r},y_{r}\right),r=1,2,\ldots,n+1$ şi fie tabloul

	$\displaystyle\left\\|1\quad x_{r}\quad y_{r}\ldots x_{r}^{n}x_{r}^{n-1}y_{r}\ldots y_{r}^{n}\right\\|$		(41)
	$\displaystyle r=1,2,\ldots,n+1$

cu $n=\binom{n+2}{2}$ coloane si $n+1$ linii. Să inmulțim acest tablou cus urniătorul

\|\begin{array}[]{cccccccccccccccc}1&0&0&0&0&0&0&0&0&0&\ldots&\ldots&\ldots&\ldots&\ldots&0\\ 0&1&i&0&0&0&0&0&0&0&\ldots&\ldots&\ldots&\ldots&\ldots&0\\ 0&0&0&1&2i&-1&0&0&0&0&\ldots&\ldots&\ldots&\ldots&\ldots&\ldots\\ 0&0&0&0&0&0&1&3i&-3&-i&\ldots&\ldots&\ldots&\ldots&\ldots&\ldots\\ \ldots&\ldots&\ldots&\ldots&\ldots&\ldots&\ldots&\ldots&\ldots&\ldots&\ldots&\ldots&\ldots&\ldots\\ 0&0&0&0&0&0&0&0&0&0&0&\ldots&1&\binom{n}{1}i&\binom{n}{2}i^{2}&\ldots\end{array}

unde $i=\gamma\overline{-1}$ .
Dacă punem $z_{r}=x_{r}+iy_{r}$ vedem că produsul celor două tablouris este egal cu determinantul $\mathrm{V}\left(z_{1},z_{2},\ldots,z_{n+1}\right)$ şi deci este diferit dezero. Formula [cunoscută a lui Cauchy ne arată atunci că există îno tabloul (41) cel putin un determinant de crdinul $n+1$ diferit de zero.

Să presupunem acum că egalitatea (40) nu are loc decât in $m\leq n+1$ puncte $\mathrm{M}_{r}\left(x_{r},y_{r}\right),r=1,2,\ldots,m$ . From the demonstrated property of the array (41) it follows that we can find in the first $m$ lines a determinant of the order $m$ different from zero. Either for fixing ideas,

\left|1\quad x_{r}\quad x_{r}^{2}\ldots x_{r}^{m-2}\quad x_{r}y_{r}\right|,\quad r=1,2,\ldots,

such a determinant.
Let's construct the polynomial $Q(x,y)$ of the degree $n$ and of shape

Q(x,y)=b_{0}+b_{1}x+\cdots+b_{m-2}x^{m-2}+bxy

which checks the conditions

\mathrm{Q}\left(x_{r},y_{r}\right)=f\left(x_{r},y_{r}\right)-\mathrm{P}\left(x_{r},y_{n}\right),\quad r=1,2,\ldots,m,

what is possible.
Let us consider the closed circles ( $\mathrm{C}_{r}$ ) with the center in $\mathrm{M}_{r}$ and the radius equal to a positive number $\delta$ We choose this number $\delta$ so that
$1^{0}$ . The circles ( $\mathrm{C}_{r}$ ) not to be cut.
$2^{0}:f(x,y)-\mathrm{P}(x,y),\mathrm{Q}(x,y)$ not to be canceled in these circles.

It follows that in each circle the functions $f(x,y)-\mathrm{P}(x,y);\mathrm{Q}(x,y)$ keeps the same sign.

Let (J') be the closed domain obtained from (D) by removing the interior of the corks ( $\mathrm{C}_{r}$ In this field ( $\mathrm{D}^{\prime}$ ) we have

|f(x,y)-\mathrm{P}(x,y)|\leq\mu^{\prime}<\mu_{n}

Be it now $\lambda$ a positive number chosen so that

\lambda<\frac{\mu_{n}-\mu^{\prime}}{2M(|Q|)}\left(<\frac{\mu_{n}}{M(|Q|)}\right)

we have then

|f(x,y)-\mathbb{R}(x,y)-\lambda Q(x,y)|<\mu^{\prime}+\frac{\mu_{n}-\mu^{\prime}}{2}=\frac{\mu_{n}+\mu^{\prime}}{2}<\mu_{n}

from all over the field ( $\mathrm{D}^{\prime}$ ).
In a circle ( $\mathrm{C}_{\varphi}$ ) we have

|f(x,y)-\mathrm{P}(x,y)-\lambda Q(x,y)|<\mu_{n}

In fact, for example,

f\left(x_{r},y_{r}\right)-\mathrm{P}\left(x_{r},y_{r}\right)=\mu_{n}

then in ( $\mathrm{C}_{r}$ )
$-\mu_{n}<-\lambda Q(x,y)\leq f(x,y)-\mathrm{P}(x,y)-\lambda Q(x,y)\leq f(x,y)-\mathrm{P}(x,y)\leq\mu_{1_{e}}$
equality cannot occur unless $f(x,y)-\mathrm{P}(x,y)=0$ or $\mathrm{Q}(x,y)=0$ , which I saw was impossible.

It follows that, in the entire domain (D), we have

|f(x,y)-\mathrm{P}(x,y)-\lambda\mathrm{Q}(x,y)|<\mu_{n}

so the polynomial $\mathrm{P}(x,y)+\lambda\mathrm{Q}(x,y)$ gives a better approximation. This is not a contradiction with the hypothesis that $\mathrm{P}(x,y)$ is a polynomial $\mathrm{T}_{n}$ ; thus the property is proven.

44.- Completion of the previous result. The previous property-

The tooth can be specified as follows:

If $\mathrm{P}(x,y)$ is a polynomial $\mathrm{T}_{n}$ , there is at least $\left\lceil\frac{n+2}{2}\right\rfloor$ puncture $\mathrm{M}_{r}\left(x_{r},y_{r}\right)$ where

f\left(x_{r},y_{r}\right)-P\left(x_{r},y_{r}\right)=\mu_{n}

and at least $\left\lceil\frac{n+2}{2}\right\rfloor$ puncture $\mathrm{M}_{r}^{\prime}\left(x_{r}^{\prime},y_{r}^{\prime}\right)$ where

f\left(x_{r}^{\prime},y_{r}^{\prime}\right)-\mathrm{P}\left(x^{\prime}{}_{r},y^{\prime}{}_{r}\right)=-\mu_{r}\ldots

to denote the largest integer contained in æ.
Let us demonstrate the first part of the statement for example.

There cannot be no point $\mathrm{M}_{\text{r }}$ , because otherwise we would have

-\mu_{n}\leq\equiv f(x,y)-\mathrm{P}(x,y)\leq\mu^{\prime}<\mu_{n}\quad,\quad\text{ in (D) }

-\mu_{n}<-\frac{\mu_{n}+\mu^{\prime}}{2}\leq f(x,y)-\mathrm{P}(x,y)+\frac{\mu_{n}-\mu^{\prime}}{2}\leq\frac{\mu_{n}+\mu^{\prime}}{2}<\mu_{n}

and the polynomial $\mathrm{P}(x,y)-\frac{\mu_{n}-\mu^{\prime}}{2}$ would give a better approximation.
Let us therefore assume that there are only $m<\left[\frac{n+2}{2}\right]$ puncture $\mathrm{M}_{\text{r, }}$
$r=1,2,\ldots,m$ We still consider the closed circles ( $\mathrm{C}_{r}$ ) defined in the previous No. We still take their common radius $\delta$ small enough so that the circles don't intersect and so on. $f(x,y)-\mathrm{P}(x,y)$ to remain positive in these circles. In each circle ( $C_{rr}$ ) we take a point $\mathrm{M}_{r}^{*}\left(\xi_{r},\eta_{r}\right)$ at a distance $\frac{1}{4}\delta$ of $\mathrm{M}_{r}$ , and be $\left(\mathrm{C}_{r}^{*}\right)^{*}$ the closed circle with the center in $\mathrm{M}_{r}^{*}$ and the radius equal to $\frac{3}{4}\delta=\delta^{*}$ .
Now let the polynomial of degree $n$

\begin{gathered}Q(x,y)=\left[\left(x-\xi_{1}\right)^{2}+\left(y-\eta_{1}\right)^{2}-\delta^{*}2\right]\left[\left(x-\xi_{2}\right)^{2}+\left(y-\eta_{2}\right)^{2}-\delta^{*2}\right]\ldots\\ \ldots\left[\left(x-\xi_{m}\right)^{2}+\left(y-\eta_{m}\right)^{2}-\delta^{*2}\right].\end{gathered}

This polynomial only vanishes on the contour of the circles ( $\mathrm{C}_{2}^{*}$ ). We have $Q(x,y)<0$ inside these circles and $Q(x,y)>0$ in the open domain (D'), which is obtained from (D) by removing the circles ( $\mathrm{C}_{r}^{*}$ ).

In ( $\mathrm{D}^{\prime}$ ) we have

-\mu_{n}\leq f(x,y)-P(x,y)\leq\mu^{\prime}<\mu_{n}.

Let's take $\lambda$ positive so that

\lambda<\frac{\mu_{n}-\mu^{\prime}}{2M(|Q|)}\left(<\frac{\mu_{n}}{M(|Q|)}\right).

We have in the domain ( $\mathrm{D}^{\prime}$ )

f(x,y)-\mathrm{P}(x,y)+\lambda\mathrm{Q}(x,y)<\mu^{\prime}+\frac{\mu_{n}-\mu^{\prime}}{2}=\frac{\mu_{n}+\mu^{\prime}}{2}<\mu_{n}

and

-\mu_{n}<f(x,y)-\mathrm{P}(x,y)+\lambda Q(x,y),

For this inequality, it is worth noting that we have the sign $\leq$ . Equality could only occur if $Q(x,y)=0$ but-
then $f(x,y)-\mathrm{P}(x,y)>0$ We have so but

\mid f(x,y)-\mathrm{P}(x,y)+\lambda Q(x,y)<\mu_{n}

in ( $\mathrm{D}^{\prime}$ ).
In the circles ( $\mathrm{C}_{r}^{*}$ ) we have
s.

f(x,y)-\mathrm{P}(x,y)+\lambda Q(x,y)<\mu_{n}

-\mu_{n}<\lambda Q(x,y)<f(x,y)-\mathrm{P}(x,y)+\lambda Q(x,y).

It follows that we have

|f(x,y)-\mathrm{P}(x,y)+\lambda Q(x,y)|<\mu_{n}

everywhere in (D). ¹ The polynomial $\mathrm{P}(x,y)-\lambda\mathrm{Q}(x,y)$ therefore a better approximation, contrary to the hypothesis. This contradiction proves the property.

The proof is done the same for the points $\mathbf{M}_{\boldsymbol{r}}^{\boldsymbol{\prime}}$ .
45. - Theorem of Mr. L. Tonelli. Whether $\mathrm{P}(x,y)$ a polynomial of degree n and E the set of points ( $x^{\prime},y^{\prime}$ ) in which M—( $f-\mathrm{P}$ )— is reached. The set E may be finite or an arbitrary closed set. D1 L. Tonelli gave the following theorem, which is somewhat analogous to the first theorem of Mr. E. Borel (No. 15):

The necessary and sufficient condition that $\mathrm{P}(x,y)$ to be a polynomial T n is such that no polynomial can be found $Q(x,y)$ of degree n verifying conditions
10. $\operatorname{sg}\mathrm{Q}\left(x^{\prime},y^{\prime}\right)=\operatorname{sg}\left(f\left(x^{\prime},y^{\prime}\right)-\mathrm{P}\left(x^{\prime},y^{\prime}\right)\right)$ ,
$2^{0}.\Gamma>\left|Q\left(x^{\prime},y^{\prime}\right)\right|>\gamma>0$ ,
at all points of the manifold E.
To show that this condition is sufficient, it is enough to show that if $\mathrm{P}(x,y)$ is not a polynomial $\mathrm{T}_{n}$ we can find such a polynomial $Q(x,y)$ Let us therefore suppose that

\mathrm{M}(|f-\mathrm{P}|)=\mu^{\prime}>\mu_{n}

From the relationship

\mathrm{T}_{n}(x,y;f)-\mathrm{P}(x,y)=f(x,y)-\mathrm{P}(x,y)-\left(f(x,y)-\mathrm{T}_{n}(x,y;f)\right)

it results that

\begin{gathered}\operatorname{sg}\left(\mathrm{T}_{n}\left(x^{\prime},y^{\prime};f\right)-\mathrm{P}\left(x^{\prime},y^{\prime}\right)\right)=\operatorname{sg}\left(f\left(x^{\prime},y^{\prime}\right)-\mathrm{P}\left(x^{\prime},y^{\prime}\right)\right)\\ 0<\mu^{\prime}-\mu_{n}\leq\left|\mathrm{T}_{n}\left(x^{\prime},y^{\prime};f\right)-\mathrm{P}\left(x^{\prime},y^{\prime}\right)\right|\leq\mu^{\prime}+\mu_{n}.\end{gathered}

So we can take $Q(x,y)=T_{n}(x,y;f)-\mathrm{P}(x,y)$ .
Let us now show that this condition is also necessary. Let us assume that

\left|f(x,y)-{}^{2}\mathrm{P}(x,y)\right|\leq\mu^{\prime}<\mu_{n}

Be it now $\lambda$ a positive number chosen so that

\lambda<\min\left(\frac{\mu_{n}-\mu^{\prime}}{2\mathrm{M}(|Q|)},\frac{\mu_{n}-\varepsilon}{\Gamma+\varepsilon}\right)

we have then

|f(x,y)-P(x,y)-\lambda Q(x,y)|<\mu^{\prime}+\frac{\mu_{n}-\mu^{\prime}}{2}=\frac{\mu_{n}+\mu^{\prime}}{2}<\mu_{n}

in ( $\mathrm{D}^{\prime}$ ).
In a circle ( $\mathrm{C}^{\prime}$ ) where

f\left(x^{\prime},y^{\prime}\right)-\mathrm{P}\left(x^{\prime},y^{\prime}\right)=\mu_{n}

HAVE

\begin{array}[]{r}0<\mu_{n}-\varepsilon<f(x,y)-\mathrm{P}(x,y)\leq\mu_{n}\\ -\lambda(\Gamma+\varepsilon)<\lambda Q(x,y)<-\lambda(\gamma-\varepsilon)\end{array}

so
$0<\mu_{n}-\varepsilon-\lambda(\Gamma+\varepsilon)<f(x,y)-P(x,y)-\lambda Q(x,y)<\mu_{n}-\lambda(\gamma-\varepsilon)<\mu_{n}$ .
Similarly, we observe that in a circle ( $\mathrm{C}^{\prime}$ ) where

f\left(x^{\prime},y^{\prime}\right)-\mathrm{P}\left(x^{\prime},y^{\prime}\right)=-\mu_{x}

HAVE
$-\mu_{n}<-\mu_{n}+\lambda(\gamma-\varepsilon)</(x,y)-\mathrm{P}(x,y)-\lambda Q(x,y)<-\mu_{n}+\varepsilon+\lambda(\Gamma+\varepsilon)<0$ It follows therefore
that in the circles ( $\mathrm{C}^{\prime}$ ),

|f(x,y)-\mathrm{P}(x,y)-\lambda Q(x,y)|<\mu_{n}-\lambda(\gamma-\varepsilon)<\mu_{n}.

It is seen, however, that for $\lambda$ we have quite a bit

|f(x,y)-\mathrm{P}(x,y)-\lambda Q(x,y)|<\mu_{n}

in the whole domain (D). This inequality contains the contradiction that proves the theorem.
47. - Multiplicity of polynomials $\mathbf{T}_{n}$ We will now show, by an example, that the polynomial $\mathrm{T}_{n}(x,y;f)$ may not be unique.

Whether $f(x)$ a continuous function of a variable defined in the interval $(a,b)$ and $\mathrm{T}_{n}(x)$ its best-approximation polynomial of degree n. Let us denote by $\mu_{n}^{*}$ the approximation given by $\mathrm{T}_{n}(x){}^{*}$

Let us now consider the function

f(x,y)=\frac{1}{d-c}\left[(y-c)\mathrm{T}_{n}(x)-(y-d)f(x)\right]

defined in the rectangle (38). Let $\mu_{n}$ the best approximation of $f(x,y)$ by polynomials of degree n. We have

f(x,y)-\mathrm{T}_{n}(x)=\frac{d-y}{d-c}\left(f(x)-\mathrm{T}_{n}(x)\right)

which shows that

\left|f(x,y)-\mathrm{T}_{n}(x)\right|\leq\mu_{n}^{*}

equality is only possible for $y=c$ and for certain values of $x$ So we have for sure $\mu_{n}\leq\mu_{n}^{*}$ .

We now have

f(x,c)=f(x)

and it follows that if $\mathrm{P}(x,y)$ is a polynomial $\mathrm{T}_{n}$ his/her $f(x,y)$ it must be

\mathrm{P}(x,c)\equiv\mathrm{T}_{n}(x).

Otherwise there would be at least one value $x$ for which

|f(x,c)-\mathrm{P}(x,c)|>\mu_{n}^{*}

We therefore have
Now let the polynomial

\mu_{n}=\mu_{n}^{*}.

\mathrm{P}(x,y)=\mathrm{T}_{n}(x)+\lambda\mu_{nx}^{*}\frac{y-c}{d-c}

sounds $|\lambda|\leq 1$ We have

\begin{gathered}|f(x,y)-\mathrm{P}(x,y)|=\frac{1}{d-c}\left|(d-y)\left(f(x)-\mathrm{T}_{n}(x)\right)-\lambda\mu_{n}^{*}(y-c)\right|\leq\\ \leq\frac{\mu_{n}^{*}}{d-c}[(d-y)+|\lambda|(y-c)]\leq\mu_{n}^{*}=\mu_{n}\end{gathered}

so all these polynomials are polynomials $\mathrm{T}_{n}$ Without
further ado, we only point out that DI L. Tonelli has also established various other properties of polynomials. $\mathrm{T}_{n}$ . You can see the cited article by Mr. Tonelli.
47. - Weierstrass's theorem. Weierstrass's theorem, stated in No. 30 for continuous functions of one variable, remains true. This theorem tells us that if the function is continuous we have

\mu_{n}(f)\rightarrow 0\quad\text{ pentru }\quad n\rightarrow\infty

For simplicity, let us assume that (D) is the rectangle (38). In very general cases we can return to this case by conveniently extending the function $f(x,y)$ We can prove Weierstrass's theorem with the help of Mr. S. Bernstein's polynomials of two variables.

\begin{gathered}\mathrm{P}_{m,n}(x,y;f)=\frac{1}{(b-a)^{m}(d-c)^{n}}\sum_{i=0}^{m}\sum_{j=0}^{n}\binom{m}{i}\binom{n}{i}f\left(a_{i},c_{j}\right)(x-a)^{i}(b-x)^{m-i}\ldots\\ \cdot(y-c)^{j}(d-y)^{n-j}\end{gathered}

where

	$\displaystyle a_{i}=a+i\frac{b-a}{m},$	$\displaystyle i=0,1,\ldots,m$
	$\displaystyle c_{j}=c+j\frac{d-c}{n},$	$\displaystyle j=0,1,\ldots,n$

these polynomials.
To limit the approximation given by this polynomial we define the oscillation modulus $\omega(\delta)$ his/her $f(x,y)$ in the following way

\omega(\delta)=\max\left|f\left(x^{\prime},y^{\prime}\right)-f\left(x^{\prime\prime},y^{\prime\prime}\right)\right|

when ( $x^{\prime},y^{\prime}$ ), ( $x^{\prime\prime},y^{\prime\prime}$ ) are two points in (D) such that

\left|x^{\prime}-x^{\prime\prime}\right|+\left|y^{\prime}-y^{\prime\prime}\right|\leq\delta.

function $\omega(\delta)$ enjoys properties analogous to those of the case of functions of one variable. These properties are proved in the same way. Let us recall them here for the case of two variables.
$\omega(\delta)$ is a function defined for $\delta\leq b-a+d-c$ , non-decreasing and which does not become negative. We have

\left|f\left(x^{\prime},y^{\prime}\right)-f\left(x^{\prime\prime},y^{\prime\prime}\right)\right|\leq\omega^{\prime}\left(\left|x^{\prime}-x^{\prime\prime}\right|+\left|y^{\prime}-y^{\prime\prime}\right|\right)

and

\omega(k\delta)<(k+1)\omega(\delta)

for a positive number $k$ so that $k\bar{o}$ and $\delta$ to be $\leq b-a+d-c$ .
The necessary and sufficient condition that the function $f(x,y)$ to be continuous in (D) is such that we have $\omega(\delta)\rightarrow 0$ for $\delta\rightarrow 0$ .

Returning now to our problem, we can write, taking into account the properties of the oscillation modulus,

\begin{gathered}\left|f(x,y)-\mathrm{P}_{m,n}(x,y;f)\right|\leq\frac{1}{(b-a)^{m}(d-c)^{n}}\sum_{i=0}^{m}\sum_{j=0}^{n}\binom{m}{i}\binom{n}{j}.\\ \cdot\left|f(x,y)-f\left(a_{i},c_{j}\right)\right|(x-a)^{i}(b-x)^{m-i}(y-c)^{j}(d-y)^{n-j}\end{gathered}

and

\left|f(x,y)-f\left(a_{i},c_{j}\right)\right|<\left[\frac{\left|x-a_{i}\right|+\left|y-c_{j}\right|}{\delta}+1\right]\omega(\delta)

Doing the calculations, it is found that

\begin{gathered}\left|f(x,y)-\mathrm{P}_{m,n}(x,y;f)\right|\leq\left\{\frac{1}{\delta}\left[\frac{1}{(b-a)^{m}}\sum_{i=0}^{m}\binom{m}{i}\left|x-a_{i}\right|(x-a)^{i}(b-x)^{m-i}+\right.\right.\\ \left.\left.\quad+\frac{1}{(d-c)^{n}}\sum_{j=0}^{n}\binom{n}{j}\left|y-c_{j}\right|(y-c)^{i}(d-y)^{n-j}\right]+1\right\}\omega(\delta)\end{gathered}

However, we showed in No. 34 that

		$\displaystyle\frac{1}{(b-a)^{mn}}\sum_{i=0}^{m}\binom{m}{i}\left\|x-a_{i}\right\|(x-a)^{i}(b-x)^{m-i}\leq\frac{b-a}{2\sqrt{m}}$
		$\displaystyle\frac{1}{(d-c)^{n}}\sum_{j=1}^{n}\binom{n}{i}\left\|y-c_{j}\right\|(y-c)^{j}(d-y)^{n-i}\leq\frac{d-c}{2\sqrt{n}}$

So if we take

\delta=\frac{b-a}{\sqrt{m}}+\frac{d-c}{\sqrt{n}}

FIND

\left|f(x,y)-\mathrm{P}_{m,n}(x,y;f)\right|<\frac{3}{2}\omega\left(\frac{b-a}{\sqrt{m}}+\frac{d-\rho}{\sqrt{n}}\right)

If we do $m\rightarrow\infty,n\rightarrow\infty$ we come across Weierstrass's theorem-
48. - The problem of the best approximation for a function of a complex variable. So far we have studied the case of functions of real variables. Let us briefly examine the case of functions of a complex variable. A function $f(x,y)$ of two real variables that takes real or complex values can also be called a function of a complex variable... $z=x+iy(i=\sqrt{-1})$ Such a function is of the form $f_{1}(x,y)+if_{2}(x,y)\cdots$ where $f_{1}$ and $f_{2}$ are real functions. The necessary and sufficient condition that the function $f(x,y)$ to be continuous is that the functions $f_{1}$ and $f_{2}$ to be continuous.

For abbreviation function $f(x,y)$ it is also noted with $f(z)$ . Vomiting. assumes as above that $f(z)$ is defined and continuous in the domain (D).

Let us now consider the set of analytic polynomials of degree n-

\mathrm{P}(z)=a_{0}z^{n}+a_{1}z^{n-1}+\cdots+a_{n}

A polynomial of the set is completely determined by its coefficients $a_{0},a_{1},\ldots,a_{n}$ real or complex.

The modulus of a function is a real function so $\mathrm{M}(|f-\mathrm{P}|\mathrm{U}\mid)$ has a well-defined meaning here as well and represents, by definition, the error or approximation with which the polynomial $\mathrm{P}(z)$ represents the function $f(z)$ in the domain (D). The best approximation $\mu_{n}(f)$ , or shorter $\mu_{n}$ , is here too, —by definition, the lower edge of the numbers $\left.\mathrm{M}_{(}|f-\mathrm{P}|\right)$ when Po of degree n.

The problem that interests us is posed as before:
Given a function $f(z)$ , to determine the polynomials of degree n for which $\mathrm{M}(|f-\mathrm{P}|)$ reaches its lower edge $\mu_{n}$ and to study this number $\mu_{n}$ .

The definition of a best-fitting polynomial is self-explanatory. We will denote such a polynomial by $\mathrm{T}_{n}(z;f)$ and we will say it is a polynomial $\mathrm{T}_{n}$ .

It is proven, exactly as above, that:
Any continuous function $f(z)$ admits at least one polynomial of best approximation of degree n.

This result remains exact for any bounded function.

number $\mu_{n}$ is positive or null and cannot be canceled unless $f(z)$ reduces to an analytic polynomial of degree $n$ We will assume, in the following, that $\mu_{n}>0$ .

This best approximation problem was also studied by Mr. L. Tonelli in the cited work.
49. - Fundamental property of polynomials $\mathbf{T}_{\boldsymbol{n}}$ The first property of best-approximation polynomials is the following:

If $\mathrm{P}(z)$ is a best-approximation polynomial of degree n, there exists at least $n+2$ points where

|f(z)-\mathrm{P}(z)|=\mu_{n}

(42)

Let us assume the opposite and let $z_{1},z_{2},\ldots,z_{m}$ the points, in number only of $m\leq n+1$ where we have the equality (42). Let

f\left(z_{r}\right)-\mathrm{P}\left(z_{r}\right)=\mu_{n}e^{ia_{r}}\quad r=1,2,\ldots,m.

Lagrange's interpolation formula allows us to determine an apolynomial $Q(z)$ of the degree $n$ so that

Q\left(z_{r}\right)=\mu_{n}e^{ia_{r}}\quad r=1,2,\ldots,m.

Let's put

f(z)-\mathrm{P}(z)=\mu e^{i\alpha},\quad Q(z)=ve^{i\beta}

where, of course, $\mu,\nu,\alpha,\beta$ depend on the point $z$ Let us now
consider the closed corks ( $C_{r}$ ) with the center in $z_{r}$ and
radius §. We take $\delta$ small enough because
10. The circles ( $\mathrm{C}_{r}$ ) not to be cut.
$2^{0}.f(z)-\mathrm{P}(z),Q(z)$ not to cancel in the circles ( $\left.\mathrm{C}_{\mathrm{r}}\right)$ . There will be. then a positive number $\gamma$ so that $\mu\geq\gamma,\gamma\geq\gamma$ in circles $\left(\mathrm{C}_{r}\right)$ .
30. Let's have

\left|\alpha-\alpha_{r}\right|\leq\alpha^{\prime}<\frac{\pi}{4},\left|\beta-\alpha_{r}\right|\leq\alpha^{\prime}<\frac{\pi}{4}

in the circles ( $\mathrm{C}_{r}$ ).
All these circumstances can be achieved by virtue of the continuity of functions $f(z)-\mathrm{P}(z),\mathrm{Q}(z)$ .

In the whole field ( $\mathrm{D}^{\prime}$ ) which is obtained from (D) by removing the interior of the circles ( $\mathrm{C}_{r}$ ), we have

|f(z)-\mathrm{P}(z)|\leq\mu^{\prime}<\mu_{n}

$\mu^{\prime}$ being a fixed number.
Let us now take a positive number $\lambda$ so that

\lambda<\min\left(\frac{\mu_{n}-\mu^{\prime}}{2\mathrm{M}(|Q|)},\frac{2\gamma\cos 2\alpha^{\prime}}{\mathrm{M}(|Q|)}\right).

\left.\mid f(z)-P_{1}z\right)-\left.\lambda Q(z)\right|^{2}=\left|\mu e^{i\alpha}-\lambda ve^{i\beta}\right|=\mu^{2}+\lambda^{2}\mu^{2}-2\mu\lambda v\cos(\alpha-\beta)v

But in the circles ( $\mathrm{C}_{r}$ )

\cos(\alpha-\beta)\geq\cos 2\alpha^{\prime}>0

and

\lambda^{2}\gamma^{2}-2\lambda\mu\nu\cos(\alpha-\beta)<\lambda\nu\left(\lambda M(|Q|)-2\gamma\cos 2\alpha^{\prime}\right)<0

We therefore have, in the circles ( $\mathrm{C}_{r}$ ),

|f(z)-P(z)-\lambda Q(z)|\leq\mu^{\prime\prime}<\mu_{n}

$\mu^{\prime\prime}$ being a fixed number.
On the other hand, in the domain ( $\mathrm{D}^{\prime}$ ) we have

f(z)-\mathrm{P}(z)-\lambda\mathrm{Q}(z)\left\lvert\,\leq\mu^{\prime}+\frac{\mu_{n}-\mu^{\prime}}{2}=\frac{\mu_{n}+\mu^{\prime}}{2}<\mu_{n}.\right.

It follows that in the entire domain (D)

|f(z)-P(z)-\lambda Q(z)|\leq\max\left(\mu^{\prime\prime},\frac{\mu_{n}+\mu^{\prime}}{2}\right)<\mu_{n}.

which contradicts the hypothesis that $\mathrm{P}(z)$ is a polynomial $\mathrm{T}_{n}$ . The stated property is therefore proven.
50. - The uniqueness of the polynomial $\mathbf{T}_{n}$ From the previous property it immediately follows that:

A continuous function $f(z)$ admits a single polynomial of best approximation of degree n.

Let us assume the opposite and let $\mathrm{P}(z),\mathrm{P}_{1}(z)$ two polynomials $\mathrm{T}_{n}$ distinct. The polynomial $P_{2}=\frac{P+P_{1}}{2}$ is also a polynomial $T_{n}$ , because

M\left(\left|f-P_{2}\right|\leq\frac{1}{2}\left\{M(|f-P|)+M\left(\left|f-P_{1}\right|\right)\right\}\leq\mu_{n}\right.

Whether $z^{\prime}$ a point where

\left|f\left(z^{\prime}\right)-P_{2}\left(z^{\prime}\right)\right|=\mu_{n}

\left|f\left(z^{\prime}\right)-\mathrm{P}\left(z^{\prime}\right)\right|\leq\mu_{n},\quad\left|f\left(z^{\prime}\right)-\mathrm{P}_{1}\left(z^{\prime}\right)\right|\leq\mu_{n}

and

\mu_{n}=\left|\frac{f\left(z^{\prime}\right)-\mathrm{P}\left(z^{\prime}\right)+\left(f\left(z^{\prime}\right)-\mathrm{P}\left(z^{\prime}\right)\right)}{2}\right|\leq\frac{\left|f(z)-\mathrm{P}\left(z^{\prime}\right)\right|+\left|f\left(z^{\prime}\right)-\mathrm{P}\left(z^{\prime}\right)\right|}{2}\leq\mu_{n}

It follows that we have the sign = everywhere. Then we must first $f\left(z^{\prime}\right)-\mathrm{P}\left(z^{\prime}\right),f\left(z^{\prime}\right)-\mathrm{P}_{1}\left(z^{\prime}\right)$ to have the same mode $\mu_{n}$ and then, the modulus of the sum being equal to the sum of the moduluses, it must have the same argument. We have so

	$\displaystyle f\left(z^{\prime}\right)-\mathrm{P}\left(z^{\prime}\right)$	$\displaystyle=f\left(z^{\prime}\right)-\mathrm{P}\left(z^{\prime}\right)$
	$\displaystyle\mathrm{P}\left(z^{\prime}\right)$	$\displaystyle=\mathrm{P}\left(z^{\prime}\right)$

Or, we saw in the previous No. that there is at least $n+1$ (and even at least $n+2$ ) points $z^{\prime}$ . Polynomials of degree $n,\mathrm{P}(z)$ and $\mathrm{P}_{1}(z)$ , coincide in at least $n+1$ points and are therefore identical, contrary to the hypothesis. The theorem is proven.
51. - Mr. L. Tonelli's theorem. Mr. Tonelli found a theorem here too, analogous to Mr. Borel's first theorem.

Let E be the set of points $\boldsymbol{z}^{\prime}$ where $\mathrm{M}(|f-\mathrm{P}|)$ is reached. We have the property:

The necessary and sufficient condition for $\mathrm{P}(z)$ to be a polynomial ${}^{\mathrm{T}}\mathrm{T}_{n}$ is that no polynomial can be found $\mathrm{Q}(z)$ , of degree n, - such that
10. $\quad c^{\prime}>\left|Q\left(z^{\prime}\right)\right|>c>0$

20 $\quad\left|\arg\mathrm{Q}\left(z^{\prime}\right)-\arg\left(f\left(z^{\prime}\right)-\mathrm{P}\left(z^{\prime}\right)\right)\right|<\alpha^{\prime}<90^{\circ}$
in all points $z^{\prime}$ of E.
To show that this condition is sufficient, it is enough to show that, if $\mathrm{P}(z)$ is not a polynomial $\mathrm{T}_{n}$ , we can construct the polynomial $Q^{\prime}z$ ).

Let us therefore suppose that

\mathrm{M}(|f-\mathrm{P}|)=\mu^{\prime}>\mu_{n},\quad\left(\mathrm{P}(z)\equiv\equiv\mathrm{T}_{n}(z;f)\right).

Whether $\mathrm{A}_{,}\mathrm{A}_{1},\mathrm{~A}_{2}$ the points that represent $f\left(z^{\prime}\right),\mathrm{P}\left(z^{\prime}\right),\mathrm{T}_{n}\left(z^{\prime};f\right)$ The point. $\mathrm{A}_{2}$ is in the circle with center A and radius equal to $\mu_{n}$ .

-We

\begin{gathered}\mu^{\prime}-\mu_{n}\leq\left|\mathrm{T}_{n}\left(z^{\prime};f\right)-\mathrm{P}\left(z^{\prime}\right)\right|\leq\mu^{\prime}+\mu_{n}\\ \left|\arg\left(\mathrm{~T}_{n}\left(z^{\prime};f\right)-\mathrm{P}\left(z^{\prime}\right)\right)-\arg\left(f\left(z^{\prime}\right)-\mathrm{P}\left(z^{\prime}\right)\right)\right|\leq\operatorname{Arcsin}\frac{\mu_{n}}{\mu^{\prime}}<90^{n}\end{gathered}

So we can take $Q(z)=\mathrm{T}_{n}(z;f)-\mathrm{P}(z)$ .
It remains to show the necessity of the condition.
Suppose there were a polynomial $Q(z)$ which satisfies the stated properties and either $\mathrm{T}_{n}(z;f)$ the best-fitting polynomial. We can assume $c<\mu_{n}$ . For a given positive number, $\varepsilon<\mu_{n}$ , corresponds to another positive number $\delta$ so that the oscillation of the functions $f(z)-\mathrm{T}_{n}(z;f)$ , $Q(z)$ to be smaller than $\varepsilon$ , in a circle of radius $\leq\delta$ . Then either $\varphi(z)=f(z)-\mathrm{T}_{n}(z;f)-\lambda Q(z),\lambda$ being a positive number.

Let's take now $\varepsilon$ small enough for us to have

\varepsilon<\min\left(c\sin\frac{90^{\circ}-\alpha^{\prime}}{2},\mu_{n}\sin 15^{\circ}\right)

And let us denote by E the projection of M on OD.
If we take

\lambda<\frac{\mu_{n}\cos(\mathrm{MOD})}{\varepsilon+\gamma\overline{c^{2}-\varepsilon^{2}}}

domain (B) is completely inside the triangle MOE. On the other hand

f(z)-\mathrm{T}_{n}(z;f)-\lambda Q(z)\mid<\mu_{n}

in the circles C . In the closed domain obtained by taking out the interior of the circles C we have

\left|f(z)-T_{n}(z;f)-\lambda Q(z)\right|\leq\mu^{\prime}<\mu_{n},

\left|f(z)-T_{n}(z;f)-\lambda Q(z)\right|<\mu_{n}

everywhere, which is in contradiction with the fact that $\mathrm{T}_{n}(z;f)$ is the best-fitting polynomial.

The property is therefore completely demonstrated.

	$\displaystyle\left\|C\right\|M\left(\left\|f-P\right\|\right)$	$\displaystyle=M\left(\left\|Cf-CP\right\|\right)=\left\|C\right\|\mu_{n}\left(f\right),$
	$\displaystyle M\left(\left\|Cf-R\right\|\right)$	$\displaystyle=M\left(\left\|Cf-C\frac{R}{C}\right\|\right)=\left\|C\right\|M\left(\left\|f-\frac{R}{C}\right\|\right)\geq\left\|C\right\|\mu_{n}\left(f\right).$

	$\displaystyle M\left(\left\|f-P\right\|\right)$	$\displaystyle\leq M\left(\left\|f-P_{1}\right\|\right)+M\left(\left\|P-P_{1}\right\|\right)<M\left(\left\|f-P_{1}\right\|\right)+\varepsilon.$
	$\displaystyle M\left(\left\|f-P_{1}\right\|\right)$	$\displaystyle\leq M\left\|\left(f-P\right)\right\|+M\left\|\left(P-P_{1}\right)\right\|<M\left(\left\|f-P\right\|\right)+\varepsilon$

	$\displaystyle\left\|\lambda Q\right\|$	$\displaystyle<\frac{\mu_{n}-\mu^{\prime}}{2},$
	$\displaystyle\left\|f-T_{n}-\lambda Q\right\|$	$\displaystyle\leq\left\|f-T_{n}\right\|+\left\|\lambda Q\right\|<\mu^{\prime}+\frac{\mu_{n}-\mu^{\prime}}{2}=\frac{\mu_{n}+\mu^{\prime}}{2}<\mu_{n}$

$\displaystyle\mu_{n}$	$\displaystyle\leq M\left(\left\|f-\frac{\alpha P+\beta P_{1}}{\alpha+\beta}\right\|\right)=M\left(\left\|\frac{\alpha^{\prime}f-P}{\alpha+\beta}+\frac{\beta\left(f-P_{1}\right)}{\alpha+\beta}\right\|\right)\leq$	(1.10)
	$\displaystyle\leq\frac{\alpha\backslash\left(\left\|f-P\right\|\right)+\beta M\left\|\left(f-P_{1}\right)\right\|}{\alpha+\beta}=\mu_{n}$
	$\displaystyle\therefore M\left(\left\|f-\frac{\alpha P+\nu P_{1}}{\alpha+\beta}\right\|\right)=\mu_{n}.$

	$\displaystyle\mid f-$	$\displaystyle P_{n}(x;f)\left\|=\left\|\frac{1}{(b-a)^{n}}\sum_{i=0}^{n}\binom{n}{i}\left[f(x)-f\left(a_{i}\right)\right](x-a)^{i}(b-x)^{n-i}\right\|\leq\right.$
		$\displaystyle\leq\frac{1}{(b-a)^{n}}\sum_{i=1}^{n}\binom{n}{i}\omega\left(\left\|x-a_{i}\right\|\right)(x-a)^{i}(b-x)^{n-i}<$
		$\displaystyle<\left\{\frac{1}{\delta}\cdot\frac{1}{(b-a)^{n}}\sum_{i=0}^{n}\binom{n}{i}\left\|x-a_{i}\right\|(x-a)^{\prime}(b-x)^{n-i}+1\right\}\omega(0)$

On best approximation of the continous functions by polynomials

Abstract

Authors

Keywords

Paper coordinates

PDF

About this paper

Journal

Publisher Name

DOI

Print ISSN

Online ISSN

References

Paper (preprint) in HTML form

On the Best Approximation of Continuous Functions by Polynomials. Five lessons held at the Faculty of Science from Cluj during the academic year 1933-1934

Chapter 1 First Lesson. The existence and uniqueness of the best approximation polynomials.

1.1 Bounded functions. The oscillation of a function.

1.2 Continuous functions.

1.3 The distance between two functions.

1.4 The problem of the best approximation using polynomials.

1.5 The determination of TnT_{n} in simple cases.

1.6 A preliminary Lemma.

1.7 The continuity of M​(|f−P|)M\left(\left|f-P\right|\right).

1.8 The existence of the polynomials of the best approximation.

1.9 The Chebyshev’s polynomials for a continuous function.

1.10 The previous result revisited.

1.11 The set of polynomials TnT_{n}.

1.12 The uniqueness of Chebyshev’s polynomials.

Chapter 2 Second lesson. The results of E. Borel.

2.1 The difference f​(x)−P​(x)f\left(x\right)-P\left(x\right).

2.2 The fundamental property of TnT_{n} polynomials.

2.3 The first Borel’s theorem.

2.4 On the distribution of zeros of Tn−Tn−1T_{n}-T_{n-1} polynomials.

2.5 The TnT_{n} polynomials for functions of order nn.

2.6 The second Borel’s Theorem.

2.7 A consequence of the previous Theorem.

2.8 The computation of TnT_{n} polynomial.

2.9 The best approximation of xn+1x^{n+1}.

Chapter 3 The third lesson. The results of Ch. de la Vallée Poussin.

3.1 The best approximation on n+2n+2 points.

3.2 The determination of the polynomial EnE_{n}.

3.3 The first theorem of Ch. de la Vallée Poussin.

3.4 The second theorem of Ch. de la Vallée Poussin.

3.5 Applications to functions with bounded differences.

3.6 Oscillation modulus of a function.

LESSON IV

Weierstrass's theorem

LECTURE V

The case of functions of two independent variables

44.- Completion of the previous result. The previous property-

Related Posts

On the Best Approximation of Continuous Functions by Polynomials.
Five lessons held at the Faculty of Science from Cluj during the academic year 1933-1934

1.5 The determination of $T_{n}$ in simple cases.

1.7 The continuity of $M\left(\left|f-P\right|\right)$ .

1.11 The set of polynomials $T_{n}$ .

2.1 The difference $f\left(x\right)-P\left(x\right)$ .

2.2 The fundamental property of $T_{n}$ polynomials.

2.4 On the distribution of zeros of $T_{n}-T_{n-1}$ polynomials.

2.5 The $T_{n}$ polynomials for functions of order $n$ .

2.8 The computation of $T_{n}$ polynomial.

2.9 The best approximation of $x^{n+1}$ .

3.1 The best approximation on $n+2$ points.

3.2 The determination of the polynomial $E_{n}$ .