Calculus in Banach spaces: Gateaux derivative and consecuences of the mean value theorem

So far, the examples of (Fréchet) differentiable functions presented are all classical in some sense and even though there was necessary to introduce some new lemmas to compute explicitly the derivative, the computations have been reasonable until now. However, in many practical examples it is not possible to compute the Fréchet derivative in one step.

Definition 1 (Gateaux Derivative) Let {f:G\subset X \rightarrow Y} and let {x \in G} and {h\in X}.

We say that {f} is differentiable at {x} in the direction {h} if there exist  {D_hf(x)\in Y} such that:

\displaystyle || f(x+th)-f(x)-tD_hf(x)||_Y=o(t), \ \ \ \ \ (1)

or equivalently:

\displaystyle D_hf(x):=\lim\limits_{t\rightarrow 0}\frac{f(x+th)-f(x)}{t} \ \ \ \ \ (2)

We say that {f} is Gateaux differentiable at {x} if {D_hf(x)} exist for every {h\in X}.

Remark 1 By the definition (2) the uniqueness of the directional derivative is automatic and also the linearity.

It is clear that the Gateaux differentiablity is equivalent to the (Fréchet) differentiability in one dimensional case i. e., when the domain of the function is contained in an one dimensional space. However, in the general case this requirement is too weak implying the existence of functions Gateaux differentiable at some point and even not continous at  it. To make this situation even more dramatic we provide the following example in a finite dimensional case:

Let {f:{\mathbb R}^2\rightarrow{\mathbb R}} defined as follows:

\displaystyle f(x,y):= \begin{cases} \frac{y^3}{x} \quad if\, x\neq0 ,\\ 0 \quad if\, x=0. \end{cases} \ \ \ \ \ (3)

It is a straightforward exercise to show that {D_hf(0,0)=0} for all {h\in {\mathbb R}^2} but {f} is not continuous at the origin.

Some important properties of the Gateaux derivative are summarized in the following lemma:

Lemma 2 Let {f:G\subset X\rightarrow Y} and let {x\in G}:

  1. (Frechét {\implies} Gateaux). If {f} is (Fréchet) differentiable at {x} then {f} is Gateaux differentiable at {x}, moreover:

    \displaystyle D_hf(x)=Df(x)h \ \ \ \ \ (4)

  2. (Chain rule). Let {U} be an open subset of {Y} such that {f(x)\in U} and suppose that {\gamma:U\subset Y\rightarrow Z} is (Fréchet) differentiable at {f(x)}.If {f} is Gateaux differentiable at {x} then {\gamma\circ f} is Gateaux differentiable at {x} and:

    \displaystyle D_h \gamma\circ f(x)=D\gamma (f(x))\circ D_hf(x) \ \ \ \ \ (5)

  3. Suppose that {f} is Gateaux differentiable at each point in {G} and let {x\in G} such that the line segment {\{x+th|t\in[0,1] \}\subset G} for some {h\in X}, then the function:

{ \psi:[0,1]\rightarrow Y}

                                {t\rightarrow f(x+th)},

is continuous on {[0,1]} and differentiable on {(0,1)}.


  1. Let us fix {h\in X-\{0\}} and let us take {\delta>0} such that for any {|t|<\delta} we have {x+th\in G}, therefore:

\displaystyle \frac{|| f(x+th)-f(x)-tDf(x)h ||_Y}{|t|}\ \ \ \ \

\displaystyle =|| h||_X\frac{|| f(x+th)-f(x)-Df(x)(th) ||_Y}{|| th||_X}\rightarrow 0. \ \ \ \ \ (6)

As {t \rightarrow 0}.

2. This proof proceeds as the proof of the chain rule and it is left as an exercise.

3. Since {G} is open, there exist {\varepsilon>0} such that we can extend the definition of {\psi} to the interval {(-\varepsilon,1+\varepsilon)}.

Let us fix {t_0\in(-\varepsilon,1+\varepsilon)}, therefore:

\displaystyle \lim\limits_{s\rightarrow 0}\frac{\psi(t_0+s)-\psi(t_o)}{s} \ \ \ \ \

\displaystyle =\lim\limits_{s\rightarrow 0}\frac{f((x+t_0h)+sh)-f(x+t_oh)}{s}=D_hf(x+t_0h). \ \ \ \ \ (7)

Therefore {\psi'(t_0)=D_hf(x+t_0h)}. Finally {\psi} is differentiable on {(-\varepsilon,1+\varepsilon)}.


Remark 2 Even though the general condition to ensure the Gateaux differentiablity of compositions of the form {\gamma \circ f} requires the (Fréchet) differentiability of {\gamma} and the Gateaux differentiability of {f} there are some cases (like the one considered in the third item of the last lemma) where we can switch the roll of {\gamma} and {f}. However, in these cases it is necessary to impose strong conditions in the inner function, in this case {f}. For instance, if {f} is an affine continuous function and {\gamma} is Gateaux differentiable, then the chain rule is still valid (exercise!).

The Gateaux derivative is also an useful tool to rules out the (Fréchet) differentiability of some functions as we can see in the following example.

Example 1 Let X be a NLS, let us consider the function {f(x):=|| x||_X}, let us prove that {f} is not differentiable at the origin.

Let {h\in X}, let us consider:

\displaystyle \lim\limits_{t\rightarrow 0^+} \frac{f(th)-f(0)}{t}=\lim\limits_{t\rightarrow 0^+} \frac{|t||| h||_X}{t}=|| h||_X \ \ \ \ \ (8)

And, on the other hand, we have:

\displaystyle \lim\limits_{t\rightarrow 0^-} \frac{f(th)-f(0)}{t}=\lim\limits_{t\rightarrow 0^-} \frac{|t||| h||_X}{t}=-|| h||_X \ \ \ \ \ (9)

Therefore, no norm is differentiable at the origin.

The last result is not a surprise because any Minkowski functional behaves as cone near the origin:

Origin cone.png

Figure 1: No differentiability of a norm near the origin

 However, sometimes a norm can be differentiable in any other point but the origin, like the euclidean distance.

tangent plane cone.gif

Figure 2: Differentiability in points far from the origin.

As an application of the chain rule, we can see that for Inner Product Spaces (IPS) norms are differentiable far from the origin.

Example 2 Let {(H,\langle .,.\rangle_H)} be an IPS. Clearly {f(x):=|| x||_H^2} is a differentiable function defined by a continuous symmetric bilinear form. Therefore, since the square root function is differentiable in the positive real numbers, by the chain rule, the restricted function {|| .||_H:H/\{0\}\rightarrow {\mathbb R}} seen as the composition {|| .||_H=\sqrt{f}} belongs to {C^{\infty}(H/\{0\})} and its derivative is given by:

\displaystyle D|| x||_H(h)=\langle \frac{x}{|| x||},h\rangle_H. \ \ \ \ \ (10)

We can state and prove now a generalization for the classical Mean Value Theorem that also generalizes the result for the Fréchet derivative. This result can be shown using the compactness of the unit interval in {{\mathbb R}} and the definition of derivative, however by personal taste we are going to use an elegant argument that involves the Hahn-Banach theorem.

Theorem 3 (Mean value Theorem (Gateaux case))

Suppose that {f:G\subset X\rightarrow Y} is Gateaux differentiable at each point in {G} and let {x\in G} such that the line segment {\{x+th|t\in[0,1] \}\subset G} for some {h\in X}, then there exist {t_0\in (0,1)} such that:

\displaystyle || f(x+h)-f(x)||_Y\leq|| D_hf(x+t_0h)||_Y \ \ \ \ \ (11)

Proof: Let {\psi:(0,1)\rightarrow Y} defined as in the last item of the lemma 2 and let {\phi \in Y^*} with {|| \phi ||_{Y^*}=1}, since the composition of continuous functions is also continuous we have that {\psi\circ \phi:[0,1]\rightarrow {\mathbb R}} is continuous on {[0,1]}, on the other hand, by the chain rule, the function {\psi\circ \phi} is differentiable on {(0,1)} and its derivative is given by {\psi'(t)=\phi (D_hf(x+th)} . Applying the mean value theorem for real valued functions defined on an interval we have that there exist {t_0\in(0,1)} such that:

\displaystyle |\phi(f(x+h)-f(x))|=|\phi\circ\psi(1)-\phi\circ\psi(1)| \ \ \ \ \

\displaystyle =|\phi (D_hf(x+t_0h)|\leq || (D_hf(x+t_0h)||_Y \ \ \ \ \ (12)

Finally, in virtue of the Hanh-Banach theorem we can take {\phi\in Y^*} with {|| \phi ||_{Y^*}=1} such that {\phi(f(x+h)-f(x))=|| f(x+h)-f(x)||_Y}, which proves the theorem. \Box

Furnished with this result we can give sufficient conditions to pass from Gateaux differentiability to (Fréchet) differentiability.

Theorem 4 Suppose that {f:G\subset X\rightarrow Y} is Gateaux differentiable at each {x\in G} and that there exist for each {x\in G} a function {A(x)\in B(X,Y)} such that {D_hf(x)=A(x)h} for all {x\in G} and for all {h\in X}.

If the function:

{ A:G\rightarrow B(X,Y)}

 {x\rightarrow A(x)}

is continuous, then {f\in C^1(G)} and {Df(x)(h)=A(x)h} for all {x\in G} and for all {h\in X}.

Proof: Let us fix {x\in G}, since {G} is open there exist {\delta>0} such that {B_{\delta}(x)\subset G}. Let us define:

{ g:B_{\delta}(0)\rightarrow Y}

                                                    {h\rightarrow f(x+h)-f(x)-A(x)h.}

Clearly {g} is Gateaux differentiable with Gateaux derivative at {h} in the direction {k} given by {D_kg(h)=A(x+h)k-A(x)k}. On the other hand, by the convexity of the ball, for any {h\in B_{\delta}(0)} the line segment {\{th|t\in[0,1] \}\subset B_{\delta}(0)}, therefore we can apply the last lemma to {g} getting:

\displaystyle \frac{|| f(x+h)-f(x)-A(x)h||_Y}{|| h||_X}=\frac{|| g(h)-g(0)||_Y}{|| h||_X}\leq \frac{|| (A(x+t_0h)-A(x))h||_Y}{|| h||_X}, \ \ \ \ \ (13)

where {t_0\in (0,1)}. Finally, by the continuity of the function {A} we get:

\displaystyle \frac{|| f(x+h)-f(x)-A(x)h||_Y}{|| h||_X}\leq || A(x+t_0h)-A(x)||_{B(X,Y)}\rightarrow 0 \ \ \ \ \ (14)

As {|| h||_X\rightarrow 0}.

By the uniqueness of the derivative {Df=A} and by the continuity of {A} on {G} it follows that {f\in C^1(G)}. \Box

This result reduces the complex task of finding the derivative of a function in a general NLS to compute limits in the real numbers. We will present a few examples of this situation later in these notes.

In the rest of this section we are goint to present more aplications and corollaries of the mean value theorem, a classical reference for more complex applications of this theorem (like conditions to exchange derivatives and limits) is Cartan’s book of Differential Calculus.

Since in the following we are going to use (mostly) the mean value theorem for (Fréchet) differentiable functions, we present the statement as an independent theorem.

Theorem 5 [Mean Value theorem] Suppose that {f:G\subset X\rightarrow Y} is (Fréchet) differentiable in {G} and let {x\in G} such that the line segment {\{x+th|t\in[0,1] \}\subset G} for some {h\in X}, then there exist {t_0\in (0,1)} such that:

\displaystyle || f(x+h)-f(x)||_Y\leq|| Df(x+t_0h)||_{B(X,Y)}|| h||_X. \ \ \ \ \ (15)

Proof: The proof trivially follows from the Gateaux case, theorem 3. \Box

Remark 3 In general we can not expect to get the equality in (11), as an example of this we can consider the function:

{ f:[0,1]\rightarrow {\mathbb R}^2}

                                          { t\rightarrow (\cos(2\pi t),\sin(2\pi t)}

Clearly this function satisfies the hypothesis of the mean value theorem, but it is clear that {f(1)-f(0)=0} and {| Df(t)|=2\pi} for all {t\in (0,1)}.

Nevertheless, if the codomain is an one dimensional space, imitating the proof of the mean value theorem it can be shown that (11) holds putting an equality instead (exercise!).

Corollary 6 Suppose that {f} is (Fréchet) differentiable in {G} with {G} convex and also suppose that there exist {K>0} such that {|| Df(x) ||_{B(X,Y)}\leq K} for each {x\in G}. Then:

\displaystyle || f(y)-f(x)||_Y\leq K|| x-x||_X \ \ \ \ \ (16)

Proof: The result follows from taking {h=y-x} in (15) and from the convexity of {G}. \Box

Corollary 7 Suppose that {f:G\subset X\rightarrow Y} is (Fréchet) differentiable in {G} and that for each {x\in G}, {Df(x)=0} . If {G} is connected, then {f} is constant in {G}.

Proof: Let us fix {x_0\in G}, since {f} is differentiable in {G} it is also continuous and therefore the set {F:=\{x\in G|f(x)=f(x_0)\}} is closed in {G}.

On the other hand, let us take any {x\in F}, since {G} is open there exist {\delta>0} such that {B_{\delta}(x)\subset G}. Since {B_{\delta}(x)} is convex we can use the previous result with {K=0} implying that {B_{\delta}(x)\subset F} and hence {F} is open in {G}.

Finally by the connectedness of {G} the result follows. \Box


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s