
i
i
i
i
i
i
i
i
A Some Multivariable Calculus
That is, for each
i
= 1
, . . . , m
,
∇f
i
(
u
) can be obtained by recursively computing the vector
Jacobian product given by
v
⊤
ℓ
:= v
⊤
ℓ−1
J
h
L−ℓ+1
(g
L−ℓ
(u)) ,
for ℓ = 1, . . . , L, starting with v
0
= e
i
and g
0
(u) = u.
A.4 Taylor’s Theorem
Once again consider a multivariate real-valued function
f
:
R
n
→ R
. If all the
k
-order deriva-
tives of
f
are continuous at a point
u ∈ R
n
, then Taylor’s theorem offers an approximation
for
f
within a neighborhood of
u
in terms of these derivatives. We are particularly interested
in cases where
k
= 1 and
k
= 2 as they are crucial in implementation of, respectively, the
first-order and the second-order optimization methods. It is easy to understand the theorem
when the function
f
is univariate. Hence we start with the univariate case and then move to
the general multivariate case. We omit the proof of Taylor’s theorem as it is a well known
result that can be found in any standard multivariate calculus textbook.
Univariate Case
Suppose that
n
= 1, that is,
f
is a univariate real-valued function. We say that
f
is
k
-times
continuously differentiable on an open interval
U ⊆ R
if
f
is
k
times differentiable at every
point on
U
(i.e., the
k
-th order derivative
d
k
f(u)
du
k
exists for all
u ∈ U
) and
d
k
f(u)
du
k
is continuous
on U. If k = 0, we interpret
d
k
f(u)
du
k
simply as f(u).
Theorem A.1 (Taylor’s Theorem in
R
). Let
f
:
R → R
be
k
-times continuously
differentiable on an open interval U ⊆ R. Then, for any u, v ∈ U,
f(u) =
k
X
i=0
(u − v)
i
i!
d
i
f(v)
du
i
+ O
|u − v|
k+1
. (A.19)
The polynomial,
P
k
(u) =
k
X
i=0
(u − v)
i
i!
d
i
f(v)
du
i
,
appeared in (A.19) is called k-th order Taylor polynomial. Since the remainder
R
k
(u) = f(u) − P
k
(u) −→ 0, as x → a,
f
(
u
) is approximately equal to
P
k
(
u
) for
u
within a small neighborhood of
a
. Particularly, for
a point
u
near
v
,
P
1
(
u
) is linear approximation of
f
(
u
) and
P
2
(
u
) is quadratic approximation
of f(u).
Multivariate Case
Now consider the multivariate case, that is,
f
is a multivariate real-valued function. In order
to state Taylor’s theorem for this case, we need some new notion that is relevant only here.
8