trR Lr
xx
0
LÀ1
l0
l
l
,
8:3:2
where tr[R] denotes the trace of matrix R. It follows that
l
max
LÀ1
l0
l
l
Lr
xx
0 LP
x
,
8:3:3
where
P
x
r
xx
0 E
Â
x
2
n
Ã
8:3:4
denotes the power of x(n). Therefore setting
0 < m <
2
LP
x
8:3:5
assures that (8.3.1) is satisfied.
Equation (8.3.5) provides some important information on how to select m, and they
are summarized as follows:
1. Since the upper bound on m is inverselyproportional to L, a small m is used for large-
order filters.
2. Since m is made inverselyproportional to the input signal power, weaker signals use
a larger m and stronger signals use a smaller m. One useful approach is to normalize
with respect to the input signal power P
x
. The resulting algorithm is called the
normalized LMS algorithm, which will be discussed in Section 8.4.
8.3.2 Convergence Speed
In the previous section, we saw that w(n) converges to w
o
if the selection of m satisfies
(8.3.1). Convergence of the weight vector w(n) from w(0) to w
o
corresponds to the
convergence of the MSE from x0 to x
min
. Therefore convergence of the MSE toward
its minimum value is a commonlyused performance measurement in adaptive systems
because of its simplicity. During adaptation, the squared error e
2
n is non-stationaryas
the weight vector w(n) adapts toward w
o
. The corresponding MSE can thus be defined
onlybased on ensemble averages. A plot of the MSE versus time n is referred to as the
learning curve for a given adaptive algorithm. Since the MSE is the performance
criterion of LMS algorithms, the learning curve is a natural wayto describe the transient
behavior.
Each adaptive mode has its own time constant, which is determined bythe overall
adaptation constant m and the eigenvalue l
l
associated with that mode. Overall con-
vergence is clearlylimited bythe slowest mode. Thus the overall MSE time constant can
be approximated as
368
ADAPTIVE FILTERING