Part B
14. Write a C or MATLAB program to compute the fast convolution of a long sequence with a
short sequence employing the overlap-save method introduced in Section 7.3.4. Compare the
results with the MATLAB function fftfilt that use overlap-add method.
15. Experiment with the capability of the psd function in the MATLAB. Use a sinusoid
embedded in white noise for testing signal.
16. Using the MATLAB function specgram to display the spectrogram of the speech file
timit1.asc included in the software package.
Part C
17. The radix-2 FFT code used in the experiments is written in consideration of minimizing
the code size. An alternative FFT implementation can be more efficient in terms of the
execution speed with the expense of using more program memory locations. For example,
the twiddle factors used by the first stage and the first group of other stages are
constants, W
0
N
1. Therefore the multiplication operations in these stages can be simplified.
Modify the assembly FFT routine given in Table 7.5 to incorporate this observation. Profile
the run-time clock cycles and record the memory usage. Compare the results with those
obtained by Experiment 7C.
18. The radix-2 FFT is the most widely used algorithm for FFT computation. When the number
of data samples are a power of 2n (i.e., N 2
2n
4
n
), we can further improve the run-time
efficiency by employing the radix-4 FFT algorithm. Modify the assembly FFT routine give
in Table 7.5 for the radix-4 FFT algorithm. Profile the run-time clock cycles, and record the
memory space usage for a 1024-point radix-4 FFT (2
10
4
5
1024). Compare the radix-4
FFT results with the results of 1024-point radix-2 FFT computed by the assembly routine.
19. Take advantage of twiddle factor, W
0
N
1, to further improve the radix-4 FFT algorithm
run-time efficiency. Compare the results of 1024-point FFT implementation using different
approaches.
20. Most of DSP applications have real input samples, our complex FFT implementation zeros
out the imaginary components of the complex buffer (see exp7c.c). This approach is simple
and easy, but it is not efficient in terms of the execution speed. For real input, we can split the
even and odd samples into two sequences, and compute both even and odd sequences in
parallel. This approach will reduce the execution time by approximately 50 percent. Given a
real value input x(n) of 2N samples, we can define cn an jbn, where two inputs
an xn and bn xn 1 are real sequences. We can represent these sequences
as an cn c
Ã
n=2 and bn Àjcn À c
Ã
n=2, then they can be written in terms
of DFTs as and Ak Ck C
Ã
N À k=2 and Bk ÀjCk À C
Ã
N À k=2.
Finally, the real input FFT can be obtained by Xk Ak W
k
2N
Bk and
Xk N Ak À W
k
2N
Bk, where k 0, 1, . . . , N À 1. Modify the complex radix-2 FFT
assembly routine to efficiently compute 2N real input samples.
EXERCISES
349