2.
where the vectors A
i
and X
i
are one-dimensional arrays of length L. There are many
different ways to access the elements of the arrays, such as direct, indirect, and
absolute addressing modes. In the following experiment, we will write a subroutine
int exp2b_3(int *Ai, int *Xi) to perform the dot product using indirect
addressing mode, and store the returned value in the variable result in data
memory. The code example is given as follows:
; Assume AR0 and AR1 are pointing to Ai and Xi
mpym *AR0, *AR1, AC0 ; Multiply Ai[0]and Xi[0]
mpym *AR0, *AR1, AC1 ; Multiply Ai[1]and Xi[1]
add
AC1, AC0
; Accumulate the partial result
mpym *AR0, *AR1, AC1 ; Multiply Ai[2]and Xi[2]
add
AC1, AC0
; Accumulate the partial result
(more instructions . . . )
mov
AC0, T0
4.
In the program, arrays A
i
and X
i
are defined as global arrays in the exp2.c. The
A
i
and X
i
arrays have the same data values as given previously. The return value is
passed to the calling function by T0.
5. Write an assembly routine int exp2b_4(int *Ai, int *Xi) using the indirect
addressing mode in conjunction with parallel instructions and repeat instructions
to improve the code density and efficiency. The following is an example of the
code:
mpym *AR0, *AR1, AC0 ; Multiply Ai[0]and Xi[0]
j jrpt #6
; Multiply and accumulate the rest
macm *AR0, *AR1, AC0
4.
The auxiliary registers, AR0 and AR1, are used as data pointers to array A
i
and
array X
i
, respectively. The instruction macm performances multiply-and-accumu-
late operation. The parallel bar jj indicates the parallel operation of two instruc-
tions. The repeat instruction, rpt #K will repeat the following instruction K1
times.
6. Create a project called exp2b and save it in A: \Experiment2.
7. Use exp2.cmd, exp2b.c, exp2b_1.asm, exp2b_2.asm, exp2b_3 .asm, and
exp2b_4.asm to build the project.
8. Open the memory watch window to watch how the arrays A
i
and X
i
are initialized
in data memory by the assembly routine exp2b_1.asm and exp2b_2.asm.
9. Open the CPU registers window to see how the dot product is computed by
exp2b_3.asm, and exp2b_4.asm.
10. Use the profile capability learned from the experiments given in Chapter 1 to
measure the run-time of the sum-of-product operations and compare the cycle
difference of the routine exp2b_3.asm and exp2b_4.asm.
74
INTRODUCTION TO TMS320C55X DIGITAL SIGNAL PROCESSOR