Motorola DSP96002 User Manual
Page 633
B-114
DSP96002 USER’S MANUAL
MOTOROLA
_mid
move #3,n0 ;new offset
move n0,n4 ;copy
move (r4)+ ;point to second butterfly
do #n/8,_laststage ;do last stage, 4 bflys at a time
move x:(r0)+,d0.s y:,d4.s ;upper x,y #1
move x:(r0)-,d1.s y:,d5.s ;lower x,1 #1
faddsub.s d0,d1 x:(r4)+,d2.s y:,d6.s ;upper x,y #2
faddsub.s d4,d5 x:(r4)-,d3.s y:,d7.s ;lower x,y #2
faddsub.s d2,d3 d1.s,x:(r0)+ d5.s,y: ;save upper x,y #1
faddsub.s d6,d7 d0.s,x:(r0)+n0 d4.s,y: ;save lower x,y #1
move d3.s,x:(r4)+ d7.s,y: ;save upper x,y #2
move d2.s,x:(r4)+n4 d6.s,y: ;save lower x,y #2
_laststage
end
If it is desired to have the results in a single memory, then the last pass of the above algorithm can be mod-
ified to merge the data from X memory and Y memory back into X memory as the butterflies are performed.
Each butterfly is read from a separate memory space but the outputs are written to a single memory space.
This executes in 3 cycles per butterfly on the final stage. Note that the last stage performs 4 butterflies per
loop and the loop takes 12 cycles for an average of 3 cycles per butterfly on the final stage.
move #data+n/2,r5 ;pointer to move back to X
move #3,n0 ;new offset
move n0,n4 ;copy
move (r4)+ ;point to second butterfly
do #n/8,_laststage ;do last stage, 4 bflys at a time
move x:(r0)+,d0.s y:,d4.s ;upper x,y #1
move x:(r0)-,d1.s y:,d5.s ;lower x,1 #1
faddsub.s d0,d1 x:(r4)+,d2.s y:,d6.s ;upper x,y #2
faddsub.s d4,d5 x:(r4)-,d3.s y:,d7.s ;lower x,y #2
faddsub.s d2,d3 d1.s,x:(r0)+ ;save upper x #1
move d5.s,x:(r5)+ ;move upper #1 back to X
faddsub.s d6,d7 d0.s,x:(r0)+n0 ;save lower x #1
move d4.s,x:(r5)+ ;move lower #1 back
move d3.s,x:(r4)+ ;save upper x,y #2
move d7.s,x:(r5)+ ;move upper #2 back
move d2.s,x:(r4)+n4 ;save lower x,y #2
move d6.s,x:(r5)+ ;move lower #2 back
_laststage