beautypg.com

Motorola DSP96002 User Manual

Page 533

background image

B-14

DSP96002 USER’S MANUAL

MOTOROLA

;

; Faster FFT using Programming Tricks found in Typical FORTRAN Libraries

;

; First two passes combined as a four butterfly loop since

; multiplies are trivial.

; 2.25 cycles internal (4 cycles external) per Radix 2

; butterfly.

; Middle passes performed with traditional, triple-nested DO loop.

; 4 cycles internal (8 cycles external) per Radix 2 butterfly

; plus overhead. Note that a new pipelining technique is

; being used to minimize overhead.

; Next to last pass performed with double butterfly loop.

; 4.5 cycles internal (8.5 cycles external) per Radix 2

; butterfly.

; Last pass has separate single butterfly loop.

; 5 cycles internal (9 cycles external) per Radix 2

; butterfly.

;

; For 1024 complex points, average Radix 2 butterfly = 3.8 cycles

; internal and 7.35 cycles external, assuming a single external

; data bus.

;

; Because of separate passes, minimum of 32 points using these

; optimizations. Approximately 150 program words required.

; Uses internal X and Y Data ROMs for twiddle factor coefficients

; for any size FFT up to 1024 complex points.

;

; Assuming internal program and internal data memory (or two

; external data buses), 1024 point complex FFT is 1.57 msec at

; 75 nsec instruction rate. Assuming internal program and

; external data memory, 1024 point complex FFT is 2.94 msec

; at 75 nsec instruction rate.

;

; First two passes

;

; 9 cycles internal, 1.77X faster than 4 cycle Radix 2 bfy

; 16 cycles external, 2.0X faster than 4 cycle Radix 2 bfy

;

; r0 = a pointer in and out

; r6 = a pointer in