beautypg.com

Texas Instruments TMS320C64X User Manual

Page 45

background image

DSP_fft16x16r

4-17

C64x+ DSPLIB Reference

DSP_fft16x16r(N, &x[0], &w[0], brev,y,N/4,0, N)

DSP_fft16x16r(N/4,&x[0], &w[2*3*N/4],brev,y,rad,0, N)

DSP_fft16x16r(N/4,&x[2*N/4], &w[2*3*N/4],brev,y,rad,N/4, N)

DSP_fft16x16r(N/4,&x[2*N/2], &w[2*3*N/4],brev,y,rad,N/2, N)

DSP_fft16x16r(N/4,&x[2*3*N/4],&w[2*3*N/4],brev,y,rad,3*N/4,N)

As discussed previously, N can be either a power of 4 or 2. If N is a power of
4, then rad = 4, and if N is a power of 2 and not a power of 4, then rad = 2. “rad”
controls how many stages of decomposition are performed. It also determines
whether a radix4 or DSP_radix2 decomposition should be performed at the
last stage. Hence, when “rad” is set to “N/4”, the first stage of the transform
alone is performed and the code exits. To complete the FFT, four other calls
are required to perform N/4 size FFTs. In fact, the ordering of these 4 FFTs
amongst themselves does not matter and, thus, from a cache perspective, it
helps to go through the remaining 4 FFTs in exactly the opposite order to the
first. This is illustrated as follows:

DSP_fft16x16r(N, &x[0], &w[0], brev,y,N/4,0, N)

DSP_fft16x16r(N/4,&x[2*3*N/4],&w[2*3*N/4],brev,y,rad,3*N/4, N)

DSP_fft16x16r(N/4,&x[2*N/2], &w[2*3*N/4],brev,y,rad,N/2, N)

DSP_fft16x16r(N/4,&x[2*N/4], &w[2*3*N/4],brev,y,rad,N/4, N)

DSP_fft16x16r(N/4,&x[0], &w[2*3*N/4],brev,y,rad,0, N)

In addition, this function can be used to minimize call overhead by completing
the FFT with one function call invocation as shown below:

DSP_fft16x16r(N, &x[0], &w[0], y, brev, rad, 0, N)

Algorithm

This is the C equivalent of the assembly code without restrictions. Note that
the assembly code is hand optimized and restrictions may apply.

void fft16x16r

(

int n,

short *ptr_x,

short *ptr_w,

unsigned char *brev,

short *y,

int radix,

int offset,

int nmax

)