Figure 3 : iir optimization example, An253 – Cirrus Logic AN253 User Manual
Page 10

AN253
10
Figure 3: IIR Optimization Example
To accomplish this optimization (see Figure 3): 1.) Use the additional registers in the MaverickCrunch co-
processor to load the filter and state variables once before the inner loop; 2.) Shuffle the state variables
around in registers during the inner loop; 3.) Store the state variables after the inner loop. This removes
nine loads and four stores from the inner loop. Also note that the copy instructions used to shuffle the state
// Floating-point Biquad IIR
// Floating-point Biquad IIR
// (Basic: Non-optimized)
// (Optimized)
main_loop
cfldrd bqd1k_s0, [bdq1k]
cfldrd bqd1k_s1, [bdq1k, 8]
cfldrd temp2, [fcoef]
cfldrd bqd1k_s2, [bdq1k, 16]
cfldrd bqd1k_s0, [bdq1k]
cfldrd bqd1k_s3, [bdq1k, 24]
cfmuld acc, temp2, bqd1k_s0
cfldrd outp, [fcoef]
cfnegd acc, acc
cfldrd temp1, [fcoef, 8]
cfldrd temp2, [fcoef, 8]
cfldrd temp2, [fcoef, 16]
cfldrd bqd1k_s1, [bdq1k, 8]
cfldrd temp3, [fcoef, 24]
cfmuld temp, temp2, bqd1k_s1
cfldrd temp4, [fcoef, 32]
cfsubd acc, acc, temp
cfnegd outp, outp
cfldrd temp2, [fcoef, 16]
cfnegd temp1, temp1
cfldrd temp4, [data]
cfmuld temp, temp2, temp4
main_loop
cfaddd acc, acc, temp
cfldrd temp2, [fcoef, 24]
cfldrd inp, [data]
cfldrd bqd1k_s2, [bdq1k, 16]
cfmuld acc, outp, bqd1k_s0
cfmuld temp, temp2, bqd1k_s2
cfmuld temp, temp1, bqd1k_s1
cfaddd acc, acc, temp
cfcpyd bqd1k_s1, bqd1k_s0
cfldrd temp2, [fcoef, 32]
cfaddd acc, acc, temp
cfldrd bqd1k_s3, [bdq1k, 24]
cfmuld temp, temp2, inp
cfmuld temp, temp2, bqd1k_s3
cfaddd acc, acc, temp
cfaddd acc, acc, temp
cfmuld temp, temp3, bqd1k_s2
cfstrd acc, [data], 8
cfaddd acc, acc, temp
cfstrd acc, [bdq1k]
cfmuld temp, temp4, bqd1k_s3
cfstrd bqd1k_s0, [bdq1k, 8]
cfcpyd bqd1k_s3, bqd1k_s2
cfstrd temp4, [bdq1k, 16]
cfcpyd bqd1k_s2, inp
cfstrd bqd1k_s2, [bdq1k, 24]
cfaddd acc, acc, temp
cfcpyd bqd1k_s0, acc
subs nn, nn, 1
bgt main_loop
subs nn, nn, 1
bgt main_loop
ldr temp1, =bqd1k_dCrstates
cfstrd bqd1k_s0, [temp1]
cfstrd bqd1k_s1, [temp1, 8]
cfstrd bqd1k_s2, [temp1, 16]
cfstrd bqd1k_s3, [temp1, 24]
3
2
1