beautypg.com

3two-level cache – Philips TMS320C6713 User Manual

Page 6

background image

SPRA921

6

TMS320C6713 Digital Signal Processor Optimized for High Performance Multichannel Audio Systems

Table 1. C6713 Benchmark Performance

Algorithm

Description

Parameter Values

Cycles

Time

Biquad filter
(IIR filter direct form II)

nx input/output cycles

nx = 60
nx = 90

316
436

1.4

µ

s

1.9

µ

s

Real FIR filter

nh coefficients
nr output samples

nh = 24
nr = 64
nh = 30,
nr = 50

802

795

3.6

µ

s

3.5

µ

s

IIR filter

nr number of output samples

nr = 64

443

2.0

µ

s

IIR lattice filter

nr number of samples
nk number of reflection coefficients

nk = 10,
nr = 100

4125

18.3

µ

s

Dotproduct

nx number of values

nx = 512

281

1.2

µ

s

3

Two-Level Cache

3.1

Cache Overview

The TMS320C6713 device utilizes a highly efficient two-level real-time cache for internal
program and data storage. The cache delivers high performance without the cost of large arrays
of on-chip memory. The efficiency of the cache makes low cost, high-density external memory,
such as SDRAM, as effective as on-chip memory.

The first level of the memory architecture has dedicated 4K Byte instruction and data caches,
L1I and L1D respectively. The LII is direct-mapped where as the L1D provides 2-way
associativity to handle multiple types of data. The second level (L2) consists of a total of 256K
bytes of memory. 64K bytes of this can be configured in one of five ways:

64K 4-way associative cache

48K 3-way associative cache, 16K mapped RAM

32K 2-way associative cache, 32K mapped RAM

16K direct mapped associative cache, 48K mapped RAM

64K Mapped RAM

Dedicated L1 caches eliminate conflicts for the memory resources between the program and
data busses. A unified L2 memory provides flexible memory allocation between program and
data for accesses that do not reside in L1.

3.2

Cache Hides Off-Chip Latency

The external memories that interface to the TMS320C6713 may operate at a maximum of
100 MHz, while the device operates at a 225 MHz maximum frequency. All external memory
devices have significant start-up latencies associated with them. For example, SDRAMs typically
have a read latency of 2-4 bus cycles. The reduced frequency and additional latency of
memories would normally significantly degrade processor performance. There is a significant
reduction in latency for retrieving data from on-chip L2 memory than from an external memory.
By having the intermediate L2 cache, this latency is hidden from the user. Using the fast L2
memories to cache the slower external memories reduces the latency of external accesses by a
factor of five.