Telos Zephyr Xstream User Manual

Page 131

USER’S MANUAL

Section 6: AUDIO CODING REFERENCE 119

codec?  Until recently, the answer was no, but new developments in codecs have changed the
picture.  One of the main objectives in audio coding is to provide the best tradeoff between
quality and bit rate.  In general, this goal can only be achieved at the cost of a certain coding
delay.  Codecs for voice telephone applications have use ADPCM and CELP because they have
much lower delay than perceptual codecs.  These are optimized for voice and can have
reasonably good performance.

Zephyr users have known for years that Layer‐3 offers all the fidelity needed for most broadcast
situations. However, they also know that the delay of Layer‐3 can be frustrating, particularly if
high fidelity is needed in both directions and parties at the two ends must carry on a
conversation.

The folks at Fraunhofer were aware of these factors, and have developed an extension to AAC
called "AAC Low Delay," or "AAC‐LD" for short. ACC‐LD offers quality equivalent to Layer‐3 with
less than 25% of the delay!

AAC‐LD combines the advantages of perceptual coders (such as Layer‐3) with certain principles
of speech coders.  Compared to speech coders, AAC‐LD handles both speech and music with
good quality.  Unlike speech coders, however, audio quality scales up with bit rate, and
transparent quality can be achieved.  AAC‐LD’s coding power is roughly the same as Layer 3,
meaning that mono high fidelity 15 kHz audio may be sent via one ISDN channel. With ISDN’s
two channels, you achieve near CD quality stereo.

Delay in perceptual codecs is dependent on several parameters:

• Frame length. Time is required to collect all the samples for a frame. The

longer the frame, the more the delay.

• Filter bank delay. This causes an additional delay equivalent in time to the

frame delay.

• Look‐ahead delay for block switching. Layer‐3 and AAC use filter banks with

high frequency resolution. For signals with high tonality, efficiency is high.
But when there are transients, a dynamic switching process changes to a
filter bank with lower frequency resolution and better time resolution. In
order to correctly decide when to make this change, a look ahead process is
required, adding delay.

• Bit reservoir. The length of this buffer determines how much delay this

process contributes.

The overall delay is a combination of all of these components, divided by the sampling rate. The
delay scales linearly and inversely with the sampling frequency.

How AAC‐LD Gets its Low Delay

AAC‐LD is based on the core AAC work, so much is similar, but each of the contributors to the
delay had to be addressed and modified:

• The frame length is reduced to 512 or 480 samples, with the same number

of spectral components at the filter bank output.