Difference between revisions of "User talk:SebastianG"

From Hydrogenaudio Knowledgebase
Jump to: navigation, search
(Removing all content from page)
 
Line 1: Line 1:
= Introduction =
 
  
Noise shaping is a technique that is often used in combination with
 
[[Dithering|dithering]] in the context of [[Quantize|quantization]]. The purpose of noise shaping
 
is to alter the spectral ''shape'' of the error that is introduced due
 
to [[Dithering|dithering]] and [[Quantize|quantization]]. Sometimes it may be desirable to
 
concentrate most of the noise power in a certain frequency band so
 
that the overall noise becomes less perceivable.
 
 
= Applications =
 
 
In the field of digital image processing a popular noise shaping
 
technique is known as "Floyd Steinberg dithering". This algorithm is
 
used to reduce the noise power at lower spatial frequencies which
 
results in an improved perceptual quality.
 
 
For word length reductions of audio signals (i.e conversion from a 24bit
 
signal to a 16bit signal) noise shaping can be used to push quantization
 
noise out of perceptually important frequency bands to perceptually less
 
important frequency bands (>20 kHz) in order to reduce the perceived
 
loudness of the quantization noise.
 
 
Another example is the
 
[[Super Audio Compact Disc|Super Audio CD]]. The SACD format represents
 
the most extreme trade-off between word length (1bit/sample) and
 
sampling frequency (2.8 Mhz). Due to the low word length high power
 
noise is introduced. With the help of noise shaping this noise is
 
usually rejected by about 120 dB in the audible band (0-20 kHz) at the
 
cost of a higher noise power in the ultrasonic range (+3.5dB for
 
frequencies above 100 kHz).
 
 
= Limits & filter design restrictions =
 
 
Noise Shaping without [[Dithering|''dithered'']] quantization is also possible but
 
less useful because the spectral characteristics of ''undithered''
 
quantization noise is not easily predictable and depends heavily on the
 
digital signal that is to be quantized.
 
 
It's not possible to lower the noise power for ''all'' frequencies
 
via noise shaping because this would be in violation of the
 
rate/distortion principles. But it ''is'' possible to lower the noise
 
in certain frequency bands at the cost of increased noise power in other
 
frequency bands. This property is also known as the
 
''noise shaping theorem''. It states that the average logarithmic magnitude response of
 
the corresponding noise shaping ''filter'' over the whole linear
 
frequency axis cannot drop below 0dB.
 
 
The noise shaping application poses some restrictions on the fiter
 
design. The first sample of the filter's impulse response always equals
 
one. Also, interesting filters for noise shaping are so-called
 
''minimum phase filters''. This type of filter performs optimal in the sense of
 
the noise shaping theorem because the average logarithmig magnitude
 
response ''is'' 0dB.
 
 
For the special [[Super Audio Compact Disc|SACD]] case an additional
 
constraint comes into play: The magnitude response should not exceed
 
3.5dB. This is enforced to prevent so-called limit cycles.
 
 
The design of these kinds of filters is however not trivial. This is not
 
covered by this article.
 
 
= Digital filter reminder =
 
 
A common representation of digital filters is their ''transfer function''
 
in form of a rational polynomial in z^-1. These polynomials are defined
 
by their coefficients b0,b1,b2,...,a1,a2,...:
 
 
<math>N(z) = \frac{B(z)}{A(z)} = \frac{b_0 + b_1 z^{-1} + b_2 z^{-2} + \ldots + b_m z^{-m}}
 
{1 + a_1 z^{-1} + a_2 z^{-2} + \ldots + a_m z^{-m}} </math>
 
 
where 'N' is the transfer function and 'm' the filter's ''order''.
 
 
== FIR filters ==
 
 
In case a1=a2=a3=...=0 the filter's impulse response is finite which is
 
why these filters are called FIR ([[Finite Impulse Response Filter|finite impulse response]])
 
filters as opposed to IIR
 
[[Infinite Impulse Response Filter|infinite impulse response]] filters. Note that in case of
 
FIR filters the coefficients b0, b1, ... correspond to the impulse
 
response. The impulse response is another way a filter can be
 
characterized and describes what the filter does to a unit pulse. Let's
 
assume for the moment that a1=a2=a3=...=0. The the impulse response
 
would be:
 
 
x:  0  0  1  0  0  0  0  ... 0  0  0 ... (input signal)
 
y:  0  0  b0 b1 b2 b3 b4 ... 0  0  0 ... (output signal)
 
 
It helps to think of negative powers of z in transfer functions as a
 
delay: z^-1 = delay of one sample. z^-2 = delay of two samples etc.
 
Since a digital filter is a linear and time-invariant system (LTI) it is
 
possible to determine the filter's output by a superposition of scaled
 
and shifted impulse responses. This leads to the following relationship
 
between input and output in the case of FIR filters:
 
 
time domain                                    Z-transform
 
-----------                                    -----------
 
y[k] = b0 x[k] + b1 x[k-1] + b2 x[k-2] + ...    Y(z) = X(z) * B(z)
 
 
where x[] is the input signal, y[] the output signal and k the current
 
sample index. This is a simple convolution directly computed using the
 
given impulse response. The convolution in the time domain corresponds
 
to a multiplication in the Z-domain.
 
 
== IIR filters ==
 
 
IIR filters are now a simple extension to the FIR case:
 
 
time domain                                Z-transform
 
-----------                                -----------
 
      y[k] + a1 y[k-1] + a2 y[k-2] + ...    Y(z) * A(Z) = X(Z) * B(Z)
 
= b0 x[k] + b1 x[k-1] + b2 x[k-2] + ...
 
 
Solving for Y we get
 
 
time domain                                    Z-transform
 
-----------                                    -----------
 
y[k] = b0 x[k] + b1 x[k-1] + b2 x[k-2] + ...    Y(Z) = X(Z) * N(z)
 
                - a1 y[k-1] - a2 y[k-2] + ...
 
 
which is the well known ''difference formula''. Its signal flow graph
 
representation is known as
 
[http://ccrma.stanford.edu/~jos/filters/Direct_Form_I.html direct form I].
 
Now, let's see how we can apply this kind of filter to quantization
 
errors...
 
 
= Noise Shaper Implementation =
 
 
Usually a filter's transfer function is called 'H'. In this article 'N'
 
has been used because this filter is also known as noise transfer
 
function (NTF) in the context of noise shaping. Later, another filter
 
'H' is introduced that has a special relation to 'N'.
 
 
== Direct Form I ==
 
 
For convenience let us substitute the actual difference in the
 
difference formula for 'u'. Also, b0 must be equal to one for noise
 
shaping filters as mentioned above. This leads to
 
 
y[k] = x[k] + u[k]  with u[k] = b1 x[k-1] + b2 x[k-2] + ...
 
                                - a1 y[k-1] - a2 y[k-2] + ...
 
 
Note that since the filter is supposed to be applied on the quantization
 
noise y[] corresponds to the filtered noise and x[] to the unfiltered
 
noise. Suppose s[] is the signal we're trying to quantize. The following
 
pseudo code snippet shows the inner loop over k of ''one possible''
 
noise shaping implementation:
 
 
(1) compute u[k]
 
(2) wanted[k] = s[k] + u[k]
 
(3) quantized[k] = quantize( wanted[k] + dither() )
 
(4) y[k] = quantized[k] - s[k]
 
(5) x[k] = y[k] - u[k]
 
 
That's about it. It may not be obvious at first, though. Take a closer
 
look at what happens at step 3. Note that the following is true
 
 
time domain                          Z-transform
 
-----------                          -----------
 
quantized[k] = s[k] + u[k] + x[k]    Q(z) = S(z) + X(z) * N(z)
 
              = s[k] + y[k]                = S(z) + Y(z)
 
 
u[k], wanted[k] are temporary variables. The values are not needed
 
outside this iteration, so you don't need to remember these. But we
 
''do'' need to remember the past filtered and unfiltered error samples
 
y[] & x[] upto some amount of samples determined by the filter order.
 
 
== Decoupling filtering and quantization ==
 
 
[[Image:Noiseshaper3.png|thumb|right|200px]]
 
As mentioned, the previous example is ''one possible'' implementation.
 
It's not the smartest one. But it's hopefully easy to understand.
 
There's another common formulation. We now require some more transfer
 
function fiddeling. First, we add one and subtract one again:
 
 
<math> N(z) = \frac{B(z)}{A(z)} = 1 - \frac{A(z)}{A(z)} + \frac{B(z)}{A(z)}</math>
 
 
Since now the two fractions have a common denominator we can combine
 
them easily to one fraction. Also, let's factor z^-1 out of the
 
fraction:
 
 
<math>N(z) = 1 - \frac{A(z)-B(z)}{A(z)} = 1 - z^{-1} \frac{z ( A(z)-B(z) )}{A(z)} =: 1 - z^{-1} H(z)</math>
 
 
The last thing we did is writing H(z) for z*(A(Z)-B(Z)). Note that with
 
b0=1 it follows that z*(1-b0)=0. So, H(z) can be written as
 
 
<math>H(z) = \frac{c0 + c1 z^{-1} + c2 z^{-2} + \ldots}{1 + a1 z^{-1} + a2 z^{-2} + \ldots} </math>
 
 
with <math>c_{k-1} = a_k - b_k</math> for <math>k = 1 \ldots m</math>. H(z)
 
is some ''arbitrary'' filter. The big advantage now is that it is
 
possible to plug nearly any digital filter implementation for H(z) into
 
the noise shaping loop. One obvious approach is the difference formula
 
from above but there are other alternatives with desirable properties.
 
However, the purpose of this article is not to discuss digital filters
 
in general but their application as noise shaping filters. So,
 
throughout the rest of this article H(z) is simply treated as a black
 
box. This black box has a certain state and gives us an output sample
 
for each input sample. The modified pseudo code loop looks like this:
 
 
(1) h_out[k] = H.filter(last_error)
 
(2) wanted[k] = s[k] - h_out[k]
 
(3) quantized[k] = quantize( wanted[k] + dither() )
 
(4) last_error = quantized[k] - wanted[k]
 
 
which is equivalent to the shown signal flow graph where the unit
 
sample delay z^-1 is implemented by the state variable 'last_error':
 
Remember that this black box filter H(z) doesn't correspond the the
 
actual noise shaping filter N(z). Depending on the context published
 
coefficients for noise shaping correspond either to N(z), H(z), or
 
-H(z).
 
 
= Example filters =
 
 
The following two filters have been designed by Sebastian Gesemann
 
via non-linear optimization to match a slightly modified version of
 
the LAME MP3 encoder's ATH (absolute threshold of hearing) curve. They
 
were first published in the
 
[http://www.hydrogenaudio.org/forums/index.php?showtopic=47980&pid=551558&st=0 Hydrogenaudio forum]
 
including the noise transfer function's second order factorizations.
 
 
== Sampling rate of 44100 Hz, derived from LAME's ATH formula ==
 
 
The noise transfer function is given by
 
<math>N(z) = \frac{1 -1.1474 z^{-1} +0.5383 z^{-2} -0.3520 z^{-3} +0.3475 z^{-4}}
 
{1 +1.0587 z^{-1} +0.0676 z^{-2} -0.6054 z^{-3} -0.2738 z^{-4}}</math>
 
 
The corresponding filter H is given by
 
<math>H(z) = \frac{2.2061 -0.4706 z^{-1} -0.2534 z^{-2} -0.6214 z^{-3}}
 
{1 +1.0587 z^{-1} +0.0676 z^{-2} -0.6054 z^{-3} -0.2738 z^{-4}}</math>
 
 
== Sampling rate of 48000 Hz, derived from LAME's ATH formula ==
 
 
The noise transfer function is given by
 
<math>N(z) = \frac{1 -1.3344 z^{-1} +0.7455 z^{-2} -0.4602 z^{-3} +0.3463 z^{-4}}
 
{1 +0.9030 z^{-1} +0.0116 z^{-2} -0.5853 z^{-3} -0.2571 z^{-4}}</math>
 
 
The corresponding filter H is given by
 
<math>H(z) = \frac{2.2374 -0.7339 z^{-1} -0.1251 z^{-2} -0.6033 z^{-3}}
 
{1 +0.9030 z^{-1} +0.0116 z^{-2} -0.5853 z^{-3} -0.2571 z^{-4}}</math>
 

Latest revision as of 10:10, 14 March 2008