LAME Y switch

From Hydrogenaudio Knowledgebase
Revision as of 10:04, 1 April 2010 by JAZ (talk | contribs) (Orthographic revision, extension on some points.)

This article describes the function of the -Y switch in the LAME encoder commandline.

The short definition

  • The -Y switch tells LAME not to encode the highest frequencies accurately, if doing so causes disproportional increases in bitrate.


Other ways to say it include:

  • The -Y switch tells LAME to use a more coarse representation for the highest frequencies, in the parts where it would cause an over-encoding of all the other bands.
  • The -Y switch tells LAME to not be so strict with the higher frequencies, if they are going to cause an increase of bitrate.


The -Y switch is not a lowpass filter.
It allows high frequencies (>=16Khz) to exist, it just alters its accuracy. If their values are very small it can quantize them to zero (but probably the psychoacoustic analyzer will decide to simply remove them instead).


The technical definition

How is audio stored in MP3

  • MP3 audio is stored in the frequency domain (values for frequencies) instead of the time domain (values for samples)
  • Frequencies are analyzed and stored in groups, known as bands.
  • Bands are quantized to make them compress better.
  • Scale factor refers to how much quantization (loss of precision) is applied to each band, where higher quantization causes greater compression, and consequently less variation between the minimum and maximum values (resolution).
  • Each band has its own scale factor, so that its quantization can be adjusted independently from the others.
  • The exception is scalefactor band 21 (sfb21), which does not have a scale factor. This band stores frequencies of 16 kHz and above.
  • Global gain is an extra quantizer that affects all bands simultaneously.

(See section notes about scalefactors and global gain)

What is the scalefactor band 21 (sfb21) defect

  • If the encoder determines that sfb21 needs more resolution, it has no way to decrease the scale factor of sfb21 alone, since there is no such scale factor.
  • The only way to increase the resolution on sfb21 is therefore to reduce the global gain quantization, since global gain applies to all bands.
  • The encoder can reduce the global gain as long as it is above zero.
  • If global gain is zero, resolution will need to be increased (and quantization lowered) on every other scale factor band.
  • The result is that unnecessary resolution is applied to every other band, so the bits used in all the other bands will increase, causing the bitrate to rise.
  • The encoder is forced to excessively increase the bitrate of the file just so that the frequencies >= 16 kHz will be adequately quantized.

The -Y switch and the sfb21

LAME implements the -Y switch as a way to activate the alternate logic that CBR uses in respect to quantization noise in the sfb21 band.

  • The encoder determines the desired quantization noise within the sfbs. The scale factors are choosen acording to these values.
  • If -Y switch is not used (either implicitly or explicitly), sfb21 gets evaluated and the global gain is set acordingly.
  • Adding -Y lets the encoder ignore whatever quantization noise will be in sfb21.

The result is that all the 16 kHz and above frequencies still get encoded.

The ones that would normally have needed higher resolution to satisfy the criteria of the psy-model don't receive that treatment, while ones that wouldn't need higher resolution are unaffected by the Y switch. The Y switch prevents global gain quantization from being decreased solely to accomodate the needs of sfb21.


The -Y switch and CBR/ABR

The -Y switch can only be activated in VBR mode. By default, -V 3 to -V 9 use -Y. -V 0, -V 1, and -V 2 do not. Consequently, adding -Y is only useful for the highest three VBR settings.

This is because in CBR and ABR modes, the encoder uses -Y implicitly. Specifically, LAME targets a given bitrate, and adjusts the quantization steps until that target is reached.

Since the sfb21 does not have quantization, its quantization noise is not evaluated.

This is the same treatment as using -Y in VBR mode.


Motivation behind this article

The article tries to clarify what the switch does and what it does not do. It is frequently misinterpreted, like joint stereo, and mistaken for a filter.

In explaining what it does, in easy terms and in technical terms, the reader should get a better understanding of the motivation behind and the usage of the switch.


See also

Description of the MPEG layer 3 format

Hydrogenaudio thread discussing this article


Notes and references

In MPEG1 (32, 44, 48Khz), the last scalefactor band is sfb21. In MPEG2 (16,22,24Khz), it is sfb12. The frequency at which it starts also depends on the sampling rate. The value of ~16Khz is for 44.1Khz material.

Global gain and scale factors are not independent. The latter is expressed as a difference of the former.

  • The global gain is the global quantization step size, with a value range between 0 and 255.
  • The scale factor per band is the amount to reduce the global quantization step size. The range of this value is dependant on the band.

Consequently, there are just a reduced amount of values to use.

This article has been brought up partially with comments fom Aleron Ives, robert and benski.