Vorbis

From Hydrogenaudio Knowledgebase
Revision as of 17:55, 3 October 2006 by Elliottmobile (Talk | contribs)

Jump to: navigation, search
Featured article

Vorbis (commonly used inside the OGG container) is a fully open, non-proprietary, patent-free (subject to speculation), and royalty-free, general-purpose compressed audio format for mid to high quality (8khz-48.0kHz, 16+ bit, multichannel) audio and music at fixed and variable bitrates from 16 to >256 kbps/channel. This places vorbis in the same competitive class as audio representations such as MPEG-4 (AAC), and similar to, but higher performance than MP3, TwinVQ (VQF), WMA and PAC. Vorbis is the first of a planned family of Ogg multimedia coding formats being developed as part of Xiph.org's ogg multimedia project.

Informal listening test suggests Vorbis to be comparable to MPEG-4 AAC at most bitrates and MPC at 128 kbps. Transparency is generally reached at about 150-170 kbps (-q 5) (with some exceptions). The encoder is reasonably young and unoptimized, so further improvements can always be expected.

Unfortunately, Xiph.org has failed to improve Vorbis at a steady rate since its initial 1.0 release in July 2002 (due to other developement projects and time constraints). Since then development has been led by other coders such as Garf and Aoyumi. Aoyumi's aoTuV series of encoders was incorporated into the September 2004 release of 1.1, which brought about the first quality improvements across the board for 2 years. Currently Aoyumi is working on aoTuV Beta 5 and future releases. The latest version is aoTuV Release 1, which is the re-branded Beta 4.51 (released in December 2005). Unfortunately, the improvements of aoTuV Release 1 has not been incorporated yet into the 'official' Vorbis line.

Vorbis has had success with many recent video game titles employing Vorbis as opposed to MP3 (with Epic Games' Unreal Tournament 2003 and Unreal Tournament 2004, the PC port of Microsoft's Halo and Uru being notable examples). Ogg Vorbis is also an official part of the OpenAL API extension library, used in many popular computer games. On April 10, 2006, RAD Game Tools integrated Ogg Vorbis support to their Miles Sound System (MSS), which has been used in over 3,200 games worldwide. This ensures that future games utilizing MSS will have the capability to play Ogg Vorbis files. Check out xiph wiki for a full list of games confirmed to use Ogg Vorbis.

Before encoding files using Ogg Vorbis, check out the Recommended Ogg Vorbis to determine what encoder to use and what settings are recommended by Hydrogenaudio.

Pros

  • Ogg Vorbis specification is in the public domain; it is free for commercial or noncommercial use, under both (LGPL and BSD licenes)
  • Easy to use high-level API (Application Programming Interface)
  • Good all-round performance (>48 kbps - a leading codec at 128 kbps)
  • Well written specs
  • Supported by most portable DAPs
  • Suitable for internet-streaming (via Icecast and other methods)
  • Fully gapless playback
  • High potential for further tuning
  • Structured to allow the design for a hybrid filterbank

Cons

  • Limited official development (third-party developement is always encouraged)
  • Current implementations are more computationally intensive to decode than MP3
  • Multichannel input mappings for 5.1, Ambisonic-B, and other configs have no channel coupling and aren't tuned (expect sub-optimal results until code is improved)


Technical Information

  • Multiple block sizes for window switching including overlap (powers of two only) (128/1024, 256/2048, 512/4096)
  • Customly designed window function is applied similiar to the sine window. it has (good sidelobe rejection)
w_k = \sin{(\frac{\pi}{2} \cdot sin^2[(\pi\div2n \cdot (k+0.5))]}
  • Psychoacoustics masking is exploited via an (ATH model)
  • Masking curves are computed from an emperically adjusted set of Ehmer Curves
  • Modified Discrete Cosine Transform (MDCT) is used for noise analysis
  • Fast Fourier Transform (FFT) is used for tonal analysis
  • Global masking curve is a mixture between calculated FFT+MDCT curves and ATH curves overlayed
  • Floor 1 or the noise-floor (envelope) is calculated using the global masking curve & piecewise linear approximation divided by spectrum to generate the residue (fine detail). The Levinson-Durbin LPC model in Floor 0 is no longer used, however the code still exists
  • Noise normalization is applied to compensate for energy lost in certain frequency bands due to quantization (rounding).
  • The channels are coupled strictly by residue using (point/phase stereo and lossless)
  • Multistage Vector quantization is used for coding the noise-floor and residue backend using trained codebooks.
  • Huffman coding is used to minimize vector codeword redundancy

Software

Encoders

  • Oggenc official command-line encoder (Win32/Posix)
  • OggdropXPd advanced drag-and-drop encoder by John33 (Win32)
  • Lancer SSE-optimized vorbis encoder utility and libraries by BlackSword (Win32/Posix)
  • foo_vorbisenc vorbis encoder library for Foobar2000 (Win32)

Decoders

ReplayGain

Splitters

The following utilities are used to splice Vorbis streams without decoding/re-encoding.

Taggers

Most tagger supporting Ogg Vorbis are listed in the download page.

Supported Digital Audio Players

The following list contains some players that support Vorbis playback.

A longer list can be found at xiph's wiki.

Important note: There may be players out there that support Ogg Vorbis, although they are not marketed as such.


External links

The following links contain information surrounding the Ogg Vorbis codec that can be found on Hydrogenaudio and elsewhere throughout the web.

Hydrogenaudio Wiki

Websites

Scientific/R&D