Editing Opus

Jump to: navigation, search

Warning: You are not logged in.

Your IP address will be recorded in this page's edit history.
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 5: Line 5:
 
| caption = Opus Interactive Audio Codec
 
| caption = Opus Interactive Audio Codec
 
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
 
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.3.1
+
| stable_release = 1.1.3
 +
| preview_release = 1.2.0 alpha
 
| operating_system = Windows, Mac OS/X, Linux/BSD
 
| operating_system = Windows, Mac OS/X, Linux/BSD
 
| use = Encoder/Decoder
 
| use = Encoder/Decoder
Line 12: Line 13:
 
}}
 
}}
  
'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}}  Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).
+
'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}}  Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).
  
Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}
+
Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0-8kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}
  
 
Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''
 
Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''
  
 
==Characteristics==
 
==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.
+
Opus supports bitrates from 6kbps to 510kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30kbps (mono) and 40-100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.
  
Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.
+
Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4kHz, 6kHz, 8kHz, 12kHz, 20kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.
  
 
Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.
 
Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.
  
For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.
+
For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.
  
 
Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.
 
Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.
  
 
==Bitrate performance==
 
==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.
+
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 32 kbps. Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.
  
For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.
+
For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12kHz (comparable with compact cassette) then to the full 20kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12kHz to 20kHz.
 
+
Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64&nbsp;kbps and 96&nbsp;kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96&nbsp;kbps also to 128&nbsp;kbps MP3 encoded using LAME <code>-V 5</code>.
+
  
 
==Indicative bitrate and quality==
 
==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.
+
The table below gives illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.
  
In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.
+
In the experimental libopus version 1.1-alpha, automatic detection of speech/music and bandwidth detection have been introduced to improve mode decisions, and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff. Thus changes are likely, and this table is likely to require small updates as the encoder is improved.
  
 
===Speech encoding quality===
 
===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.
+
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed.
  
 
{| class="wikitable" style="text-align:center"
 
{| class="wikitable" style="text-align:center"
Line 53: Line 52:
 
|-
 
|-
 
!Less than 5 kbps
 
!Less than 5 kbps
| &mdash;
+
| -
| &mdash;
+
| -
 
| Bitrates lower than 6 kbps not supported by Opus
 
| Bitrates lower than 6 kbps not supported by Opus
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.7–3.2&nbsp;kbps mono speech
+
| Try [http://codec2.org/ codec2] for 1.2-2.4 kbps speech
 
|-
 
|-
 
!6 kbps
 
!6 kbps
|6 kHz medium-band
+
|4 kHz
 
|SILK
 
|SILK
 
|Fair, intelligible
 
|Fair, intelligible
Line 65: Line 64:
 
|-
 
|-
 
!8 kbps
 
!8 kbps
|6 kHz medium-band
+
|4 kHz narrowband
 
|SILK
 
|SILK
 
|Close to telephone quality
 
|Close to telephone quality
Line 71: Line 70:
 
|-
 
|-
 
!12 kbps
 
!12 kbps
|12 kHz super-wideband
+
|6 kHz medium-band
|hybrid
+
|SILK
 
|Medium bandwidth, better than telephone quality
 
|Medium bandwidth, better than telephone quality
 
|Similar quality to AMR-WB
 
|Similar quality to AMR-WB
 
|-
 
|-
 
!16 kbps
 
!16 kbps
|20 kHz
+
|8 kHz wideband
|hybrid/CELT
+
|SILK
 
|Wideband speech quality
 
|Wideband speech quality
 
|Similar to/better than AMR-WB
 
|Similar to/better than AMR-WB
 
|-
 
|-
 
!24 kbps
 
!24 kbps
|20 kHz
+
|12 kHz super-wideband
|hybrid/CELT
+
|hybrid
 
|Near transparent speech
 
|Near transparent speech
 
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
 
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
Line 90: Line 89:
 
!32 kbps
 
!32 kbps
 
|20 kHz
 
|20 kHz
|CELT
+
|hybrid / possibly CELT
|Essentially transparent speech plus moderately good stereo music
+
|Essentially transparent speech plus moderately good mono music
 
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
 
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
 
|-
 
|-
Line 109: Line 108:
  
 
===Music encoding quality===
 
===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48&nbsp;kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.
+
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used - content dependent even when mono is specified as the typical stereo mode in the table below.
  
 
{| class="wikitable" style="text-align:center"
 
{| class="wikitable" style="text-align:center"
Line 122: Line 121:
 
!6 kbps
 
!6 kbps
 
|mono
 
|mono
|6 kHz
+
|4 kHz
 
|SILK
 
|SILK
 
|Poor, muffled sound but intelligible lyrics.
 
|Poor, muffled sound but intelligible lyrics.
| &mdash;
+
| -
 
|-
 
|-
 
!8 kbps
 
!8 kbps
 
|mono
 
|mono
|6 kHz
+
|4 kHz
 
|SILK
 
|SILK
 
|Poor, muffled but OK for bitrate
 
|Poor, muffled but OK for bitrate
| &mdash;
+
| -
 
|-
 
|-
 
!14 to 16 kbps
 
!14 to 16 kbps
 
|mono
 
|mono
|20 kHz
+
|6 kHz
|hybrid/CELT
+
|SILK
|Fairly poor but OK for bitrate
+
|Fairly Poor but OK for bitrate
 
|Perhaps acceptable for incidental music
 
|Perhaps acceptable for incidental music
 
|-
 
|-
 
!22 to 24 kbps
 
!22 to 24 kbps
 
|mono
 
|mono
|20 kHz
+
|8 kHz
|hybrid/CELT
+
|SILK
 
|Fair but OK for bitrate
 
|Fair but OK for bitrate
 
|OK for incidental music
 
|OK for incidental music
 
|-
 
|-
!32 to 40 kbps
+
!32 kbps
 +
|mono
 +
|12 kHz
 +
|hybrid
 +
|Moderately good mono, reasonably bright treble (c.f. mono cassette)
 +
|Good for podcasts, audiobooks, CELT-only poss for music. Competitor HE-AAC@32kbps is stereo full-band but with annoying artifacts.
 +
|-
 +
!36 to 40 kbps
 
|stereo
 
|stereo
|20 kHz
+
|12 kHz
|CELT
+
|hybrid/CELT
|Moderately good stereo, some artifacts, rarely nasty
+
|Moderately good stereo, reasonably bright treble (c.f. stereo cassette)
 
|Stereo podcasts, audiobooks, very low bitrate music
 
|Stereo podcasts, audiobooks, very low bitrate music
 
|-
 
|-
Line 159: Line 165:
 
|20 kHz
 
|20 kHz
 
|CELT
 
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
+
|Full bandwidth stereo music, some artifacts, rarely nasty
 
|Stereo podcasts, audiobooks, low bitrate music
 
|Stereo podcasts, audiobooks, low bitrate music
 
|-
 
|-
Line 167: Line 173:
 
|CELT
 
|CELT
 
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
 
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
+
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [http://people.xiph.org/~greg/opus/ha2011/ listening test]
 
|-
 
|-
 
!96 kbps
 
!96 kbps
Line 190: Line 196:
 
|Music storage & streaming. Future download music sales.
 
|Music storage & streaming. Future download music sales.
 
|-
 
|-
!160 to 192 kbps
+
!192 kbps
 
|stereo
 
|stereo
 
|20 kHz
 
|20 kHz
Line 202: Line 208:
 
|CELT
 
|CELT
 
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
 
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
+
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy --blocksize=256 may be competitive with minimum latency mode also.
 
|-
 
|-
 
!>510 kbps
 
!>510 kbps
| &mdash;
+
| -
| &mdash;
+
| -
| &mdash;
+
| -
 
|Above Opus bitrate range allowed for stereo sources
 
|Above Opus bitrate range allowed for stereo sources
|Settle for 510&nbsp;kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
+
|Settle for 510kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
 
|-
 
|-
 
|}
 
|}
Line 215: Line 221:
 
===Lower latency versus quality/bitrate trade-off===
 
===Lower latency versus quality/bitrate trade-off===
 
====Packet overhead in interactive applications====
 
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20&nbsp;ms frames, supports 60&nbsp;ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10&nbsp;ms SILK frames to reduce latency somewhat at the expense of packet overhead.
+
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20.0ms frames, supports 60.0ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10.0ms SILK frames to reduce latency somewhat at the expense of packet overhead.
  
In the CELT layer, which tends to operate at higher bitrates than SILK, 20&nbsp;ms frames are the default, but frames of 10&nbsp;ms, 5&nbsp;ms and 2.5&nbsp;ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.
+
In the CELT layer, which tends to operate at higher bitrates than SILK, 20.0ms frames are the default, but frames of 10.0ms, 5.0ms and 2.5ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.
  
 
None of the bitrates mentioned in this article account for the packet overhead.
 
None of the bitrates mentioned in this article account for the packet overhead.
  
 
====CELT layer latency versus quality/bitrate trade-off====
 
====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10&nbsp;ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.
+
Unlike the SILK layer, which works on fixed 10.0ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.
  
When the CELT layer uses 10&nbsp;ms, 5&nbsp;ms and 2.5&nbsp;ms frames instead of the default 20&nbsp;ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).
+
When the CELT layer uses 10.0ms, 5.0ms and 2.5ms frames instead of the default 20.0ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).
  
 
These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.
 
These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.
  
In all modes, the algorithmic delay consists of the frame size plus an additional 2.5&nbsp;ms delay. The CELT layer requires 2.5&nbsp;ms for MDCT window overlap.
+
In all modes, the algorithmic delay consists of the frame size plus an additional 2.5ms delay. The CELT layer requires 2.5ms for MDCT window overlap.
  
 
Xiph.org used matched PEAQ scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.
 
Xiph.org used matched PEAQ scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.
Line 239: Line 245:
 
!fractional bitrate increase
 
!fractional bitrate increase
 
|-
 
|-
!20 ms
+
!20.0 ms
 
|22.5 ms
 
|22.5 ms
 
|64.0 kbps
 
|64.0 kbps
| +0.0 %
+
|0.0 %
 
|-
 
|-
!10 ms
+
!10.0 ms
 
|12.5 ms
 
|12.5 ms
 
|70.4 kbps
 
|70.4 kbps
| +10.0 %
+
|10.0 %
 
|-
 
|-
!5 ms
+
!5.0 ms
 
|7.5 ms
 
|7.5 ms
 
|84.8 kbps
 
|84.8 kbps
| +32.5 %
+
|32.5 %
 
|-
 
|-
 
!2.5 ms
 
!2.5 ms
 
|5.0 ms
 
|5.0 ms
 
|112.0 kbps
 
|112.0 kbps
| +75.0 %
+
|75.0 %
 
|-
 
|-
 
|}
 
|}
  
N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20&nbsp;ms frame size is preferable.
+
N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20.0ms frame size is preferable.
  
 
== Hardware & Software Support ==
 
== Hardware & Software Support ==
Line 270: Line 276:
  
 
=== Commandline binaries & libopus versions ===
 
=== Commandline binaries & libopus versions ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.
+
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder ''opusenc'', decoder ''opusdec'', and with a different license, the ''opusinfo'' opus stream & metadata analyzer.
  
 
The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.
 
The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.
Line 277: Line 283:
 
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.
 
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.
  
'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.
+
'''Stable''', '''well-tuned''' ''opusenc'' reference encoder as included in RFC documentation.
  
 
CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.
 
CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.
Line 289: Line 295:
 
There are many minor improvements to '''speech quality''' in both SILK and CELT layers.
 
There are many minor improvements to '''speech quality''' in both SILK and CELT layers.
  
*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
+
'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
  
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40&nbsp;kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
+
'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24~40kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
  
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
+
'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
  
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.
+
'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.
  
 
==== libopus v1.1.3 ====  
 
==== libopus v1.1.3 ====  
 
Released July 15th, 2016. This version contains:
 
Released July 15th, 2016. This version contains:
  
*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
+
-Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
 
+
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
+
 
+
*Fixes to comfort noise generation (CNG)
+
 
+
*Documenting that PLC packets can also be 2 bytes
+
 
+
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)
+
 
+
==== libopus v1.2.1 ====
+
Released June 26th, 2017. This version contains:
+
 
+
*Speech quality improvements especially in the 12–20&nbsp;kbit/s range
+
 
+
*Improved VBR encoding for hybrid mode
+
 
+
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14&nbsp;kbit/s
+
 
+
*Music quality improvements in the 32–48&nbsp;kbit/s range
+
 
+
*Generic and SSE CELT optimizations
+
 
+
*Support for directly encoding packets up to 120&nbsp;ms
+
 
+
*DTX support for CELT mode
+
 
+
*SILK CBR improvements
+
 
+
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
+
 
+
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)
+
 
+
==== libopus v1.3 ====
+
Released on October 18th, 2018. This version contains:
+
 
+
* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
+
 
+
* Support for ambisonics coding using channel mapping families 2 and 3
+
 
+
* Improvements to stereo speech coding at low bitrate
+
 
+
* Using wideband encoding down to 9&nbsp;kb/s
+
 
+
* Making it possible to use SILK down to bitrates around 5&nbsp;kb/s
+
 
+
* Minor quality improvement on tones
+
 
+
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
+
 
+
* Security/hardening improvements
+
 
+
* Fixes to the CELT PLC
+
 
+
* Bandwidth detection fixes
+
  
==== libopus v1.3.1 ====
+
-Fixes some issues with 16-bit platforms (e.g. TI C55x)
Released on April 12th, 2019. This version contains:
+
  
* Fixes to x87 builds
+
-Fixes to comfort noise generation (CNG)
  
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
+
-Documenting that PLC packets can also be 2 bytes
  
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)
+
-Includes experimental ambisonics work (--enable-ambisonics)
  
 
=== Ports ===
 
=== Ports ===
Line 382: Line 333:
 
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
 
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
 
* TrueConf video conferencing solutions support Opus.
 
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
+
* Opus support is planned for Jitsi 2.0, together with VP8 video
 
* Empathy may use any format supported in GStreamer, including Opus.
 
* Empathy may use any format supported in GStreamer, including Opus.
 
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
 
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
 
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
 
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
+
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.
+
  
 
=== Web frameworks and browsers ===
 
=== Web frameworks and browsers ===
Line 394: Line 344:
 
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
 
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
 
* Chromium and Google Chrome have audio support as of version 33.
 
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
 
 
* Maxthon Cloud Browser
 
* Maxthon Cloud Browser
  
Line 415: Line 364:
  
 
* Windows/Mac/Linux (Cross-Platform)
 
* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
+
*# [[VLC]] (media player supports Opus as of version 2.0.4  
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
+
*#[[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
 
*# Clementine has Opus support
 
*# Clementine has Opus support
 
*# Audacious player
 
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
 
  
  
Line 428: Line 376:
 
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
 
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
 
*# MPC-HC
 
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
 
  
  
 
* iOS/Android (Cross-Platform)
 
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
+
*#Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
+
  
  
 
* Android Exclusive
 
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
 
 
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
 
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
 
*# [http://neutronmp.com/ Neutron Music Player]
 
*# [http://neutronmp.com/ Neutron Music Player]
Line 446: Line 391:
 
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
 
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
 
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
 
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
 
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]
 
  
 
=== Other software ===
 
=== Other software ===
Line 454: Line 397:
 
* Report-IT
 
* Report-IT
 
* [[MP3tag|MP3tag]]
 
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
 
 
* [http://www.xdlab.ru/en/ TagScanner]
 
* [http://www.xdlab.ru/en/ TagScanner]
 
* [http://www.xmedia-recode.de/ XMedia Recode]
 
* [http://www.xmedia-recode.de/ XMedia Recode]

Please note that all contributions to Hydrogenaudio Knowledgebase are considered to be released under the GNU Free Documentation License 1.2 (see Hydrogenaudio Knowledgebase:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)