Hydrogenaudio Knowledgebase - User contributions [en]

Cue sheet

2024-10-08T09:03:14Z

Artoria2e5: /* Most often used */ WAVE is a later extension. There is a strong consensus for FLAC to work, but Opus? Fb2k works, VLC doesn't.

A '''cue sheet''' (or '''CUE file''', '''.cue''', '''CUE sheet''', etc.) is a formatted text file which provides index and other supplemental information for one or more audio files. A cue sheet is generally used in conjunction with either extracting from or burning to [[Compact Disc|CD]]. For example, when a CD's complete audio content has been ripped to a single file, a cue sheet contains information about the track boundaries, and CD-R burning software can use it to make a copy of the original CD with the same track layout as the original. Cue sheets can also be used when writing data CDs.

Increasingly, cue sheets are being used as playlists: you load the cue sheet in a media player, and it can play an "image" (single-file) rip as if it were separate files, one for each track. Cue sheets can be used for file-per-track rips as well, but many such rips require that the cue sheet not adhere strictly to the original specification's rules.

==Cue sheet contents==
All cue sheets contain the following info:
* The name & type of at least one file being indexed (an audio file, normally);
* A numbered list of tracks each file corresponds to or contains;
* The start point (index 01) for each track, time-wise (MM:SS:FF format).

Cue sheets may contain the following additional info:
* CD-Text [[metadata]] such as performer, title, songwriter for the disc and/or each track;
* ISRCs (sound recording IDs to burn)
* Special flags for CD burning (e.g. for pre-emphasis)
* Gap info (how much silence to insert before or after each track)
* Comments (which are used by some programs to store nonstandard metadata like genre, freeDB disc ID, etc.)

A cue sheet isn't necessary to make an exact copy of the audio portion of a CD; ripping & burning software will get you the audio wave data and can figure out where each track starts. However, a cue sheet ''can'' be used to specify the location of the first track (if it deviates from the standard), as well as certain subcode information, such as non-01 index points, CD-TEXT (which may not exist on the original CD), UPC/ISRC data, and [[pre-emphasis]] information.

A cue sheet ''is'' required to burn "hidden track one audio" ([[HTOA]]), which is audio that can only be played after scanning backwards from the beginning of track 1. A cue sheet may be needed when silent frames have been omitted from the beginning or end of files to be burned; the cue sheet can be used to reconstruct the pauses by telling the burner or player where to insert silence. A cue sheet may also be needed when there is a mix of audio and data tracks to be burned (unless the burning software is told which tracks are which).

== History ==
The cue sheet format was invented by Jeff Arnold of [http://web.archive.org/web/20070217191217/http://www.goldenhawk.com/ GoldenHawk Technology] for use with his [[DAO]] ('''D'''isc '''A'''t '''O'''nce) and [http://web.archive.org/web/20070217191217/http://www.goldenhawk.com/ CDRWIN] applications. The format has since been adopted as the ''de facto'' standard, and is used by various other applications, including the audio player [[foobar2000]]. The official cue sheet specification is widely accepted to be Appendix A of the CDRWIN User's Guide.

The name is taken from the '''SEND CUE SHEET''' command (as defined in the ''SCSI-3 Multimedia Commands'' specification), used for sending a binary-format cue sheet describing the disc layout to the drive before writing starts in SAO (Session-At-Once) write mode. The drive writes to the disc, using the cue sheet information to generate the P and Q subchannel data, and to retrieve the format and block size of the data transferred with the '''WRITE''' command.<ref>Text adapted from [http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=42485&view=findpost&p=374579 a post by Martin H].</ref>

The DAO and CDRWIN software was developed for use on MS-DOS and early Windows systems, when it was common to refer to types of files by their file name extensions, in all-caps: TXT for text, DOC for Word document, and so on. Early references to cue sheets likewise referred to ''CUE files''. This convention continues to the present day, but the ''cue'' in the term ''cue sheet'' is not an acronym and need not be capitalized.

== Cue sheet commands ==
The following commands are detailed in the Appendix A of the [http://web.archive.org/web/20070221154246/http://www.goldenhawk.com/download/cdrwin.pdf CDRWIN User's Guide]:
* CATALOG – A 13-digit UPC/EAN code, also referred to as the Media Catolog Number (MCN). 12-digit UPC codes should be prefixed with a "0".
* CDTEXTFILE – A path to a file containing CD-Text info.
* FILE – A path to a file containing audio data, and to which subsequent commands apply.
* FLAGS – Per-track subcode flag(s):
** DCP - Digital copy permitted.
** 4CH - Four channel audio.
** PRE - Pre-emphasis enabled (audio tracks only).
** SCMS - Serial Copy Management System (not supported by all recorders).
* INDEX – Per-track index(es).
* ISRC – Per-track ISRC(s).
* PERFORMER – Per-disc or per-track performer name for CD-Text data.
* POSTGAP – Amount of post-track silence to add.
* PREGAP – Amount of pre-track silence to add.
* REM – A remark/comment to be ignored.
* SONGWRITER – Per-disc or per-track songwriter name for CD-Text data.
* TITLE – Per-disc or per-track title for CD-Text data.
* TRACK – Type of track to create, and to which subsequent commands apply.

=== Most often used ===
;FILE
:The FILE command specifies the file that the cue sheet is currently referencing. Valid file types are WAVE, MP3, AIFF, BINARY and MOTOROLA.
:As a commonly-accepted extension, many lossless formats such as [[WavPack]] or [[FLAC]], can also be used under the WAVE file type. There is a chance for lossy formats like [[Opus]] to also work, but this is not as commonly agreed upon by CUE players.
;INDEX
:A number between 00 and 99. Index points are specified in MM:SS:FF format, and are relative to the start of the file currently referenced. MM is the number of minutes, SS the number of seconds, and FF the number of frames (there are seventy five frames to one second). INDEX 01 commands specify the beginning of a new track. INDEX 00 commands specify the pre-gap of a track; you may notice your [[Compact Disc Digital Audio|Audio CD]] player count up from a negative value before beginning a new track - this is the period between INDEX 00 and INDEX 01.
;PERFORMER
:At top-level this will specify the CD artist, while at track-level it specifies the track artist.
;PREGAP
:Used to specify the length of a track pre-gap, in MM:SS:FF format. Although the SCSI specs reserve the term ''pre-gap'' for the pause before a data track, in a cue sheet the PREGAP command can be used to create a pause before any kind of track, data or audio.
;REM
:Used to record comments in a cue sheet. This command is often used to store additional meta data to TITLE and PERFORMER, e.g.: the date or genre of the disc.
:The following REM comments can be written to a disc's CD-Text section and read by an application such as [[cdrdao]] or ImgBurn:
:;REM UPC
::The "UPC" is not necessarily the same as "CATALOG", and can be 12 or 13 digits in length. See [[Wikipedia:Universal Product Code]].
:;REM DISCID
::Although programs such as [[Exact Audio Copy]] use this to store the disc's CDDB1 value, other programs can extract the disc's true Disc ID, which is usually the disc's label-specific catalog number (see the example TOC file on the [[cdrdao]] page and the "DISC_ID" field for an example).
;TITLE
:At top-level this will specify the album name, while at track-level it specifies the track name.
;TRACK
:A number between 01 and 99, indicating the track number.

=== Quotation marks ===
The use of quotation marks around strings for PERFORMER, TITLE, etc., is standard practice, however, for programs such as ImgBurn, they are not mandatory.<ref>[https://forum.imgburn.com/index.php?/topic/23743-double-quoatation-marks-in-track-title/ double-quotation marks in track title]</ref>

By omitting quotation marks, this allows the use of quotation marks within the string itself. For example:

<pre>
TRACK 01 AUDIO
TITLE Theme Of "Rome"
PERFORMER Danger Mouse & Daniele Luppi
ISRC GBAYE1001378
INDEX 01 00:00:00
</pre>

This does not, however, work for strings that need to display quotation marks at the beginning of the string, as ImgBurn only parses the text contained ''within'' the quotation marks.

=== Whitespace ===
Line breaks must be used between commands. Spaces or tabs can be used to indent; they're ignored but can make the file easier to understand when viewing or manually editing. Customarily, for audio CDs, all the commands which apply to a particular file are indented under the FILE command, and those which apply to a specific track are further indented under the TRACK command.

=== Examples ===
'''A standard single file cue sheet'''
<pre>
REM GENRE Alternative
REM DATE 1991
REM DISCID 860B640B
REM COMMENT "ExactAudioCopy v0.95b4"
PERFORMER "My Bloody Valentine"
TITLE "Loveless"
FILE "My Bloody Valentine - Loveless.wav" WAVE
TRACK 01 AUDIO
TITLE "Only Shallow"
PERFORMER "My Bloody Valentine"
INDEX 01 00:00:00
TRACK 02 AUDIO
TITLE "Loomer"
PERFORMER "My Bloody Valentine"
INDEX 01 04:17:52
</pre>
The cue sheet above, created by [[EAC]], shows the first two tracks of a standard single file cue sheet. Note the use of REM commands to record additional [[metadata]], in the format '''REM <TAG> "<value>"'''. The '''PERFORMER''' and '''TITLE''' commands at the top of the cue sheet detail the [[Compact Disc Digital Audio|CD]] artist and album name respectively. The '''PERFORMER''' and '''TITLE''' commands at track-level specify the track artist and title.

TRACK 02's INDEX 01 entry does not state that the track is 4m 17.693s long, but that the beginning of the track is 4m 17.693s into the file (so TRACK 01 was in fact 4m 17.693s long). If TRACK 02 was 3m long exactly, TRACK 03's INDEX 01 value would be 07:17:52.

Also note the file reference specifying a relative path to the file (references can also be absolute) and the file type: [[WAV|WAVE]].

==== A single-file cue sheet with a TRACK 01 INDEX 00 hidden track ====
<pre>
PERFORMER "Bloc Party"
TITLE "Silent Alarm"
FILE "Bloc Party - Silent Alarm.flac" WAVE
TRACK 01 AUDIO
TITLE "Like Eating Glass"
PERFORMER "Bloc Party"
INDEX 00 00:00:00
INDEX 01 03:22:70
TRACK 02 AUDIO
TITLE "Helicopter"
PERFORMER "Bloc Party"
INDEX 00 07:42:69
INDEX 01 07:44:69
</pre>

The cue sheet above shows the first two tracks of a single file cue sheet for a disc with a hidden track at the start. Note that TRACK 01 INDEX 01 starts at 03:22:70 (3m 22.933s) instead of 00:00:00 as in the first example, and most cue sheets. The INDEX 00 index on TRACK 02 displays the more usual behaviour, being two seconds before INDEX 01.

As the INDEX 00 is on TRACK 01 you will not normally see the usual countdown from a negative value that you might see from an INDEX 00 command on a subsequent track. To listen to this track on a [[Compact Disc Digital Audio|Audio CD]] player you will need to start the disc playing and press rewind, to rewind, essentially, from 3m 22s into the disc back to the true beginning.

Also note that the file referenced is [[FLAC]], but the [[WAV|WAVE]] files type is used. For [[MP3]] files the file type "'''[[MP3]]'''" should be used, for [[AIFF]] you should use "'''[[AIFF]]'''", but for all other types "'''[[WAV|WAVE]]'''" is used.

==== Multiple files with corrected gaps ====
<pre>
FILE "The Specials - Singles - 01 - Gangsters.wav" WAVE
TRACK 01 AUDIO
TITLE "Gangsters"
PERFORMER "The Specials"
INDEX 01 00:00:00
FILE "The Specials - Singles - 02 - Rudi, A Message To You.wav" WAVE
TRACK 02 AUDIO
TITLE "Rudi, A Message To You"
PERFORMER "The Specials"
INDEX 00 00:00:00
INDEX 01 00:00:28
</pre>
This multiple file cue sheet, created by [[EAC]], has gaps prepended to the next track. This method allows users to retain gaps, but by prepending the gap to the next track each track may begin with silence, which makes playback less satisfactory. This is a very uncommon way to rip CDs, even though it is more in line with the disc's actual track layout.

==== Multiple files with gaps left out ====
<pre>
FILE "The Specials - Singles - 01 - Gangsters.wav" WAVE
TRACK 01 AUDIO
TITLE "Gangsters"
PERFORMER "The Specials"
INDEX 01 00:00:00
FILE "The Specials - Singles - 02 - Rudi, A Message To You.wav" WAVE
TRACK 02 AUDIO
TITLE "Rudi, A Message To You"
PERFORMER "The Specials"
PREGAP 00:00:28
INDEX 01 00:00:00
</pre>
This multiple file cue sheet, created by [[EAC]], has removed the gaps, but artificially recreates silence between tracks using the PREGAP command. This is fine if the gap was silence, but unsatisfactory if it contained audio.

==== Multiple files with gaps (Noncompliant) ====
<pre>
FILE "The Specials - Singles - 01 - Gangsters.wav" WAVE
TRACK 01 AUDIO
TITLE "Gangsters"
PERFORMER "The Specials"
INDEX 01 00:00:00
TRACK 02 AUDIO
TITLE "Rudi, A Message To You"
PERFORMER "The Specials"
INDEX 00 02:47:74
FILE "The Specials - Singles - 02 - Rudi, A Message To You.wav" WAVE
INDEX 01 00:00:00
</pre>
This multiple-file cue sheet, created by [[EAC]], has gaps appended to the previous track, and is a favourite among users who rip to track files but wish to retain gap information. This format allows the user to retain gaps, but in a position in the track file that does not hinder playback. Unfortunately, this format is non-compliant; this type of rip, despite its popularity, was not supported by the original DAO and CDRWIN software for which cue sheets were designed. Applications that adhere to the cue sheet specification, like [[foobar2000]], will not be able to read it. Of course, [[EAC]] will read these cue sheets, as will the [[Compact Disc|CD]] burning application [[Burrrn]].

Note that INDEX 00 of TRACK 02 is set while still referencing the first FILE.

==== Single file version of the cue sheet used above ====
<pre>
FILE "The Specials - Singles.wav" WAVE
TRACK 01 AUDIO
TITLE "Gangsters"
PERFORMER "The Specials"
INDEX 01 00:00:00
TRACK 02 AUDIO
TITLE "Rudi, A Message To You"
PERFORMER "The Specials"
INDEX 00 02:47:74
INDEX 01 02:48:27
</pre>
For reference, the cue sheet used in the examples above is in single file format.

== Example cue sheet ==
<pre>
REM GENRE Ska
REM DATE 1991
REM DISCID D00DA810
REM COMMENT "ExactAudioCopy v0.95b4"
PERFORMER "The Specials"
TITLE "Singles"
FILE "The Specials - Singles.wav" WAVE
TRACK 01 AUDIO
TITLE "Gangsters"
PERFORMER "The Specials"
INDEX 01 00:00:00
TRACK 02 AUDIO
TITLE "Rudi, A Message To You"
PERFORMER "The Specials"
INDEX 00 02:47:74
INDEX 01 02:48:27
TRACK 03 AUDIO
TITLE "Nite Klub"
PERFORMER "The Specials"
INDEX 00 05:41:50
INDEX 01 05:42:27
TRACK 04 AUDIO
TITLE "Too Much Too Young"
PERFORMER "The Specials"
INDEX 00 08:53:47
INDEX 01 08:54:37
TRACK 05 AUDIO
TITLE "Guns Of Navarone"
PERFORMER "The Specials"
INDEX 00 10:59:20
INDEX 01 11:00:17
TRACK 06 AUDIO
TITLE "Rat Race"
PERFORMER "The Specials"
INDEX 00 13:20:55
INDEX 01 13:20:67
TRACK 07 AUDIO
TITLE "Stereotype"
PERFORMER "The Specials"
INDEX 00 16:29:67
INDEX 01 16:30:30
TRACK 08 AUDIO
TITLE "International Jet Set"
PERFORMER "The Specials"
INDEX 00 20:19:27
INDEX 01 20:20:20
TRACK 09 AUDIO
TITLE "Do Nothing"
PERFORMER "The Specials"
INDEX 00 24:30:70
INDEX 01 24:32:27
TRACK 10 AUDIO
TITLE "Ghost Town"
PERFORMER "The Specials"
INDEX 00 28:23:30
INDEX 01 28:23:42
TRACK 11 AUDIO
TITLE "Why?"
PERFORMER "The Specials"
INDEX 00 34:21:37
INDEX 01 34:21:47
TRACK 12 AUDIO
TITLE "Friday Night, Saturday Morning"
PERFORMER "The Specials"
INDEX 00 38:16:50
INDEX 01 38:16:55
TRACK 13 AUDIO
TITLE "War Crimes"
PERFORMER "The Specials"
INDEX 00 41:50:07
INDEX 01 41:51:00
TRACK 14 AUDIO
TITLE "Racist Friend"
PERFORMER "The Specials"
INDEX 00 45:50:55
INDEX 01 45:51:72
TRACK 15 AUDIO
TITLE "Nelson Mandela"
PERFORMER "The Specials"
INDEX 00 49:35:55
INDEX 01 49:38:22
TRACK 16 AUDIO
TITLE "(What I Like Most About You Is Your) Girlfriend"
PERFORMER "The Specials"
INDEX 00 54:11:00
INDEX 01 54:12:40
</pre>

== Useful applications ==
=== Playing ===
* [[foobar2000]]

=== Splitting ===
* [[ACDIR]]: http://nyaochi.sakura.ne.jp/xoops/modules/mysoftwares/tc_2.html
* CUE Splitter: http://www.enfis.it/downloads.php?cat_id=1
* CueProc: http://nyaochi.sakura.ne.jp/xoops/modules/mysoftwares/tc_6.html (that domain appears to be lost, but https://github.com/rinrinne/cueproc-alternative looks like a derivative)
* [[CueTools]]: http://www.hydrogenaudio.org/forums/index.php?showtopic=41476
* [[foobar2000]]: http://www.foobar2000.org/
* mp3DirectCut: https://mpesch3.de/
* mp3splt: https://github.com/mp3splt/mp3splt
* pcutmp3: http://www.hydrogenaudio.org/forums/index.php?showtopic=35654
* [[shntool]]: http://shnutils.freeshell.org/shntool/
* WavSplit: http://tangerine.uw.hu/prog/

=== Joining ===
* CueMake: http://www.synthetic-soul.co.uk/files/cuemake/
* [[CueTools]]: http://www.hydrogenaudio.org/forums/index.php?showtopic=41476
* [[foobar2000]]: http://www.foobar2000.org/
* [[shntool]]: http://shnutils.freeshell.org/shntool/
* [[XRECODE]]: https://xrecode.com/

=== Creating ===
* CD Wave: http://www.milosoftware.com/cdwave/ - only reads .wav, .flac, .ape, .w64
* CUEgenerator: http://cuegenerator.net/ (online web app)
* CueMaster: http://cuemaster.org/
* [[CueTools]]: http://www.hydrogenaudio.org/forums/index.php?showtopic=41476
* [[foobar2000]]: http://www.foobar2000.org/
* [[Goldwave]]: http://www.goldwave.com/
* imgburn: https://www.imgburn.com/ - (Tools menu -> Create CUE File)
* [[shntool]]: http://shnutils.freeshell.org/shntool/
* Wave Repair: https://www.delback.co.uk/wavrep/ - only reads in .WAV files

== References ==
<references/>

== See also ==
* [[Gap settings]]
* [[EAC CUE Sheets]]

== External links ==
* [https://github.com/libyal/libodraw/blob/main/documentation/CUE%20sheet%20format.asciidoc libodraw cue sheet documentation]
* [http://web.archive.org/web/20070614044112/http://www.goldenhawk.com/download/cdrwin.pdf CDRWIN 3.8 Users Manual.book - cdrwin.pdf] via archive.org - Cue sheet commands are listed under Appendix A.
* [http://web.archive.org/web/20070217191217/http://www.goldenhawk.com/ goldenhawk.com] via archive.org
* {{wikipedia|Cue sheet (computing)}}

[[Category:CD ripping]]

Other hardware

2024-08-06T03:57:18Z

Artoria2e5: /* Headphones */

{{stub}}

== Bluetooth ==

Bluetooth has two unusual requirements: codec complexity and latency. Complexity (bitrate too, to an extent) decides battery life, but with the codec doing less work, it's not usual to see bitrate/quality trades way worse than conventional codecs.

=== A2DP ===
Advanced Audio Distribution Profile (A2DP) is the traditional music profile.
*[http://www.hydrogenaudio.org/forums/index.php?showtopic=32634 forums]
*[http://en.wikipedia.org/wiki/A2DP#Bluetooth_profiles wikipedia]
* [https://habr.com/en/articles/456182/ HABR]

Standard baseline codecs:
* Sub Band Codec (SBC) – okay, not great

Standard optional codecs:
* MPEG-1,2 Audio: = MP3, almost never used
* AAC: relatively good decode support, transparency possible at bluetooth bitrates, but latency can be an issue
* ATRAC: never used

Vendor extension codecs:
* AptX, LDAC, LLDC: in general lower complexity but higher bitrate (still within BT limits) than AAC, transparency possible; AptX and LDAC has wide support
* FastStream: Qualcomm's bidirectional SBC variant, providing a mono back channel
* Opus 0.5: PipeWire's [https://gitlab.freedesktop.org/pipewire/pipewire/-/blob/434cc6a90b6cdbaa31f013fea95786e1f5bf6d88/spa/plugins/bluez5/README-OPUS-A2DP.md little experiment] with nearly unlimited channels in either direction. Zero hardware support.
* Opus Android: The [https://hydrogenaud.io/index.php/topic,114566.msg1023511.html#msg1023511 Android 13] approach to adding Opus support. Stereo only, could be bidirectional? (check a2dp_vendor_opus_constants.h!) Supported by Pixel headphones.

Bitrate is limited by bluetooth link. For all relevant codecs, there is a maximum limit of 2 channels: this is a headphone protocol, not a speaker protocol. Microphone-back channel is not available except in FastStream and AptX LL duplex.

==== Sub Band Codec ====
* Sampling Frequency - 16000, 32000, 44100, 48000 (some are optional, see references)
* Channel Mode - mono, dual channel, stereo, joint stereo

The bitrate used for CD-rate (16-bit, 44.1 KHz) audio is 229 kbps and 328 kbps for middle and high quality, respectively.

Reference: [https://www.bluetooth.org/foundry/adopters/document/A2DP_Spec_V1_0/en/ BLUETOOTH Advanced Audio Distribution Profile 1.0]

=== HSP/HFP ===
This is the mode with mono input and output, both at voice sample rates. Only very crude encodings (CVSD, PCM, optionally SBC 16KHz as the higher-quality thing) are supported, so do not expect audio quality.

=== LE Audio ===
This is the new, rewritten general-purpose bluetooth audio stack. This mode allows bidirectional audio of various channel configurations. Battery use is allegedly lower than A2DP.

==== LC3 ====
: ''See also: [[Wikipedia:LC3 (codec)]]''
[https://www.bluetooth.org/DocMan/handlers/DownloadDoc.ashx?doc_id=502107&vId=542963 LC3] is the baseline LE Audio codec. It uses some of the same ideas as Opus CELT, but comes with some new patented techniques and runs at extremely low complexity.

* It's [https://www.etsi.org/deliver/etsi_tr/103500_103599/103590/01.01.01_60/tr_103590v010101p.pdf allegedly] "better than Opus" with respect to error tolerance in speech environments. Except the test is done with speech samples in the non-speech, non-FEC mode of Opus (CELT), at minimum complexity (effort), and an old version at that.
* No [[joint stereo]] is available. The rationale given is that this allows True Wireless Stereo (TWS) to work with the audio source transmitting direct to each headphone, without needing an intermediate decode.
* LC3plus, LC3's "high-res" cousin available over A2DP, is [https://hydrogenaud.io/index.php/topic,121850.0.html worse than Apple and Android (FDK) AAC at 144 kbps]. The Japan Audio Society nevertheless thinks that's enough for a "Hi-Res AUDIO WIRELESS" sticker.

In any case, it's much better than SBC and retains the low complexity character. When given more bits, such as with the 96*2 and 124*2 standard profiles, it can be very acceptable. Google's open source [https://github.com/google/liblc3 liblc3] implementation is written to the word of the BT spec.

=== Operating systems ===

Bluetooth codec support is not great on Windows. Windows 10 has SBC and AptX. Windows 11 introduces AAC as a better-than-SBC fallback on non-Qualcomm devices; it also introduces LE Audio support. There is no support for any codec variant such as AptX-HD. You can use a [https://www.bluetoothgoodies.com/a2dp/ third-party driver] ($5.99 trialware) that does A2DP its own way to get the better codecs.

With PipeWire and BlueZ 5, Linux has support for many codecs: AptX (including LL and HD), LDAC, FastStream, both versions of Opus, LC3plus, and SBC XQ. The modern, PipeWire-based Linux audio stack can have very low software latency when combined with linux-rt or rtkit. Combined with a low-latency codec, one ''should'' receive a relatively low-latency experience.

macOS and iOS are known to support AAC. SBC support is presumed to exist. There [https://gist.github.com/dvf/3771e58085568559c429d05ccc339219 used to be] support for AptX.

== Headphones ==
Some Styles (smallest to largest)
* In the ear - outputs directly into the ear canal
* Earbuds - sits in the outer ear
* Supra-aural - sits on the ear
* circumaural - completely cover the ear

Supra-aural and circumaural headphones have the sound reshaped by your auricle. The other two styles feed sound more or less directly into the ear canal. This difference is important for deciding on what kind of HRTF to use.

Terminology used for comparing headphones
* Noise Cancelling
** active: the headphone samples outside noise in real time and emits the opposite soundwave (for some frequencies).
** passive: would be more "blocking" than "cancelling" -- think about an earmuff.
* Frequency response - The range of frequencies the headphones can reproduce
* Impedance - Doesn't mean much by itself, but in general, impedence should be matched across a system.
* SPL@1kHz, 1V rms - [http://en.wikipedia.org/wiki/Sound_pressure_level#SPL_in_audio_equipment Sound Pressure Level]. How efficient the unit converts electrical energy to sound energy.
* Neodymium - The magnet of choice for high end headphones.

Popular Headphones
* Sub $30
* Sub $100
* $100 - $300
* $300+

Other hardware

2024-08-06T03:48:43Z

Artoria2e5: /* Operating systems */ ~

{{stub}}

== Bluetooth ==

Bluetooth has two unusual requirements: codec complexity and latency. Complexity (bitrate too, to an extent) decides battery life, but with the codec doing less work, it's not usual to see bitrate/quality trades way worse than conventional codecs.

=== A2DP ===
Advanced Audio Distribution Profile (A2DP) is the traditional music profile.
*[http://www.hydrogenaudio.org/forums/index.php?showtopic=32634 forums]
*[http://en.wikipedia.org/wiki/A2DP#Bluetooth_profiles wikipedia]
* [https://habr.com/en/articles/456182/ HABR]

Standard baseline codecs:
* Sub Band Codec (SBC) – okay, not great

Standard optional codecs:
* MPEG-1,2 Audio: = MP3, almost never used
* AAC: relatively good decode support, transparency possible at bluetooth bitrates, but latency can be an issue
* ATRAC: never used

Vendor extension codecs:
* AptX, LDAC, LLDC: in general lower complexity but higher bitrate (still within BT limits) than AAC, transparency possible; AptX and LDAC has wide support
* FastStream: Qualcomm's bidirectional SBC variant, providing a mono back channel
* Opus 0.5: PipeWire's [https://gitlab.freedesktop.org/pipewire/pipewire/-/blob/434cc6a90b6cdbaa31f013fea95786e1f5bf6d88/spa/plugins/bluez5/README-OPUS-A2DP.md little experiment] with nearly unlimited channels in either direction. Zero hardware support.
* Opus Android: The [https://hydrogenaud.io/index.php/topic,114566.msg1023511.html#msg1023511 Android 13] approach to adding Opus support. Stereo only, could be bidirectional? (check a2dp_vendor_opus_constants.h!) Supported by Pixel headphones.

Bitrate is limited by bluetooth link. For all relevant codecs, there is a maximum limit of 2 channels: this is a headphone protocol, not a speaker protocol. Microphone-back channel is not available except in FastStream and AptX LL duplex.

==== Sub Band Codec ====
* Sampling Frequency - 16000, 32000, 44100, 48000 (some are optional, see references)
* Channel Mode - mono, dual channel, stereo, joint stereo

The bitrate used for CD-rate (16-bit, 44.1 KHz) audio is 229 kbps and 328 kbps for middle and high quality, respectively.

Reference: [https://www.bluetooth.org/foundry/adopters/document/A2DP_Spec_V1_0/en/ BLUETOOTH Advanced Audio Distribution Profile 1.0]

=== HSP/HFP ===
This is the mode with mono input and output, both at voice sample rates. Only very crude encodings (CVSD, PCM, optionally SBC 16KHz as the higher-quality thing) are supported, so do not expect audio quality.

=== LE Audio ===
This is the new, rewritten general-purpose bluetooth audio stack. This mode allows bidirectional audio of various channel configurations. Battery use is allegedly lower than A2DP.

==== LC3 ====
: ''See also: [[Wikipedia:LC3 (codec)]]''
[https://www.bluetooth.org/DocMan/handlers/DownloadDoc.ashx?doc_id=502107&vId=542963 LC3] is the baseline LE Audio codec. It uses some of the same ideas as Opus CELT, but comes with some new patented techniques and runs at extremely low complexity.

* It's [https://www.etsi.org/deliver/etsi_tr/103500_103599/103590/01.01.01_60/tr_103590v010101p.pdf allegedly] "better than Opus" with respect to error tolerance in speech environments. Except the test is done with speech samples in the non-speech, non-FEC mode of Opus (CELT), at minimum complexity (effort), and an old version at that.
* No [[joint stereo]] is available. The rationale given is that this allows True Wireless Stereo (TWS) to work with the audio source transmitting direct to each headphone, without needing an intermediate decode.
* LC3plus, LC3's "high-res" cousin available over A2DP, is [https://hydrogenaud.io/index.php/topic,121850.0.html worse than Apple and Android (FDK) AAC at 144 kbps]. The Japan Audio Society nevertheless thinks that's enough for a "Hi-Res AUDIO WIRELESS" sticker.

In any case, it's much better than SBC and retains the low complexity character. When given more bits, such as with the 96*2 and 124*2 standard profiles, it can be very acceptable. Google's open source [https://github.com/google/liblc3 liblc3] implementation is written to the word of the BT spec.

=== Operating systems ===

Bluetooth codec support is not great on Windows. Windows 10 has SBC and AptX. Windows 11 introduces AAC as a better-than-SBC fallback on non-Qualcomm devices; it also introduces LE Audio support. There is no support for any codec variant such as AptX-HD. You can use a [https://www.bluetoothgoodies.com/a2dp/ third-party driver] ($5.99 trialware) that does A2DP its own way to get the better codecs.

With PipeWire and BlueZ 5, Linux has support for many codecs: AptX (including LL and HD), LDAC, FastStream, both versions of Opus, LC3plus, and SBC XQ. The modern, PipeWire-based Linux audio stack can have very low software latency when combined with linux-rt or rtkit. Combined with a low-latency codec, one ''should'' receive a relatively low-latency experience.

macOS and iOS are known to support AAC. SBC support is presumed to exist. There [https://gist.github.com/dvf/3771e58085568559c429d05ccc339219 used to be] support for AptX.

== Headphones ==
Some Styles (smallest to largest)
* In the ear - outputs directly into the ear canal
* Earbuds - sits in the outer ear
* Supra-aural - sits on the ear
* circumaural - completely cover the ear

Terminology used for comparing headphones
* Noise Cancelling
** active
** passive

* Frequency response - The range of frequencies the headphones can reproduce

* Impedance - Doesn't mean much by itself, but in general, impedence should be matched across a system.

* SPL@1kHz, 1V rms - [http://en.wikipedia.org/wiki/Sound_pressure_level#SPL_in_audio_equipment Sound Pressure Level]. How efficient the unit converts electrical energy to sound energy.

* Neodymium - The magnet of choice for high end headphones.
Popular Headphones
* Sub $30
* Sub $100
* $100 - $300
* $300+

Other hardware

2024-08-06T03:48:05Z

Artoria2e5: /* Operating systems */

{{stub}}

== Bluetooth ==

Bluetooth has two unusual requirements: codec complexity and latency. Complexity (bitrate too, to an extent) decides battery life, but with the codec doing less work, it's not usual to see bitrate/quality trades way worse than conventional codecs.

=== A2DP ===
Advanced Audio Distribution Profile (A2DP) is the traditional music profile.
*[http://www.hydrogenaudio.org/forums/index.php?showtopic=32634 forums]
*[http://en.wikipedia.org/wiki/A2DP#Bluetooth_profiles wikipedia]
* [https://habr.com/en/articles/456182/ HABR]

Standard baseline codecs:
* Sub Band Codec (SBC) – okay, not great

Standard optional codecs:
* MPEG-1,2 Audio: = MP3, almost never used
* AAC: relatively good decode support, transparency possible at bluetooth bitrates, but latency can be an issue
* ATRAC: never used

Vendor extension codecs:
* AptX, LDAC, LLDC: in general lower complexity but higher bitrate (still within BT limits) than AAC, transparency possible; AptX and LDAC has wide support
* FastStream: Qualcomm's bidirectional SBC variant, providing a mono back channel
* Opus 0.5: PipeWire's [https://gitlab.freedesktop.org/pipewire/pipewire/-/blob/434cc6a90b6cdbaa31f013fea95786e1f5bf6d88/spa/plugins/bluez5/README-OPUS-A2DP.md little experiment] with nearly unlimited channels in either direction. Zero hardware support.
* Opus Android: The [https://hydrogenaud.io/index.php/topic,114566.msg1023511.html#msg1023511 Android 13] approach to adding Opus support. Stereo only, could be bidirectional? (check a2dp_vendor_opus_constants.h!) Supported by Pixel headphones.

Bitrate is limited by bluetooth link. For all relevant codecs, there is a maximum limit of 2 channels: this is a headphone protocol, not a speaker protocol. Microphone-back channel is not available except in FastStream and AptX LL duplex.

==== Sub Band Codec ====
* Sampling Frequency - 16000, 32000, 44100, 48000 (some are optional, see references)
* Channel Mode - mono, dual channel, stereo, joint stereo

The bitrate used for CD-rate (16-bit, 44.1 KHz) audio is 229 kbps and 328 kbps for middle and high quality, respectively.

Reference: [https://www.bluetooth.org/foundry/adopters/document/A2DP_Spec_V1_0/en/ BLUETOOTH Advanced Audio Distribution Profile 1.0]

=== HSP/HFP ===
This is the mode with mono input and output, both at voice sample rates. Only very crude encodings (CVSD, PCM, optionally SBC 16KHz as the higher-quality thing) are supported, so do not expect audio quality.

=== LE Audio ===
This is the new, rewritten general-purpose bluetooth audio stack. This mode allows bidirectional audio of various channel configurations. Battery use is allegedly lower than A2DP.

==== LC3 ====
: ''See also: [[Wikipedia:LC3 (codec)]]''
[https://www.bluetooth.org/DocMan/handlers/DownloadDoc.ashx?doc_id=502107&vId=542963 LC3] is the baseline LE Audio codec. It uses some of the same ideas as Opus CELT, but comes with some new patented techniques and runs at extremely low complexity.

* It's [https://www.etsi.org/deliver/etsi_tr/103500_103599/103590/01.01.01_60/tr_103590v010101p.pdf allegedly] "better than Opus" with respect to error tolerance in speech environments. Except the test is done with speech samples in the non-speech, non-FEC mode of Opus (CELT), at minimum complexity (effort), and an old version at that.
* No [[joint stereo]] is available. The rationale given is that this allows True Wireless Stereo (TWS) to work with the audio source transmitting direct to each headphone, without needing an intermediate decode.
* LC3plus, LC3's "high-res" cousin available over A2DP, is [https://hydrogenaud.io/index.php/topic,121850.0.html worse than Apple and Android (FDK) AAC at 144 kbps]. The Japan Audio Society nevertheless thinks that's enough for a "Hi-Res AUDIO WIRELESS" sticker.

In any case, it's much better than SBC and retains the low complexity character. When given more bits, such as with the 96*2 and 124*2 standard profiles, it can be very acceptable. Google's open source [https://github.com/google/liblc3 liblc3] implementation is written to the word of the BT spec.

=== Operating systems ===

Bluetooth codec support is not great on Windows. Windows 10 has SBC and AptX. Windows 11 introduces AAC as a better-than-SBC fallback on non-Qualcomm devices; it also introduces LE Audio support. There is no support for any codec variant such as AptX-HD. You can use a [https://www.bluetoothgoodies.com/a2dp/ third-party driver] ($5.99 trialware) that does A2DP its own way to get the better codecs.

With PipeWire and BlueZ 5, Linux has support for many codecs: AptX (including LL and HD), LDAC, FastStream, both versions of Opus, LC3plus, and SBC XQ. The modern, PipeWire-based Linux audio stack can have very low software latency when combined with RTKit. Combined with a low-latency codec, one ''should'' receive a relatively low-latency experience.

macOS and iOS are known to support AAC. SBC support is presumed to exist. There [https://gist.github.com/dvf/3771e58085568559c429d05ccc339219 used to be] support for AptX.

== Headphones ==
Some Styles (smallest to largest)
* In the ear - outputs directly into the ear canal
* Earbuds - sits in the outer ear
* Supra-aural - sits on the ear
* circumaural - completely cover the ear

Terminology used for comparing headphones
* Noise Cancelling
** active
** passive

* Frequency response - The range of frequencies the headphones can reproduce

* Impedance - Doesn't mean much by itself, but in general, impedence should be matched across a system.

* SPL@1kHz, 1V rms - [http://en.wikipedia.org/wiki/Sound_pressure_level#SPL_in_audio_equipment Sound Pressure Level]. How efficient the unit converts electrical energy to sound energy.

* Neodymium - The magnet of choice for high end headphones.
Popular Headphones
* Sub $30
* Sub $100
* $100 - $300
* $300+

Other hardware

2024-08-06T03:46:53Z

Artoria2e5: /* Bluetooth */ add /* Operating systems */

{{stub}}

== Bluetooth ==

Bluetooth has two unusual requirements: codec complexity and latency. Complexity (bitrate too, to an extent) decides battery life, but with the codec doing less work, it's not usual to see bitrate/quality trades way worse than conventional codecs.

=== A2DP ===
Advanced Audio Distribution Profile (A2DP) is the traditional music profile.
*[http://www.hydrogenaudio.org/forums/index.php?showtopic=32634 forums]
*[http://en.wikipedia.org/wiki/A2DP#Bluetooth_profiles wikipedia]
* [https://habr.com/en/articles/456182/ HABR]

Standard baseline codecs:
* Sub Band Codec (SBC) – okay, not great

Standard optional codecs:
* MPEG-1,2 Audio: = MP3, almost never used
* AAC: relatively good decode support, transparency possible at bluetooth bitrates, but latency can be an issue
* ATRAC: never used

Vendor extension codecs:
* AptX, LDAC, LLDC: in general lower complexity but higher bitrate (still within BT limits) than AAC, transparency possible; AptX and LDAC has wide support
* FastStream: Qualcomm's bidirectional SBC variant, providing a mono back channel
* Opus 0.5: PipeWire's [https://gitlab.freedesktop.org/pipewire/pipewire/-/blob/434cc6a90b6cdbaa31f013fea95786e1f5bf6d88/spa/plugins/bluez5/README-OPUS-A2DP.md little experiment] with nearly unlimited channels in either direction. Zero hardware support.
* Opus Android: The [https://hydrogenaud.io/index.php/topic,114566.msg1023511.html#msg1023511 Android 13] approach to adding Opus support. Stereo only, could be bidirectional? (check a2dp_vendor_opus_constants.h!) Supported by Pixel headphones.

Bitrate is limited by bluetooth link. For all relevant codecs, there is a maximum limit of 2 channels: this is a headphone protocol, not a speaker protocol. Microphone-back channel is not available except in FastStream and AptX LL duplex.

==== Sub Band Codec ====
* Sampling Frequency - 16000, 32000, 44100, 48000 (some are optional, see references)
* Channel Mode - mono, dual channel, stereo, joint stereo

The bitrate used for CD-rate (16-bit, 44.1 KHz) audio is 229 kbps and 328 kbps for middle and high quality, respectively.

Reference: [https://www.bluetooth.org/foundry/adopters/document/A2DP_Spec_V1_0/en/ BLUETOOTH Advanced Audio Distribution Profile 1.0]

=== HSP/HFP ===
This is the mode with mono input and output, both at voice sample rates. Only very crude encodings (CVSD, PCM, optionally SBC 16KHz as the higher-quality thing) are supported, so do not expect audio quality.

=== LE Audio ===
This is the new, rewritten general-purpose bluetooth audio stack. This mode allows bidirectional audio of various channel configurations. Battery use is allegedly lower than A2DP.

==== LC3 ====
: ''See also: [[Wikipedia:LC3 (codec)]]''
[https://www.bluetooth.org/DocMan/handlers/DownloadDoc.ashx?doc_id=502107&vId=542963 LC3] is the baseline LE Audio codec. It uses some of the same ideas as Opus CELT, but comes with some new patented techniques and runs at extremely low complexity.

* It's [https://www.etsi.org/deliver/etsi_tr/103500_103599/103590/01.01.01_60/tr_103590v010101p.pdf allegedly] "better than Opus" with respect to error tolerance in speech environments. Except the test is done with speech samples in the non-speech, non-FEC mode of Opus (CELT), at minimum complexity (effort), and an old version at that.
* No [[joint stereo]] is available. The rationale given is that this allows True Wireless Stereo (TWS) to work with the audio source transmitting direct to each headphone, without needing an intermediate decode.
* LC3plus, LC3's "high-res" cousin available over A2DP, is [https://hydrogenaud.io/index.php/topic,121850.0.html worse than Apple and Android (FDK) AAC at 144 kbps]. The Japan Audio Society nevertheless thinks that's enough for a "Hi-Res AUDIO WIRELESS" sticker.

In any case, it's much better than SBC and retains the low complexity character. When given more bits, such as with the 96*2 and 124*2 standard profiles, it can be very acceptable. Google's open source [https://github.com/google/liblc3 liblc3] implementation is written to the word of the BT spec.

=== Operating systems ===

Bluetooth codec support is not great on Windows. Windows 10 has SBC and AptX. Windows 11 introduces AAC as a better-than-SBC fallback on non-Qualcomm devices; it also introduces LE Audio support. There is no support for any codec variant such as AptX-HD. You can use a [https://www.bluetoothgoodies.com/a2dp/ third-party driver] that does A2DP its own way to get the better codecs.

With PipeWire and BlueZ 5, Linux has support for many codecs: AptX (including LL and HD), LDAC, FastStream, both versions of Opus, LC3plus, and SBC XQ. The modern, PipeWire-based Linux audio stack can have very low software latency when combined with RTKit. Combined with a low-latency codec, one ''should'' receive a relatively low-latency experience.

macOS and iOS are known to support AAC. SBC support is presumed to exist. There [https://gist.github.com/dvf/3771e58085568559c429d05ccc339219 used to be] support for AptX.

== Headphones ==
Some Styles (smallest to largest)
* In the ear - outputs directly into the ear canal
* Earbuds - sits in the outer ear
* Supra-aural - sits on the ear
* circumaural - completely cover the ear

Terminology used for comparing headphones
* Noise Cancelling
** active
** passive

* Frequency response - The range of frequencies the headphones can reproduce

* Impedance - Doesn't mean much by itself, but in general, impedence should be matched across a system.

* SPL@1kHz, 1V rms - [http://en.wikipedia.org/wiki/Sound_pressure_level#SPL_in_audio_equipment Sound Pressure Level]. How efficient the unit converts electrical energy to sound energy.

* Neodymium - The magnet of choice for high end headphones.
Popular Headphones
* Sub $30
* Sub $100
* $100 - $300
* $300+

BS.1387

2023-08-25T00:07:08Z

Artoria2e5: /* Other objective metrics */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples. Assuming the HydrogenAudio subjective scoring system, where the reference sample is always scored as a perfect 5, adding 5 points to ODG should produce an approximation of the subjective score.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== GstPEAQ ==
[https://github.com/HSU-ANT/gstpeaq GstPEAQ] is an implementation of PEAQ, ''both'' basic and advanced, in GStreamer. In addition to the ODG, it also outputs the distortion index (DI), which is not clipped at extremes and not fitted to score anchors. On the HA multiformat dataset:
* The advanced model gives a correlation improvement of ~0.2 over basic;
* DI is slightly better at predicting subjective scores than ODG, with a correlation improvement of ~0.03.

== Comparison with subjective listening tests ==
* [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 EAqual results for the AAC@128v2 listening test] - fair Pearson correlation (0.699) among higher-quality samples: all AAC
* [https://hydrogenaud.io/index.php/topic,124607.msg1031323.htmlGstPEAQ: PEAQ done right, allegedly || Multiformat correlation] - great Pearson correlation (0.924, DI Adv): samples of three quality groups

HA comparisons between PEAQ and human raters remain inconclusive. PEAQ is considered useful for an approximation of human senses in codec development and research, but concrete results still need human participation.

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both designed for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It works for both speech and music, but the neural network (don't worry, it runs fast enough on a CPU) is trained for short clips only. Maybe someone can write a tool to one file into many clips and see individual segment scores.
* CDPAM is allegedly the model that comes closest to human datasets. Unfortunately the only pre-trained model works on 22050 Hz. And it's academic neural-network code -- not something you can expect to run on first try.

There are also much more primitive methods that don't attempt anything perceptual, preferred by peddlers of Bluetooth codecs (they make more sense for analogue systems like DACs):
* SNR
* THD+N

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

Other hardware

2023-08-25T00:04:24Z

Artoria2e5: /* Bluetooth */

{{stub}}

== Bluetooth ==

Bluetooth has two unusual requirements: codec complexity and latency. Complexity (bitrate too, to an extent) decides battery life, but with the codec doing less work, it's not usual to see bitrate/quality trades way worse than conventional codecs.

=== A2DP ===
Advanced Audio Distribution Profile (A2DP) is the traditional music profile.
*[http://www.hydrogenaudio.org/forums/index.php?showtopic=32634 forums]
*[http://en.wikipedia.org/wiki/A2DP#Bluetooth_profiles wikipedia]
* [https://habr.com/en/articles/456182/ HABR]

Standard baseline codecs:
* Sub Band Codec (SBC) – okay, not great

Standard optional codecs:
* MPEG-1,2 Audio: = MP3, almost never used
* AAC: relatively good decode support, transparency possible at bluetooth bitrates, but latency can be an issue
* ATRAC: never used

Vendor extension codecs:
* AptX, LDAC, LLDC: in general lower complexity but higher bitrate (still within BT limits) than AAC, transparency possible; AptX and LDAC has wide support
* FastStream: Qualcomm's bidirectional SBC variant, providing a mono back channel
* Opus 0.5: PipeWire's [https://gitlab.freedesktop.org/pipewire/pipewire/-/blob/434cc6a90b6cdbaa31f013fea95786e1f5bf6d88/spa/plugins/bluez5/README-OPUS-A2DP.md little experiment] with nearly unlimited channels in either direction. Zero hardware support.
* Opus Android: The [https://hydrogenaud.io/index.php/topic,114566.msg1023511.html#msg1023511 Android 13] approach to adding Opus support. Stereo only, could be bidirectional? (check a2dp_vendor_opus_constants.h!) Supported by Pixel headphones.

Bitrate is limited by bluetooth link. For all relevant codecs, there is a maximum limit of 2 channels: this is a headphone protocol, not a speaker protocol. Microphone-back channel is not available except in FastStream and AptX LL duplex.

==== Sub Band Codec ====
* Sampling Frequency - 16000, 32000, 44100, 48000 (some are optional, see references)
* Channel Mode - mono, dual channel, stereo, joint stereo

The bitrate used for CD-rate (16-bit, 44.1 KHz) audio is 229 kbps and 328 kbps for middle and high quality, respectively.

Reference: [https://www.bluetooth.org/foundry/adopters/document/A2DP_Spec_V1_0/en/ BLUETOOTH Advanced Audio Distribution Profile 1.0]

=== HSP/HFP ===
This is the mode with mono input and output, both at voice sample rates. Only very crude encodings (CVSD, PCM, optionally SBC 16KHz as the higher-quality thing) are supported, so do not expect audio quality.

=== LE Audio ===
This is the new, rewritten general-purpose bluetooth audio stack. This mode allows bidirectional audio of various channel configurations. Battery use is allegedly lower than A2DP.

==== LC3 ====
: ''See also: [[Wikipedia:LC3 (codec)]]''
[https://www.bluetooth.org/DocMan/handlers/DownloadDoc.ashx?doc_id=502107&vId=542963 LC3] is the baseline LE Audio codec. It uses some of the same ideas as Opus CELT, but comes with some new patented techniques and runs at extremely low complexity.

* It's [https://www.etsi.org/deliver/etsi_tr/103500_103599/103590/01.01.01_60/tr_103590v010101p.pdf allegedly] "better than Opus" with respect to error tolerance in speech environments. Except the test is done with speech samples in the non-speech, non-FEC mode of Opus (CELT), at minimum complexity (effort), and an old version at that.
* No [[joint stereo]] is available. The rationale given is that this allows True Wireless Stereo (TWS) to work with the audio source transmitting direct to each headphone, without needing an intermediate decode.
* LC3plus, LC3's "high-res" cousin available over A2DP, is [https://hydrogenaud.io/index.php/topic,121850.0.html worse than Apple and Android (FDK) AAC at 144 kbps]. The Japan Audio Society nevertheless thinks that's enough for a "Hi-Res AUDIO WIRELESS" sticker.

In any case, it's much better than SBC and retains the low complexity character. When given more bits, such as with the 96*2 and 124*2 standard profiles, it can be very acceptable. Google's open source [https://github.com/google/liblc3 liblc3] implementation is written to the word of the BT spec.

== Headphones ==
Some Styles (smallest to largest)
* In the ear - outputs directly into the ear canal
* Earbuds - sits in the outer ear
* Supra-aural - sits on the ear
* circumaural - completely cover the ear

Terminology used for comparing headphones
* Noise Cancelling
** active
** passive

* Frequency response - The range of frequencies the headphones can reproduce

* Impedance - Doesn't mean much by itself, but in general, impedence should be matched across a system.

* SPL@1kHz, 1V rms - [http://en.wikipedia.org/wiki/Sound_pressure_level#SPL_in_audio_equipment Sound Pressure Level]. How efficient the unit converts electrical energy to sound energy.

* Neodymium - The magnet of choice for high end headphones.
Popular Headphones
* Sub $30
* Sub $100
* $100 - $300
* $300+

Other hardware

2023-08-25T00:03:32Z

Artoria2e5: /* LC3 */

{{stub}}

== Bluetooth ==

Bluetooth has two unusual requirements: codec complexity and latency. Complexity decides battery life, but with the codec doing less work, it's not usual to see bitrate/quality trades way worse than conventional codecs.

=== A2DP ===
Advanced Audio Distribution Profile (A2DP) is the traditional music profile.
*[http://www.hydrogenaudio.org/forums/index.php?showtopic=32634 forums]
*[http://en.wikipedia.org/wiki/A2DP#Bluetooth_profiles wikipedia]
* [https://habr.com/en/articles/456182/ HABR]

Standard baseline codecs:
* Sub Band Codec (SBC) – okay, not great

Standard optional codecs:
* MPEG-1,2 Audio: = MP3, almost never used
* AAC: relatively good decode support, transparency possible at bluetooth bitrates, but latency can be an issue
* ATRAC: never used

Vendor extension codecs:
* AptX, LDAC, LLDC: in general lower complexity but higher bitrate (still within BT limits) than AAC, transparency possible; AptX and LDAC has wide support
* FastStream: Qualcomm's bidirectional SBC variant, providing a mono back channel
* Opus 0.5: PipeWire's [https://gitlab.freedesktop.org/pipewire/pipewire/-/blob/434cc6a90b6cdbaa31f013fea95786e1f5bf6d88/spa/plugins/bluez5/README-OPUS-A2DP.md little experiment] with nearly unlimited channels in either direction. Zero hardware support.
* Opus Android: The [https://hydrogenaud.io/index.php/topic,114566.msg1023511.html#msg1023511 Android 13] approach to adding Opus support. Stereo only, could be bidirectional? (check a2dp_vendor_opus_constants.h!) Supported by Pixel headphones.

Bitrate is limited by bluetooth link. For all relevant codecs, there is a maximum limit of 2 channels: this is a headphone protocol, not a speaker protocol. Microphone-back channel is not available except in FastStream and AptX LL duplex.

==== Sub Band Codec ====
* Sampling Frequency - 16000, 32000, 44100, 48000 (some are optional, see references)
* Channel Mode - mono, dual channel, stereo, joint stereo

The bitrate used for CD-rate (16-bit, 44.1 KHz) audio is 229 kbps and 328 kbps for middle and high quality, respectively.

Reference: [https://www.bluetooth.org/foundry/adopters/document/A2DP_Spec_V1_0/en/ BLUETOOTH Advanced Audio Distribution Profile 1.0]

=== HSP/HFP ===
This is the mode with mono input and output, both at voice sample rates. Only very crude encodings (CVSD, PCM, optionally SBC 16KHz as the higher-quality thing) are supported, so do not expect audio quality.

=== LE Audio ===
This is the new, rewritten general-purpose bluetooth audio stack. This mode allows bidirectional audio of various channel configurations. Battery use is allegedly lower than A2DP.

==== LC3 ====
: ''See also: [[Wikipedia:LC3 (codec)]]''
[https://www.bluetooth.org/DocMan/handlers/DownloadDoc.ashx?doc_id=502107&vId=542963 LC3] is the baseline LE Audio codec. It uses some of the same ideas as Opus CELT, but comes with some new patented techniques and runs at extremely low complexity.

* It's [https://www.etsi.org/deliver/etsi_tr/103500_103599/103590/01.01.01_60/tr_103590v010101p.pdf allegedly] "better than Opus" with respect to error tolerance in speech environments. Except the test is done with speech samples in the non-speech, non-FEC mode of Opus (CELT), at minimum complexity (effort), and an old version at that.
* No [[joint stereo]] is available. The rationale given is that this allows True Wireless Stereo (TWS) to work with the audio source transmitting direct to each headphone, without needing an intermediate decode.
* LC3plus, LC3's "high-res" cousin available over A2DP, is [https://hydrogenaud.io/index.php/topic,121850.0.html worse than Apple and Android (FDK) AAC at 144 kbps]. The Japan Audio Society nevertheless thinks that's enough for a "Hi-Res AUDIO WIRELESS" sticker.

In any case, it's much better than SBC and retains the low complexity character. When given more bits, such as with the 96*2 and 124*2 standard profiles, it can be very acceptable. Google's open source [https://github.com/google/liblc3 liblc3] implementation is written to the word of the BT spec.

== Headphones ==
Some Styles (smallest to largest)
* In the ear - outputs directly into the ear canal
* Earbuds - sits in the outer ear
* Supra-aural - sits on the ear
* circumaural - completely cover the ear

Terminology used for comparing headphones
* Noise Cancelling
** active
** passive

* Frequency response - The range of frequencies the headphones can reproduce

* Impedance - Doesn't mean much by itself, but in general, impedence should be matched across a system.

* SPL@1kHz, 1V rms - [http://en.wikipedia.org/wiki/Sound_pressure_level#SPL_in_audio_equipment Sound Pressure Level]. How efficient the unit converts electrical energy to sound energy.

* Neodymium - The magnet of choice for high end headphones.
Popular Headphones
* Sub $30
* Sub $100
* $100 - $300
* $300+

Other hardware

2023-08-25T00:01:14Z

Artoria2e5: /* LC3 */

{{stub}}

== Bluetooth ==

Bluetooth has two unusual requirements: codec complexity and latency. Complexity decides battery life, but with the codec doing less work, it's not usual to see bitrate/quality trades way worse than conventional codecs.

=== A2DP ===
Advanced Audio Distribution Profile (A2DP) is the traditional music profile.
*[http://www.hydrogenaudio.org/forums/index.php?showtopic=32634 forums]
*[http://en.wikipedia.org/wiki/A2DP#Bluetooth_profiles wikipedia]
* [https://habr.com/en/articles/456182/ HABR]

Standard baseline codecs:
* Sub Band Codec (SBC) – okay, not great

Standard optional codecs:
* MPEG-1,2 Audio: = MP3, almost never used
* AAC: relatively good decode support, transparency possible at bluetooth bitrates, but latency can be an issue
* ATRAC: never used

Vendor extension codecs:
* AptX, LDAC, LLDC: in general lower complexity but higher bitrate (still within BT limits) than AAC, transparency possible; AptX and LDAC has wide support
* FastStream: Qualcomm's bidirectional SBC variant, providing a mono back channel
* Opus 0.5: PipeWire's [https://gitlab.freedesktop.org/pipewire/pipewire/-/blob/434cc6a90b6cdbaa31f013fea95786e1f5bf6d88/spa/plugins/bluez5/README-OPUS-A2DP.md little experiment] with nearly unlimited channels in either direction. Zero hardware support.
* Opus Android: The [https://hydrogenaud.io/index.php/topic,114566.msg1023511.html#msg1023511 Android 13] approach to adding Opus support. Stereo only, could be bidirectional? (check a2dp_vendor_opus_constants.h!) Supported by Pixel headphones.

Bitrate is limited by bluetooth link. For all relevant codecs, there is a maximum limit of 2 channels: this is a headphone protocol, not a speaker protocol. Microphone-back channel is not available except in FastStream and AptX LL duplex.

==== Sub Band Codec ====
* Sampling Frequency - 16000, 32000, 44100, 48000 (some are optional, see references)
* Channel Mode - mono, dual channel, stereo, joint stereo

The bitrate used for CD-rate (16-bit, 44.1 KHz) audio is 229 kbps and 328 kbps for middle and high quality, respectively.

Reference: [https://www.bluetooth.org/foundry/adopters/document/A2DP_Spec_V1_0/en/ BLUETOOTH Advanced Audio Distribution Profile 1.0]

=== HSP/HFP ===
This is the mode with mono input and output, both at voice sample rates. Only very crude encodings (CVSD, PCM, optionally SBC 16KHz as the higher-quality thing) are supported, so do not expect audio quality.

=== LE Audio ===
This is the new, rewritten general-purpose bluetooth audio stack. This mode allows bidirectional audio of various channel configurations. Battery use is allegedly lower than A2DP.

==== LC3 ====
: ''See also: [[Wikipedia:LC3 (codec)]]''
[https://www.bluetooth.org/DocMan/handlers/DownloadDoc.ashx?doc_id=502107&vId=542963 LC3] is the baseline LE Audio codec. It uses some of the same ideas as Opus CELT, but comes with some new patented techniques and runs at extremely low complexity.

* It's [https://www.etsi.org/deliver/etsi_tr/103500_103599/103590/01.01.01_60/tr_103590v010101p.pdf allegedly] "better than Opus" with respect to error tolerance in speech environments. Except the test is done with speech samples in the non-speech, non-FEC mode of Opus (CELT), at minimum complexity (effort), and an old version at that.
* No [[joint stereo]] is available. The rationale given is that this allows True Wireless Stereo (TWS) to work with the audio source transmitting direct to each headphone, without needing an intermediate decode.
* LC3plus, LC3's "high-res" cousin available over A2DP, is [https://hydrogenaud.io/index.php/topic,121850.0.html worse than Apple and Android (FDK) AAC at 144 kbps]. The Japan Audio Society nevertheless thinks that's enough for a "Hi-Res AUDIO WIRELESS" sticker.

In any case, it's much better than SBC and retains the low complexity character. It is not the best use of the bandwidth available, but a lot of things can be excused with "low complexity". Google's open source [https://github.com/google/liblc3 liblc3] implementation is written to the word of the BT spec.

== Headphones ==
Some Styles (smallest to largest)
* In the ear - outputs directly into the ear canal
* Earbuds - sits in the outer ear
* Supra-aural - sits on the ear
* circumaural - completely cover the ear

Terminology used for comparing headphones
* Noise Cancelling
** active
** passive

* Frequency response - The range of frequencies the headphones can reproduce

* Impedance - Doesn't mean much by itself, but in general, impedence should be matched across a system.

* SPL@1kHz, 1V rms - [http://en.wikipedia.org/wiki/Sound_pressure_level#SPL_in_audio_equipment Sound Pressure Level]. How efficient the unit converts electrical energy to sound energy.

* Neodymium - The magnet of choice for high end headphones.
Popular Headphones
* Sub $30
* Sub $100
* $100 - $300
* $300+

Opus

2023-08-12T09:09:12Z

Artoria2e5: /* "Equivalent bitrate" */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
! Less than 6 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|4 kHz narrow-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!9 kbps VBR/CVBR 10 kbps CBR
|8 kHz wide-band
|SILK
|Telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: radio links requires a predictable buffer amount, which is only possible with CBR when SILK is used, but use of CBR in turn hurts SILK. There is a noticeable quality difference at the NB/WB switch at 9 kbps VBR / 10 kbps CBR.

Opus 1.3+ allows forced use of SILK down to 5 kbps VBR (NB) and 6 kbps VBR (WB, requires forcing the C API with <code>OPUS_SET_BANDWIDTH</code>). However, quality is in no way guaranteed -- it's just possible.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!4 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!9 kbps
|mono
|8 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

==== "Equivalent bitrate" ====
Opus code includes a [https://github.com/xiph/opus/blob/9fc8fc4cf432640f284113ba502ee027268b0d9f/src/opus_encoder.c#L806 {{code|compute_equiv_rate()}}] function. Given the bitrate, framesize, cbr decision, and complexity setting, it converts the bitrate to an standard config (VBR, 20 ms frame, complexity 10) equivalent to be used for bandwidth, layer, and stereo decisions. The interesting bits are:

* CBR requires 8% more bitrate for the same quality.
* Frame overhead is fixed and modelled as <code>(40*channels+20)*(frame_rate - 50)</code> for any frame_rate larger than 50. (frame_rate is the number of frames per second, so <code>1000/frame_size_ms</code>). There's no modelling for reduction in overhead from larger-than-standard frames: you'd imagine the expression runs in the opposite direction as well.
* Complexity turning results in up to 30% more bitrate requirement.

This layer of conversion is why Opus runs wideband speech at 9 kbps VBR and CVBR, but with CBR it takes 10 kbps (now we know it's exactly 9.75 kbps) to use WB.

=== Channel count vs bitrate ===

For surround sound bitrates, use [[Bitrate#Equivalent bitrate estimates for multichannel audio]].

For ambisonics, see [https://www.mdpi.com/2076-3417/10/9/3188 AMBIQUAL listening test], paper figures 11 and 12.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]
* [[loudgain]]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]
<references/>
[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Opus

2023-08-12T09:08:22Z

Artoria2e5: /* "Equivalent bitrate" */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
! Less than 6 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|4 kHz narrow-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!9 kbps VBR/CVBR 10 kbps CBR
|8 kHz wide-band
|SILK
|Telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: radio links requires a predictable buffer amount, which is only possible with CBR when SILK is used, but use of CBR in turn hurts SILK. There is a noticeable quality difference at the NB/WB switch at 9 kbps VBR / 10 kbps CBR.

Opus 1.3+ allows forced use of SILK down to 5 kbps VBR (NB) and 6 kbps VBR (WB, requires forcing the C API with <code>OPUS_SET_BANDWIDTH</code>). However, quality is in no way guaranteed -- it's just possible.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!4 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!9 kbps
|mono
|8 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

==== "Equivalent bitrate" ====
Opus code includes a [https://github.com/xiph/opus/blob/9fc8fc4cf432640f284113ba502ee027268b0d9f/src/opus_encoder.c#L806 {{code|compute_equiv_rate()}}] function. Given the bitrate, framesize, cbr decision, and complexity setting, it converts the bitrate to an standard config (VBR, 20 ms frame, complexity 10) equivalent to be used for bandwidth, layer, and stereo decisions. The interesting bits are:

* CBR requires 8% more bitrate for the same quality.
* Frame overhead is fixed and modelled as <code>(40*channels+20)*(frame_rate - 50);</code> for any frame_rate larger than 50. (frame_rate is the number of frames per second, so <code>1000/frame_size_ms</code>). There's no modelling for larger-than-standard frames.
* Complexity turning results in up to 30% more bitrate requirement.

This layer of conversion is why Opus runs wideband speech at 9 kbps VBR and CVBR, but with CBR it takes 10 kbps (now we know it's exactly 9.75 kbps) to use WB.

=== Channel count vs bitrate ===

For surround sound bitrates, use [[Bitrate#Equivalent bitrate estimates for multichannel audio]].

For ambisonics, see [https://www.mdpi.com/2076-3417/10/9/3188 AMBIQUAL listening test], paper figures 11 and 12.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]
* [[loudgain]]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]
<references/>
[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Opus

2023-08-12T09:07:54Z

Artoria2e5: /* Lower latency versus quality/bitrate trade-off */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
! Less than 6 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|4 kHz narrow-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!9 kbps VBR/CVBR 10 kbps CBR
|8 kHz wide-band
|SILK
|Telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: radio links requires a predictable buffer amount, which is only possible with CBR when SILK is used, but use of CBR in turn hurts SILK. There is a noticeable quality difference at the NB/WB switch at 9 kbps VBR / 10 kbps CBR.

Opus 1.3+ allows forced use of SILK down to 5 kbps VBR (NB) and 6 kbps VBR (WB, requires forcing the C API with <code>OPUS_SET_BANDWIDTH</code>). However, quality is in no way guaranteed -- it's just possible.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!4 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!9 kbps
|mono
|8 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

==== "Equivalent bitrate" ====
Opus code includes a [https://github.com/xiph/opus/blob/9fc8fc4cf432640f284113ba502ee027268b0d9f/src/opus_encoder.c#L806 {{code|compute_equiv_rate()}}] function. Given the bitrate, framesize, cbr decision, and complexity setting, it converts the bitrate to an standard config (VBR, 20 ms frame, complexity 10) equivalent to be used for bandwidth, layer, and stereo decisions. The interesting bits are:

* CBR requires 8% more bitrate for the same quality.
* Frame overhead is fixed and modelled as <code>(40*channels+20)*(frame_rate - 50);</code> for any frame_rate larger than 50. (frame_rate is the number of frames per second, so <code>1000/frame_size_ms</code>).
* Complexity turning results in up to 30% more bitrate requirement.

This layer of conversion is why Opus runs wideband speech at 9 kbps VBR and CVBR, but with CBR it takes 10 kbps (now we know it's exactly 9.75 kbps) to use WB.

=== Channel count vs bitrate ===

For surround sound bitrates, use [[Bitrate#Equivalent bitrate estimates for multichannel audio]].

For ambisonics, see [https://www.mdpi.com/2076-3417/10/9/3188 AMBIQUAL listening test], paper figures 11 and 12.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]
* [[loudgain]]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]
<references/>
[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Fraunhofer FDK AAC

2023-08-12T07:11:10Z

Artoria2e5: /* FDK License */

[[category:Encoder/Decoder]]
{{aac-encoders}}
The '''Fraunhofer FDK AAC''' is a high-quality open-source [[AAC]] [[codec|encoder]] library developed by [[Fraunhofer|Fraunhofer IIS]]. It was officially released for Android, but has been ported to other platforms.

The licensed Fraunhofer AAC codec included in Winamp (often called [[Fraunhofer#Fraunhofer IIS Codecs|FhG AAC]]) is not the same as the FDK AAC codec. While they use the same approach, they are developed by different teams, and target different platforms. The FDK library is built around fixed-point math and originally targeted low-delay communication on mobile devices.

FDK AAC is considered a favorable alternative to the [[Nero AAC]] codec, which is no longer developed.

== Software Versions ==

{|class="wikitable"
! Package/Component !! Version !! Developer/Maintainer !! License !! Description
|-
| FDK Encoder
| 4.0.0  [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/src/aacenc_lib.cpp]
|rowspan=4| [[Fraunhofer|Fraunhofer IIS]]
|rowspan=4| [[#FDK License|FDK License]]
|rowspan=4| The FDK AAC library included in Android.
|-
| FDK SBR/PS Encoder (for HE & HEv2)
| 4.0.0  [https://android.googlesource.com/platform/external/aac/+/master/libSBRenc/src/sbr_encoder.cpp]
|-
| FDK Decoder
| 3.0.0 [https://android.googlesource.com/platform/external/aac/+/master/libAACdec/src/aacdecoder_lib.cpp]
|-
| FDK SBR/PS Decoder
| 2.2.6 [https://android.googlesource.com/platform/external/aac/+/lollipop-release/libSBRdec/src/sbrdecoder.cpp]
|-
| [[#(lib)fdk-aac|fdk-aac]]
| 2.0.0 (2018-11-22) ([https://github.com/mstorsjo/fdk-aac/releases watch]) based on FDK AAC (4.0.0/3.0.0) shared library version 2.0.0 
| Martin Storsjö/Opencore AMR project
|rowspan=2|[[#FDK License|FDK License]] with additions under [http://www.apache.org/licenses/LICENSE-2.0 Apache 2.0 license]
| The FDK AAC encoder and decoder as a portable library separate from Android.
|-
| fdk-aac (debian)
| 0.1.6-1 [https://tracker.debian.org/pkg/fdk-aac] libfdk-aac0 (shared library version 1.1.0)
|
| The Debian source package for fdk-aac. Includes libfdk-aac* and the [[#aac-enc|aac-enc]] encoding front end.
|-
| [[#fdkaac|fdkaac]]
| 1.0.0 (using libfdk-aac 0.1.6)
| nu774
| zlib
| An advanced front-end to the FDK AAC encoder using libfdk-aac.
|-
| [[#FFmpeg|FFmpeg]]/[[#Libav/avconv|Libav]] support
| Libav: [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacenc.c encode], [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacdec.c decode] FFmpeg: [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacenc.c encode], [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacdec.c decode]
| Martin Storsjö
| [http://www.isc.org/downloads/software-support-policy/isc-license/ ISC license]
| A wrapper for libfdk-aac that adds support to FFmpeg and Libav/avconv. It is included in both projects, and is the recommended AAC encoder for FFmpeg.
|}

== FDK License ==
The license included by Fraunhofer in the FDK source code specifically allows distribution in source or binary forms, but does not license patented technologies described by the source code. It goes on to say "you may ''use'' this FDK AAC Codec software or modifications thereto only for purposes that are authorized by appropriate patent licenses".<ref>[https://android.googlesource.com/platform/external/aac/+/master/NOTICE NOTICE file], fdkaac</ref> As this governs ''use'', it should not have anything to do with distribution. The Free Software Foundation (FSF) considers it fishy to have an invitation to purchase patent licenses in the text, but concedes that "any program is potentially threatened by patents". Considering [[AAC#Patent situation|the patent terms]], all AAC software are indeed equally affected.

This license puts a limitation on charging for software that includes the library, leading Debian to consider it non-free. Debian does not comment on the patent situation.

The position of FFmpeg is that although the license is GPL-incompatible (and therefore nondistributable with GPL parts), it is acceptable to distribute the library with LGPL parts.<ref><code>ffmpeg -license</code> command output</ref> FFmpeg [https://ffmpeg.org/legal.html does not care about patents]. (The AAC patent license covers both encoder ''and'' decoder, so using fdk_aac does not add patent violations to FFmpeg.)

=== Free Software? ===
{| class=wikitable
|-
! Party !! Classification !! Note
|-
| Debian || {{no|Non-free}} || [https://tracker.debian.org/pkg/fdk-aac][https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=694257] (fee clause)
|-
| Fedora/Red Hat || {{maybe|Free but not Allowed}} || [https://fedoraproject.org/wiki/Licensing/FDK-AAC]: Fedora has since adopted a more defensive posture to patent language, making it not Allowed
|-
| FSF || {{maybe|Free (but warns about patents)}} || [https://www.gnu.org/licenses/license-list.html#fdk]
|}

== Afterburner ==
''Afterburner'' is "a type of analysis by synthesis algorithm which increases the audio quality but also the required processing power." Fraunhofer recommends to always activate this feature.

== Audio Object Types ==
The library supports the following MPEG-2/4 AOTs:
{| class="wikitable"
! Object Type ID !! Audio Object Type !! Description
|-
|2 || AAC-LC || "AAC Profile" MPEG-2 Low-complexity (LC) combined with MPEG-4 Perceptual Noise Substitution (PNS)
|-
|5 || HE-AAC || AAC LC + SBR (Spectral Band Replication)
|-
|29 || HE-AAC v2 || AAC LC + SBR + PS (Parametric Stereo)
|-
|23 || AAC-LD || "Low Delay Profile" used for real-time communication
|-
|39 || AAC-ELD || Enhanced Low Delay
|-
|129|| MPEG-2 AAC LC ||
|-
|132|| MPEG-2 HE-AAC (SBR) ||
|-
|156|| MPEG-2 HE-AAC v2 (SBR+PS) ||
|}

== Bitrate Modes ==

{|class="wikitable"
! AACENC_BITRATEMODE !! Mode !! Stream Bitrate
|-
| 0 || [[Constant Bitrate]] (CBR) || As specified by AACENC_BITRATE
|-
| 1-5 || [[Variable Bitrate]] (VBR) || Calculated based on channel layout (See table below)
|-
|colspan=3|
|-
| 6 ||colspan=2| Fixed frame mode.
|-
| 7 ||colspan=2| Superframe mode.
|-
| 8 ||colspan=2| LD/ELD full bitreservoir for packet based transmission
|}

The bitrate limit for each variable bitrate mode. [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/src/aacenc.cpp] HE and HEv2 will often end up with actual bitrates far below these limits.
{|class="wikitable"
!rowspan=2|AACENC_BITRATEMODE (VBR Modes) !!rowspan=2|Mode !!colspan=2|Bitrate per channel (LC) !!rowspan=2|AOTs
|-
! Mono !! Stereo a
|-
| 1 || VBR || 32 kbps || 20 kbps || LC, HE, HEv2
|-
| 2 || VBR || 40 kbps || 32 kbps || LC, HE, HEv2
|-
| 3 || VBR || 56 kbps || 48 kbps || LC, HE, HEv2
|-
| 4 || VBR || 72 kbps || 64 kbps || LC
|-
| 5 || VBR || 112 kbps || 96 kbps || LC
|}

a Note that a "stereo" channel is any that is [[Joint stereo|bonded with another channel]], as noted with a plus sign in the [[#Channel Layouts|channel layouts]] table.

====Example Bitrate Calculations====

{|class=wikitable
! Profile !! VBR Mode !! Channel layout !! Expected stream bitrate
|-
| LC || 3 || L+R || 2 "stereo" channels at 48kbps = 96kbps
|-
| LC || 3 || C, L+R || 1 "mono" center channel at 56 kbps and 2 "stereo" channels at 48kbps = 152kbps
|-
| LC || 4 || C, L+R, LS+RS, LFE || 1 "mono" center channel and 1 mono LFE channel each at 72kbps, and 4 "stereo" channels (2 sets of 2) each at 64kbps = 400kbps
|}

==Bandwidth==
[[Image:FDK filter.png|400px|thumb|A spectrogram showing the effect of the FDK AAC low-pass filter.]]
The default bandwidth (or low-pass filter cutoff) for each [[#Bitrate Modes|bitrate mode]] will be the minimum of the appropriate value in the tables below or half the [[#Sample Rates|sample rate]]. This can be overridden, but the maximum value is 20000 Hz. [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/src/bandwidth.cpp]

The fdk-aac parameter is AACENC_BANDWIDTH. More information can be found in the official documentation, section 3.1 ''Bandwidth''.

=== HE-AAC/SBR ===

The HE-AAC and HE-AACv2 profiles encode audio using AAC-LC at one half the sample rate, relying on [[Spectral Band Replication]] (SBR) to attempt reconstruction of the missing higher frequencies. The end result is an apparent full bandwidth transmission (as if no low-pass filter was applied), even though the actual AAC-LC encoded audio is only storing frequencies up to 1/4 the original sample rate.


=== VBR Modes ===
{|class="wikitable"
! AACENC_BITRATEMODE !! Mono !! Two or More Channels
|-
| 1 ||colspan=2| 13050 Hz
|-
| 2 ||colspan=2| 13050 Hz
|-
| 3 ||colspan=2| 14260 Hz
|-
| 4 ||colspan=2| 15500 Hz
|-
| 5 ||colspan=2| Full range, no filter
|}

=== CBR Mode ===
{|class="wikitable"
! AOT/Sample Rates !! Bitrate per channel !! Mono !! Two or More Channels
|-
|rowspan=8| LC / Any
| Below 12kbps || 3700 Hz || 5000 Hz
|-
| 12-20 kbps || 5000 Hz || 6400 Hz
|-
| 20-28 kbps || 6900 Hz || 9640 Hz
|-
| 28-40 kbps || 9600 Hz || 13050 Hz
|-
| 40-56 kbps || 12060 Hz || 14260 Hz
|-
| 56-72 kbps || 13950 Hz || 15500 Hz
|-
| 72-96 kpbs || 14200 Hz || 16120 Hz
|-
| 96kbps and above ||colspan=2| 17000 Hz
|-
|colspan=4|...
|-
|rowspan=2| LD / 44100 Hz
| 56kbps || 11000 Hz || 12900 Hz
|-
| 64kbps || 14400 Hz || 15500 Hz
|-
|colspan=4|...
|}

== Sample Format ==

The FDK library is based on fixed-point math and only supports 16-bit integer PCM input.

== Sample Rates ==

FDK library officially supports sample rates for input of 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, and 96000 Hz.

See [[#GetInvInt table limit|Issues/GetInvInt table limit]] if experiencing crashes with high sample rates and VBR.

Also see [[#Recommended Sampling Rate and Bitrate Combinations|Recommended Sampling Rate and Bitrate Combinations]].

== Channel Layouts ==

{| class="wikitable"
! Channels !! Layout !! Mode !! Description
|-
| 1 || C || MODE_1 || Mono
|-
| 2 || L+R || MODE_2 || Stereo
|-
| 3 || C, L+R || MODE_1_2 ||
|-
| 4 || C, L+R, Rear || MODE_1_2_1 || fdkaac calls it "C L R Cs"
|-
| 5 || C, L+R, LS+RS || MODE_1_2_2 ||
|-
| 5.1 || C, L+R, LS+RS, LFE || MODE_1_2_2_1 ||
|-
| 7.1 || C, LC+RC, L+R, LS+RS, LFE || MODE_1_2_2_2_1 MODE_7_1_FRONT_CENTER ||
|-
| 7.1 (Rear) || C, L+R, LS+RS, Lrear+Rrear, LFE || MODE_7_1_REAR_SURROUND ||
|}

The plus sign (+) denotes "stereo" channels.

== Issues ==
=== GetInvInt table limit ===
As of FDK version 3.4.12, not all combinations of audio object types, bitrate modes, channel layouts, and sample rates can be used together, due to a limited table of pre-computed values used by the encoder.

For example, using 96kHz stereo input with the AAC-LC audio object type and bitrate mode 5 (VBR 96-112kbps/channel) will result in catastrophic failure: [https://github.com/mstorsjo/fdk-aac/issues/17]
./libFDK/include/fixpoint_math.h:459: FIXP_DBL GetInvInt(int): Assertion `(intValue > 0) && (intValue < 50)' failed.
Aborted (core dumped)

A recent (August 2014) patch to libfdk-aac fixes most of the previously unsupported combinations [https://github.com/mstorsjo/fdk-aac/commit/9a3234055adb1e18f80571925779503c8dec5251], and is expected to be included in the next official version of the FDK AAC library.

See [[#Libav/avconv|Libav/avconv]] for a workaround.

== Recommended Sampling Rate and Bitrate Combinations ==

This table is from the documentation included in the FDK library source code. (PDF section 2.12 or source code: [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/include/aacenc_lib.h])

The following table provides an overview of recommended encoder configuration parameters which [Fraunhofer] determined by virtue of numerous listening tests.

{|class="wikitable"
! [[#Audio Object Types|Audio Object Type]] !! Bit Rate Range [bit/s] !! Supported [[#Sampling Rates|Sampling Rates]] [kHz] !! Recommended Sampling Rate [kHz] !! Number of [[#Channel Layouts|Channels]]
|-
|rowspan="4"| [29] HE-AAC v2 (AAC LC + SBR + PS)
| 8000 - 11999 || 22.05, 24.00 || 24.00 || 2
|-
| 12000 - 17999 || 32.00 || 32.00 || 2
|-
| 18000 - 39999 || 32.00, 44.10, 48.00 || 44.10 || 2
|-
| 40000 - 56000 || 32.00, 44.10, 48.00 || 48.00 || 2
|-
|rowspan="7"| [5] HE-AAC (AAC LC + SBR)
| 8000 - 11999 || 22.05, 24.00 || 24.00 || 1
|-
| 12000 - 17999 || 32.00 || 32.00 || 1
|-
| 18000 - 39999 || 32.00, 44.10, 48.00 || 44.10 || 1
|-
| 40000 - 56000 || 32.00, 44.10, 48.00 || 48.00 || 1
|-
| 16000 - 27999 || 32.00, 44.10, 48.00 || 32.00 || 2
|-
| 28000 - 63999 || 32.00, 44.10, 48.00 || 44.10 || 2
|-
| 64000 - 128000 || 32.00, 44.10, 48.00 || 48.00 || 2
|-
|rowspan="4"| [5] HE-AAC (AAC LC + SBR)
| 64000 - 69999 || 32.00, 44.10, 48.00 || 32.00 || 5, 5.1
|-
| 70000 - 159999 || 32.00, 44.10, 48.00 || 44.10 || 5, 5.1
|-
| 160000 - 245999 || 32.00, 44.10, 48.00 || 48.00 || 5
|-
| 160000 - 265999 || 32.00, 44.10, 48.00 || 48.00 || 5.1
|-
|rowspan="6"| [2] AAC LC
| 8000 - 15999 || 11.025, 12.00, 16.00 || 12.00 || 1
|-
| 16000 - 23999 || 16.00 || 16.00 || 1
|-
| 24000 - 31999 || 16.00, 22.05, 24.00 || 24.00 || 1
|-
| 32000 - 55999 || 32.00 || 32.00 || 1
|-
| 56000 - 160000 || 32.00, 44.10, 48.00 || 44.10 || 1
|-
| 160001 - 288000 || 48.00 || 48.00 || 1
|-
|rowspan="7"| [2] AAC LC
| 16000 - 23999 || 11.025, 12.00, 16.00 || 12.00 || 2
|-
| 24000 - 31999 || 16.00 || 16.00 || 2
|-
| 32000 - 39999 || 16.00, 22.05, 24.00 || 22.05 || 2
|-
| 40000 - 95999 || 32.00 || 32.00 || 2
|-
| 96000 - 111999 || 32.00, 44.10, 48.00 || 32.00 || 2
|-
| 112000 - 320001 || 32.00, 44.10, 48.00 || 44.10 || 2
|-
| 320002 - 576000 || 48.00 || 48.00 || 2
|-
|rowspan="3"| [2] AAC LC
| 160000 - 239999 || 32.00 || 32.00 || 5, 5.1
|-
| 240000 - 279999 || 32.00, 44.10, 48.00 || 32.00 || 5, 5.1
|-
| 280000 - 800000 || 32.00, 44.10, 48.00 || 44.10 || 5, 5.1
|}

== (lib)fdk-aac ==

Martin Storsjö (as the opencore-amr project) maintains a source code distribution of the Fraunhofer library as fdk-aac. It is distributed in a binary form in Debian (and Debian derivatives like Ubuntu) as the package fdk-aac, which includes the libfdk-aac* and [[#aac-enc|aac-enc]] binaries.

See [[#Software Versions|Software Versions]] for latest release information.

=== Links ===
* [https://github.com/mstorsjo/fdk-aac Source] at Github
* [https://tracker.debian.org/pkg/fdk-aac fdk-aac] at Debian package tracker. Package includes libfdk-aac* and the aac-enc binary.

== aac-enc ==

fdk-aac includes a very, very basic command-line interface encoding utility, called aac-enc, that can encode to AAC from WAV.

=== Usage ===
aac-enc [-r bitrate] [-t aot] [-a afterburner] [-s sbr] [-v vbr] in.wav out.aac

;-r <bitrate>:Bitrate in bits per seconds (for CBR). Default is 64000.
;-t <aot>:The [[#Audio Object Types|Audio Object Type]]. Default is 2 (AAC-LC).
;-a <0,1>:Enable [[#Afterburner|''Afterburner'']]. 0=Disabled, 1=Enabled (recommended). Default is 1.
;-s <-1,0,1>:Spectral Band Replication (ELD AOT only). -1=Use ELD SBR auto configurator (default,recommended), 0=Disabled, 1=Enabled. Default is -1.
;-v <0-5>:[[#Bitrate Modes|Bitrate mode]]. Only 0-5 used. 0=CBR @ value given in -r. Default is 0.

== fdkaac ==

fdkaac is a command-line interface encoding and metadata utility. It is maintained by nu774 and is licensed under the zlib license. It employs libfdk-aac for encoding.

See [[#Software Versions|Software Versions]] for latest release information.

=== Examples ===

# Convert a FLAC file to m4a using fdkaac configured for AAC-LC at about 50kbps/channel (100kbps for stereo).
flac -s -d -c song.flac | fdkaac --ignorelength --profile 2 --bitrate-mode 3 -o song.m4a -

=== Usage ===

fdkaac [options] input_file

;-p, --profile <n> :The [[#Audio Object Types|Audio Object Type]].
;-b, --bitrate <n> :Bitrate in bits per seconds (for CBR)
;-m, --bitrate-mode <n> :[[#Bitrate Modes|Bitrate mode]]. Only 0-5 used. 0=CBR.
;-w, --bandwidth <n> :Frequency [[#Bandwidth|bandwidth]] in Hz (AAC LC only)
;-a, --afterburner <n>:Enable [[#Afterburner|''Afterburner'']]. 0=Disabled, 1=Enabled (recommended). Default is 1.
;-L, --lowdelay-sbr <-1,0,1>:Configure SBR activity on AAC ELD
:{| class=wikitable
| -1 || Use ELD SBR auto configurator
|-
| 0 || Disable SBR on ELD (default)
|-
| 1 || Enable SBR on ELD
|}
; -s, --sbr-ratio <0,1,2> :Controls activation of downsampled SBR
:{| class=wikitable
| 0 || Use lib default (default)
|-
| 1 || Downsampled SBR (default for ELD+SBR)
|-
| 2 || Dual-rate SBR (default for HE-AAC)
|}
;-f, --transport-format <n> :Transport format
:{| class=wikitable
| 0 || RAW (default, muxed into M4A)
|-
| 1 || ADIF
|-
| 2 || ADTS
|-
| 6 || LATM MCP=1
|-
| 7 || LATM MCP=0
|-
|10 || LOAS/LATM (LATM within LOAS)
|}
;-C, --adts-crc-check : Add CRC protection on ADTS header
;-h, --header-period <n> : StreamMuxConfig/PCE repetition period in transport layer
;-o <filename> : Output filename
;-G, --gapless-mode <n> : Encoder delay signaling for gapless playback
:{| class=wikitable
| 0 || iTunSMPB (default)
|-
| 1 || ISO standard (edts + sgpd)
|-
| 2 || Both
|}
;--include-sbr-delay : Count SBR decoder delay in encoder delay. This is not iTunes compatible, but is default behavior of FDK library.
;-I, --ignorelength : Ignore length of WAV header
;-S, --silent : Don't print progress messages
;--moov-before-mdat : Place moov box before mdat box on m4a output

Options for raw (headerless) input:
;-R, --raw: Treat input as raw (by default WAV is assumed)
;--raw-channels <n> : Number of channels (default: 2)
;--raw-rate <n> : Sample rate (default: 44100)
;--raw-format <spec> : Sample format, default is "S16L". Spec is as follows:
:{|
| 1st char || S(igned), U(nsigned), or F(loat)
|-
| 2nd part || bits per channel
|-
| Last char || L(ittle) or B(ig)
|}
:Last char can be omitted, in which case L is assumed. Spec is case insensitive, therefore "u16b" is same as "U16B".
:Up to 32-bit integer or 64-bit floating point format is supported as input. The FDK library, however, is [[#Sample Format|implemented based on fixed point math and onlysupports 16-bit integer PCM]]. Therefore, be wary of clipping. You might want to dither/noise shape beforehand when your input has higher resolution.

Tagging options:
;--tag <fcc>:<value>: Set iTunes predefined tag with four char code. See [https://code.google.com/p/mp4v2/wiki/iTunesMetadata iTunes Metadata].
;--tag-from-file <fcc><nowiki>:</nowiki><filename> : Same as above, but value is read from file.
;--long-tag <name><nowiki>:</nowiki><value> : Set arbitrary tag as iTunes custom metadata.
;--tag-from-json <filename[?dot_notation]>
: Read tags from JSON. By default, tags are assumed to be direct children of the root object(dictionary). Optionally, position of the dictionary that contains tags can be specified with dotted notation.
{|class="wikitable sortable"
! Option/Usage !! MP4 Block Modified !!lass="unsortable"| Comment
|-
| --title <string> || ©nam
|-
| --artist <string> || ©ART
|-
| --album <string> || ©alb
|-
| --genre <string> || ©gen || Appears to always store the string the "user-defined" '''©gen''' even if there is an ID3 genre id that could be used with the '''gnre''' block.
|-
| --date <string> || ©day || YYYY[-MM[-DD]] format
|-
| --composer <string> || ©wrt
|-
| --grouping <string> || ©grp
|-
| --comment <string> || ©cmt
|-
| --album-artist <string> || aART
|-
| --track <number[/total]> || trkn || Block stores both track and totaltracks in one binary value
|-
| --disk <number[/total]> || disk || Block stores both disc and totaldiscs in one binary value
|-
| --tempo <n> || tmpo || Beats per minute, stored as a 16-bit integer
|}

=== Links ===
*[https://github.com/nu774/fdkaac Source code]
*[https://launchpad.net/~mc3man/+archive/ubuntu/fdkaac-encoder Ubuntu PPA]

== FFmpeg ==
libfdk-aac can be used with FFmpeg, but requires a custom build of FFmpeg. FFmpeg provides significant [https://trac.ffmpeg.org/wiki/Encode/AAC#fdk_aac documentation for using libfdk_aac] in the FFmpeg wiki.

=== Usage/Examples ===

CBR mode:
ffmpeg -i <input> -c:a libfdk_aac -b:a 128k <output>

VBR mode:
ffmpeg -i <input> -c:a libfdk_aac -vbr 3 <output>
;-afterburner:Enable [[#Afterburner|''Afterburner'']]. 0=Disabled, 1=Enabled (recommended). Default is 1.
;-profile<nowiki>:</nowiki>a:The [[#Audio Object Types|Audio Object Type]]. Value is one of LC, HE-AAC, HE-AACv2, LD, or ELD. Default is LC.
;-b<nowiki>:</nowiki>a:CBR bitrate
;-vbr:Values 1-5. See [[#Bitrate Modes|Bitrate mode]].
;--cutoff:The low-pass filter cut-off in Hz. See [[#Bandwidth|Bandwidth]] for default values. FFmpeg maximum value is 20000.

=== Links ===

* [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacenc.c libfdk-aacenc.c] in FFmpeg source tree
* [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacdec.c libfdk-aacdec.c] in FFmpeg source tree

== Libav/avconv ==
libfdk-aac can be used with Libav's avconv, but requires a custom build of avconv with "--enable-libfdk-aac" passed to configure. See [https://wiki.libav.org/Encoding/aac Libav AAC encoding].

=== Usage ===

CBR mode:
avconv -i <input> -c:a libfdk_aac -b:a <bitrate> -afterburner 1 <output>

VBR mode:
avconv -i <input> -c:a libfdk_aac -flags +qscale -global_quality [1-5] -afterburner 1 <output>

;-afterburner:See ''[[#Afterburner|afterburner]]''.
;-global_quality:Values 1-5. See [[#Bitrate Modes|Bitrate mode]].

=== FLAC to M4A example with quirks ===

Using a FLAC example with 24-bit/96kHz 5.1 channel audio, and embedded album art to demonstrate workarounds for some quirks/bugs. The [http://www.diatonis.com/downloads/diatonis_dark-edges_02_rock_flac_6-chan_9624.zip sample used] is from the [http://www.diatonis.com/surround_sound_music.html Diatonis Free Surround Sound Music] page. The track used is titled "Rock".

avconv -i diatonis-rock.flac -vn -sample_fmt s16 -ar 48000 -c:a libfdk_aac -flags +qscale -global_quality 5 diatonis-rock.m4a

;-global_quality 5:Use [[#Bitrate Modes|VBR Mode 5]].
;-vn:Means drop all video. The FLAC source has embedded album art that can't be handled by avconv in this case. Libav apparently doesn't know how to embed cover art in M4A. It tries to use it as an MP4 video stream. Using -c:v mjpeg, as can be done with MP3, doesn't work either. See [[Nero AAC#NeroAacTag|NeroAacTag]] for a tool that can easily add M4A album art.
;-sample_fmt s16 -ar 48000:The FLAC source's 96kHz sample rate combined with VBR mode 5 triggers the [[#GetInvInt_table_limit|GetInvInt table limit]] bug in libfdk_aac 0.1.3 and earlier. These options resample the audio before sending it to the FDK encoder, to avoid the crash.

=== Links ===
* [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacenc.c libfdk-aacenc.c] in Libav source tree
* [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacdec.c libfdk-aacdec.c] in Libav source tree

== References ==
<references />

== Links ==
* [http://www.iis.fraunhofer.de/en/ff/amm/impl/fdkaaccodec.html Official web page]
* [https://en.wikipedia.org/wiki/Fraunhofer_FDK_AAC Fraunhofer FDK AAC] at Wikipedia
* [http://www.hydrogenaud.io/forums/index.php?showtopic=95989 Release information HydrogenAudio forums]
* [https://android.googlesource.com/platform/external/aac/+/master/ FDK in Android source code]
* [https://github.com/mstorsjo/fdk-aac fdk-aac source] (github)
* [http://sourceforge.net/p/opencore-amr/fdk-aac/ci/master/tree/ fdk-aac source code] (sourceforge)

Fraunhofer FDK AAC

2023-08-12T06:30:49Z

Artoria2e5: /* FDK License */

[[category:Encoder/Decoder]]
{{aac-encoders}}
The '''Fraunhofer FDK AAC''' is a high-quality open-source [[AAC]] [[codec|encoder]] library developed by [[Fraunhofer|Fraunhofer IIS]]. It was officially released for Android, but has been ported to other platforms.

The licensed Fraunhofer AAC codec included in Winamp (often called [[Fraunhofer#Fraunhofer IIS Codecs|FhG AAC]]) is not the same as the FDK AAC codec. While they use the same approach, they are developed by different teams, and target different platforms. The FDK library is built around fixed-point math and originally targeted low-delay communication on mobile devices.

FDK AAC is considered a favorable alternative to the [[Nero AAC]] codec, which is no longer developed.

== Software Versions ==

{|class="wikitable"
! Package/Component !! Version !! Developer/Maintainer !! License !! Description
|-
| FDK Encoder
| 4.0.0  [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/src/aacenc_lib.cpp]
|rowspan=4| [[Fraunhofer|Fraunhofer IIS]]
|rowspan=4| [[#FDK License|FDK License]]
|rowspan=4| The FDK AAC library included in Android.
|-
| FDK SBR/PS Encoder (for HE & HEv2)
| 4.0.0  [https://android.googlesource.com/platform/external/aac/+/master/libSBRenc/src/sbr_encoder.cpp]
|-
| FDK Decoder
| 3.0.0 [https://android.googlesource.com/platform/external/aac/+/master/libAACdec/src/aacdecoder_lib.cpp]
|-
| FDK SBR/PS Decoder
| 2.2.6 [https://android.googlesource.com/platform/external/aac/+/lollipop-release/libSBRdec/src/sbrdecoder.cpp]
|-
| [[#(lib)fdk-aac|fdk-aac]]
| 2.0.0 (2018-11-22) ([https://github.com/mstorsjo/fdk-aac/releases watch]) based on FDK AAC (4.0.0/3.0.0) shared library version 2.0.0 
| Martin Storsjö/Opencore AMR project
|rowspan=2|[[#FDK License|FDK License]] with additions under [http://www.apache.org/licenses/LICENSE-2.0 Apache 2.0 license]
| The FDK AAC encoder and decoder as a portable library separate from Android.
|-
| fdk-aac (debian)
| 0.1.6-1 [https://tracker.debian.org/pkg/fdk-aac] libfdk-aac0 (shared library version 1.1.0)
|
| The Debian source package for fdk-aac. Includes libfdk-aac* and the [[#aac-enc|aac-enc]] encoding front end.
|-
| [[#fdkaac|fdkaac]]
| 1.0.0 (using libfdk-aac 0.1.6)
| nu774
| zlib
| An advanced front-end to the FDK AAC encoder using libfdk-aac.
|-
| [[#FFmpeg|FFmpeg]]/[[#Libav/avconv|Libav]] support
| Libav: [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacenc.c encode], [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacdec.c decode] FFmpeg: [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacenc.c encode], [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacdec.c decode]
| Martin Storsjö
| [http://www.isc.org/downloads/software-support-policy/isc-license/ ISC license]
| A wrapper for libfdk-aac that adds support to FFmpeg and Libav/avconv. It is included in both projects, and is the recommended AAC encoder for FFmpeg.
|}

== FDK License ==
The license included by Fraunhofer in the FDK source code specifically allows distribution in source or binary forms, but does not license patented technologies described by the source code. It goes on to say "you may ''use'' this FDK AAC Codec software or modifications thereto only for purposes that are authorized by appropriate patent licenses".<ref>[https://android.googlesource.com/platform/external/aac/+/master/NOTICE NOTICE file], fdkaac</ref> As this governs ''use'', it should not have anything to do with distribution. The Free Software Foundation (FSF) considers it fishy to have an invitation to purchase patent licenses in the text, but concedes that "any program is potentially threatened by patents". Considering [[AAC#Patent situation|the patent terms]], all AAC software are indeed equally affected.

This license puts a limitation on charging for software that includes the library, leading Debian to consider it non-free. Debian does not comment on the patent situation.

The position of FFmpeg is that although the license is GPL-incompatible (and therefore nondistributable with GPL parts), it is acceptable to distribute the library with LGPL parts.<ref><code>ffmpeg -license</code> command output</ref> FFmpeg [https://ffmpeg.org/legal.html does not care about patents]. (The AAC patent license covers both encoder ''and'' decoder, so using fdk_aac does not add patent violations to FFmpeg.)

=== Free Software? ===
{| class=wikitable
|-
! Party !! Classification !! Note
|-
| Debian || {{no|Non-free}} || [https://tracker.debian.org/pkg/fdk-aac][https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=694257]
|-
| Fedora/Red Hat || {{yes|Free}} || [https://fedoraproject.org/wiki/Licensing/FDK-AAC]
|-
| FSF || {{maybe|Free (but warns about patents)}} || [https://www.gnu.org/licenses/license-list.html#fdk]
|}

== Afterburner ==
''Afterburner'' is "a type of analysis by synthesis algorithm which increases the audio quality but also the required processing power." Fraunhofer recommends to always activate this feature.

== Audio Object Types ==
The library supports the following MPEG-2/4 AOTs:
{| class="wikitable"
! Object Type ID !! Audio Object Type !! Description
|-
|2 || AAC-LC || "AAC Profile" MPEG-2 Low-complexity (LC) combined with MPEG-4 Perceptual Noise Substitution (PNS)
|-
|5 || HE-AAC || AAC LC + SBR (Spectral Band Replication)
|-
|29 || HE-AAC v2 || AAC LC + SBR + PS (Parametric Stereo)
|-
|23 || AAC-LD || "Low Delay Profile" used for real-time communication
|-
|39 || AAC-ELD || Enhanced Low Delay
|-
|129|| MPEG-2 AAC LC ||
|-
|132|| MPEG-2 HE-AAC (SBR) ||
|-
|156|| MPEG-2 HE-AAC v2 (SBR+PS) ||
|}

== Bitrate Modes ==

{|class="wikitable"
! AACENC_BITRATEMODE !! Mode !! Stream Bitrate
|-
| 0 || [[Constant Bitrate]] (CBR) || As specified by AACENC_BITRATE
|-
| 1-5 || [[Variable Bitrate]] (VBR) || Calculated based on channel layout (See table below)
|-
|colspan=3|
|-
| 6 ||colspan=2| Fixed frame mode.
|-
| 7 ||colspan=2| Superframe mode.
|-
| 8 ||colspan=2| LD/ELD full bitreservoir for packet based transmission
|}

The bitrate limit for each variable bitrate mode. [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/src/aacenc.cpp] HE and HEv2 will often end up with actual bitrates far below these limits.
{|class="wikitable"
!rowspan=2|AACENC_BITRATEMODE (VBR Modes) !!rowspan=2|Mode !!colspan=2|Bitrate per channel (LC) !!rowspan=2|AOTs
|-
! Mono !! Stereo a
|-
| 1 || VBR || 32 kbps || 20 kbps || LC, HE, HEv2
|-
| 2 || VBR || 40 kbps || 32 kbps || LC, HE, HEv2
|-
| 3 || VBR || 56 kbps || 48 kbps || LC, HE, HEv2
|-
| 4 || VBR || 72 kbps || 64 kbps || LC
|-
| 5 || VBR || 112 kbps || 96 kbps || LC
|}

a Note that a "stereo" channel is any that is [[Joint stereo|bonded with another channel]], as noted with a plus sign in the [[#Channel Layouts|channel layouts]] table.

====Example Bitrate Calculations====

{|class=wikitable
! Profile !! VBR Mode !! Channel layout !! Expected stream bitrate
|-
| LC || 3 || L+R || 2 "stereo" channels at 48kbps = 96kbps
|-
| LC || 3 || C, L+R || 1 "mono" center channel at 56 kbps and 2 "stereo" channels at 48kbps = 152kbps
|-
| LC || 4 || C, L+R, LS+RS, LFE || 1 "mono" center channel and 1 mono LFE channel each at 72kbps, and 4 "stereo" channels (2 sets of 2) each at 64kbps = 400kbps
|}

==Bandwidth==
[[Image:FDK filter.png|400px|thumb|A spectrogram showing the effect of the FDK AAC low-pass filter.]]
The default bandwidth (or low-pass filter cutoff) for each [[#Bitrate Modes|bitrate mode]] will be the minimum of the appropriate value in the tables below or half the [[#Sample Rates|sample rate]]. This can be overridden, but the maximum value is 20000 Hz. [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/src/bandwidth.cpp]

The fdk-aac parameter is AACENC_BANDWIDTH. More information can be found in the official documentation, section 3.1 ''Bandwidth''.

=== HE-AAC/SBR ===

The HE-AAC and HE-AACv2 profiles encode audio using AAC-LC at one half the sample rate, relying on [[Spectral Band Replication]] (SBR) to attempt reconstruction of the missing higher frequencies. The end result is an apparent full bandwidth transmission (as if no low-pass filter was applied), even though the actual AAC-LC encoded audio is only storing frequencies up to 1/4 the original sample rate.


=== VBR Modes ===
{|class="wikitable"
! AACENC_BITRATEMODE !! Mono !! Two or More Channels
|-
| 1 ||colspan=2| 13050 Hz
|-
| 2 ||colspan=2| 13050 Hz
|-
| 3 ||colspan=2| 14260 Hz
|-
| 4 ||colspan=2| 15500 Hz
|-
| 5 ||colspan=2| Full range, no filter
|}

=== CBR Mode ===
{|class="wikitable"
! AOT/Sample Rates !! Bitrate per channel !! Mono !! Two or More Channels
|-
|rowspan=8| LC / Any
| Below 12kbps || 3700 Hz || 5000 Hz
|-
| 12-20 kbps || 5000 Hz || 6400 Hz
|-
| 20-28 kbps || 6900 Hz || 9640 Hz
|-
| 28-40 kbps || 9600 Hz || 13050 Hz
|-
| 40-56 kbps || 12060 Hz || 14260 Hz
|-
| 56-72 kbps || 13950 Hz || 15500 Hz
|-
| 72-96 kpbs || 14200 Hz || 16120 Hz
|-
| 96kbps and above ||colspan=2| 17000 Hz
|-
|colspan=4|...
|-
|rowspan=2| LD / 44100 Hz
| 56kbps || 11000 Hz || 12900 Hz
|-
| 64kbps || 14400 Hz || 15500 Hz
|-
|colspan=4|...
|}

== Sample Format ==

The FDK library is based on fixed-point math and only supports 16-bit integer PCM input.

== Sample Rates ==

FDK library officially supports sample rates for input of 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, and 96000 Hz.

See [[#GetInvInt table limit|Issues/GetInvInt table limit]] if experiencing crashes with high sample rates and VBR.

Also see [[#Recommended Sampling Rate and Bitrate Combinations|Recommended Sampling Rate and Bitrate Combinations]].

== Channel Layouts ==

{| class="wikitable"
! Channels !! Layout !! Mode !! Description
|-
| 1 || C || MODE_1 || Mono
|-
| 2 || L+R || MODE_2 || Stereo
|-
| 3 || C, L+R || MODE_1_2 ||
|-
| 4 || C, L+R, Rear || MODE_1_2_1 || fdkaac calls it "C L R Cs"
|-
| 5 || C, L+R, LS+RS || MODE_1_2_2 ||
|-
| 5.1 || C, L+R, LS+RS, LFE || MODE_1_2_2_1 ||
|-
| 7.1 || C, LC+RC, L+R, LS+RS, LFE || MODE_1_2_2_2_1 MODE_7_1_FRONT_CENTER ||
|-
| 7.1 (Rear) || C, L+R, LS+RS, Lrear+Rrear, LFE || MODE_7_1_REAR_SURROUND ||
|}

The plus sign (+) denotes "stereo" channels.

== Issues ==
=== GetInvInt table limit ===
As of FDK version 3.4.12, not all combinations of audio object types, bitrate modes, channel layouts, and sample rates can be used together, due to a limited table of pre-computed values used by the encoder.

For example, using 96kHz stereo input with the AAC-LC audio object type and bitrate mode 5 (VBR 96-112kbps/channel) will result in catastrophic failure: [https://github.com/mstorsjo/fdk-aac/issues/17]
./libFDK/include/fixpoint_math.h:459: FIXP_DBL GetInvInt(int): Assertion `(intValue > 0) && (intValue < 50)' failed.
Aborted (core dumped)

A recent (August 2014) patch to libfdk-aac fixes most of the previously unsupported combinations [https://github.com/mstorsjo/fdk-aac/commit/9a3234055adb1e18f80571925779503c8dec5251], and is expected to be included in the next official version of the FDK AAC library.

See [[#Libav/avconv|Libav/avconv]] for a workaround.

== Recommended Sampling Rate and Bitrate Combinations ==

This table is from the documentation included in the FDK library source code. (PDF section 2.12 or source code: [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/include/aacenc_lib.h])

The following table provides an overview of recommended encoder configuration parameters which [Fraunhofer] determined by virtue of numerous listening tests.

{|class="wikitable"
! [[#Audio Object Types|Audio Object Type]] !! Bit Rate Range [bit/s] !! Supported [[#Sampling Rates|Sampling Rates]] [kHz] !! Recommended Sampling Rate [kHz] !! Number of [[#Channel Layouts|Channels]]
|-
|rowspan="4"| [29] HE-AAC v2 (AAC LC + SBR + PS)
| 8000 - 11999 || 22.05, 24.00 || 24.00 || 2
|-
| 12000 - 17999 || 32.00 || 32.00 || 2
|-
| 18000 - 39999 || 32.00, 44.10, 48.00 || 44.10 || 2
|-
| 40000 - 56000 || 32.00, 44.10, 48.00 || 48.00 || 2
|-
|rowspan="7"| [5] HE-AAC (AAC LC + SBR)
| 8000 - 11999 || 22.05, 24.00 || 24.00 || 1
|-
| 12000 - 17999 || 32.00 || 32.00 || 1
|-
| 18000 - 39999 || 32.00, 44.10, 48.00 || 44.10 || 1
|-
| 40000 - 56000 || 32.00, 44.10, 48.00 || 48.00 || 1
|-
| 16000 - 27999 || 32.00, 44.10, 48.00 || 32.00 || 2
|-
| 28000 - 63999 || 32.00, 44.10, 48.00 || 44.10 || 2
|-
| 64000 - 128000 || 32.00, 44.10, 48.00 || 48.00 || 2
|-
|rowspan="4"| [5] HE-AAC (AAC LC + SBR)
| 64000 - 69999 || 32.00, 44.10, 48.00 || 32.00 || 5, 5.1
|-
| 70000 - 159999 || 32.00, 44.10, 48.00 || 44.10 || 5, 5.1
|-
| 160000 - 245999 || 32.00, 44.10, 48.00 || 48.00 || 5
|-
| 160000 - 265999 || 32.00, 44.10, 48.00 || 48.00 || 5.1
|-
|rowspan="6"| [2] AAC LC
| 8000 - 15999 || 11.025, 12.00, 16.00 || 12.00 || 1
|-
| 16000 - 23999 || 16.00 || 16.00 || 1
|-
| 24000 - 31999 || 16.00, 22.05, 24.00 || 24.00 || 1
|-
| 32000 - 55999 || 32.00 || 32.00 || 1
|-
| 56000 - 160000 || 32.00, 44.10, 48.00 || 44.10 || 1
|-
| 160001 - 288000 || 48.00 || 48.00 || 1
|-
|rowspan="7"| [2] AAC LC
| 16000 - 23999 || 11.025, 12.00, 16.00 || 12.00 || 2
|-
| 24000 - 31999 || 16.00 || 16.00 || 2
|-
| 32000 - 39999 || 16.00, 22.05, 24.00 || 22.05 || 2
|-
| 40000 - 95999 || 32.00 || 32.00 || 2
|-
| 96000 - 111999 || 32.00, 44.10, 48.00 || 32.00 || 2
|-
| 112000 - 320001 || 32.00, 44.10, 48.00 || 44.10 || 2
|-
| 320002 - 576000 || 48.00 || 48.00 || 2
|-
|rowspan="3"| [2] AAC LC
| 160000 - 239999 || 32.00 || 32.00 || 5, 5.1
|-
| 240000 - 279999 || 32.00, 44.10, 48.00 || 32.00 || 5, 5.1
|-
| 280000 - 800000 || 32.00, 44.10, 48.00 || 44.10 || 5, 5.1
|}

== (lib)fdk-aac ==

Martin Storsjö (as the opencore-amr project) maintains a source code distribution of the Fraunhofer library as fdk-aac. It is distributed in a binary form in Debian (and Debian derivatives like Ubuntu) as the package fdk-aac, which includes the libfdk-aac* and [[#aac-enc|aac-enc]] binaries.

See [[#Software Versions|Software Versions]] for latest release information.

=== Links ===
* [https://github.com/mstorsjo/fdk-aac Source] at Github
* [https://tracker.debian.org/pkg/fdk-aac fdk-aac] at Debian package tracker. Package includes libfdk-aac* and the aac-enc binary.

== aac-enc ==

fdk-aac includes a very, very basic command-line interface encoding utility, called aac-enc, that can encode to AAC from WAV.

=== Usage ===
aac-enc [-r bitrate] [-t aot] [-a afterburner] [-s sbr] [-v vbr] in.wav out.aac

;-r <bitrate>:Bitrate in bits per seconds (for CBR). Default is 64000.
;-t <aot>:The [[#Audio Object Types|Audio Object Type]]. Default is 2 (AAC-LC).
;-a <0,1>:Enable [[#Afterburner|''Afterburner'']]. 0=Disabled, 1=Enabled (recommended). Default is 1.
;-s <-1,0,1>:Spectral Band Replication (ELD AOT only). -1=Use ELD SBR auto configurator (default,recommended), 0=Disabled, 1=Enabled. Default is -1.
;-v <0-5>:[[#Bitrate Modes|Bitrate mode]]. Only 0-5 used. 0=CBR @ value given in -r. Default is 0.

== fdkaac ==

fdkaac is a command-line interface encoding and metadata utility. It is maintained by nu774 and is licensed under the zlib license. It employs libfdk-aac for encoding.

See [[#Software Versions|Software Versions]] for latest release information.

=== Examples ===

# Convert a FLAC file to m4a using fdkaac configured for AAC-LC at about 50kbps/channel (100kbps for stereo).
flac -s -d -c song.flac | fdkaac --ignorelength --profile 2 --bitrate-mode 3 -o song.m4a -

=== Usage ===

fdkaac [options] input_file

;-p, --profile <n> :The [[#Audio Object Types|Audio Object Type]].
;-b, --bitrate <n> :Bitrate in bits per seconds (for CBR)
;-m, --bitrate-mode <n> :[[#Bitrate Modes|Bitrate mode]]. Only 0-5 used. 0=CBR.
;-w, --bandwidth <n> :Frequency [[#Bandwidth|bandwidth]] in Hz (AAC LC only)
;-a, --afterburner <n>:Enable [[#Afterburner|''Afterburner'']]. 0=Disabled, 1=Enabled (recommended). Default is 1.
;-L, --lowdelay-sbr <-1,0,1>:Configure SBR activity on AAC ELD
:{| class=wikitable
| -1 || Use ELD SBR auto configurator
|-
| 0 || Disable SBR on ELD (default)
|-
| 1 || Enable SBR on ELD
|}
; -s, --sbr-ratio <0,1,2> :Controls activation of downsampled SBR
:{| class=wikitable
| 0 || Use lib default (default)
|-
| 1 || Downsampled SBR (default for ELD+SBR)
|-
| 2 || Dual-rate SBR (default for HE-AAC)
|}
;-f, --transport-format <n> :Transport format
:{| class=wikitable
| 0 || RAW (default, muxed into M4A)
|-
| 1 || ADIF
|-
| 2 || ADTS
|-
| 6 || LATM MCP=1
|-
| 7 || LATM MCP=0
|-
|10 || LOAS/LATM (LATM within LOAS)
|}
;-C, --adts-crc-check : Add CRC protection on ADTS header
;-h, --header-period <n> : StreamMuxConfig/PCE repetition period in transport layer
;-o <filename> : Output filename
;-G, --gapless-mode <n> : Encoder delay signaling for gapless playback
:{| class=wikitable
| 0 || iTunSMPB (default)
|-
| 1 || ISO standard (edts + sgpd)
|-
| 2 || Both
|}
;--include-sbr-delay : Count SBR decoder delay in encoder delay. This is not iTunes compatible, but is default behavior of FDK library.
;-I, --ignorelength : Ignore length of WAV header
;-S, --silent : Don't print progress messages
;--moov-before-mdat : Place moov box before mdat box on m4a output

Options for raw (headerless) input:
;-R, --raw: Treat input as raw (by default WAV is assumed)
;--raw-channels <n> : Number of channels (default: 2)
;--raw-rate <n> : Sample rate (default: 44100)
;--raw-format <spec> : Sample format, default is "S16L". Spec is as follows:
:{|
| 1st char || S(igned), U(nsigned), or F(loat)
|-
| 2nd part || bits per channel
|-
| Last char || L(ittle) or B(ig)
|}
:Last char can be omitted, in which case L is assumed. Spec is case insensitive, therefore "u16b" is same as "U16B".
:Up to 32-bit integer or 64-bit floating point format is supported as input. The FDK library, however, is [[#Sample Format|implemented based on fixed point math and onlysupports 16-bit integer PCM]]. Therefore, be wary of clipping. You might want to dither/noise shape beforehand when your input has higher resolution.

Tagging options:
;--tag <fcc>:<value>: Set iTunes predefined tag with four char code. See [https://code.google.com/p/mp4v2/wiki/iTunesMetadata iTunes Metadata].
;--tag-from-file <fcc><nowiki>:</nowiki><filename> : Same as above, but value is read from file.
;--long-tag <name><nowiki>:</nowiki><value> : Set arbitrary tag as iTunes custom metadata.
;--tag-from-json <filename[?dot_notation]>
: Read tags from JSON. By default, tags are assumed to be direct children of the root object(dictionary). Optionally, position of the dictionary that contains tags can be specified with dotted notation.
{|class="wikitable sortable"
! Option/Usage !! MP4 Block Modified !!lass="unsortable"| Comment
|-
| --title <string> || ©nam
|-
| --artist <string> || ©ART
|-
| --album <string> || ©alb
|-
| --genre <string> || ©gen || Appears to always store the string the "user-defined" '''©gen''' even if there is an ID3 genre id that could be used with the '''gnre''' block.
|-
| --date <string> || ©day || YYYY[-MM[-DD]] format
|-
| --composer <string> || ©wrt
|-
| --grouping <string> || ©grp
|-
| --comment <string> || ©cmt
|-
| --album-artist <string> || aART
|-
| --track <number[/total]> || trkn || Block stores both track and totaltracks in one binary value
|-
| --disk <number[/total]> || disk || Block stores both disc and totaldiscs in one binary value
|-
| --tempo <n> || tmpo || Beats per minute, stored as a 16-bit integer
|}

=== Links ===
*[https://github.com/nu774/fdkaac Source code]
*[https://launchpad.net/~mc3man/+archive/ubuntu/fdkaac-encoder Ubuntu PPA]

== FFmpeg ==
libfdk-aac can be used with FFmpeg, but requires a custom build of FFmpeg. FFmpeg provides significant [https://trac.ffmpeg.org/wiki/Encode/AAC#fdk_aac documentation for using libfdk_aac] in the FFmpeg wiki.

=== Usage/Examples ===

CBR mode:
ffmpeg -i <input> -c:a libfdk_aac -b:a 128k <output>

VBR mode:
ffmpeg -i <input> -c:a libfdk_aac -vbr 3 <output>
;-afterburner:Enable [[#Afterburner|''Afterburner'']]. 0=Disabled, 1=Enabled (recommended). Default is 1.
;-profile<nowiki>:</nowiki>a:The [[#Audio Object Types|Audio Object Type]]. Value is one of LC, HE-AAC, HE-AACv2, LD, or ELD. Default is LC.
;-b<nowiki>:</nowiki>a:CBR bitrate
;-vbr:Values 1-5. See [[#Bitrate Modes|Bitrate mode]].
;--cutoff:The low-pass filter cut-off in Hz. See [[#Bandwidth|Bandwidth]] for default values. FFmpeg maximum value is 20000.

=== Links ===

* [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacenc.c libfdk-aacenc.c] in FFmpeg source tree
* [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacdec.c libfdk-aacdec.c] in FFmpeg source tree

== Libav/avconv ==
libfdk-aac can be used with Libav's avconv, but requires a custom build of avconv with "--enable-libfdk-aac" passed to configure. See [https://wiki.libav.org/Encoding/aac Libav AAC encoding].

=== Usage ===

CBR mode:
avconv -i <input> -c:a libfdk_aac -b:a <bitrate> -afterburner 1 <output>

VBR mode:
avconv -i <input> -c:a libfdk_aac -flags +qscale -global_quality [1-5] -afterburner 1 <output>

;-afterburner:See ''[[#Afterburner|afterburner]]''.
;-global_quality:Values 1-5. See [[#Bitrate Modes|Bitrate mode]].

=== FLAC to M4A example with quirks ===

Using a FLAC example with 24-bit/96kHz 5.1 channel audio, and embedded album art to demonstrate workarounds for some quirks/bugs. The [http://www.diatonis.com/downloads/diatonis_dark-edges_02_rock_flac_6-chan_9624.zip sample used] is from the [http://www.diatonis.com/surround_sound_music.html Diatonis Free Surround Sound Music] page. The track used is titled "Rock".

avconv -i diatonis-rock.flac -vn -sample_fmt s16 -ar 48000 -c:a libfdk_aac -flags +qscale -global_quality 5 diatonis-rock.m4a

;-global_quality 5:Use [[#Bitrate Modes|VBR Mode 5]].
;-vn:Means drop all video. The FLAC source has embedded album art that can't be handled by avconv in this case. Libav apparently doesn't know how to embed cover art in M4A. It tries to use it as an MP4 video stream. Using -c:v mjpeg, as can be done with MP3, doesn't work either. See [[Nero AAC#NeroAacTag|NeroAacTag]] for a tool that can easily add M4A album art.
;-sample_fmt s16 -ar 48000:The FLAC source's 96kHz sample rate combined with VBR mode 5 triggers the [[#GetInvInt_table_limit|GetInvInt table limit]] bug in libfdk_aac 0.1.3 and earlier. These options resample the audio before sending it to the FDK encoder, to avoid the crash.

=== Links ===
* [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacenc.c libfdk-aacenc.c] in Libav source tree
* [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacdec.c libfdk-aacdec.c] in Libav source tree

== References ==
<references />

== Links ==
* [http://www.iis.fraunhofer.de/en/ff/amm/impl/fdkaaccodec.html Official web page]
* [https://en.wikipedia.org/wiki/Fraunhofer_FDK_AAC Fraunhofer FDK AAC] at Wikipedia
* [http://www.hydrogenaud.io/forums/index.php?showtopic=95989 Release information HydrogenAudio forums]
* [https://android.googlesource.com/platform/external/aac/+/master/ FDK in Android source code]
* [https://github.com/mstorsjo/fdk-aac fdk-aac source] (github)
* [http://sourceforge.net/p/opencore-amr/fdk-aac/ci/master/tree/ fdk-aac source code] (sourceforge)

Fraunhofer FDK AAC

2023-08-12T06:30:13Z

Artoria2e5: /* FDK License */

[[category:Encoder/Decoder]]
{{aac-encoders}}
The '''Fraunhofer FDK AAC''' is a high-quality open-source [[AAC]] [[codec|encoder]] library developed by [[Fraunhofer|Fraunhofer IIS]]. It was officially released for Android, but has been ported to other platforms.

The licensed Fraunhofer AAC codec included in Winamp (often called [[Fraunhofer#Fraunhofer IIS Codecs|FhG AAC]]) is not the same as the FDK AAC codec. While they use the same approach, they are developed by different teams, and target different platforms. The FDK library is built around fixed-point math and originally targeted low-delay communication on mobile devices.

FDK AAC is considered a favorable alternative to the [[Nero AAC]] codec, which is no longer developed.

== Software Versions ==

{|class="wikitable"
! Package/Component !! Version !! Developer/Maintainer !! License !! Description
|-
| FDK Encoder
| 4.0.0  [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/src/aacenc_lib.cpp]
|rowspan=4| [[Fraunhofer|Fraunhofer IIS]]
|rowspan=4| [[#FDK License|FDK License]]
|rowspan=4| The FDK AAC library included in Android.
|-
| FDK SBR/PS Encoder (for HE & HEv2)
| 4.0.0  [https://android.googlesource.com/platform/external/aac/+/master/libSBRenc/src/sbr_encoder.cpp]
|-
| FDK Decoder
| 3.0.0 [https://android.googlesource.com/platform/external/aac/+/master/libAACdec/src/aacdecoder_lib.cpp]
|-
| FDK SBR/PS Decoder
| 2.2.6 [https://android.googlesource.com/platform/external/aac/+/lollipop-release/libSBRdec/src/sbrdecoder.cpp]
|-
| [[#(lib)fdk-aac|fdk-aac]]
| 2.0.0 (2018-11-22) ([https://github.com/mstorsjo/fdk-aac/releases watch]) based on FDK AAC (4.0.0/3.0.0) shared library version 2.0.0 
| Martin Storsjö/Opencore AMR project
|rowspan=2|[[#FDK License|FDK License]] with additions under [http://www.apache.org/licenses/LICENSE-2.0 Apache 2.0 license]
| The FDK AAC encoder and decoder as a portable library separate from Android.
|-
| fdk-aac (debian)
| 0.1.6-1 [https://tracker.debian.org/pkg/fdk-aac] libfdk-aac0 (shared library version 1.1.0)
|
| The Debian source package for fdk-aac. Includes libfdk-aac* and the [[#aac-enc|aac-enc]] encoding front end.
|-
| [[#fdkaac|fdkaac]]
| 1.0.0 (using libfdk-aac 0.1.6)
| nu774
| zlib
| An advanced front-end to the FDK AAC encoder using libfdk-aac.
|-
| [[#FFmpeg|FFmpeg]]/[[#Libav/avconv|Libav]] support
| Libav: [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacenc.c encode], [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacdec.c decode] FFmpeg: [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacenc.c encode], [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacdec.c decode]
| Martin Storsjö
| [http://www.isc.org/downloads/software-support-policy/isc-license/ ISC license]
| A wrapper for libfdk-aac that adds support to FFmpeg and Libav/avconv. It is included in both projects, and is the recommended AAC encoder for FFmpeg.
|}

== FDK License ==
The license included by Fraunhofer in the FDK source code specifically allows distribution in source or binary forms, but does not license patented technologies described by the source code. It goes on to say "you may ''use'' this FDK AAC Codec software or modifications thereto only for purposes that are authorized by appropriate patent licenses".<ref>[https://android.googlesource.com/platform/external/aac/+/master/NOTICE NOTICE file], fdkaac</ref> As this governs ''use'', it should not have anything to do with distribution. The FSF considers it fishy to have an invitation to purchase patent licenses in the text, but concedes that "any program is potentially threatened by patents". Considering [[AAC#Patent situation|the patent terms]], all AAC software are indeed equally affected.

This license puts a limitation on charging for software that includes the library, leading Debian to consider it non-free. Debian does not comment on the patent situation.

The position of FFmpeg is that although the license is GPL-incompatible (and therefore nondistributable with GPL parts), it is acceptable to distribute the library with LGPL parts.<ref><code>ffmpeg -license</code> command output</ref> FFmpeg [https://ffmpeg.org/legal.html does not care about patents]. (The AAC patent license covers both encoder ''and'' decoder, so using fdk_aac does not add patent violations to FFmpeg.)

=== Free Software? ===
{| class=wikitable
|-
! Party !! Classification !! Note
|-
| Debian || {{no|Non-free}} || [https://tracker.debian.org/pkg/fdk-aac][https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=694257]
|-
| Fedora/Red Hat || {{yes|Free}} || [https://fedoraproject.org/wiki/Licensing/FDK-AAC]
|-
| FSF || {{maybe|Free (but warns about patents)}} || [https://www.gnu.org/licenses/license-list.html#fdk]
|}

== Afterburner ==
''Afterburner'' is "a type of analysis by synthesis algorithm which increases the audio quality but also the required processing power." Fraunhofer recommends to always activate this feature.

== Audio Object Types ==
The library supports the following MPEG-2/4 AOTs:
{| class="wikitable"
! Object Type ID !! Audio Object Type !! Description
|-
|2 || AAC-LC || "AAC Profile" MPEG-2 Low-complexity (LC) combined with MPEG-4 Perceptual Noise Substitution (PNS)
|-
|5 || HE-AAC || AAC LC + SBR (Spectral Band Replication)
|-
|29 || HE-AAC v2 || AAC LC + SBR + PS (Parametric Stereo)
|-
|23 || AAC-LD || "Low Delay Profile" used for real-time communication
|-
|39 || AAC-ELD || Enhanced Low Delay
|-
|129|| MPEG-2 AAC LC ||
|-
|132|| MPEG-2 HE-AAC (SBR) ||
|-
|156|| MPEG-2 HE-AAC v2 (SBR+PS) ||
|}

== Bitrate Modes ==

{|class="wikitable"
! AACENC_BITRATEMODE !! Mode !! Stream Bitrate
|-
| 0 || [[Constant Bitrate]] (CBR) || As specified by AACENC_BITRATE
|-
| 1-5 || [[Variable Bitrate]] (VBR) || Calculated based on channel layout (See table below)
|-
|colspan=3|
|-
| 6 ||colspan=2| Fixed frame mode.
|-
| 7 ||colspan=2| Superframe mode.
|-
| 8 ||colspan=2| LD/ELD full bitreservoir for packet based transmission
|}

The bitrate limit for each variable bitrate mode. [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/src/aacenc.cpp] HE and HEv2 will often end up with actual bitrates far below these limits.
{|class="wikitable"
!rowspan=2|AACENC_BITRATEMODE (VBR Modes) !!rowspan=2|Mode !!colspan=2|Bitrate per channel (LC) !!rowspan=2|AOTs
|-
! Mono !! Stereo a
|-
| 1 || VBR || 32 kbps || 20 kbps || LC, HE, HEv2
|-
| 2 || VBR || 40 kbps || 32 kbps || LC, HE, HEv2
|-
| 3 || VBR || 56 kbps || 48 kbps || LC, HE, HEv2
|-
| 4 || VBR || 72 kbps || 64 kbps || LC
|-
| 5 || VBR || 112 kbps || 96 kbps || LC
|}

a Note that a "stereo" channel is any that is [[Joint stereo|bonded with another channel]], as noted with a plus sign in the [[#Channel Layouts|channel layouts]] table.

====Example Bitrate Calculations====

{|class=wikitable
! Profile !! VBR Mode !! Channel layout !! Expected stream bitrate
|-
| LC || 3 || L+R || 2 "stereo" channels at 48kbps = 96kbps
|-
| LC || 3 || C, L+R || 1 "mono" center channel at 56 kbps and 2 "stereo" channels at 48kbps = 152kbps
|-
| LC || 4 || C, L+R, LS+RS, LFE || 1 "mono" center channel and 1 mono LFE channel each at 72kbps, and 4 "stereo" channels (2 sets of 2) each at 64kbps = 400kbps
|}

==Bandwidth==
[[Image:FDK filter.png|400px|thumb|A spectrogram showing the effect of the FDK AAC low-pass filter.]]
The default bandwidth (or low-pass filter cutoff) for each [[#Bitrate Modes|bitrate mode]] will be the minimum of the appropriate value in the tables below or half the [[#Sample Rates|sample rate]]. This can be overridden, but the maximum value is 20000 Hz. [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/src/bandwidth.cpp]

The fdk-aac parameter is AACENC_BANDWIDTH. More information can be found in the official documentation, section 3.1 ''Bandwidth''.

=== HE-AAC/SBR ===

The HE-AAC and HE-AACv2 profiles encode audio using AAC-LC at one half the sample rate, relying on [[Spectral Band Replication]] (SBR) to attempt reconstruction of the missing higher frequencies. The end result is an apparent full bandwidth transmission (as if no low-pass filter was applied), even though the actual AAC-LC encoded audio is only storing frequencies up to 1/4 the original sample rate.


=== VBR Modes ===
{|class="wikitable"
! AACENC_BITRATEMODE !! Mono !! Two or More Channels
|-
| 1 ||colspan=2| 13050 Hz
|-
| 2 ||colspan=2| 13050 Hz
|-
| 3 ||colspan=2| 14260 Hz
|-
| 4 ||colspan=2| 15500 Hz
|-
| 5 ||colspan=2| Full range, no filter
|}

=== CBR Mode ===
{|class="wikitable"
! AOT/Sample Rates !! Bitrate per channel !! Mono !! Two or More Channels
|-
|rowspan=8| LC / Any
| Below 12kbps || 3700 Hz || 5000 Hz
|-
| 12-20 kbps || 5000 Hz || 6400 Hz
|-
| 20-28 kbps || 6900 Hz || 9640 Hz
|-
| 28-40 kbps || 9600 Hz || 13050 Hz
|-
| 40-56 kbps || 12060 Hz || 14260 Hz
|-
| 56-72 kbps || 13950 Hz || 15500 Hz
|-
| 72-96 kpbs || 14200 Hz || 16120 Hz
|-
| 96kbps and above ||colspan=2| 17000 Hz
|-
|colspan=4|...
|-
|rowspan=2| LD / 44100 Hz
| 56kbps || 11000 Hz || 12900 Hz
|-
| 64kbps || 14400 Hz || 15500 Hz
|-
|colspan=4|...
|}

== Sample Format ==

The FDK library is based on fixed-point math and only supports 16-bit integer PCM input.

== Sample Rates ==

FDK library officially supports sample rates for input of 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, and 96000 Hz.

See [[#GetInvInt table limit|Issues/GetInvInt table limit]] if experiencing crashes with high sample rates and VBR.

Also see [[#Recommended Sampling Rate and Bitrate Combinations|Recommended Sampling Rate and Bitrate Combinations]].

== Channel Layouts ==

{| class="wikitable"
! Channels !! Layout !! Mode !! Description
|-
| 1 || C || MODE_1 || Mono
|-
| 2 || L+R || MODE_2 || Stereo
|-
| 3 || C, L+R || MODE_1_2 ||
|-
| 4 || C, L+R, Rear || MODE_1_2_1 || fdkaac calls it "C L R Cs"
|-
| 5 || C, L+R, LS+RS || MODE_1_2_2 ||
|-
| 5.1 || C, L+R, LS+RS, LFE || MODE_1_2_2_1 ||
|-
| 7.1 || C, LC+RC, L+R, LS+RS, LFE || MODE_1_2_2_2_1 MODE_7_1_FRONT_CENTER ||
|-
| 7.1 (Rear) || C, L+R, LS+RS, Lrear+Rrear, LFE || MODE_7_1_REAR_SURROUND ||
|}

The plus sign (+) denotes "stereo" channels.

== Issues ==
=== GetInvInt table limit ===
As of FDK version 3.4.12, not all combinations of audio object types, bitrate modes, channel layouts, and sample rates can be used together, due to a limited table of pre-computed values used by the encoder.

For example, using 96kHz stereo input with the AAC-LC audio object type and bitrate mode 5 (VBR 96-112kbps/channel) will result in catastrophic failure: [https://github.com/mstorsjo/fdk-aac/issues/17]
./libFDK/include/fixpoint_math.h:459: FIXP_DBL GetInvInt(int): Assertion `(intValue > 0) && (intValue < 50)' failed.
Aborted (core dumped)

A recent (August 2014) patch to libfdk-aac fixes most of the previously unsupported combinations [https://github.com/mstorsjo/fdk-aac/commit/9a3234055adb1e18f80571925779503c8dec5251], and is expected to be included in the next official version of the FDK AAC library.

See [[#Libav/avconv|Libav/avconv]] for a workaround.

== Recommended Sampling Rate and Bitrate Combinations ==

This table is from the documentation included in the FDK library source code. (PDF section 2.12 or source code: [https://android.googlesource.com/platform/external/aac/+/master/libAACenc/include/aacenc_lib.h])

The following table provides an overview of recommended encoder configuration parameters which [Fraunhofer] determined by virtue of numerous listening tests.

{|class="wikitable"
! [[#Audio Object Types|Audio Object Type]] !! Bit Rate Range [bit/s] !! Supported [[#Sampling Rates|Sampling Rates]] [kHz] !! Recommended Sampling Rate [kHz] !! Number of [[#Channel Layouts|Channels]]
|-
|rowspan="4"| [29] HE-AAC v2 (AAC LC + SBR + PS)
| 8000 - 11999 || 22.05, 24.00 || 24.00 || 2
|-
| 12000 - 17999 || 32.00 || 32.00 || 2
|-
| 18000 - 39999 || 32.00, 44.10, 48.00 || 44.10 || 2
|-
| 40000 - 56000 || 32.00, 44.10, 48.00 || 48.00 || 2
|-
|rowspan="7"| [5] HE-AAC (AAC LC + SBR)
| 8000 - 11999 || 22.05, 24.00 || 24.00 || 1
|-
| 12000 - 17999 || 32.00 || 32.00 || 1
|-
| 18000 - 39999 || 32.00, 44.10, 48.00 || 44.10 || 1
|-
| 40000 - 56000 || 32.00, 44.10, 48.00 || 48.00 || 1
|-
| 16000 - 27999 || 32.00, 44.10, 48.00 || 32.00 || 2
|-
| 28000 - 63999 || 32.00, 44.10, 48.00 || 44.10 || 2
|-
| 64000 - 128000 || 32.00, 44.10, 48.00 || 48.00 || 2
|-
|rowspan="4"| [5] HE-AAC (AAC LC + SBR)
| 64000 - 69999 || 32.00, 44.10, 48.00 || 32.00 || 5, 5.1
|-
| 70000 - 159999 || 32.00, 44.10, 48.00 || 44.10 || 5, 5.1
|-
| 160000 - 245999 || 32.00, 44.10, 48.00 || 48.00 || 5
|-
| 160000 - 265999 || 32.00, 44.10, 48.00 || 48.00 || 5.1
|-
|rowspan="6"| [2] AAC LC
| 8000 - 15999 || 11.025, 12.00, 16.00 || 12.00 || 1
|-
| 16000 - 23999 || 16.00 || 16.00 || 1
|-
| 24000 - 31999 || 16.00, 22.05, 24.00 || 24.00 || 1
|-
| 32000 - 55999 || 32.00 || 32.00 || 1
|-
| 56000 - 160000 || 32.00, 44.10, 48.00 || 44.10 || 1
|-
| 160001 - 288000 || 48.00 || 48.00 || 1
|-
|rowspan="7"| [2] AAC LC
| 16000 - 23999 || 11.025, 12.00, 16.00 || 12.00 || 2
|-
| 24000 - 31999 || 16.00 || 16.00 || 2
|-
| 32000 - 39999 || 16.00, 22.05, 24.00 || 22.05 || 2
|-
| 40000 - 95999 || 32.00 || 32.00 || 2
|-
| 96000 - 111999 || 32.00, 44.10, 48.00 || 32.00 || 2
|-
| 112000 - 320001 || 32.00, 44.10, 48.00 || 44.10 || 2
|-
| 320002 - 576000 || 48.00 || 48.00 || 2
|-
|rowspan="3"| [2] AAC LC
| 160000 - 239999 || 32.00 || 32.00 || 5, 5.1
|-
| 240000 - 279999 || 32.00, 44.10, 48.00 || 32.00 || 5, 5.1
|-
| 280000 - 800000 || 32.00, 44.10, 48.00 || 44.10 || 5, 5.1
|}

== (lib)fdk-aac ==

Martin Storsjö (as the opencore-amr project) maintains a source code distribution of the Fraunhofer library as fdk-aac. It is distributed in a binary form in Debian (and Debian derivatives like Ubuntu) as the package fdk-aac, which includes the libfdk-aac* and [[#aac-enc|aac-enc]] binaries.

See [[#Software Versions|Software Versions]] for latest release information.

=== Links ===
* [https://github.com/mstorsjo/fdk-aac Source] at Github
* [https://tracker.debian.org/pkg/fdk-aac fdk-aac] at Debian package tracker. Package includes libfdk-aac* and the aac-enc binary.

== aac-enc ==

fdk-aac includes a very, very basic command-line interface encoding utility, called aac-enc, that can encode to AAC from WAV.

=== Usage ===
aac-enc [-r bitrate] [-t aot] [-a afterburner] [-s sbr] [-v vbr] in.wav out.aac

;-r <bitrate>:Bitrate in bits per seconds (for CBR). Default is 64000.
;-t <aot>:The [[#Audio Object Types|Audio Object Type]]. Default is 2 (AAC-LC).
;-a <0,1>:Enable [[#Afterburner|''Afterburner'']]. 0=Disabled, 1=Enabled (recommended). Default is 1.
;-s <-1,0,1>:Spectral Band Replication (ELD AOT only). -1=Use ELD SBR auto configurator (default,recommended), 0=Disabled, 1=Enabled. Default is -1.
;-v <0-5>:[[#Bitrate Modes|Bitrate mode]]. Only 0-5 used. 0=CBR @ value given in -r. Default is 0.

== fdkaac ==

fdkaac is a command-line interface encoding and metadata utility. It is maintained by nu774 and is licensed under the zlib license. It employs libfdk-aac for encoding.

See [[#Software Versions|Software Versions]] for latest release information.

=== Examples ===

# Convert a FLAC file to m4a using fdkaac configured for AAC-LC at about 50kbps/channel (100kbps for stereo).
flac -s -d -c song.flac | fdkaac --ignorelength --profile 2 --bitrate-mode 3 -o song.m4a -

=== Usage ===

fdkaac [options] input_file

;-p, --profile <n> :The [[#Audio Object Types|Audio Object Type]].
;-b, --bitrate <n> :Bitrate in bits per seconds (for CBR)
;-m, --bitrate-mode <n> :[[#Bitrate Modes|Bitrate mode]]. Only 0-5 used. 0=CBR.
;-w, --bandwidth <n> :Frequency [[#Bandwidth|bandwidth]] in Hz (AAC LC only)
;-a, --afterburner <n>:Enable [[#Afterburner|''Afterburner'']]. 0=Disabled, 1=Enabled (recommended). Default is 1.
;-L, --lowdelay-sbr <-1,0,1>:Configure SBR activity on AAC ELD
:{| class=wikitable
| -1 || Use ELD SBR auto configurator
|-
| 0 || Disable SBR on ELD (default)
|-
| 1 || Enable SBR on ELD
|}
; -s, --sbr-ratio <0,1,2> :Controls activation of downsampled SBR
:{| class=wikitable
| 0 || Use lib default (default)
|-
| 1 || Downsampled SBR (default for ELD+SBR)
|-
| 2 || Dual-rate SBR (default for HE-AAC)
|}
;-f, --transport-format <n> :Transport format
:{| class=wikitable
| 0 || RAW (default, muxed into M4A)
|-
| 1 || ADIF
|-
| 2 || ADTS
|-
| 6 || LATM MCP=1
|-
| 7 || LATM MCP=0
|-
|10 || LOAS/LATM (LATM within LOAS)
|}
;-C, --adts-crc-check : Add CRC protection on ADTS header
;-h, --header-period <n> : StreamMuxConfig/PCE repetition period in transport layer
;-o <filename> : Output filename
;-G, --gapless-mode <n> : Encoder delay signaling for gapless playback
:{| class=wikitable
| 0 || iTunSMPB (default)
|-
| 1 || ISO standard (edts + sgpd)
|-
| 2 || Both
|}
;--include-sbr-delay : Count SBR decoder delay in encoder delay. This is not iTunes compatible, but is default behavior of FDK library.
;-I, --ignorelength : Ignore length of WAV header
;-S, --silent : Don't print progress messages
;--moov-before-mdat : Place moov box before mdat box on m4a output

Options for raw (headerless) input:
;-R, --raw: Treat input as raw (by default WAV is assumed)
;--raw-channels <n> : Number of channels (default: 2)
;--raw-rate <n> : Sample rate (default: 44100)
;--raw-format <spec> : Sample format, default is "S16L". Spec is as follows:
:{|
| 1st char || S(igned), U(nsigned), or F(loat)
|-
| 2nd part || bits per channel
|-
| Last char || L(ittle) or B(ig)
|}
:Last char can be omitted, in which case L is assumed. Spec is case insensitive, therefore "u16b" is same as "U16B".
:Up to 32-bit integer or 64-bit floating point format is supported as input. The FDK library, however, is [[#Sample Format|implemented based on fixed point math and onlysupports 16-bit integer PCM]]. Therefore, be wary of clipping. You might want to dither/noise shape beforehand when your input has higher resolution.

Tagging options:
;--tag <fcc>:<value>: Set iTunes predefined tag with four char code. See [https://code.google.com/p/mp4v2/wiki/iTunesMetadata iTunes Metadata].
;--tag-from-file <fcc><nowiki>:</nowiki><filename> : Same as above, but value is read from file.
;--long-tag <name><nowiki>:</nowiki><value> : Set arbitrary tag as iTunes custom metadata.
;--tag-from-json <filename[?dot_notation]>
: Read tags from JSON. By default, tags are assumed to be direct children of the root object(dictionary). Optionally, position of the dictionary that contains tags can be specified with dotted notation.
{|class="wikitable sortable"
! Option/Usage !! MP4 Block Modified !!lass="unsortable"| Comment
|-
| --title <string> || ©nam
|-
| --artist <string> || ©ART
|-
| --album <string> || ©alb
|-
| --genre <string> || ©gen || Appears to always store the string the "user-defined" '''©gen''' even if there is an ID3 genre id that could be used with the '''gnre''' block.
|-
| --date <string> || ©day || YYYY[-MM[-DD]] format
|-
| --composer <string> || ©wrt
|-
| --grouping <string> || ©grp
|-
| --comment <string> || ©cmt
|-
| --album-artist <string> || aART
|-
| --track <number[/total]> || trkn || Block stores both track and totaltracks in one binary value
|-
| --disk <number[/total]> || disk || Block stores both disc and totaldiscs in one binary value
|-
| --tempo <n> || tmpo || Beats per minute, stored as a 16-bit integer
|}

=== Links ===
*[https://github.com/nu774/fdkaac Source code]
*[https://launchpad.net/~mc3man/+archive/ubuntu/fdkaac-encoder Ubuntu PPA]

== FFmpeg ==
libfdk-aac can be used with FFmpeg, but requires a custom build of FFmpeg. FFmpeg provides significant [https://trac.ffmpeg.org/wiki/Encode/AAC#fdk_aac documentation for using libfdk_aac] in the FFmpeg wiki.

=== Usage/Examples ===

CBR mode:
ffmpeg -i <input> -c:a libfdk_aac -b:a 128k <output>

VBR mode:
ffmpeg -i <input> -c:a libfdk_aac -vbr 3 <output>
;-afterburner:Enable [[#Afterburner|''Afterburner'']]. 0=Disabled, 1=Enabled (recommended). Default is 1.
;-profile<nowiki>:</nowiki>a:The [[#Audio Object Types|Audio Object Type]]. Value is one of LC, HE-AAC, HE-AACv2, LD, or ELD. Default is LC.
;-b<nowiki>:</nowiki>a:CBR bitrate
;-vbr:Values 1-5. See [[#Bitrate Modes|Bitrate mode]].
;--cutoff:The low-pass filter cut-off in Hz. See [[#Bandwidth|Bandwidth]] for default values. FFmpeg maximum value is 20000.

=== Links ===

* [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacenc.c libfdk-aacenc.c] in FFmpeg source tree
* [https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libfdk-aacdec.c libfdk-aacdec.c] in FFmpeg source tree

== Libav/avconv ==
libfdk-aac can be used with Libav's avconv, but requires a custom build of avconv with "--enable-libfdk-aac" passed to configure. See [https://wiki.libav.org/Encoding/aac Libav AAC encoding].

=== Usage ===

CBR mode:
avconv -i <input> -c:a libfdk_aac -b:a <bitrate> -afterburner 1 <output>

VBR mode:
avconv -i <input> -c:a libfdk_aac -flags +qscale -global_quality [1-5] -afterburner 1 <output>

;-afterburner:See ''[[#Afterburner|afterburner]]''.
;-global_quality:Values 1-5. See [[#Bitrate Modes|Bitrate mode]].

=== FLAC to M4A example with quirks ===

Using a FLAC example with 24-bit/96kHz 5.1 channel audio, and embedded album art to demonstrate workarounds for some quirks/bugs. The [http://www.diatonis.com/downloads/diatonis_dark-edges_02_rock_flac_6-chan_9624.zip sample used] is from the [http://www.diatonis.com/surround_sound_music.html Diatonis Free Surround Sound Music] page. The track used is titled "Rock".

avconv -i diatonis-rock.flac -vn -sample_fmt s16 -ar 48000 -c:a libfdk_aac -flags +qscale -global_quality 5 diatonis-rock.m4a

;-global_quality 5:Use [[#Bitrate Modes|VBR Mode 5]].
;-vn:Means drop all video. The FLAC source has embedded album art that can't be handled by avconv in this case. Libav apparently doesn't know how to embed cover art in M4A. It tries to use it as an MP4 video stream. Using -c:v mjpeg, as can be done with MP3, doesn't work either. See [[Nero AAC#NeroAacTag|NeroAacTag]] for a tool that can easily add M4A album art.
;-sample_fmt s16 -ar 48000:The FLAC source's 96kHz sample rate combined with VBR mode 5 triggers the [[#GetInvInt_table_limit|GetInvInt table limit]] bug in libfdk_aac 0.1.3 and earlier. These options resample the audio before sending it to the FDK encoder, to avoid the crash.

=== Links ===
* [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacenc.c libfdk-aacenc.c] in Libav source tree
* [https://git.libav.org/?p=libav.git;a=blob;f=libavcodec/libfdk-aacdec.c libfdk-aacdec.c] in Libav source tree

== References ==
<references />

== Links ==
* [http://www.iis.fraunhofer.de/en/ff/amm/impl/fdkaaccodec.html Official web page]
* [https://en.wikipedia.org/wiki/Fraunhofer_FDK_AAC Fraunhofer FDK AAC] at Wikipedia
* [http://www.hydrogenaud.io/forums/index.php?showtopic=95989 Release information HydrogenAudio forums]
* [https://android.googlesource.com/platform/external/aac/+/master/ FDK in Android source code]
* [https://github.com/mstorsjo/fdk-aac fdk-aac source] (github)
* [http://sourceforge.net/p/opencore-amr/fdk-aac/ci/master/tree/ fdk-aac source code] (sourceforge)

Opus

2023-08-12T05:29:37Z

Artoria2e5: /* Music encoding quality */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
! Less than 6 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|4 kHz narrow-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!9 kbps VBR/CVBR 10 kbps CBR
|8 kHz wide-band
|SILK
|Telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: radio links requires a predictable buffer amount, which is only possible with CBR when SILK is used, but use of CBR in turn hurts SILK. There is a noticeable quality difference at the NB/WB switch at 9 kbps VBR / 10 kbps CBR.

Opus 1.3+ allows forced use of SILK down to 5 kbps VBR (NB) and 6 kbps VBR (WB, requires forcing the C API with <code>OPUS_SET_BANDWIDTH</code>). However, quality is in no way guaranteed -- it's just possible.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!4 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!9 kbps
|mono
|8 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

=== Channel count vs bitrate ===

For surround sound bitrates, use [[Bitrate#Equivalent bitrate estimates for multichannel audio]].

For ambisonics, see [https://www.mdpi.com/2076-3417/10/9/3188 AMBIQUAL listening test], paper figures 11 and 12.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]
* [[loudgain]]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]
<references/>
[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Opus

2023-08-12T05:28:34Z

Artoria2e5: /* Other software */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
! Less than 6 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|4 kHz narrow-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!9 kbps VBR/CVBR 10 kbps CBR
|8 kHz wide-band
|SILK
|Telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: radio links requires a predictable buffer amount, which is only possible with CBR when SILK is used, but use of CBR in turn hurts SILK. There is a noticeable quality difference at the NB/WB switch at 9 kbps VBR / 10 kbps CBR.

Opus 1.3+ allows forced use of SILK down to 5 kbps VBR (NB) and 6 kbps VBR (WB, requires forcing the C API with <code>OPUS_SET_BANDWIDTH</code>). However, quality is in no way guaranteed -- it's just possible.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!6 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!8 kbps
|mono
|6 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

=== Channel count vs bitrate ===

For surround sound bitrates, use [[Bitrate#Equivalent bitrate estimates for multichannel audio]].

For ambisonics, see [https://www.mdpi.com/2076-3417/10/9/3188 AMBIQUAL listening test], paper figures 11 and 12.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]
* [[loudgain]]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]
<references/>
[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Opus

2023-08-12T05:26:08Z

Artoria2e5: /* Channel count vs bitrate */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
! Less than 6 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|4 kHz narrow-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!9 kbps VBR/CVBR 10 kbps CBR
|8 kHz wide-band
|SILK
|Telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: radio links requires a predictable buffer amount, which is only possible with CBR when SILK is used, but use of CBR in turn hurts SILK. There is a noticeable quality difference at the NB/WB switch at 9 kbps VBR / 10 kbps CBR.

Opus 1.3+ allows forced use of SILK down to 5 kbps VBR (NB) and 6 kbps VBR (WB, requires forcing the C API with <code>OPUS_SET_BANDWIDTH</code>). However, quality is in no way guaranteed -- it's just possible.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!6 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!8 kbps
|mono
|6 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

=== Channel count vs bitrate ===

For surround sound bitrates, use [[Bitrate#Equivalent bitrate estimates for multichannel audio]].

For ambisonics, see [https://www.mdpi.com/2076-3417/10/9/3188 AMBIQUAL listening test], paper figures 11 and 12.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]
<references/>
[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Opus

2023-08-12T05:26:00Z

Artoria2e5: /* Indicative bitrate and quality */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
! Less than 6 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|4 kHz narrow-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!9 kbps VBR/CVBR 10 kbps CBR
|8 kHz wide-band
|SILK
|Telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: radio links requires a predictable buffer amount, which is only possible with CBR when SILK is used, but use of CBR in turn hurts SILK. There is a noticeable quality difference at the NB/WB switch at 9 kbps VBR / 10 kbps CBR.

Opus 1.3+ allows forced use of SILK down to 5 kbps VBR (NB) and 6 kbps VBR (WB, requires forcing the C API with <code>OPUS_SET_BANDWIDTH</code>). However, quality is in no way guaranteed -- it's just possible.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!6 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!8 kbps
|mono
|6 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

== Channel count vs bitrate ==

For surround sound bitrates, use [[Bitrate#Equivalent bitrate estimates for multichannel audio]].

For ambisonics, see [https://www.mdpi.com/2076-3417/10/9/3188 AMBIQUAL listening test], paper figures 11 and 12.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]
<references/>
[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Opus

2023-08-12T05:23:43Z

Artoria2e5: /* Speech encoding quality */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
! Less than 6 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|4 kHz narrow-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!9 kbps VBR/CVBR 10 kbps CBR
|8 kHz wide-band
|SILK
|Telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: radio links requires a predictable buffer amount, which is only possible with CBR when SILK is used, but use of CBR in turn hurts SILK. There is a noticeable quality difference at the NB/WB switch at 9 kbps VBR / 10 kbps CBR.

Opus 1.3+ allows forced use of SILK down to 5 kbps VBR (NB) and 6 kbps VBR (WB, requires forcing the C API with <code>OPUS_SET_BANDWIDTH</code>). However, quality is in no way guaranteed -- it's just possible.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!6 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!8 kbps
|mono
|6 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]
<references/>
[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Opus

2023-08-12T05:22:18Z

Artoria2e5: /* Speech encoding quality */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
! Less than 6 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|4 kHz narrow-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!9 kbps VBR/CVBR 10 kbps CBR
|8 kHz wide-band
|SILK
|Telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: radio links requires a predictable buffer amount, which is only possible with CBR when SILK is used, but use of CBR in turn hurts SILK.

Opus 1.3+ allows forced use of SILK down to 5 kbps VBR (NB) and 6 kbps VBR (WB, requires forcing the C API with <code>OPUS_SET_BANDWIDTH</code>). However, quality is in no way guaranteed -- it's just possible.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!6 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!8 kbps
|mono
|6 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]
<references/>
[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Opus

2023-08-12T05:21:17Z

Artoria2e5: /* Speech encoding quality */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
! Less than 6 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|4 kHz narrow-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!9 kbps VBR/CVBR 10 kbps CBR
|8 kHz wide-band
|SILK
|Telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: radio links requires a predictable buffer amount, which is only possible with CBR when SILK is used, but use of CBR in turn hurts SILK.

Opus 1.3+ allows forced use of SILK down to 5 kbps VBR (NB) and 6 kbps VBR (WB, requires using the raw C API with <code>OPUS_SET_BANDWIDTH</code>). However, quality is in no way guaranteed -- it's just possible.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!6 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!8 kbps
|mono
|6 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]
<references/>
[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Joint stereo

2023-08-12T05:12:14Z

Artoria2e5: /* Opus */

'''Joint stereo''' refers to any stereo-encoding method that goes beyond simple encoding as two independent channels ("simple" or "L/R" stereo or DualMono). These methods exploit the similarities between channels and typically allow for more bits to be effectively used, increasing audio quality for a given bitrate. They are, however, not guaranteed to be perfect and could instead cause audible artifacts (mostly on older encoders).

Some file formats, such as MP3, can do switch among these formats on-the-fly on a frame or sub-frame basis, for the sake of efficiency or quality. For example, a high-[[bitrate]] "joint stereo" [[MP3]] file may contain a mixture of SS and MS frames, or it may contain all SS frames or all MS frames. Due to some historical accident, the term as applied in MP3 refers to a mixture of coding formats. In other words, a non-"joint stereo" MP3 will never contain a mixture of frame types.

==Stereo coding methods or "modes"==

===Left-Right (L/R) or "Simple" Stereo (SS)===
Simple stereo is the most straightforward method of coding a stereo signal: each channel is treated as a completely separate entity. This can be inefficient and may adversely impact quality (as compared to other modes) when both channels contain nearly identical signals (i.e., are mono or nearly so).

===Mid-side Stereo (MS)===
Mid-side stereo coding calculates a "mid"-channel by addition of left and right channel, and a "side"-channel by subtraction, i.e.:

;Encoding
:''M'' = (''L'' + ''R'') / 2, ''S'' = (''L'' - ''R'') / 2
;Decoding
:''L'' = ''M'' + ''S'', R = ''M'' - ''S''

Whenever a signal is concentrated in the middle of the stereo image (i.e. more mono-like), mid-side stereo can achieve a significant saving in bitrate, since one can use fewer bits to encode the side-channel. Even more important is the fact that by applying the inverse matrix in the decoder, the quantization noise becomes correlated and falls in the middle of the stereo image, where it is masked by the signal.

Unlike [[Joint stereo#intensity stereo|intensity stereo]] which destroys phase information, mid-side coding is mathematically lossless (although subsequent lossy compression may cause phase degredation). Correctly implemented mid-side stereo does very little or no damage to the stereo image and increases compression efficiency either by reducing size or increasing overall quality. Mid-side is also simple enough to be implemented in FM radio and stereophonic Vinyl.

Mid-side stereo can use coefficients other than 1 in encoding and decoding. Allowing different contributions from each channel allows the codec to adapt to off-balance sources and retain the bitrate savings. This extension is found in opus, where an angle can be encoded.<ref name=opus/>

===Intensity stereo===

Intensity stereo coding is a method that achieves a saving in bitrate by replacing the left and the right signal by a single representing signal plus directional information (in the form of amplitude ratios for each frequency range). This replacement is psychoacoustically justified in the higher [[frequency]] range since the human auditory system is insensitive to the signal phase at frequencies above approximately 2 kHz.<ref>http://www.hydrogenaudio.org/forums/index.php?showtopic=1491&view=findpost&p=14091</ref> To maintain the justification, a codec may only apply intensity stereo to higher-frequency parameters.<ref name=opus>See e.g. https://web.archive.org/web/20180714000735/http://jmvalin.ca/papers/aes135_opus_celt.pdf, sections 4.5 [IS frequency], 4.5.1 [M/S angle]</ref>

Intensity stereo is by definition a [[lossy]] coding method thus it is primarily useful at low bitrates. For coding at higher bitrates only mid-side stereo should be used.

===Parametric stereo===
Parametric stereo, found in HE-AAC, is similar to intensity stereo, except that the directional information also includes phase and correlation. The phase information makes this algorithm also capable of keeping low frequency location cues (by inter-aural time differences), while the (de-)correlation information helps add ambience by synthesizing some difference between channels.<ref name=LC-M4>Purnhagen, Heiko (October 5–8, 2004). [http://dafx.de/paper-archive/2004/P_163.PDF LOW COMPLEXITY PARAMETRIC STEREO CODING IN MPEG-4]" (PDF). 7th International Conference on Digital Audio Effects: 163–168.</ref>

PS replaces a whole channel with only 2-3 kbit/s of side information. As a result, the remaining channel gets almost double the bitrate to use, so the quality gain can more than makes up for the lossiness of the process. It is not useful at high bitrate.

The phase aspect is covered by a few patents applied in 1997~2000 (EP1107232A3, EP0797324A2), which should have expired. The ambience part (EP1927266B1) will expire in 2026, so do not expect any new experimental codec to use it yet.

== More channels ==
The general idea of exploiting the redundancy among channels is called ''channel coupling''.

=== Surround ===

Surround is structured like stereo in some ways, except now there are many more pairs that can be coupled together. The basic approach is to code together the corresponding pairs of left and right using ordinary joint stereo techniques.

In MPEG Surround, a process similar to parametric stereo is used to three streams into two, or two streams into one – plus a small stream of side information. A stream created by merging itself can be merged, creating a hierarchy of merges. For example, a 5.1 stream can be encoded as merges of C/LFE, L/Ls, R/Rs, then these three streams can be mixed down if needed.

=== Ambisonics ===

Ambisonics represents an entire sound field. In the raw representation, everything is based on spherical harmonics.

* Multi-mono lossy encoding is unacceptably bad for ambisonics. Each stream does its own thing with the phase, resulting in a incoherent sound image.<ref>Phase/ambisonic issue discussed in: Mahé, Pierre; Ragot, Stéphane; Marchand, Sylvain (2 September 2019). ''[https://hal.science/hal-02289558 First-Order Ambisonic Coding with PCA Matrixing and Quaternion-Based Interpolation]''. 22nd International Conference on Digital Audio Effects (DAFx-19), Birmingham, UK. p. 284.</ref>
* A fixed encoding matrix, such as the one in Opus, is passable. Sources a fixed direction gets much better quality (because it only goes in one stream: no chance of phase inconsistencies), and if the underlying codec is given enough bitrate to not mess with phase too much, the rest can be okay too.
* MPEG-H 3D Audio isolates each source in space from the input, storing a representation based on objects. This should not have any preferred direction.

== By format ==

=== MP3 ===

MP3 supports dual-mono, M/S, and intensity methods. LAME does not support intensity stereo.

Some early MP3 encoders didn't make ideal decisions about what mode to use from frame to frame in joint stereo files, or how much bandwidth to allocate to encoding the side channel. This led to a widespread but mistaken belief that an abundance of M/S frames, or the use of joint stereo in general, always negatively impacts channel separation and other measures of audio quality. This is not an issue with modern encoders. Modern, optimized encoders will switch between mid-side coding or simple stereo coding as necessary, depending on the correlation between the left and right channels, and will allocate channel bandwidth appropriately to ensure the best mode is used for each frame.

LAME M/S is known to better preserve stereo image than dual-mono in most circumstances, given the same bitrate budget. See [[Lossy]].

=== Vorbis ===
[[Vorbis]] treats stereo information with '''square polar mapping''' which is beneficial when the correlation between the left and right channels are strong (this can also be extended to multichannel coupling as well). In Vorbis, the spectrum of each channel is normalized against a floor function, which is a rough envelope of the actual spectrum. In the square polar mapping, the (stereo) phase is roughly defined as the difference between the normalized left and right amplitude of a given frequency component. If the original left and right channel are the same within a certain frequency band, apart from an overall scaling factor, then the normalized frequency spectrum is the same left and right and the stereo phase is zero over the whole frequency band. Note that in the context of polar mapping, the term 'phase' (here: 'stereo phase') has a very different meaning from the phase of a periodic wave. Unlike in the Fourier Transform, the Cosine Transform used in Vorbis and other encoders only provides amplitudes and no phases of the latter type.

Once the stereo information is represented in polar mapping as a magnitude and stereo phase, Vorbis can use three coupling methods:<ref>[http://www.xiph.org/vorbis/doc/stereo.html Ogg Vorbis stereo-specific channel coupling] at xiph.org.</ref>
* '''Lossless coupling''' is mathematically equivalent to independent encoding of the two channels ('dual mono' in MP3), but with the benefit of additional space-saving. It does polar mapping/channel interleaving using the residue vectors.
* In '''phase stereo''', the stereo phase is quantized, i.e. stored at a lower resolution. Especially above 4 kHz, the ear is not very sensitive to phase information. Phase stereo is '''not''' currently implemented in reference encoder due to complexity, but will be re-added again later on. Note that phase stereo should not be compared to intensity stereo in MP3 coding.
* In [[point stereo]], the stereo phase is discarded completely. All the stereo information comes from the difference in the spectral floors for the left and right channels.

Ogg Vorbis uses lossless/point stereo coupling below ''-q 6''. Lossless channel coupling is used for high bitrates entirely (''-q 6 and up''). This can be adjusted via an advanced-encode switch, but is not done for simplicity's sake.

=== Opus ===

Opus is capable of multi-mono, M/S with tunable weight factor, and intensity stereo. It avoids multi-mono unless explicitly asked for, and decide among M/S and intensity by the bitrate available and audio content. It also calculates the stereo width to decide the total amount of bitrate needed.

With surround input, Opus can only couple to pairs of joint-stereo. It does take advantage of surround masking.

With ambisonic input, Opus can use a fixed matrix, or do multi-mono.

==External Links==
* [http://en.wikipedia.org/wiki/Joint_stereo joint stereo at Wikipedia]
* [http://www.codingtechnologies.com/products/paraSter.htm Parametric Stereo at Coding Technologies]

==References==
<references/>

[[Category:Technical]]
[[Category:Algorithms]]

Joint stereo

2023-08-12T05:12:07Z

Artoria2e5: /* Opus */

'''Joint stereo''' refers to any stereo-encoding method that goes beyond simple encoding as two independent channels ("simple" or "L/R" stereo or DualMono). These methods exploit the similarities between channels and typically allow for more bits to be effectively used, increasing audio quality for a given bitrate. They are, however, not guaranteed to be perfect and could instead cause audible artifacts (mostly on older encoders).

Some file formats, such as MP3, can do switch among these formats on-the-fly on a frame or sub-frame basis, for the sake of efficiency or quality. For example, a high-[[bitrate]] "joint stereo" [[MP3]] file may contain a mixture of SS and MS frames, or it may contain all SS frames or all MS frames. Due to some historical accident, the term as applied in MP3 refers to a mixture of coding formats. In other words, a non-"joint stereo" MP3 will never contain a mixture of frame types.

==Stereo coding methods or "modes"==

===Left-Right (L/R) or "Simple" Stereo (SS)===
Simple stereo is the most straightforward method of coding a stereo signal: each channel is treated as a completely separate entity. This can be inefficient and may adversely impact quality (as compared to other modes) when both channels contain nearly identical signals (i.e., are mono or nearly so).

===Mid-side Stereo (MS)===
Mid-side stereo coding calculates a "mid"-channel by addition of left and right channel, and a "side"-channel by subtraction, i.e.:

;Encoding
:''M'' = (''L'' + ''R'') / 2, ''S'' = (''L'' - ''R'') / 2
;Decoding
:''L'' = ''M'' + ''S'', R = ''M'' - ''S''

Whenever a signal is concentrated in the middle of the stereo image (i.e. more mono-like), mid-side stereo can achieve a significant saving in bitrate, since one can use fewer bits to encode the side-channel. Even more important is the fact that by applying the inverse matrix in the decoder, the quantization noise becomes correlated and falls in the middle of the stereo image, where it is masked by the signal.

Unlike [[Joint stereo#intensity stereo|intensity stereo]] which destroys phase information, mid-side coding is mathematically lossless (although subsequent lossy compression may cause phase degredation). Correctly implemented mid-side stereo does very little or no damage to the stereo image and increases compression efficiency either by reducing size or increasing overall quality. Mid-side is also simple enough to be implemented in FM radio and stereophonic Vinyl.

Mid-side stereo can use coefficients other than 1 in encoding and decoding. Allowing different contributions from each channel allows the codec to adapt to off-balance sources and retain the bitrate savings. This extension is found in opus, where an angle can be encoded.<ref name=opus/>

===Intensity stereo===

Intensity stereo coding is a method that achieves a saving in bitrate by replacing the left and the right signal by a single representing signal plus directional information (in the form of amplitude ratios for each frequency range). This replacement is psychoacoustically justified in the higher [[frequency]] range since the human auditory system is insensitive to the signal phase at frequencies above approximately 2 kHz.<ref>http://www.hydrogenaudio.org/forums/index.php?showtopic=1491&view=findpost&p=14091</ref> To maintain the justification, a codec may only apply intensity stereo to higher-frequency parameters.<ref name=opus>See e.g. https://web.archive.org/web/20180714000735/http://jmvalin.ca/papers/aes135_opus_celt.pdf, sections 4.5 [IS frequency], 4.5.1 [M/S angle]</ref>

Intensity stereo is by definition a [[lossy]] coding method thus it is primarily useful at low bitrates. For coding at higher bitrates only mid-side stereo should be used.

===Parametric stereo===
Parametric stereo, found in HE-AAC, is similar to intensity stereo, except that the directional information also includes phase and correlation. The phase information makes this algorithm also capable of keeping low frequency location cues (by inter-aural time differences), while the (de-)correlation information helps add ambience by synthesizing some difference between channels.<ref name=LC-M4>Purnhagen, Heiko (October 5–8, 2004). [http://dafx.de/paper-archive/2004/P_163.PDF LOW COMPLEXITY PARAMETRIC STEREO CODING IN MPEG-4]" (PDF). 7th International Conference on Digital Audio Effects: 163–168.</ref>

PS replaces a whole channel with only 2-3 kbit/s of side information. As a result, the remaining channel gets almost double the bitrate to use, so the quality gain can more than makes up for the lossiness of the process. It is not useful at high bitrate.

The phase aspect is covered by a few patents applied in 1997~2000 (EP1107232A3, EP0797324A2), which should have expired. The ambience part (EP1927266B1) will expire in 2026, so do not expect any new experimental codec to use it yet.

== More channels ==
The general idea of exploiting the redundancy among channels is called ''channel coupling''.

=== Surround ===

Surround is structured like stereo in some ways, except now there are many more pairs that can be coupled together. The basic approach is to code together the corresponding pairs of left and right using ordinary joint stereo techniques.

In MPEG Surround, a process similar to parametric stereo is used to three streams into two, or two streams into one – plus a small stream of side information. A stream created by merging itself can be merged, creating a hierarchy of merges. For example, a 5.1 stream can be encoded as merges of C/LFE, L/Ls, R/Rs, then these three streams can be mixed down if needed.

=== Ambisonics ===

Ambisonics represents an entire sound field. In the raw representation, everything is based on spherical harmonics.

* Multi-mono lossy encoding is unacceptably bad for ambisonics. Each stream does its own thing with the phase, resulting in a incoherent sound image.<ref>Phase/ambisonic issue discussed in: Mahé, Pierre; Ragot, Stéphane; Marchand, Sylvain (2 September 2019). ''[https://hal.science/hal-02289558 First-Order Ambisonic Coding with PCA Matrixing and Quaternion-Based Interpolation]''. 22nd International Conference on Digital Audio Effects (DAFx-19), Birmingham, UK. p. 284.</ref>
* A fixed encoding matrix, such as the one in Opus, is passable. Sources a fixed direction gets much better quality (because it only goes in one stream: no chance of phase inconsistencies), and if the underlying codec is given enough bitrate to not mess with phase too much, the rest can be okay too.
* MPEG-H 3D Audio isolates each source in space from the input, storing a representation based on objects. This should not have any preferred direction.

== By format ==

=== MP3 ===

MP3 supports dual-mono, M/S, and intensity methods. LAME does not support intensity stereo.

Some early MP3 encoders didn't make ideal decisions about what mode to use from frame to frame in joint stereo files, or how much bandwidth to allocate to encoding the side channel. This led to a widespread but mistaken belief that an abundance of M/S frames, or the use of joint stereo in general, always negatively impacts channel separation and other measures of audio quality. This is not an issue with modern encoders. Modern, optimized encoders will switch between mid-side coding or simple stereo coding as necessary, depending on the correlation between the left and right channels, and will allocate channel bandwidth appropriately to ensure the best mode is used for each frame.

LAME M/S is known to better preserve stereo image than dual-mono in most circumstances, given the same bitrate budget. See [[Lossy]].

=== Vorbis ===
[[Vorbis]] treats stereo information with '''square polar mapping''' which is beneficial when the correlation between the left and right channels are strong (this can also be extended to multichannel coupling as well). In Vorbis, the spectrum of each channel is normalized against a floor function, which is a rough envelope of the actual spectrum. In the square polar mapping, the (stereo) phase is roughly defined as the difference between the normalized left and right amplitude of a given frequency component. If the original left and right channel are the same within a certain frequency band, apart from an overall scaling factor, then the normalized frequency spectrum is the same left and right and the stereo phase is zero over the whole frequency band. Note that in the context of polar mapping, the term 'phase' (here: 'stereo phase') has a very different meaning from the phase of a periodic wave. Unlike in the Fourier Transform, the Cosine Transform used in Vorbis and other encoders only provides amplitudes and no phases of the latter type.

Once the stereo information is represented in polar mapping as a magnitude and stereo phase, Vorbis can use three coupling methods:<ref>[http://www.xiph.org/vorbis/doc/stereo.html Ogg Vorbis stereo-specific channel coupling] at xiph.org.</ref>
* '''Lossless coupling''' is mathematically equivalent to independent encoding of the two channels ('dual mono' in MP3), but with the benefit of additional space-saving. It does polar mapping/channel interleaving using the residue vectors.
* In '''phase stereo''', the stereo phase is quantized, i.e. stored at a lower resolution. Especially above 4 kHz, the ear is not very sensitive to phase information. Phase stereo is '''not''' currently implemented in reference encoder due to complexity, but will be re-added again later on. Note that phase stereo should not be compared to intensity stereo in MP3 coding.
* In [[point stereo]], the stereo phase is discarded completely. All the stereo information comes from the difference in the spectral floors for the left and right channels.

Ogg Vorbis uses lossless/point stereo coupling below ''-q 6''. Lossless channel coupling is used for high bitrates entirely (''-q 6 and up''). This can be adjusted via an advanced-encode switch, but is not done for simplicity's sake.

=== Opus ===

Opus is capable of multi-mono, M/S with tunable weight factor, and intensity stereo. It avoids multi-mono unless explicitly asked for, and decide among M/S and intensity by the bitrate available and audio content. It also calculates the stereo width to decide the total amount of bitrate needed.

With surround input, Opus can only couple to pairs of joint-stereo. It does take advantage of stereo masking.

With ambisonic input, Opus can use a fixed matrix, or do multi-mono.

==External Links==
* [http://en.wikipedia.org/wiki/Joint_stereo joint stereo at Wikipedia]
* [http://www.codingtechnologies.com/products/paraSter.htm Parametric Stereo at Coding Technologies]

==References==
<references/>

[[Category:Technical]]
[[Category:Algorithms]]

Bitrate

2023-08-12T05:10:55Z

Artoria2e5:

'''Bitrate''' means the data rate (i.e. how many bits get transferred in a certain amount of time), usually expressed in bits per second.

The common units of bit rate are kilobits per second (Kbps) and megabits per second (Mbps). In data rates, the multipliers "k", "M", etc. stand for powers of 1000, not powers of 1024.

The term is also commonly used when discussing digital sampling and sample rates. For example, the MP3 audio compression algorithm is often set to ouput files with a bitrate of 128 kbps. This means that the file contains an average of 128 kilobits for each second of audio (960KB per minute). This contrasts with CD audio, which is encoded as 44100 16-bit stereo samples per second: 1411.2 kbps (16bit x 44100Hz x 2 channels).

Often, upper-case units and multipliers are used for bytes (like "KB" for kilobytes) and lowercase multipliers are bits (like "kb" for kilobits). All modern computers use 8-bit bytes.

==MP3 bitrates==

MP3 bitrates can be deceptive. For example, a 128 kbps "constant bitrate" ([[CBR]]) MP3 will use ''about'' 128 kilobits for each second of audio that is encoded (so the file size, in bits, divided by the audio's duration, comes out to about 128,000), and its frame headers will occur at regular intervals, but internally, from frame to frame it may encode audio at a bitrate higher or lower than 128 kbps through the use of the [[bit reservoir]] (the ability of a frame to use spare bits from the preceding frame). However, the size of this reservoir, and thus the amount of variability, is limited, so 128 kbps will be very close to the effective bitrate across the whole file.

As another example, a "128 kbps VBR MP3" is usually a misnomer, since the point of [[VBR]] is to allow each of the MP3's internal frames to have its own bitrate. When people refer to the bitrate of a VBR MP3, they are usually referring to the actual average of bitrate of its frames. If the duration of the encoded audio is known, then "bitrate" might be the size of the file data divided by its duration, which will be pretty close to the same number. However, the duration of a VBR MP3 cannot be accurately determined without scanning all the frames.

== Equivalent bitrate estimates for multichannel audio==

C.R.Helmrich [https://hydrogenaud.io/index.php/topic,120007.msg997612.html#msg997612 writes]:
<blockquote>
The following function gives you a weight w for each ''cc'', where cc is the channel configuration written as a decimal number (1.0 for mono, 2.0 for stereo, 5.1 for surround, etc.):

w(''cc'') = ''cc''0.75

If you want to know the quality equivalent mono bit-rate of, e.g., 128 kbps stereo, you simply calculate 128 * w(1.0)/w(2.0) = 128 * (1.0/2.0)^0.75 = 76 kbps. That function also tells you that, with 5.1 surround, you need roughly twice the stereo bit-rate for the same level of audio quality. Based on my experience that's quite reasonable, at least with modern codecs like Opus, (x)HE-AAC, and MPEG-H Audio.
</blockquote>

This very simple definition does two things:
* Surround channels are treated as one-tenth of a "real" channel. This is reasonable, because low-frequency effects (LFE) channel do carry little information.
* The power function makes it such that each additional channel adds less bitrate than a previous channel. This is also reasonable, considering [[joint stereo]] and analogous methods used to extract redundancy from multichannel audio.

[[Category:Technical]]

Opus

2023-08-11T06:23:45Z

Artoria2e5: /* References & Notes */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
!Less than 5 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|6 kHz medium-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!8 kbps
|6 kHz medium-band
|SILK
|Close to telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality significantly degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: even though it works acceptably at low bitrate, radio links are fixed-rate require a predictable bit rate which Opus speech cannot adequately provide.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!6 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!8 kbps
|mono
|6 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]
<references/>
[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Opus

2023-08-11T06:23:31Z

Artoria2e5: SILK CVBR discussion

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
!Less than 5 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|6 kHz medium-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!8 kbps
|6 kHz medium-band
|SILK
|Close to telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

One major limitation of Opus at low bitrate is that SILK is inherently VBR: it accepts no constraints in CVBR, and if forced to do CBR the quality significantly degrades from bit-shaving. As a result, even though constrained VBR is designed such that a fixed-rate data link requires at most one frame of buffer to handle the variation in bit rate -- great news for communication links -- any use of SILK, even in hybrid mode, has the potential of breaking this intention. This makes Opus suboptimal for low-rate radio links: even though it works acceptably at low bitrate, radio links are fixed-rate require a predictable bit rate which Opus speech cannot adequately provide.

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!6 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!8 kbps
|mono
|6 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]

[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Opus

2023-08-11T06:08:29Z

Artoria2e5: /* libopus v1.1 */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
!Less than 5 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|6 kHz medium-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!8 kbps
|6 kHz medium-band
|SILK
|Close to telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!6 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!8 kbps
|mono
|6 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.<ref>https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml</ref>

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

A new '''temporal VBR''' feature is added. For reasons not explained by classic psychoacoustics, it appears that giving more bits to loud frames (stealing from quiet frames) makes the result substantially better on listening tests. This feature is not tunable: it always affects VBR calculation at low bitrates, gradually becoming weaker at higher bitrates, until it turns off completely at 68 kbps.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]

[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

Theora

2023-08-10T15:12:54Z

Artoria2e5:

'''Theora''' is a video [[codec]] being developed by the Xiph.org Foundation as part of their [[Ogg]] project. Based upon the VP3 codec from [http://www.on2.com/ On2 Technologies], and christened by On2 as the successor in VP3's lineage, [http://www.theora.org/ Theora] is targeted at competing with [[MPEG-4]] video (e.g., [[XviD]] and [[DivX]]), [[RealVideo]], [[Windows Media Video]], and similar lower-bitrate video compression schemes. Theora is an 8x8 DCT-II transform codec like competing MPEG-4 video schemes, but differs in that it only uses I (intra) and P (inter) frames, with no corresponding B (bi-predictive) frames.

The VPx lineage has seen many, many new codecs since Theora. VP8 and VP9 are, since 2015, widely supported in browsers. Their successor AV1 has also become mature.

Theora is still in developmental stages with Xiph.org having made five alpha releases thus far.
* Alpha One was released on September 25, 2002
* Alpha Two was released half on December 16 and half on December 27, 2002
* Alpha Three was released on March 20, 2004
* Alpha Four was released on December 15, 2004.
* Alpha Five was released on August 20, 2005.

The first beta release Beta-1 is expected later in 2006. Theora is released under the terms of a BSD-style license.

While VP3 ''is'' patented technology, On2 has irrevocably given royalty-free license of the VP3 patents to all of mankind, enabling the public to utilize Theora and other VP3-derived codecs for any imaginable purpose.

Ralph Giles heads up the Theora project.

In the Ogg multimedia framework, Theora provides a video layer, while [[Vorbis]] acts as the audio layer.

Theora is named for Theora Jones, Edison Carter's Controller on the Max Headroom television program.

[[Category:Codecs]]

BS.1387

2023-08-10T15:07:15Z

Artoria2e5: /* GstPEAQ */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples. Assuming the HydrogenAudio subjective scoring system, where the reference sample is always scored as a perfect 5, adding 5 points to ODG should produce an approximation of the subjective score.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== GstPEAQ ==
[https://github.com/HSU-ANT/gstpeaq GstPEAQ] is an implementation of PEAQ, ''both'' basic and advanced, in GStreamer. In addition to the ODG, it also outputs the distortion index (DI), which is not clipped at extremes and not fitted to score anchors. On the HA multiformat dataset:
* The advanced model gives a correlation improvement of ~0.2 over basic;
* DI is slightly better at predicting subjective scores than ODG, with a correlation improvement of ~0.03.

== Comparison with subjective listening tests ==
* [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 EAqual results for the AAC@128v2 listening test] - fair Pearson correlation (0.699) among higher-quality samples: all AAC
* [https://hydrogenaud.io/index.php/topic,124607.msg1031323.htmlGstPEAQ: PEAQ done right, allegedly || Multiformat correlation] - great Pearson correlation (0.924, DI Adv): samples of three quality groups

HA comparisons between PEAQ and human raters remain inconclusive. PEAQ is considered useful for an approximation of human senses in codec development and research, but concrete results still need human participation.

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both designed for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It works for both speech and music, but the neural network (don't worry, it runs fast enough on a CPU) is trained for short clips only. Maybe someone can write a tool to one file into many clips and see individual segment scores.
* CDPAM is allegedly the model that comes closest to human datasets. Unfortunately the only pre-trained model works on 22050 Hz. And it's academic neural-network code -- not something you can expect to run on first try.

There are also much more primitive methods that don't attempt anything perceptual, preferred by peddlers of Bluetooth codecs:
* SNR
* THD+N

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

BS.1387

2023-08-10T15:06:38Z

Artoria2e5:

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples. Assuming the HydrogenAudio subjective scoring system, where the reference sample is always scored as a perfect 5, adding 5 points to ODG should produce an approximation of the subjective score.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== GstPEAQ ==
[https://github.com/HSU-ANT/gstpeaq GstPEAQ] is an implementation of PEAQ, ''both'' basic and advanced, in GStreamer. In addition to the ODG, it also outputs the distortion index (DI), which is not clipped at extremes and not fitted to score anchors. On the HA multiformat dataset:
* The advanced model gives a correlation improvement of ~0.2 over basic;
* DI is slightly better at predicting subjective scores than ODG, with a correlation improvement of ~0.04.

== Comparison with subjective listening tests ==
* [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 EAqual results for the AAC@128v2 listening test] - fair Pearson correlation (0.699) among higher-quality samples: all AAC
* [https://hydrogenaud.io/index.php/topic,124607.msg1031323.htmlGstPEAQ: PEAQ done right, allegedly || Multiformat correlation] - great Pearson correlation (0.924, DI Adv): samples of three quality groups

HA comparisons between PEAQ and human raters remain inconclusive. PEAQ is considered useful for an approximation of human senses in codec development and research, but concrete results still need human participation.

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both designed for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It works for both speech and music, but the neural network (don't worry, it runs fast enough on a CPU) is trained for short clips only. Maybe someone can write a tool to one file into many clips and see individual segment scores.
* CDPAM is allegedly the model that comes closest to human datasets. Unfortunately the only pre-trained model works on 22050 Hz. And it's academic neural-network code -- not something you can expect to run on first try.

There are also much more primitive methods that don't attempt anything perceptual, preferred by peddlers of Bluetooth codecs:
* SNR
* THD+N

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

BS.1387

2023-08-10T15:03:25Z

Artoria2e5: /* Comparison with subjective listening tests */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples. Assuming the HydrogenAudio subjective scoring system, where the reference sample is always scored as a perfect 5, adding 5 points to ODG should produce an approximation of the subjective score.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== Comparison with subjective listening tests ==
* [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 EAqual results for the AAC@128v2 listening test] - fair Pearson correlation (0.699) among higher-quality samples: all AAC
* [https://hydrogenaud.io/index.php/topic,124607.msg1031323.htmlGstPEAQ: PEAQ done right, allegedly || Multiformat correlation] - great Pearson correlation (0.924, DI Adv): samples of three quality groups

HA comparisons between PEAQ and human raters remain inconclusive. PEAQ is considered useful for an approximation of human senses in codec development and research, but concrete results still need human participation.

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

There is one advanced-version implementation: [https://github.com/HSU-ANT/gstpeaq GstPEAQ], which builds on GStreamer. It deviates from the standard less than the previous ones do, and most importantly has the advanced stuff. If any future experimentation on PEAQ is to be done, GstPEAQ should be used.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both designed for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It works for both speech and music, but the neural network (don't worry, it runs fast enough on a CPU) is trained for short clips only. Maybe someone can write a tool to one file into many clips and see individual segment scores.
* CDPAM is allegedly the model that comes closest to human datasets. Unfortunately the only pre-trained model works on 22050 Hz. And it's academic neural-network code -- not something you can expect to run on first try.

There are also much more primitive methods that don't attempt anything perceptual, preferred by peddlers of Bluetooth codecs:
* SNR
* THD+N

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

BS.1387

2023-08-10T15:01:45Z

Artoria2e5:

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples. Assuming the HydrogenAudio subjective scoring system, where the reference sample is always scored as a perfect 5, adding 5 points to ODG should produce an approximation of the subjective score.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== Comparison with subjective listening tests ==
* [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 EAqual results for the AAC@128v2 listening test] - fair Pearson correlation (0.699) among higher-quality samples: all AAC
* [https://hydrogenaud.io/index.php/topic,124607.msg1031323.htmlGstPEAQ: PEAQ done right, allegedly || Multiformat correlation] - great Pearson correlation (0.924, DI Adv): samples of three quality groups

It remains inconclusive whether EAQUAL can replace human raters at all. Codec developers and researchers do find PEAQ useful.

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

There is one advanced-version implementation: [https://github.com/HSU-ANT/gstpeaq GstPEAQ], which builds on GStreamer. It deviates from the standard less than the previous ones do, and most importantly has the advanced stuff. If any future experimentation on PEAQ is to be done, GstPEAQ should be used.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both designed for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It works for both speech and music, but the neural network (don't worry, it runs fast enough on a CPU) is trained for short clips only. Maybe someone can write a tool to one file into many clips and see individual segment scores.
* CDPAM is allegedly the model that comes closest to human datasets. Unfortunately the only pre-trained model works on 22050 Hz. And it's academic neural-network code -- not something you can expect to run on first try.

There are also much more primitive methods that don't attempt anything perceptual, preferred by peddlers of Bluetooth codecs:
* SNR
* THD+N

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

BS.1387

2023-08-10T06:08:50Z

Artoria2e5: /* Interpreting EAQUAL output */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''. Several tests have been performed using EAQUAL most notably using numerous [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 AAC encoders] to determine via a [http://en.wikipedia.org/wiki/Pearson_correlation Pearson Correlation] the linear relationship between human ratings and EAQUAL ratings on a given set of test samples. The results, however when using objective testing methodologies are still inconclusive and mostly only used by codec developers and researchers.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples. Assuming the HydrogenAudio subjective scoring system, where the reference sample is always scored as a perfect 5, adding 5 points to ODG should produce an approximation of the subjective score.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

There is one advanced-version implementation: [https://github.com/HSU-ANT/gstpeaq GstPEAQ], which builds on GStreamer. It deviates from the standard less than the previous ones do, and most importantly has the advanced stuff. If any future experimentation on PEAQ is to be done, GstPEAQ should be used.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both designed for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It works for both speech and music, but the neural network (don't worry, it runs fast enough on a CPU) is trained for short clips only. Maybe someone can write a tool to one file into many clips and see individual segment scores.
* CDPAM is allegedly the model that comes closest to human datasets. Unfortunately the only pre-trained model works on 22050 Hz. And it's academic neural-network code -- not something you can expect to run on first try.

There are also much more primitive methods that don't attempt anything perceptual, preferred by peddlers of Bluetooth codecs:
* SNR
* THD+N

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

BS.1387

2023-08-10T06:03:12Z

Artoria2e5: /* Other objective metrics */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''. Several tests have been performed using EAQUAL most notably using numerous [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 AAC encoders] to determine via a [http://en.wikipedia.org/wiki/Pearson_correlation Pearson Correlation] the linear relationship between human ratings and EAQUAL ratings on a given set of test samples. The results, however when using objective testing methodologies are still inconclusive and mostly only used by codec developers and researchers.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

There is one advanced-version implementation: [https://github.com/HSU-ANT/gstpeaq GstPEAQ], which builds on GStreamer. It deviates from the standard less than the previous ones do, and most importantly has the advanced stuff. If any future experimentation on PEAQ is to be done, GstPEAQ should be used.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both designed for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It works for both speech and music, but the neural network (don't worry, it runs fast enough on a CPU) is trained for short clips only. Maybe someone can write a tool to one file into many clips and see individual segment scores.
* CDPAM is allegedly the model that comes closest to human datasets. Unfortunately the only pre-trained model works on 22050 Hz. And it's academic neural-network code -- not something you can expect to run on first try.

There are also much more primitive methods that don't attempt anything perceptual, preferred by peddlers of Bluetooth codecs:
* SNR
* THD+N

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

BS.1387

2023-08-10T05:56:00Z

Artoria2e5: /* Other implementations */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''. Several tests have been performed using EAQUAL most notably using numerous [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 AAC encoders] to determine via a [http://en.wikipedia.org/wiki/Pearson_correlation Pearson Correlation] the linear relationship between human ratings and EAQUAL ratings on a given set of test samples. The results, however when using objective testing methodologies are still inconclusive and mostly only used by codec developers and researchers.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

There is one advanced-version implementation: [https://github.com/HSU-ANT/gstpeaq GstPEAQ], which builds on GStreamer. It deviates from the standard less than the previous ones do, and most importantly has the advanced stuff. If any future experimentation on PEAQ is to be done, GstPEAQ should be used.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both designed for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It works for both speech and music, but the neural network (don't worry, it runs fast enough on a CPU) is trained for short clips only. Maybe someone can write a tool to one file into many clips and see individual segment scores.
* CDPAM is allegedly the model that comes closest to human datasets. Unfortunately the only pre-trained model works on 22050 Hz. And it's academic neural-network code -- not something you can expect to run on first try.

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

BS.1387

2023-08-10T05:55:11Z

Artoria2e5: /* Other implementations */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''. Several tests have been performed using EAQUAL most notably using numerous [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 AAC encoders] to determine via a [http://en.wikipedia.org/wiki/Pearson_correlation Pearson Correlation] the linear relationship between human ratings and EAQUAL ratings on a given set of test samples. The results, however when using objective testing methodologies are still inconclusive and mostly only used by codec developers and researchers.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

There is one advanced-version implementation: [https://github.com/HSU-ANT/gstpeaq GstPEAQ], which builds on GStreamer. It deviates from the standard less than the previous ones do, and most importantly has the advanced stuff.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both designed for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It works for both speech and music, but the neural network (don't worry, it runs fast enough on a CPU) is trained for short clips only. Maybe someone can write a tool to one file into many clips and see individual segment scores.
* CDPAM is allegedly the model that comes closest to human datasets. Unfortunately the only pre-trained model works on 22050 Hz. And it's academic neural-network code -- not something you can expect to run on first try.

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

Opus

2023-08-10T05:54:40Z

Artoria2e5: /* CELT layer latency versus quality/bitrate trade-off */

{{Software Infobox
| name = Opus
| logo = [[Image:opus-logo.png|250px|Official Opus logo]]
| screenshot =
| caption = Opus Interactive Audio Codec
| maintainer = [http://xiph.org/ Xiph.Org Foundation]
| stable_release = 1.4
| operating_system = Windows, Mac OS/X, Linux/BSD
| use = Encoder/Decoder
| license = 3-clause BSD license
| website = [http://www.opus-codec.org/ opus-codec.org]
}}

'''Opus''' is a [[lossy]] audio compression format developed by the Internet Engineering Task Force (IETF) designed to be suitable for interactive real-time applications over the Internet,{{ref|homepage|a}} including music as well as speech, yet it is also very competitive for use as a storage and playback format, being a [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ class leader at around 64 kbps] and [http://listening-test.coresv.net/results.htm also at 96 kbps]. As an open format standardised through [http://tools.ietf.org/html/rfc6716 Request for Comments (RFC) 6716],{{ref|RFC|c}} a high quality reference implementation is provided under the 3-clause BSD license{{ref|homepage|a}} which compiles and runs on the vast majority of general purpose and embedded (fixed point) processors. Many Software patents which cover Opus are licensed under royalty-free terms.{{ref|FAQ|b}} Opus is also a Mandatory To Implement (MTI) codec for the upcoming WebRTC (Web Real Time Communication) specification of the World Wide Web Consortium (W3C).

Opus incorporates technology from two codecs, the speech-oriented SILK codec developed by Skype and the multi-purpose low-latency CELT codec developed by Xiph.org with significant changes to each to ensure they can work together.{{ref|RFC|c}} Opus can seamlessly transition among high and low bitrates, using a linear prediction codec (the SILK layer) at lower bitrates and a lapped transform codec (the CELT layer) at higher bitrates, as well as a hybrid of the two for a short overlap in which SILK encodes the 0–8 kHz spectrum and the CELT layer encodes only the frequencies above 8kHz.{{ref|RFC|c}} Opus has very low algorithmic delay (typ 22.5 ms) compared to popular music formats such as [[MP3]], [[Vorbis |Ogg Vorbis]], [[AAC | LC-AAC and HE-AAC]] (all over 100 ms), yet performs very competitively with them in terms of quality per bitrate, making it comparably viable as a storage & playback format. Also unlike Vorbis, Opus does not require the definition of large codebooks for each individual file, making it also preferable for short clips of audio, such as those often used by game developers, a field where patent-free Vorbis is commonly used.{{ref|RFC|c}}

Considerably more details of the history and potential applications for Opus are included in the ''Wikipedia'' page for '''[http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Opus (audio format)]'''

==Characteristics==
Opus supports bitrates from 6 kbps to 510 kbps for typical stereo audio sources (and a maximum of around 255 kbps per channel for multichannel audio), with the 'sweet spot' for music and general audio around 30 kbps (mono) and 40–100 kbps (stereo). It is intrinsically [[VBR | variable bitrate]], though constrained VBR and [[CBR | constant bitrate]] modes are possible where required. In the case of the reference release, libopus, the target bitrate is calibrated against the internal constant quality targets so that over a typical music collection, something very close to the target bitrate will be achieved. This bitrate-calibrated approach differs from most VBR encoders (e.g. LAME, helix mp3, qaac, Nero aacenc, Ogg Vorbis, Musepack) where a setting on some 'constant quality' scale (which differs between encoders) is used and the bitrate will fall where it may. Improved future versions can be expected to offer improved quality at the same setting. Independent implementations may adopt a different approach.

Opus is able to seamlessly adapt its mode of operation without glitches or sound interruption (an illustrative demonstration of [http://opus-codec.org/examples/#gauge bitrate scalability] is on the Opus Examples page), which can be particularly useful for mixed-content audio or varying network conditions, making the unified Opus codec superior to a suite of different codecs that might otherwise cover the same range of bitrate and quality settings and would require out-of-band signalling to instigate codec switching. The switching includes the choice of mono, stereo and other channel mappings, the use of the speech-oriented SILK layer, the general-purpose CELT layer or the hybrid of both, and the use of different audio bandwidths (4, 6, 8, 12, or 20 kHz) as well as the quality adjustments within the same operating mode that are available in most VBR-capable codecs.

Of importance mainly to interactive uses, but potentially useful in time-delayed audio streaming also, Opus includes packet loss concealment (PLC) in all modes and, in the speech-oriented modes where the SILK layer is active it also supports Forward Error Correction (FEC) where the expected rate of packet loss can be indicated to the encoder by the user or by application software and critical frames (e.g. consonant sounds) can be retransmitted at low bitrate to preserve intelligibility.

For music and general audio, the CELT layer of Opus builds on knowledge gained during xiph.org's Vorbis development and ensures as a primary goal that the total energy in each spectral band is preserved while requiring only a modest bitrate overhead to achieve this, thereby eliminating a lot of bitrate-starvation artifacts such as 'birdies' that are common in low-bitrate MP3, especially during transients, applause and cymbal sounds. This technique likewise increases coding efficiency at bitrates targetting transparent music reproduction. Short blocks (2.5 ms) are also possible for efficient transient handling. Short blocks can also be used exclusively, if very low algorithmic delay (5.0 ms) is required to enable very low-latency interative audio (e.g. live networked music performances such as remote jam sessions), though greater bitrate is then required to maintain the same quality (illustrated in [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo Monty's CELT demo page] under Constant PEAQ value, varying latency). CELT uses a number of additional techniques and provides additional advanced tools to enable encoder tuning.

Opus natively supports [[gapless playback]] (though [[Gapless_playback#Poorly_designed_playback_systems | poor player design]] might itself induce interruptions during playback). Playback gain is also required, making some form of [[ReplayGain]] or [[ReplayGain_2.0_specification | similar]] volume control possible in any compliant player.

==Bitrate performance==
For mono speech, Opus ranges from intelligible narrowband speech reproduction starting at 6 kbps to medium-band, wideband and superwideband speech, reaching full-band speech by around 14 kbps in encoder version 1.2 (was 21 kbps in v1.1, 29 kbps in v1.0). Above about 32 kbps, the SILK layer is no longer used at all, as CELT alone gives superior quality.

For music, the SILK modes are quite tolerable and better than CELT at very low bitrates. The hybrid mode is adopted as bitrate increases, extending bandwidth first to 12 kHz (comparable with compact cassette) then to the full 20 kHz and CELT then takes over. Assuming the source is stereo, the transition from mono to stereo typically happens between the transition from 12 kHz to 20 kHz. Encoder version 1.2 includes great improvements to music encoding in the 32–64 kbps range, allowing full-band stereo at 32 kbps and providing acceptable quality at 48 kbps where artifacts are audible but rarely annoying. Version 1.3 is expected to further improve quality in this range.

Multi-format stereo music listening tests have demonstrated the superiority of Opus at 64 kbps and 96 kbps compared to the best AAC-LC, HE-AAC and Ogg Vorbis encoders, and at 96 kbps also to 128 kbps MP3 encoded using LAME <code>-V 5</code>.

==Indicative bitrate and quality==
The tables below give illustrative, indicative quality guidance based on typical modes used internally by Opus and a range of listening tests.

In encoder version 1.1 automatic detection of speech/music and bandwidth detection were introduced to improve mode decisions and VBR is less constrained, all with the aim of maximizing the quality/bitrate tradeoff, and these improvements are further enhanced in version 1.2 and 1.3. These tables are likely to require updates as the encoder is improved, especially in low-bitrate regions.

===Speech encoding quality===
This table assumes a '''monophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate) but mentions stereo compatibility for 40kbps+. The default 20ms frame size (22.5ms latency) is assumed. Note that the selection of ''VOIP'' mode will deliberately modify the sound with a High Pass Filter and emphasis of formants and harmonics to improve intelligibility of speech especially in noisy environments much as telephones do. ''Auto'' mode will not modify the sound prior to encoding so is usually better for high quality speech recordings or mixed speech and music.

{| class="wikitable" style="text-align:center"
|-
!Bitrate Target
!Bandwidth
!Typical Mode Used
!Speech Quality
!Use Cases / Competitive Codecs
|-
!Less than 5 kbps
| —
| —
| Bitrates lower than 6 kbps not supported by Opus (SILK disabled if forced to encode, which results in terrible speech quality)
| Try [https://en.wikipedia.org/wiki/Codec_2 Codec 2] for 0.45–3.2 kbps mono speech or [[Wikipedia:Lyra (codec)|Lyra]] for 3.2 kbps mono speech
|-
!6 kbps
|6 kHz medium-band
|SILK
|Fair, intelligible
|AMR-NB may be a little better, but higher latency & proprietary, [[Speex]] also competitive
|-
!8 kbps
|6 kHz medium-band
|SILK
|Close to telephone quality
|AMR-NB & AMR-WB similar quality, but higher latency & proprietary. [[Speex]] competitive.
|-
!12 kbps
|12 kHz super-wideband
|hybrid
|Medium bandwidth, better than telephone quality
|Similar quality to AMR-WB
|-
!16 kbps
|20 kHz
|hybrid/CELT
|Wideband speech quality
|Similar to/better than AMR-WB
|-
!24 kbps
|20 kHz
|hybrid/CELT
|Near transparent speech
|Better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!32 kbps
|20 kHz
|CELT
|Essentially transparent speech plus moderately good stereo music
|Much better than AMR-WB. Podcasts/audiobooks/talk-radio.
|-
!40 kbps
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, fairly good stereo music
|Stereo podcasts/audiobooks/talk radio with some music
|-
!48 kbps or more
|20 kHz
|CELT
|Essentially transparent mono or stereo speech, reasonable music
|Flexible general purpose modes to suit mixed music and speech
|-
|}

===Music encoding quality===
This table assumes a '''stereophonic''' source sampled at CD quality or above (typ 48 kHz sampling rate). Opus will automatically use mono at very low bitrates, though a certain amount of stereo encoding can still be used (content dependent) even when mono is specified as the typical stereo mode in the table below.

{| class="wikitable" style="text-align:center"
|-
!Bitrate target
!Stereo mode
!Bandwidth
!typ SILK/CELT use
!Music quality notes
!Use cases/notes/competitive codecs
|-
!6 kbps
|mono
|6 kHz
|SILK
|Poor, muffled sound but intelligible lyrics.
| —
|-
!8 kbps
|mono
|6 kHz
|SILK
|Poor, muffled but OK for bitrate
| —
|-
!14 to 16 kbps
|mono
|20 kHz
|hybrid/CELT
|Fairly poor but OK for bitrate
|Perhaps acceptable for incidental music
|-
!22 to 24 kbps
|mono
|20 kHz
|hybrid/CELT
|Fair but OK for bitrate
|OK for incidental music
|-
!32 to 40 kbps
|stereo
|20 kHz
|CELT
|Moderately good stereo, some artifacts, rarely nasty
|Stereo podcasts, audiobooks, very low bitrate music
|-
!48 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, may have problems with cymbals
|Stereo podcasts, audiobooks, low bitrate music
|-
!64 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, nice sound, detectable differences to original (mostly 'not annoying')
|Music storage & streaming. Beat HE-AAC, Vorbis, MP3 in [https://web.archive.org/web/20200130083553/http://people.xiph.org/~greg/opus/ha2011/ listening test]
|-
!96 kbps
|stereo
|20 kHz
|CELT
|Full bandwidth stereo music, good quality approaching transparency
|Music storage & high quality streaming. Beat LC-AAC, Vorbis, MP3 in [http://listening-test.coresv.net/results.htm listening test]
|-
!112 kbps
|stereo
|20 kHz
|CELT
|Fairly close to transparency (needs more testing)
|Music storage & high quality streaming. Very low-latency stereo networked music performance/jam sessions at OK quality (see below table)
|-
!128 kbps
|stereo
|20 kHz
|CELT
|Very close to transparency (needs more testing). Most modern codecs competitive (AAC-LC, Vorbis, MP3)
|Music storage & streaming. Future download music sales.
|-
!160 to 192 kbps
|stereo
|20 kHz
|CELT
|Transparent with very low chance of artifacts (a few killer samples still detectable). Most old & new lossy codecs competitive.
|Music storage & streaming, dedicated limited-bandwidth audio links (e.g. wireless, [http://en.wikipedia.org/wiki/Bluetooth_profile#Advanced_Audio_Distribution_Profile_.28A2DP.29 A2DP-bluetooth] type links).
|-
!510 kbps
|stereo
|20 kHz
|CELT
|Maximum possible stereo bitrate target (actual rate often less than 510 for default frame size). Most old and new lossy codecs competitive, plus near-lossless [[lossyWAV]] and [[WavPack | WavPack lossy]]
|Music storage, dedicated limited-bitrate audio links (e.g. wireless, minimum latency high quality audio. LossyWAV and WavPack lossy are very competitive for storage, and WavPack lossy <code>--blocksize=256</code> may be competitive with minimum latency mode also.
|-
!>510 kbps
| —
| —
| —
|Above Opus bitrate range allowed for stereo sources
|Settle for 510 kbps or use [[lossless]], [[lossyWAV]], [[WavPack | WavPack lossy]] or lossy transform/subband codecs like [[Vorbis]], [[Musepack]] at very high settings.
|-
|}

===Lower latency versus quality/bitrate trade-off===
====Packet overhead in interactive applications====
For interactive use on the Internet or other packet-based networks, total bandwidth used will be subject to packet overhead. The more packet headers that are transmitted every second, the greater will be the overhead that is required. For this reason, Opus, while defaulting to 20 ms frames, supports 60 ms frames to reduce overhead when transporting low-bitrate SILK frames at the expense of greater latency, which may still be acceptable for speech, and also supports 10 ms SILK frames to reduce latency somewhat at the expense of packet overhead.

In the CELT layer, which tends to operate at higher bitrates than SILK, 20 ms frames are the default, but frames of 10 ms, 5 ms and 2.5 ms are also possible, which directly increases the frame overhead by transmitting more packets per second to achieve lower latency. In addition, as we'll see below it also reduces the quality/bitrate tradeoff of the CELT layer itself.

You probably do not want to use a frame size lower than 10 ms in applications containing speech, as doing so turns off SILK. The "lowdelay" application switch (available in FFmpeg and the raw library) turns off SILK to cut out 4 ms of synchronization delay, but a frame size of 10 ms achieves more delay reduction compared to default without sacrificing SILK.

None of the bitrates mentioned in this article account for the packet overhead.

====CELT layer latency versus quality/bitrate trade-off====
Unlike the SILK layer, which works on fixed 10 ms blocks, 1, 2 or 6 of which can be combined into an Opus frame, the CELT layer is able to modify the encoding block lengths available to enable its use with shorter frames.

When the CELT layer uses 10 ms, 5 ms and 2.5 ms frames instead of the default 20 ms, it must use smaller transform block sizes to achieve this, thereby reducing frequency resolution in the MDCT compared to the default transform window, thus reducing encoding efficiency for tonal signals. To obtain the same frequency precision for a sound divided into shorter transform windows, improved amplitude precision is necessary, resulting in increased bitrate to obtain the same perceptual quality (or conversely lower quality at the same bitrate).

These reduced-latency modes remain efficient for transient signals, which use short blocks anyway.

In all modes, the algorithmic delay consists of the frame size plus an additional 2.5 ms delay. The CELT layer requires 2.5 ms for MDCT window overlap.

Xiph.org used matched [[PEAQ]] scores (approximate perceptual quality assessment made in software) for the CELT0.10 codec that was used as the basis of the CELT layer in the Opus reference release, which indicate the following [http://people.xiph.org/~xiphmont/demo/celt/demo.html#demo approximate equivalent settings] for stereo music.

{| class="wikitable" style="text-align:center"
|-
!Frame size
!Algorithmic delay
!Bitrate to match 64kbps@22.5ms delay
!fractional bitrate increase
|-
!20 ms
|22.5 ms
|64.0 kbps
| +0.0 %
|-
!10 ms
|12.5 ms
|70.4 kbps
| +10.0 %
|-
!5 ms
|7.5 ms
|84.8 kbps
| +32.5 %
|-
!2.5 ms
|5.0 ms
|112.0 kbps
| +75.0 %
|-
|}

N.B. This table is useful for interactive streaming only. For music storage & delayed playback or non-interactive streaming, latency reduction is not important and the default 20 ms frame size is preferable.

== Implementations ==

The format and algorithms are openly documented and the reference implementation is published as free software. The reference implementation (Opus Audio Tools, opus-tools), consisting of separate encoders and decoders, is published under the terms of a BSD-like license. It is written in C programming language and can be compiled for hardware architectures with or without floating point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore, unlike the encoder and decoder, available under the terms of version 2 of the GPL.

=== Reference implementation (libopus + binaries) ===
The commandline tools of the reference version are available pre-compiled for the most popular operating systems at [http://opus-codec.org/downloads opus-codec.org] and [https://ftp.mozilla.org/pub/mozilla.org/opus/ Mozilla's ftp server], plus in the foobar2000 free encoders pack and some alternative compiles through the hydrogenaud.io opus forum. The libopus commandline tools include encoder <code>opusenc</code>, decoder <code>opusdec</code>, and with a different license, the <code>opusinfo</code> opus stream & metadata analyzer.

The '''latest stable release''' is recommended for general use and as of mid 2014 is considered competitive with or superior to the best alternative speech or general music encoders at most supported bitrates.

==== libopus v1.0 ====
Released 11 Sep 2012 when RFC6716 was standardized but mostly fully developed by late 2011.

'''Stable, well-tuned''' <code>opusenc</code> reference encoder as included in RFC documentation.

CELT layer closely related to CELT 0.10 implements Constrained VBR mode by default (bitrate boost used mainly for transients), plus true CBR.

==== libopus v1.1 ====

The alpha source code released 21 Dec 2012 for testing & user feedback and following a beta release and testing, the stable 1.1 version was released on 5 December 2013, considered well tested enough for general release.

CELT layer [http://jmspeex.livejournal.com/11737.html quality improvements] introduced to provide '''unconstrained VBR''' include a rate boost not just for transients but now for highly tonal signals too and rate reduction when stereo image is narrow. There's also a rewrite of its '''transient detection''' code and '''time-frequency analysis''' code, and rewritten '''dynamic allocation''' code (HF/LF tilt and Band Boost) to allow more aggressive changes from the typical static allocation when warranted.

There are many minor improvements to '''speech quality''' in both SILK and CELT layers.

*'''DC-rejection''' below 3 Hz also aids quality if inaudible DC offset is present with no effect on deep bass notes.
*'''Automatic speech/music detection''' is introduced to optimize encoding mode choices, especially near the bitrate target range (presumably around 24–40 kbps) where the encoder may perform best with SILK, hybrid or CELT depending on content type. Below that range SILK performs best for both music & speech, and above it CELT performs best for speech & music. The detection, without look-ahead is not perfect but usually is undecided in audio where either mode will work well.
*'''Automatic bandwidth detection''' is also introduced to save wasted bits allocated to absent frequencies.
*'''Surround sound improvements''' were introduced since the beta release with considerable advances in coding efficiency, bitrate allocation and quality.

==== libopus v1.1.3 ====
Released July 15th, 2016. This version contains:

*Neon optimizations improving performance on ARMv7 and ARMv8 by up to 15%
*Fixes some issues with 16-bit platforms (e.g. TI C55x)
*Fixes to comfort noise generation (CNG)
*Documenting that PLC packets can also be 2 bytes
*Includes experimental ambisonics work (<code>--enable-ambisonics</code>)

==== libopus v1.2.1 ====
Released June 26th, 2017. This version contains:

*Speech quality improvements especially in the 12–20 kbit/s range
*Improved VBR encoding for hybrid mode
*More aggressive use of wider speech bandwidth, including fullband speech starting at 14 kbit/s
*Music quality improvements in the 32–48 kbit/s range
*Generic and SSE CELT optimizations
*Support for directly encoding packets up to 120 ms
*DTX support for CELT mode
*SILK CBR improvements
*Support for all of the fixes in draft-ietf-codec-opus-update-06 (the mono downmix and the folding fixes need <code>--enable-update-draft</code>)
*Many bug fixes, including integer wrap-arounds discovered through fuzzing (no security implications)

==== libopus v1.3 ====
Released on October 18th, 2018. This version contains:

* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband encoding down to 9 kb/s
* Making it possible to use SILK down to bitrates around 5 kb/s
* Minor quality improvement on tones
* Enabling the spec fixes in <nowiki>RFC 8251</nowiki> by default
* Security/hardening improvements
* Fixes to the CELT PLC
* Bandwidth detection fixes

==== libopus v1.3.1 ====
Released on April 12th, 2019. This version contains:

* Fixes to x87 builds
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)

==== libopus v1.4 ====
Released on April 20th, 2023. This version contains:

* Improved tuning of the Opus in-band FEC (LBRR). See the issue for details
* Added a OPUS_SET_INBAND_FEC(2) option that turns on FEC, but does not force SILK mode (FEC will be disabled in CELT mode)
* Improved tuning and various fixes to DTX
* Added Meson support, improved CMake support In addition to the improvements above, this release includes many minor bug fixes.

=== Other implementations ===

==== Concentus ====

The libopus reference library (fixed-point variant) has successfully been ported to both '''C#''' and '''Java''', as part of a project called '''Concentus'''. The aim of the project is specifically to target cross-platform applications where native C interop is relatively difficult. The code is available on [https://github.com/lostromb/concentus Github] and distributed via standard package managers.

==== Emscripten ports ====

At least one port of reference opus in Javascript has been made using the automated tool [https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Emscripten emscripten]. See [https://blog.rillke.com/opusenc.js/ here], [https://github.com/kazuki/opus.js-sample here] and [https://github.com/audiocogs/opus.js here].

==== ffmpeg ====
FFmpeg has a native [https://ffmpeg.org/ffmpeg-codecs.html#opus "opus"] codec. It is of lower quality than the reference libopus and only does CELT coding. However, it is still good for the ecosystem to have a completely independent implementation.

== Hardware & Software Support ==

Much of this section is based heavily on the Jan 12th 2013 version of the '''Support''' section of the [http://en.wikipedia.org/wiki/Opus_%28audio_format%29 Wikipedia article], which is more likely to be kept updated and to provide links to further information about the supporting platforms.

=== VoIP software ===
* The open source virtual PBX Freeswitch supports Opus transcoding.
* The voice-chat software Mumble supports Opus as its main codec.
* SIP softphones Phoner and PhonerLite support Opus
* The SIP and IAX2 client SFLphone is being fitted with Opus support.
* Integration of Opus into the Skype client is finished, although no version with Opus support has yet been published.
* TrueConf video conferencing solutions support Opus.
* Opus support is planned for Jitsi 2.0, together with VP8 video.
* Empathy may use any format supported in GStreamer, including Opus.
* Line2 has replaced their current codec with Opus. Their iOS app will be the first to be released with the Opus. The Android app will follow later.
* CSipSimple supports Opus, Codec2, G.726 and G.722.1 with an additional plug-in.
* The voice-chat software TeamSpeak 3 supports Opus for voice and music in pre-release server 3.0.7-pre2 and beta client version 3.0.10.
* The proprietary instant messenger service Discord uses Opus audio for all voice calls and video calls, regardless of platform.

=== Web frameworks and browsers ===
* Opus support is mandatory for WebRTC implementations.
* Mozilla supports Opus beginning with version 15 of Firefox and Thunderbird, plus Seamonkey, which uses a shared codebase.
* Depending on the backend in use, Opera supports inline playback of embedded Opus files. Official support for Opus and WebRTC are on the development roadmap.
* Chromium and Google Chrome have audio support as of version 33.
* Apple's Safari browser now supports Opus as of iOS 11 and macOS 10.13 High Sierra.
* Maxthon Cloud Browser

=== Streaming audio ===
* Icecast. (examples: [http://dir.xiph.org/by_format/Opus Stream directory by format Opus], [http://smj.delfa.net/opus_64.m3u 64k]/[http://smj.delfa.net/opus_256.m3u 256k] [http://smj.delfa.net/ Smooth Jazz Opus Stream], [http://www.absoluteradio.co.uk/listen/labs.html Absolute Radio Opus Trial] 7 stations at 24,64,96 kbps, [http://icecast.ofdoom.com:8000/burst-opus.ogg Icecast Of Doom 96k]
* Krad Radio
* Liquidsoap

=== Operating systems and desktop multimedia frameworks ===
* In Debian GNU/Linux the Opus development tools and supporting libraries can be installed from the preconfigured repositories in the next stable version ("wheezy") that is expected to be released in early 2013.
* For Microsoft Windows, there are DirectShow filters supporting Opus, including DC-Bass Source Mod and the LAV Filters.
* In GStreamer the integration of Opus support is complete.
* FFmpeg supports decoding and encoding Opus via the external library libopus.
* Android 5.0 and above supports Opus natively if encapsulated in the Ogg container, but .opus filename extension is not recognized by Android, so the use of double filename extension .opus.ogg is recommended as a workaround to allow apps to recognize files as playable audio.

=== Hardware support ===
* Support in [[Rockbox]] is available. This means hardware support for a series of portable media players (including some products from the iPod series by Apple and Sansa, iriver and Archos devices) and with "Rockbox as an Application" (RaaA) also on Android devices.

=== Player software ===

* Windows/Mac/Linux (Cross-Platform)
*# [[VLC]] (media player supports Opus as of version 2.0.4
*# [[Amarok]] 2.8 has transcoding support for Opus codec if ffmpeg is compiled with support for the libopus library & support for playback of Opus encoded files if Amarok is compiled against TagLib (newer than V1.8)
*# Clementine has Opus support
*# Audacious player
*# [[MPD]] as of version 0.18 if compiled against libopus (supports both encoding for http streams and decoding)
* Windows Exclusive
*# AIMP supports Opus natively as of version 3.20 build 1125 beta 1
*# [[foobar2000]] supports Opus natively as of v1.1.14 beta 1
*# Mpxplay supports Opus (using a decoder DLL) as of v1.60 alpha 2
*# [[Winamp]] supports Opus using a [http://forums.winamp.com/showthread.php?p=2925154#post2925154 3rd party plug-in]
*# MPC-HC
*# Resonic Player/Pro supports Opus natively as of version 0.2.2
* iOS/Android (Cross-Platform)
*# Capriccio [https://itunes.apple.com/us/app/capriccio-free-ultimate-music/id434829018?mt=8 iOS]/[https://play.google.com/store/apps/details?id=me.ideariboso.capriccio Android]
*# foobar2000 [https://itunes.apple.com/us/app/foobar2000/id1072807669?mt=8 iOS]/[https://play.google.com/store/apps/details?id=com.foobar2000.foobar2000&hl=en Android]
* Android Exclusive
*# [https://play.google.com/store/apps/details?id=in.krosbits.musicolet Musicolet Music Player]
*# [http://gonemadmusicplayer.blogspot.com/ GoneMAD Music Player]
*# [http://neutronmp.com/ Neutron Music Player]
*# [http://www.videolan.org/vlc/download-android.html VLC Media Player for Android]
*# [https://play.google.com/store/apps/details?id=ru.recoilme.freeamp FreeMP]
*# [https://play.google.com/store/apps/details?id=net.mderezynski.youki3 Youki]
*# [https://play.google.com/store/apps/details?id=com.aimp.player AIMP for Android]
*# [https://play.google.com/store/apps/details?id=com.acmeandroid.listen Listen Audiobook Player]
*# [https://play.google.com/store/apps/details?id=com.mxtech.videoplayer.ad MX Player]
*# [https://play.google.com/store/apps/details?id=org.tomahawk.tomahawk_android Tomahawk Player Beta]
*# [https://play.google.com/store/apps/details?id=com.maxmpz.audioplayer&hl=en Poweramp Music Player]

=== Other software ===
* CDBurnerXP
* MediaCoder
* Report-IT
* [[MP3tag|MP3tag]]
* [https://moisescardona.me/opus-gui/ Opus GUI]
* [http://www.xdlab.ru/en/ TagScanner]
* [http://www.xmedia-recode.de/ XMedia Recode]

== References & Notes ==

*{{note|homepage|a}}[http://opus-codec.org/ opus-codec.org homepage]
*{{note|FAQ|b}}[http://wiki.xiph.org/OpusFAQ Opus FAQ]
*{{note|RFC|c}}[http://tools.ietf.org/html/rfc6716 IETF RFC 6716]

[[Category:Codecs]]
[[Category:Lossy]]
[[Category:Encoder/Decoder]]

PEAQ

2023-08-10T05:54:36Z

Artoria2e5: Redirected page to BS.1387

#REDIRECT [[BS.1387]]

BS.1387

2023-08-10T05:42:25Z

Artoria2e5: /* Other objective metrics */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''. Several tests have been performed using EAQUAL most notably using numerous [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 AAC encoders] to determine via a [http://en.wikipedia.org/wiki/Pearson_correlation Pearson Correlation] the linear relationship between human ratings and EAQUAL ratings on a given set of test samples. The results, however when using objective testing methodologies are still inconclusive and mostly only used by codec developers and researchers.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

There is one advanced-version implementation: [https://github.com/HSU-ANT/gstpeaq GstPEAQ], which builds on GStreamer. It very slightly deviates from the standard.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both designed for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It works for both speech and music, but the neural network (don't worry, it runs fast enough on a CPU) is trained for short clips only. Maybe someone can write a tool to one file into many clips and see individual segment scores.
* CDPAM is allegedly the model that comes closest to human datasets. Unfortunately the only pre-trained model works on 22050 Hz. And it's academic neural-network code -- not something you can expect to run on first try.

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

BS.1387

2023-08-10T05:36:24Z

Artoria2e5: /* Other objective metrics */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''. Several tests have been performed using EAQUAL most notably using numerous [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 AAC encoders] to determine via a [http://en.wikipedia.org/wiki/Pearson_correlation Pearson Correlation] the linear relationship between human ratings and EAQUAL ratings on a given set of test samples. The results, however when using objective testing methodologies are still inconclusive and mostly only used by codec developers and researchers.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

There is one advanced-version implementation: [https://github.com/HSU-ANT/gstpeaq GstPEAQ], which builds on GStreamer. It very slightly deviates from the standard.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both designed for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It works for both speech and music, but is trained for short clips only. Maybe someone can write a tool to one file into many clips and see individual segment scores.

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

BS.1387

2023-08-10T05:34:56Z

Artoria2e5: /* Other implementations */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''. Several tests have been performed using EAQUAL most notably using numerous [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 AAC encoders] to determine via a [http://en.wikipedia.org/wiki/Pearson_correlation Pearson Correlation] the linear relationship between human ratings and EAQUAL ratings on a given set of test samples. The results, however when using objective testing methodologies are still inconclusive and mostly only used by codec developers and researchers.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

There is one advanced-version implementation: [https://github.com/HSU-ANT/gstpeaq GstPEAQ], which builds on GStreamer. It very slightly deviates from the standard.

== Other objective metrics ==
PEAQ is not the end. There are other metrics:<ref>https://github.com/jonnor/machinehearing/blob/09b5060bd03b8a49fc1d0afd8eedba4babca83ca/audio-quality/README.md</ref>

* ITU also has PESQ and POLQA, both optimized for speech.
* [https://github.com/google/visqol VISQOL] is Google's open-source metric. It's a neural-network model for both speech and music, but is optimized for short clips only.

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

Category:Quality measurement

2023-08-10T05:28:48Z

Artoria2e5: Created page with "Category:Technical"

[[Category:Technical]]

Category:Listening Tests

2023-08-10T05:28:36Z

Artoria2e5:

[[Category:Quality measurement]]

BS.1387

2023-08-10T05:28:28Z

Artoria2e5: /* External links */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''. Several tests have been performed using EAQUAL most notably using numerous [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 AAC encoders] to determine via a [http://en.wikipedia.org/wiki/Pearson_correlation Pearson Correlation] the linear relationship between human ratings and EAQUAL ratings on a given set of test samples. The results, however when using objective testing methodologies are still inconclusive and mostly only used by codec developers and researchers.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

Unfortunately, we have not been able to locate an open-source PEAQ-Advanced implementation yet.

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measurement]]

BS.1387

2023-08-10T05:28:18Z

Artoria2e5: /* External links */

'''ITU-R recommendation BS.1387''' is the document that defines '''Perceptual Evaluation of Audio Quality''' (PEAQ), an ''objective'' measurement technique used to measure the quality of encoded/decoded audio files. It acts in contrast to more the common place ''subjective'' testing methodology deployed using [[ABX]] and [[ABC/HR]] reference testing -- frequently preferred by hydrogenaudio. PEAQ returns an "ODG" rating, which is intended to match the difference in subjective (1–5) scores between the two input samples.

== Structure ==
PEAQ has two versions: basic and advanced. The basic version only uses an FFT-based ear model and is easier to compute. The advanced version uses both FFT and filter bank and is expected to be more accurate.

== History ==

BS.1387 was initially published in 1998. It was updated to BS.1387-1 in 2001 and BS.1387-2 in 2023.

* BS.1387-1 includes important technical corrections -- ones that are important to reach the standard's own conformance criteria.<ref>https://www.opticom.de/download/CorrectionstoBS1387.pdf</ref>
* BS.1387-2 seems to have no real change, except for removal of references to BS.1115, addition of a table of contents, and extensive reformatting.

== EAQUAL ==
'''EAQUAL''' (''Evaluation Of Audio Quality'') is an open-source software that implements PEAQ's basic model ''only''. Several tests have been performed using EAQUAL most notably using numerous [http://www.hydrogenaudio.org/forums/index.php?showtopic=20264 AAC encoders] to determine via a [http://en.wikipedia.org/wiki/Pearson_correlation Pearson Correlation] the linear relationship between human ratings and EAQUAL ratings on a given set of test samples. The results, however when using objective testing methodologies are still inconclusive and mostly only used by codec developers and researchers.

=== Invoking EAQUAL ===
As of version 0.1.3alpha, the ''-h'' argument can be used to find out how to use eaqual (ex: ''eaqual -h'').

To compare a test wave file to a reference wave file, one can use for example: ''eaqual -fref ref.wav -ftest test.wav''.

=== Interpreting EAQUAL output ===
EAQUAL outputs one score, the PEAQ "ODG" rating. This ODG (Objective Difference Grade) rating is designed by ITU to match an SDG (Subjective Difference Grade) rating, which is the difference between the subjective (1–5) scores between the two input samples.

=== Status of the project ===
Development of EAQUAL was halted in 2002 due to patent concerns. This is not a problem for PEAQ compilance, however, considering the 2001 BS.1387-1 does not differ substantially from the 2023 version.

The ITU patent declaration system does not list any specific PEAQ patent by number. However, no new patents have been added since 1998, so any patent should have expired by 2018.<ref>https://www.itu.int/en/ITU-R/study-groups/Pages/itu-r-patent-information.aspx</ref>

Versions of EAQUAL include:
* [http://www.mp3-tech.org/programmer/sources/eaqual.tgz EAQUAL Sourcecode] linux archive of c code used to implement EAQUAL provided by Gabriel Bouvigne, mirrored on github by [https://github.com/spxnn/eaqual spxnn]
* [http://www.rarewares.org/others.html EAQUAL Tools] zip compression archive of the utility used to perform EAQUAL tests provided by Rarewares.
* [https://github.com/ivan-codelegs/eaqual ivan-codelegs] github fork, adds macOS support

== Other implementations ==
PEAQ-Basic is simple enough to have many implementations.

* [https://sourceforge.net/projects/peaqb/ peaqb] is another implementation of PEAQ. Last updated 2003.
* There a good number of Matlab implementations for researchers. But it's Matlab, so there's gonna be academic code smell.

Unfortunately, we have not been able to locate an open-source PEAQ-Advanced implementation yet.

== External links==
* [[Wikipedia:Perceptual Evaluation of Audio Quality]]
* [https://www.itu.int/rec/R-REC-BS.1387 ITU BS.1387] download -- free full text of the standard, straight from the official site.
<references />

[[Category:Software]][[Category:Quality measures]]

BS.1387

2023-08-10T05:24:58Z

Artoria2e5: /* Structure */

BS.1387

2023-08-10T05:24:42Z

Artoria2e5: /* Other implementations */

BS.1387

2023-08-10T05:22:26Z

Artoria2e5: /* Status of the project */

BS.1387

2023-08-10T05:18:38Z

Artoria2e5: /* External links */