Difference between revisions of "Lossless"

From Hydrogenaudio Knowledgebase
Jump to: navigation, search
(move to --> Category:Codecs)
m (Couple of words fixed.)
(22 intermediate revisions by 15 users not shown)
Line 1: Line 1:
'''Lossless compression''' is a compression methodology in which the result of the compression can be restored faithfully, i.e. bit-by-bit identical with the uncompressed data.
+
Compression is '''lossless''' when decoding the compressed data gives a result which is identical bit-by-bit to the uncompressed original.  Also, a format that stores data ''uncompressed'' is lossless if it can be reverted back to the original bit-by-bit.  
  
In a nutshell, it is somewhat like compressing a Waveform file with ZIP or RAR.
+
Lossless compression has been used for long in various applications, for example ''generic file compressors'' like ZIP or RAR or Windows NTFS file compression feature; this article is about lossless compression of audio signals.  
  
The difference between 'mere' ZIP/RAR is that lossless compression algorithms are especially tuned and designed for the characteristics of Waveform data, thus achieving compression far greater than can be achieved by generic compression utilities.
+
== Lossless audio compression and formats ==
  
As lossless compression preserves all information of the original Waveform file, audio compressed with lossless compression will unavoidably be larger than audio compressed with [[lossy]] compression. However, this disadvantage is more than offset by lossless' ability to be [[transcoding|transcoded]] to other lossless format <u>without</u> any quality degradation.
+
Compressing audio with generic file compressors to e.g. .7z or .rar is not efficient for typical audio signals: file sizes end up fairly close to the uncompressed original.  Lossless audio formats might measure closer to half size of the original uncompressed [[PCM|linear PCM]] (.wav or .aiff) file, utilizing knowledge about real-world ''audio'' data.  File sizes will still be larger than audio compressed with any (reasonable) [[lossy]] encoder, as lossy compression aims at saving space by replacing the original signal by an approximation which is ''perceptually'' "close" but easier to compress.  
  
 +
Lossless audio file formats ''typically'' have features that generic file compressors are lacking (but most lossy audio formats possess): for playback they can be read block by block rather than having to unpack the whole file first, and a decoder might pick up the audio mid-stream and play from there (like when tuning in radio on a channel).  Furthermore, they can be tagged with metadata like artist, album, title, track number etc. Because this feature is designed for metadata to be altered by users at their discretion, a lossless ''audio'' format need not transfer metadata bit-by-bit, only the audio - although certain lossless codecs can also store the original's metadata in a separate chunk to be recreated.
  
=Popular lossless formats=
+
Just like e.g. two .zip-compressed copies of the same file might differ due to e.g. effort made to find a smaller file with the same information - try for example 7-zip with different compression options - then the same original audio file might encode to different size depending on both codec format and the settings used upon encoding - possibly the compressor's internal choices could depend on the CPU and process different files with the same command given on two different computers. 
* [[Apple Lossless]] ([[ALAC]])
+
  
* [[FLAC]]
+
The phrase "lossless" is not restricted to ''files'', it also refers to data streams (like a video file with lossless audio) or not in files (an audio [[CD]] has no files) - or furthermore, to the ''process'' that generates a signal.  E.g. reducing a 16-bit signal to 8 bits is not a "lossless" operation, and it does not become lossless even if the output signal is stored in a "lossless" format like FLAC (or even uncompressed .wav or .aiff).  [https://en.wikipedia.org/wiki/Master_Quality_Authenticated MQA] is lossy processing even if delivered with a codec that ''could'' deliver the lossless signal. 
  
* [[Lossless Audio]] ([[LA]])
+
=== Notable lossless codecs in current use ===
  
* [[LPAC]]
+
Different [[codec|codecs]] - i.e., formats and encoders/decoders - have been developed with different priorities in mind, as trade-off between compressed file size vs encoding CPU load (time taken to encode) vs decoding CPU load (to play or decompress for e.g. creating lossy files for portable use).  Also they differ as to features and OS/third party support.  Thus there is no single 'superior for all' format.  To compare features and performance, see the HA Wiki's [[Lossless comparison]] - though arguably, performance was more of an issue with storage/CPU costs of the early 2000s when most popular lossless formats were launched and when the first version of the comparison and this article were written.
  
* [[MLP]]
+
Some formats in current use - some widespread and available from online music stores, others arguably restricted to the enthusiast user segments - in alphabetical order:
 +
* [[Apple Lossless|Apple Lossless (ALAC)]].  Has support in the Apple ecosystem.
 +
* [[Free Lossless Audio Codec|Free Lossless Audio Codec (FLAC)]]. Probably the most common and well-supported lossless codec.  Highly optimized for light decoding CPU usage.
 +
* [[Monkey's Audio|Monkey's Audio (APE)]]. Launched 2000, it was by then ''the'' compressor for users who prioritized size over speed.  Still actively maintained.
 +
* [[OptimFROG]]. Even higher compression (and slower speed) than Monkey's.  Optional [[hybrid codec|lossless/lossy hybrid]] encoding.
 +
* [[TAK|Tom's verlustfreier Audiokompressor (TAK)]]. More recently launched (2006), it has attracted attention for accomplishing both high speed and high compression levels.
 +
* [[TTA|The True Audio (TTA)]]. A single-setting compressor performing on the fast side (close to WavPack/FLAC).
 +
* [[WavPack]]. Developed since the 1990s into arguably the most feature-rich lossless codec. Optional [[hybrid codec|lossless/lossy hybrid]] encoding.
  
* [[Monkey's Audio]] ([[APE]])
+
Also Blu-Ray/DVD discs are certainly widespread, carrying a variety of audio formats of which the lossless compressed formats are [[Meridian Lossless Packing]] (MLP), Dolby TrueHD (uses the MLP algorithm) and [[DTS-HD|DTS-HD MA]] (hybrid).  [https://en.wikipedia.org/wiki/FFmpeg FFmpeg] has support for these. 
  
* [[OptimFROG]]
+
=== Other (once) notable formats ===
 +
These formats once have at some stage been widely used or otherwise notable, though end-users would hardly encode to them anymore (as of 2022):
 +
* [[Shorten]] (SHN): The major lossless compressor of the 1990s.
 +
* [[Windows Media Audio|WMA lossless]]: Once aggressively pushed by Microsoft, support for the WMA formats has waned to the point where certain Windows 10 releases could not handle WMA lossless(ly).  Not recommended.
 +
* [[ATRAC]] Advanced Lossless: a lossless [[hybrid codec|hybrid]] extension of Sony's ATRAC format (MiniDisc etc.). Like WMA, a once-corporate-backed format now considered legacy.
 +
* [[mp3HD]]: A short-lived similar extension of MP3, [[hybrid codec|hybrid]] with a lossless correction stream. 
 +
* [[Real Lossless]]. Before the Windows Media suite, Real Networks had theirs, and it was expanded with a lossless audio format and a freeware encoder.  Real would later support the development of MPEG-4 ALS.
 +
* [[MPEG-4 ALS]]. Despite being an ISO standard, with an open-source encoder/decoder available, the format scarcely caught on.  Its predecessors [[MPEG-4 ALS | LPAC/LTAC]] once enjoyed some popularity in competition with Shorten.
 +
* [[MPEG-4 SLS]]. Also ISO-standardized, but hardly in use, and obviously not intended for end-users, witnessed by the pricing of the only known encoder.
 +
* [[Lossless Audio]] (La).  Notable for its very high compression levels, and would therefore appear in comparison tests.  Unmaintained since 2004.
 +
* [[Sac]].  Only semi-notable for its even higher compression levels, not for ever being practically useful other than for benchmarking.
 +
* [[RK Audio]] (RKAU) and the later general-purpose compressor WinRK. RKAU offered good compression for year 2000 standards.
 +
* [http://www.logarithmic.net/pfh/bonk Bonk]. Also with a lossy compressor, both abandoned around 2002.  More notable for the project evolving into the BonkEnc CD ripper, which later changed name to [[fre:ac]]. Bonk itself was redeveloped into a lossy/lossless codec called ''sonic'', which has ffmpeg support.
 +
* aptX Lossless is a codec to be used in Bluetooth streaming. Hardware support [https://www.qualcomm.com/news/releases/2021/09/01/qualcomm-adds-bluetooth-lossless-audio-technology-snapdragon-sound announced September 2021], future popularity unknown at time of writing.
  
* [[RKAU]]
+
Also several audio editing software have (had) their own formats, several of which are still in use.
  
* [[Shorten]] ([[SHN]])
+
=== Oddball legacy formats ===
 +
There are several old lossless formats that never made it to a significant userbase. Most of those would have disappeared by now, but several are being preserved for posterity at [[User:Rjamorim|rjamorim]]'s  Rarewares/[https://www.rarewares.org/rrw/programs.php ReallyRareWares] website.
  
* [[TTA]]
+
* a-Pac (by sound card manufacturer MARIAN)
 +
* Advanced Digital Audio (ADA)
 +
* AudioZip
 +
* Dakx WAV
 +
* Entis Lab MIO
 +
* Kexis
 +
* LiteWave
 +
* mkw
 +
* [https://en.wikipedia.org/wiki/Ogg_Squish OggSquish] (Xiph, discontinued in favour of FLAC).
 +
* Pegasus SPS
 +
* Split2000
 +
* Sonarc ([https://codecs.multimedia.cx/2020/09/a-brief-look-at-sonarc/ possibly the first known lossless audio compressor], apparently predating both Shorten and VocPack)
 +
* VocPack
 +
* WavArc
 +
* WaveZip/MUSICompress
  
* [[WavPack]]
+
And finally, HA will sometimes see codecs created more for educational purposes than indended to acquire a userbase, like [https://hydrogenaud.io/index.php?topic=117335 SLAC] by the WavPack developer and [https://hydrogenaud.io/index.php?topic=119030 SELA] (HA threads).
  
* [[WMA | WMA lossless]]
+
== Further reading ==
 +
* [https://www.rarewares.org/rrw/programs.php ReallyRareWares] has preserved older codecs
 +
* [http://fileformats.archiveteam.org/wiki/Audio_and_Music#Audio_recording_and_sound_waves fileformats.archiveteam.org] has a section describing audio file formats.
 +
* [[Lossless_comparison| HA Wiki's Lossless Codec Comparison]] originally by [[User:Rjamorim|Rjamorim]]  
  
 
+
[[Category:Codecs|*]]
=Oddball Formats=
+
There are several old lossless formats that aren't really deserving of having an article all for themselves. Reasons are: lack of widespread support, lack of features, bad efficiency and, most importantly, it seems noone is really interested in them.
+
 
+
Most of those would have disappeared by now, but they are being preserved for posterity at [[User:Rjamorim|rjamorim]]'s [http://www.rjamorim.com/rrw/ ReallyRareWares]
+
 
+
; Advanced Digital Audio (ADA)
+
 
+
* http://www.rjamorim.com/rrw/ada.html
+
 
+
; Marian's a-Pac
+
 
+
* http://www.marian.de/en/downloads#APAC
+
* http://www.rjamorim.com/rrw/apac.html
+
 
+
; AudioZip
+
 
+
* http://www.rjamorim.com/rrw/audiozip.html
+
 
+
; Dakx WAV
+
 
+
* http://www.dakx.com/
+
* http://www.rjamorim.com/rrw/daxwav.html
+
 
+
; Entis Lab MIO
+
 
+
* http://www.entis.gr.jp/eri/frame.html
+
* http://www.rjamorim.com/rrw/mio.html
+
 
+
; LiteWave
+
 
+
* http://www.clearjump.com/products/LiteWave.html
+
* http://www.rjamorim.com/rrw/litewave.html
+
 
+
; Pegasus SPS
+
 
+
* http://www.krishnasoft.com/sps.htm
+
* http://www.rjamorim.com/rrw/pegasussps.html
+
 
+
; RKaudio
+
 
+
* http://www.msoftware.co.nz/downloads_page.php
+
* http://rksoft.virtualave.net/rkau.html
+
 
+
; Split2000
+
 
+
* http://www.rjamorim.com/rrw/split2000.html
+
 
+
; Sonarc
+
 
+
* http://www.rjamorim.com/rrw/sonarc.html
+
 
+
; VocPack
+
 
+
* http://www.rjamorim.com/rrw/vocpack.html
+
 
+
; WavArc
+
 
+
* http://www.rjamorim.com/rrw/wavarc.html
+
 
+
; WaveZip/MUSICompress
+
 
+
* http://members.aol.com/_ht_a/sndspace/
+
* http://www.rjamorim.com/rrw/wavezip.html
+
 
+
 
+
Note that currently '''no single format can be considered best for all applications'''. Rather, the best format depends on the ''intended use'', as well as a number of other factors (such as licensing and file structure). For example, Shorten and FLAC are widely used for sharing live music because of their cross-platform support and speed. Monkey's Audio is popular among Windows users for its superior compression ratio.
+
 
+
=Comparisons=
+
''Note the specific assumptions and limitations of each comparison; in particular, results are sensitive to the music selected'''
+
 
+
; http://web.inter.nl.net/users/hvdh/lossless/lossless.htm : Includes an interesting graph of encode/decode speeds vs. file size on the All Albums page
+
 
+
; [[Lossless comparison]] : A comparision focusing more on codec features and less on absolute encoding efficiency. Also features a table comparing most popular codecs based on their features.
+
 
+
; http://members.home.nl/w.speek/comparison.htm : Performance Comparison of Lossless Audio Compressors - Compares file size, encode speed, decode speed for [[APE]], [[FLAC]], [[LPAC]], [[WavPack]], Shorten ([[SHN]]), [[RKAU]], [[OptimFROG]], [[LA]], [[WMA | WMA Lossless]]. Updated 5-2003
+
 
+
; http://www.bobulous.org.uk/misc/lossless_audio_2006.html : Lossless audio formats - A comparison of the rip-and-encode speed and album file size of six different lossless formats: [[WAV|uncompressed Wave]], [[FLAC]], [[WavPack]], [[SHN|Shorten]], [[APE|Monkey's Audio]], and [[OptimFROG]]. First published on 22nd May 2006.
+
 
+
 
+
[[Category:Codecs]]
+

Revision as of 22:31, 15 May 2022

Compression is lossless when decoding the compressed data gives a result which is identical bit-by-bit to the uncompressed original. Also, a format that stores data uncompressed is lossless if it can be reverted back to the original bit-by-bit.

Lossless compression has been used for long in various applications, for example generic file compressors like ZIP or RAR or Windows NTFS file compression feature; this article is about lossless compression of audio signals.

Lossless audio compression and formats

Compressing audio with generic file compressors to e.g. .7z or .rar is not efficient for typical audio signals: file sizes end up fairly close to the uncompressed original. Lossless audio formats might measure closer to half size of the original uncompressed linear PCM (.wav or .aiff) file, utilizing knowledge about real-world audio data. File sizes will still be larger than audio compressed with any (reasonable) lossy encoder, as lossy compression aims at saving space by replacing the original signal by an approximation which is perceptually "close" but easier to compress.

Lossless audio file formats typically have features that generic file compressors are lacking (but most lossy audio formats possess): for playback they can be read block by block rather than having to unpack the whole file first, and a decoder might pick up the audio mid-stream and play from there (like when tuning in radio on a channel). Furthermore, they can be tagged with metadata like artist, album, title, track number etc. Because this feature is designed for metadata to be altered by users at their discretion, a lossless audio format need not transfer metadata bit-by-bit, only the audio - although certain lossless codecs can also store the original's metadata in a separate chunk to be recreated.

Just like e.g. two .zip-compressed copies of the same file might differ due to e.g. effort made to find a smaller file with the same information - try for example 7-zip with different compression options - then the same original audio file might encode to different size depending on both codec format and the settings used upon encoding - possibly the compressor's internal choices could depend on the CPU and process different files with the same command given on two different computers.

The phrase "lossless" is not restricted to files, it also refers to data streams (like a video file with lossless audio) or not in files (an audio CD has no files) - or furthermore, to the process that generates a signal. E.g. reducing a 16-bit signal to 8 bits is not a "lossless" operation, and it does not become lossless even if the output signal is stored in a "lossless" format like FLAC (or even uncompressed .wav or .aiff). MQA is lossy processing even if delivered with a codec that could deliver the lossless signal.

Notable lossless codecs in current use

Different codecs - i.e., formats and encoders/decoders - have been developed with different priorities in mind, as trade-off between compressed file size vs encoding CPU load (time taken to encode) vs decoding CPU load (to play or decompress for e.g. creating lossy files for portable use). Also they differ as to features and OS/third party support. Thus there is no single 'superior for all' format. To compare features and performance, see the HA Wiki's Lossless comparison - though arguably, performance was more of an issue with storage/CPU costs of the early 2000s when most popular lossless formats were launched and when the first version of the comparison and this article were written.

Some formats in current use - some widespread and available from online music stores, others arguably restricted to the enthusiast user segments - in alphabetical order:

Also Blu-Ray/DVD discs are certainly widespread, carrying a variety of audio formats of which the lossless compressed formats are Meridian Lossless Packing (MLP), Dolby TrueHD (uses the MLP algorithm) and DTS-HD MA (hybrid). FFmpeg has support for these.

Other (once) notable formats

These formats once have at some stage been widely used or otherwise notable, though end-users would hardly encode to them anymore (as of 2022):

  • Shorten (SHN): The major lossless compressor of the 1990s.
  • WMA lossless: Once aggressively pushed by Microsoft, support for the WMA formats has waned to the point where certain Windows 10 releases could not handle WMA lossless(ly). Not recommended.
  • ATRAC Advanced Lossless: a lossless hybrid extension of Sony's ATRAC format (MiniDisc etc.). Like WMA, a once-corporate-backed format now considered legacy.
  • mp3HD: A short-lived similar extension of MP3, hybrid with a lossless correction stream.
  • Real Lossless. Before the Windows Media suite, Real Networks had theirs, and it was expanded with a lossless audio format and a freeware encoder. Real would later support the development of MPEG-4 ALS.
  • MPEG-4 ALS. Despite being an ISO standard, with an open-source encoder/decoder available, the format scarcely caught on. Its predecessors LPAC/LTAC once enjoyed some popularity in competition with Shorten.
  • MPEG-4 SLS. Also ISO-standardized, but hardly in use, and obviously not intended for end-users, witnessed by the pricing of the only known encoder.
  • Lossless Audio (La). Notable for its very high compression levels, and would therefore appear in comparison tests. Unmaintained since 2004.
  • Sac. Only semi-notable for its even higher compression levels, not for ever being practically useful other than for benchmarking.
  • RK Audio (RKAU) and the later general-purpose compressor WinRK. RKAU offered good compression for year 2000 standards.
  • Bonk. Also with a lossy compressor, both abandoned around 2002. More notable for the project evolving into the BonkEnc CD ripper, which later changed name to fre:ac. Bonk itself was redeveloped into a lossy/lossless codec called sonic, which has ffmpeg support.
  • aptX Lossless is a codec to be used in Bluetooth streaming. Hardware support announced September 2021, future popularity unknown at time of writing.

Also several audio editing software have (had) their own formats, several of which are still in use.

Oddball legacy formats

There are several old lossless formats that never made it to a significant userbase. Most of those would have disappeared by now, but several are being preserved for posterity at rjamorim's Rarewares/ReallyRareWares website.

  • a-Pac (by sound card manufacturer MARIAN)
  • Advanced Digital Audio (ADA)
  • AudioZip
  • Dakx WAV
  • Entis Lab MIO
  • Kexis
  • LiteWave
  • mkw
  • OggSquish (Xiph, discontinued in favour of FLAC).
  • Pegasus SPS
  • Split2000
  • Sonarc (possibly the first known lossless audio compressor, apparently predating both Shorten and VocPack)
  • VocPack
  • WavArc
  • WaveZip/MUSICompress

And finally, HA will sometimes see codecs created more for educational purposes than indended to acquire a userbase, like SLAC by the WavPack developer and SELA (HA threads).

Further reading