MPC Encoder Functions

From Hydrogenaudio Knowledgebase

Revision as of 16:21, 13 July 2005 by Rjamorim (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

[edit] Quality oriented encoder functions

--ms x (x can be
0, 1, 2)
Sets Mid-side stereo mode (channel coupling): 0 (off), 1 (on) or 2 (enhanced): 0 means that there's no channel coupling, 2 means channel coupling is more cautious. 1 means, channel coupling is not so cautious which may result joint stereo artifacts.
M/S-coding calculates a "mid"-channel by addition of left and right channel (l+r)/2 and a "side"-channel (l-r)/2. With more mono-like signals one can use less bitrate to encode the side-channel, so that the overall bitrate will be less than encoding the left and right channel. If the psychoacoustics work well, there is no audible difference between m/s- coded or l/r-coded files. Mid/Side coding in MPC is subband selective, broadband 0-22khz is divided into 32 subbands. Psychoacoustics calculates for each subband if mid/side coding should be used or not. This is different than in MP3 encoding, where the full frame will be either m/s coded or l/r (true stereo) coded, so MP3 mid/side coding is more likely to cause audible artifacts, unless tweaked to be very cautious (like nssafejoint in lame encoder).

--cvd x (x can be
0, 1)
Sets clearvoicedetection either off (0) or on (1 default).
CVD is able to detect voice-like signals to give a higher quality with voices or sounds with harmonic spectra. It uses special analysis to detect harmonics with varying base frequency - the "normal" psychoacoustics are not able to detect such signals and will add audible noise to these signals.

--bw x (x can be
0 to 22500) : Defines the max frequency bandwidth which can be encoded (actual frequency response depends also on ATM/ATH). Basically acts like a lowpass filter.

--ltq x (x can be
iso, ank, fil) : Level threshold in quiet (also called as ATH) is a threshold or hearing curve. This is the sound pressure level (spl in db) below which the human hearing of most people is unable to perceive a sine-tone.

--ltq_max x (x can be
-99 to 99; recommended: 60 to 99)
maximum level for ltq, in dB. default is 83. - --ltq_gain x (x can be: -99 to 99; recommended: -12 to 5)
Adds offset of x db to chosen ltq. If you use negative number, you can make the hearing curve more sensitive (for more sensitive hearing), but it increases bitrate. If you use positive number, less bits will be needed.

--ltq_var x (x can be
0, 1)
Adaptive threshold in quiet. default is 1.

--minSMR x (x can be
0 to 3; recommended: 0)
Sets the minimum smr (signal to mask ratio) over full BandWidth. The higher the smr the higher the quality and bitrate. Setting -minSMR over 0 db will result in full BandWidth encoding, like in insane profile.

--tmpmask x (x can be
0, 1)
Sets post masking on or off. Temporal postmasking saves a few kbit/s because the human hearing has to "relax" after a sound event, so that the encoder can put a bit more distortion to the signal during this time (saves bits).

--nmt x (x can be
0 to 99; recommended: 6 to 16)
Sets minimum smr (signal to mask ratio) for pure noisy sound. MPC encoder calculates a masking threshold. Noisy sound has high masking ratio. Subband coder like MPC works by adding noise (quantization error) to each subband. You can increase the quantization resolution (less quantization noise, higher bitrate) by raising smr.
This noise should be of course below the masking threshold, so that it would be inaudible. Sometimes quantization noise however is not inaudible, because tonality estimation (which calculates the tonality and "noisiness" of sound) may conclude that a noisy sound is more noisier than it really is. This will mean that the masking threshold will be higher than it should be. Encoder concludes that more noise can be masked than really can, and this will result audible noise (distortion). This happens because quantization resolution (bitrate) is lower than it should be in order for the noise to be inaudible. But, you can compensate this by raising smr for pure noisy sound (nmt). It will increase the quantization resolution for noisy sound.

--tmn x (x can be
0 to 99; recommended: 22 to 32)
Sets minimum smr for pure sinusoidal sound. Sinusoidal sound is very tonal (not noisy). This means that it does not have much masking capability. Quantization resolution (bitrate) must be high enough so that tonal sound is encoded without audible noise. Of course in normal music there is both noisy and tonal sound, so the masking threshold will be calculated accordingly. Also different resolutions of quantization can be assigned to the different frequency regions.

--ans x (x can be
0 to 5; recommended: 5)
Adaptive noise shaping order. 0 means off, 1 to 5 means on. Default is 5.

--shortthr x (x can be for example
Short fft threshold. default is 5.

--transdet x (x can be for example
slewrate for transient detection. default is 100.

--minval x (x can be
0, 1, 2)
Method for calculating minval. 0: old method, 1: Buschmann method, 2: Klemm method.

Use alternative filterbank clipping solving strategy. Not thoroughly tested.

--scale x (x can be
0.00 to 1.00; recommended 0.850 to 1.00)
Method to overcome clipping of the encoded file. In the event that clipping occurs, the encoder (mppenc) will display a warning message and recommend a numerical value to process the audio so that clipping may cease. The numerical value is a percentage to the extent that the audio will be processed (1.0 = 100%, 0.85 = 85%). "--scale 1.00" means no additional clipping processing will be done - this is the desired situation.

--fadeshape x (x can be
1 to 99; recommended: 1 to 10)
Sets the fading scheme used. "Small values are first fast fading then slow fading. Large values are the opposite." - FrankKlemm?

--fadein x / --fadeout x (x can be
0 to 99; recommended: 0 to 5)
The encoder (mppenc) offers simple audio processing via fading. "X" indicates the number of seconds to which the encoder will fade the music. Useful for encoding live recordings when the whole recording will not be used (e.g. tracklisting, playlists, seamless listening, shuffled music, etc.).

--start x / --skip x (x can be
0 to 99; recommended: 0 to 5)
Sets the number of seconds that the encoder will NOT PROCESS. (i.e. "--start 4" will skip the first 4 seconds of the source file and the encoder will begin encoding on the fourth second). "Start" and "skip" do the same thing.
Personal tools