There are two stereo modes, "stereo" and "jstereo". "stereo" is just the normal independent coding of left & right channels. "jstereo" means the frames may use normal stereo, stereo, mid/side stereo or intensity stereo. FhG only seems to use intensity stereo at bitrates of 80kbs and lower. LAME does not have intensity stereo capability. In jstereo mode, the encoder has to decide for each frame if it should be encoded stereo or mid/side stereo.
mid/side stereo encodes the mid and side channels instead of left and right. It allocates more bits to the mid channel than the side channel. For signals without a lot of stereo seperation, there will be very little information in the side channel and this trick will improve bandwidth. If the left & right channels differ by a lot, then the side channel will contain a lot of information. Errors encoding this information will show up as noise in *both* the left and right channels after decoding.
The LAME mid/side switching criterion, and mid/side masking thresholds are taken from:
Johnston and Ferreira, Sum-Difference Stereo Transform Coding, Proc. IEEE ICASSP (1992) p 569-571.
The MPEG AAC standard claims to use mid/side encoding based on this
paper.
I believe the idea behind this is the following: If one channel has
much less noise masking in a certain band, than masked noise in one channel
that is spread to the other channel (by mid/side stereo) may no longer
be masked. If both channels have the same masking, then the noise
spread between both channels will be equally well masked.
regular stereo frames:
Fools.wav: (1180 frames)
FhG
frames 794-805,903
new LAME
frames 794-804,870,903,967,1018
old LAME
over 500 frames used regular stereo
IfYouCould.wav: (80 frames)
FhG
44,52,61
new LAME
43,44,52,60,61 (like FhG, 2 extra)
old LAME
34,63,66,67 (completely unlike FhG)
mstest.wav: (156 frames) (from Scott Miller)
FhG:
138 frames use regular stereo
new LAME
139 frames use regular stereo
old LAME
8 frames use regular stereo
t1.wav: 160 frames (from Nils Faerber)
FhG:
40-43, 81-84, 122-125, 145-151
new LAME:
39-42, 81-84, 121-125, 149
old LAME:
constant inappropriate toggling of ms_stereo
Castanets.wav: (253 frames)
All encoders use all ms_stereo for all frames
else3.wav: 217 frames
All encoders use all ms_stereo for all frames
What's done right now? Without the -h option, LAME jstereo only
computes L & R masking thresholds. If it is encoding a non ms_stereo
frame, no problem. If it is encoding Mid & Side channels, then
we have to be a little careful. We are quantizing Mid/Side channels,
but the masking (allowed distortion) is given on L & R channels. Thus
the computation of the audible distortion has to be done on the L &
R channels too. This just involves reforming the L/R MDCT coefficients
and the de-quantized L/R coefficients, and is done in calc_noise2.