[FFmpeg-devel] [PATCH] lavc/ac3dec: disable DRC by default

Wed Jan 29 15:48:36 EET 2020

This issue has been argued before, with the status quo being preserved under
the logic that the spec says this parameter is supposed to default to 1,
and that we should follow the spec. The spec was misguided, and thus so was
blindly following it.

The (E-)AC3 DRC architecture is a clever optimization: the (relatively) heavy
lifting of analyzing the audio data and calculating the per-block gain needed
is offloaded to the encoder, so that the player only needs to apply the gain
coefficients (at the desired scale) to get the DRC effect. In theory, this
seems like an efficient way to reduce the computational complexity of playback.
In practice, this isn't actually the case.

Actual (E-)AC3 is not guaranteed to have useful DRC metadata at all. For
instance, ffmpeg's encoder has never supported generating it. This means that
a user who wants DRC can't merely use the internal filter for (E-)AC3 and an
external one for other codecs; they have to use an external filter at al
times! They could also apply the internal filter, but it's hard to see what
benefit this would give.

I've also seen it argued that (E-)AC3 DRC does a better job of reproducing the
intent of the audio mastering engineer than playback-time DRC can. I don't
believe this argument is credible either. The official encoder does support
passing in custom gain values, but it doesn't appear that this capability is
frequently used. I'm not aware of any tooling that actually passes custom data
into the encoder's loudness parameters, and Dolby's first-party GUI tools don't
appear to support doing so. Dolby's own tools expose the parameters available in
the encoder itself, which comprise a set of 5 presets with no customizability:
"Film Standard", "Film Light", "Music Standard", "Music Light", and "Speech",
along with the null profile. These profiles are documented (minus their time
constants) in this PDF on the Dolby website:
https://www.dolby.com/us/en/technologies/a-guide-to-dolby-metadata.pdf
These are not advanced filters; they don't provide any complex control to the
content producer, who can't even tune the details or _know_ the time constants.
Even if the mastering engineer did want to customize the DRC with some custom
tool, in most cases I don't think they'd have the ability to do so because they
don't have access to the point in the pipeline where audio encoding occurs.
In TV, audio is often encoded live rather than streaming a pre-compressed
bitstream. In web streaming, masters are usually delivered with uncompressed
PCM audio and no loudness metadata, and encoded using stock configurations.
Film for DVD or Blu-Ray is the only plausible case, but again, I'm not aware
of any widely-used tooling for this with any more configurability than Dolby's.

Additional context: the AudioToolbox (E-)AC3 decoder (which is Dolby's) seems
not to apply DRC; you can see this by comparing the output of ffmpeg's decoder
with drc_scale set to 0 and 1 with the output of the (e)ac3_at decoder.

So to summarize:
- (E-)AC3 DRC is designed as an optimization for the decoder
- There's no inherent reason (E-)AC3 _requires_ DRC for playback more than any
  other codec
- Due to inconsistent encoding, it can't be relied upon
- There's no reason to believe it's better than decode-side DRC when it works
- It's not universally applied by other software

I think this makes a compelling case to stop applying (E-)AC3 DRC by default.
The user can enable it on their own if they prefer, but I believe having it off
by default will improve the apparent quality of (E-)AC3 playback for the vast
majority of ffmpeg users, and improve the consistency between (E-)AC3 playback
and playback of any other codec. Thus, this patch.
---
 libavcodec/ac3dec_fixed.c | 2 +-
 libavcodec/ac3dec_float.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavcodec/ac3dec_fixed.c b/libavcodec/ac3dec_fixed.c
index bd66175d50..312616edfd 100644
--- a/libavcodec/ac3dec_fixed.c
+++ b/libavcodec/ac3dec_fixed.c
@@ -169,7 +169,7 @@ static void ac3_downmix_c_fixed16(int16_t **samples, int16_t **matrix,
 
 static const AVOption options[] = {
     { "cons_noisegen", "enable consistent noise generation", OFFSET(consistent_noise_generation), AV_OPT_TYPE_BOOL, {.i64 = 0 }, 0, 1, PAR },
-    { "drc_scale", "percentage of dynamic range compression to apply", OFFSET(drc_scale), AV_OPT_TYPE_FLOAT, {.dbl = 1.0}, 0.0, 6.0, PAR },
+    { "drc_scale", "percentage of dynamic range compression to apply", OFFSET(drc_scale), AV_OPT_TYPE_FLOAT, {.dbl = 0.0}, 0.0, 6.0, PAR },
     { "heavy_compr", "enable heavy dynamic range compression", OFFSET(heavy_compression), AV_OPT_TYPE_BOOL, {.i64 = 0 }, 0, 1, PAR },
     { NULL},
 };
diff --git a/libavcodec/ac3dec_float.c b/libavcodec/ac3dec_float.c
index b85a4ce336..69bd47aa6a 100644
--- a/libavcodec/ac3dec_float.c
+++ b/libavcodec/ac3dec_float.c
@@ -33,7 +33,7 @@
 
 static const AVOption options[] = {
     { "cons_noisegen", "enable consistent noise generation", OFFSET(consistent_noise_generation), AV_OPT_TYPE_BOOL, {.i64 = 0 }, 0, 1, PAR },
-    { "drc_scale", "percentage of dynamic range compression to apply", OFFSET(drc_scale), AV_OPT_TYPE_FLOAT, {.dbl = 1.0}, 0.0, 6.0, PAR },
+    { "drc_scale", "percentage of dynamic range compression to apply", OFFSET(drc_scale), AV_OPT_TYPE_FLOAT, {.dbl = 0.0}, 0.0, 6.0, PAR },
     { "heavy_compr", "enable heavy dynamic range compression", OFFSET(heavy_compression), AV_OPT_TYPE_BOOL, {.i64 = 0 }, 0, 1, PAR },
     { "target_level", "target level in -dBFS (0 not applied)", OFFSET(target_level), AV_OPT_TYPE_INT, {.i64 = 0 }, -31, 0, PAR },
 
-- 
2.24.1