[FFmpeg-devel] One pass volume normalization (ebur128)
Paul B Mahol
onemda at gmail.com
Sun Jul 14 00:39:29 CEST 2013
On 7/13/13, Jan Ehrhardt <phpdev at ehrhardt.nl> wrote:
> Nicolas George in gmane.comp.video.ffmpeg.devel (Sat, 13 Jul 2013
> 21:41:52 +0200):
>>Le quintidi 25 messidor, an CCXXI, Jan Ehrhardt a ecrit :
>>> Subject: [FFmpeg-devel] One pass volume normalization (ebur128)
>>
>>Single-pass volume normalization is not possible, please do not call the
>>feature that way.
>
> Call it what you like. I am using it in a single pass transcode. Just
> like the -af volnorm filter in MEncoder.
>
>>r128.I is not a good choice, but there is nothing better yet.
>
> You can use all the r128 variables, that are inserted in the metadata.
>
>>Missing documentation update.
>
> I know.
>
>>> @@ -51,18 +51,24 @@ static const AVOption volume_options[] = {
>>> { "fixed", "select 8-bit fixed-point", 0,
>>> AV_OPT_TYPE_CONST, { .i64 = PRECISION_FIXED }, INT_MIN, INT_MAX, A|F,
>>> "precision" },
>>> { "float", "select 32-bit floating-point", 0,
>>> AV_OPT_TYPE_CONST, { .i64 = PRECISION_FLOAT }, INT_MIN, INT_MAX, A|F,
>>> "precision" },
>>> { "double", "select 64-bit floating-point", 0,
>>> AV_OPT_TYPE_CONST, { .i64 = PRECISION_DOUBLE }, INT_MIN, INT_MAX, A|F,
>>> "precision" },
>>
>>> + { "metadata", "set the metadata key for loudness normalization",
>>> OFFSET(metadata), AV_OPT_TYPE_STRING, { .str = NULL }, .flags = A|F },
>>
>>Inconsistent indentation.
>
> Not really. If you look at the origional you will see that fixed, float
> and double are values for the precision.
>
>>> + if (vol->metadata) {
>>> + double loudness, new_volume, timestamp, mx;
>>> + AVDictionaryEntry *e;
>>> + mx = 20;
>>> + timestamp = (float)(1.0 * buf->pts / outlink->sample_rate);
>>> + mx = fmin(mx, timestamp);
>>> + e = av_dict_get(buf->metadata, vol->metadata, NULL, 0);
>>> + if (e) {
>>> + loudness = av_strtod(e->value, NULL);
>>> + if (loudness > -69) {
>>> + new_volume = fmax(-mx,fmin(mx,(-23 - loudness)));
>>> + av_log(NULL, AV_LOG_VERBOSE, "loudness=%f => %f =>
>>> volume=%f\n",
>>> + loudness, new_volume, pow(10, new_volume / 20));
>>> + set_fixed_volume(vol, pow(10, new_volume / 20));
>>> + }
>>
>>This paragraph has several problems. First, it is missing spaces around
>>words, that is easy to fix.
>
> ACK.
>
>>Second, it has a duplicated mathematical formula, which is pretty much a
>>recipe for inconsistency. That is easy to fix too.
>
> ACK.
>
>>Third, it has several hardcoded values, and that is not good design.
>
> Two of the three hardcoded values should be hardcoded. The -23 is part
> of the EBU R128 specs: http://tech.ebu.ch/loudness
>
> The 69 was suggested by Clement. If there is no sound at all, the volume
> level seems to be reported as -71 or somemething like that. -69 means
> there is sound (with a very low volume).
>
> The 20 is indeed an arbitrary choice, to maximize the volume adjustment
> during the first 20 seconds of a video.
>
>>It seems to me that using an expression, evaluated each time the metadata
>>value changes and with that value available as a variable would be a much
>>nicer design.
>
> I agree, but this is a little above my head.
>
>>AFAIK, this is unneeded since the "evil plan".
>
> I do not even know what the "evil plan" is...
It means, that code is not needed any more. Just remove whole function.
>
>>> diff --git a/libavfilter/f_ebur128.c b/libavfilter/f_ebur128.c
>>> index 88d37e8..f4ce6d9 100644
>>> --- a/libavfilter/f_ebur128.c
>>> +++ b/libavfilter/f_ebur128.c
>>
>>Unrelated.
>
> Not quite either. f_ebur128.c hardcodes the errorlevel to verbose if the
> metadata are set. You do not want to see the intermediate metadata if
> you do a 'one pass' transcoode. If needed you can always set the
> loglevel to view them.
>
> Jan
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
More information about the ffmpeg-devel
mailing list