[FFmpeg-devel] [PATCH 2/3] avfilter/af_volumedetect.c: Add 32bit float audio support
Paul B Mahol
onemda at gmail.com
Tue Jun 18 10:48:51 EEST 2024
On Tue, Jun 18, 2024 at 8:56 AM Rémi Denis-Courmont <remi at remlab.net> wrote:
>
>
> Le 17 juin 2024 19:52:10 GMT+02:00, Paul B Mahol <onemda at gmail.com> a
> écrit :
> >On Mon, Jun 17, 2024 at 4:52 PM Rémi Denis-Courmont <remi at remlab.net>
> wrote:
> >
> >>
> >>
> >> Le 17 juin 2024 13:18:11 GMT+02:00, Yigithan Yigit <
> >> yigithanyigitdevel at gmail.com> a écrit :
> >> >---
> >> > libavfilter/af_volumedetect.c | 159 ++++++++++++++++++++++++++++------
> >> > 1 file changed, 133 insertions(+), 26 deletions(-)
> >> >
> >> >diff --git a/libavfilter/af_volumedetect.c
> b/libavfilter/af_volumedetect.c
> >> >index 327801a7f9..dbbcd037a5 100644
> >> >--- a/libavfilter/af_volumedetect.c
> >> >+++ b/libavfilter/af_volumedetect.c
> >> >@@ -20,27 +20,51 @@
> >> >
> >> > #include "libavutil/channel_layout.h"
> >> > #include "libavutil/avassert.h"
> >> >+#include "libavutil/mem.h"
> >> > #include "audio.h"
> >> > #include "avfilter.h"
> >> > #include "internal.h"
> >> >
> >> >+#define MAX_DB_FLT 1024
> >> > #define MAX_DB 91
> >> >+#define HISTOGRAM_SIZE 0x10000
> >> >+#define HISTOGRAM_SIZE_FLT (MAX_DB_FLT*2)
> >> >
> >> > typedef struct VolDetectContext {
> >> >- /**
> >> >- * Number of samples at each PCM value.
> >> >- * histogram[0x8000 + i] is the number of samples at value i.
> >> >- * The extra element is there for symmetry.
> >> >- */
> >> >- uint64_t histogram[0x10001];
> >> >+ uint64_t* histogram; ///< for integer number of samples at each
> PCM
> >> value, for float number of samples at each dB
> >> >+ uint64_t nb_samples; ///< number of samples
> >> >+ double sum2; ///< sum of the squares of the samples
> >> >+ double max; ///< maximum sample value
> >> >+ int is_float; ///< true if the input is in floating point
> >> > } VolDetectContext;
> >> >
> >> >-static inline double logdb(uint64_t v)
> >> >+static inline double logdb(double v, enum AVSampleFormat sample_fmt)
> >> > {
> >> >- double d = v / (double)(0x8000 * 0x8000);
> >> >- if (!v)
> >> >- return MAX_DB;
> >> >- return -log10(d) * 10;
> >> >+ if (sample_fmt == AV_SAMPLE_FMT_FLT) {
> >> >+ if (!v)
> >> >+ return MAX_DB_FLT;
> >> >+ return -log10(v) * 10;
> >> >+ } else {
> >> >+ double d = v / (double)(0x8000 * 0x8000);
> >> >+ if (!v)
> >> >+ return MAX_DB;
> >> >+ return -log10(d) * 10;
> >> >+ }
> >> >+}
> >> >+
> >> >+static void update_float_stats(VolDetectContext *vd, float
> *audio_data)
> >> >+{
> >> >+ double sample;
> >> >+ int idx;
> >> >+ if(!isnormal(*audio_data))
> >> >+ return;
> >>
> >> Do we really need to classify floats here? That's probably going to hurt
> >> perfs badly, and makes an otherwise very vectorisable function not so
> >> easily vectored.
> >>
> >
> >This is fast, it should translate to checking few bits of memory.
>
> Sure but the branch is what irks me here, not the classification per se.
> And I don't get why it's needed here, where most of the code base seems to
> assume that floats are always numeric. It's also not clear why subnormals
> are disallowed here.
>
HUGE floats get out of range easily, there is probably nicer way to add
them to some kind of "non-uniform non-linear" histogram.
>
> IMO all that needs justification in the commit message which I find
> lacking. Or if it's unjustified then it shouldn't be there.
>
> >> >+ sample = fabsf(*audio_data);
> >> >+ if (sample > vd->max)
> >> >+ vd->max = sample;
> >> >+ vd->sum2 += sample * sample;
> >> >+ idx = lrintf(floorf(logdb(sample * sample, AV_SAMPLE_FMT_FLT))) +
> >> MAX_DB_FLT;
> >>
> >> You're recomputing the same value again, and you seem to be rounding
> twice
> >> in a row?
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>
More information about the ffmpeg-devel
mailing list