[FFmpeg-devel] [PATCH v2 1/2] avfilter/vf_blackdetect: add AVX2 SIMD version

Kacper Michajlow kasper93 at gmail.com
Fri Jul 18 17:16:24 EEST 2025


On Fri, 18 Jul 2025 at 15:33, Kieran Kunhya via ffmpeg-devel
<ffmpeg-devel at ffmpeg.org> wrote:
>
> On Fri, Jul 18, 2025 at 2:22 PM Kacper Michajlow <kasper93 at gmail.com> wrote:
> >
> > On Fri, 18 Jul 2025 at 14:46, Kieran Kunhya via ffmpeg-devel
> > <ffmpeg-devel at ffmpeg.org> wrote:
> > >
> > > On Fri, Jul 18, 2025 at 1:41 PM Kacper Michajlow <kasper93 at gmail.com> wrote:
> > > >
> > > > On Fri, 18 Jul 2025 at 14:14, Kieran Kunhya via ffmpeg-devel
> > > > <ffmpeg-devel at ffmpeg.org> wrote:
> > > > >
> > > > > > blackdetect8_c:                                        820.8 ( 1.00x)
> > > > > > blackdetect8_avx2:                                     219.2 ( 3.74x)
> > > > > > blackdetect16_c:                                       372.8 ( 1.00x)
> > > > > > blackdetect16_avx2:                                    201.4 ( 1.85x)
> > > > > >
> > > > > > Again, sorry for being pedantic here, but it gives the wrong
> > > > > > impression especially if you look at this from outside.
> > > > >
> > > > > Also misleading as far as I understand because GCC doesn't have
> > > > > runtime detection like FFmpeg.
> > > >
> > > > Speak of... actually GCC does have runtime detection. All you have to
> > > > do is mark the function with `target_clones` with requested
> > > > architectures and it will dispatch automatically during runtime the
> > > > best function to use.
> > > >
> > > > See for more information:
> > > > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-target_005fclones-function-attribute
> > >
> > > It's not as sophisticated as our runtime detection (e.g avx512 vs
> > > avx512icl which we support).
> > > Comparing C vs autovectorised code that works only on some platforms
> > > with forced compilation settings is also unfair.
> >
> > In my original message clang build was completely default, no forced options.
> >
> > Handwritten avx512 also works on this specific platform. So comparing
> > this to autovectorized code (that works on exactly the same platform)
> > as a baseline makes sense. Furthermore autovectorized code can scale
> > onto more platforms than handwritten avx512. IMHO comparing things in
> > the same domain makes more sense.
> >
> > The point of my message was that we should have defined a baseline
> > target, if it is GCC without autovectorization, so be it. But it
> > should be specified and not implied in the commit description that the
> > compared result is autovectorized.
> >
> > To be honest, I agree with you. It's misleading and unfair, so we
> > shouldn't make any comparisons. This is not only limited to
> > autovectorization, scalar code generation also differs. It just
> > happens to give the biggest difference.
> >
> > Context matters, saying "C code performance " is vague. I'm not saying
> > one way is better than the other, but it doesn't cost anything to
> > specify it better to avoid miscommunication.
>
> It's not fair to compare autovectorised output that's AVX512 that will
> be called *on any system with AVX512 support including ones that
> downclock heavily* with AVX512(ICL) checked properly in FFmpeg to run
> on only non-downlocking systems.

That's the customer/user decision how to compile FFmpeg for best
performance on their target platform. Also note, you brought up
avx512, while I agree on the issues with it. I'm commenting on the
AVX2 patch. I wanted to make general comment about the performance
metric we share, diving into avx512 issues is kinda a separate topic.

I guess the C code performance can vary a lot, between compiler,
between optimization flags, between platforms. And we should be
specific about what our "x figure" mean, else it's just a number in
void. There are cases where "fully optimized" generated code is
terrible as with some recent cases, (not this one tho) and then it's
cool to point this out, but if you add different constraints on
compiler generated code it makes this comparison unnecessary
confusing. Whatever that means, but I think you know what I'm trying
to say.

- Kacper


More information about the ffmpeg-devel mailing list