[FFmpeg-devel] [PATCH v2 00/15] avfilter/vf_bwdif: Add aarch64 neon functions

John Cox jc at kynesim.co.uk
Mon Jul 3 11:44:36 EEST 2023


On Mon, 3 Jul 2023 00:09:52 +0300 (EEST), you wrote:

>On Sun, 2 Jul 2023, John Cox wrote:
>
>> Also adds a filter_line3 method which on aarch64 neon yields approx 30%
>> speedup over 2xfilter_line and a memcpy
>>
>> Differences from v1:
>> .align 16 corrected to .balign 16
>> SXTW tolower
>> Mac ABI (hopefully) fixed
>> V register pop/push macroed & prettified
>>
>> John Cox (15):
>>  avfilter/vf_bwdif: Add outline for aarch neon functions
>>  avfilter/vf_bwdif: Add common macros and consts for aarch64 neon
>>  avfilter/vf_bwdif: Export C filter_intra
>>  avfilter/vf_bwdif: Add neon for filter_intra
>>  tests/checkasm: Add test for vf_bwdif filter_intra
>>  avfilter/vf_bwdif: Add clip and spatial macros for aarch64 neon
>>  avfilter/vf_bwdif: Export C filter_edge
>>  avfilter/vf_bwdif: Add neon for filter_edge
>>  tests/checkasm: Add test for vf_bwdif filter_edge
>>  avfilter/vf_bwdif: Export C filter_line
>>  avfilter/vf_bwdif: Add neon for filter_line
>>  avfilter/vf_bwdif: Add a filter_line3 method for optimisation
>>  avfilter/vf_bwdif: Add neon for filter_line3
>>  tests/checkasm: Add test for vf_bwdif filter_line3
>>  avfilter/vf_bwdif: Block filter slices into a multiple of 4 lines
>
>Overall, I'd suggest squashing/reordering the patches like this:
>
>- tests/checkasm: Add test for vf_bwdif filter_intra
>- avfilter/vf_bwdif: Add neon for filter_intra
>   (With the preceding patches squashed. For extra common macros, only add
>   the ones you use in this patch here.)
>- tests/checkasm: Add test for vf_bwdif filter_edge
>- avfilter/vf_bwdif: Add neon for filter_edge (with other dependencies
>   squashed)
>- avfilter/vf_bwdif: Add neon for filter_line
>- avfilter/vf_bwdif: Add a filter_line3 method for optimisation
>   + checkasm test squashed
>- avfilter/vf_bwdif: Add neon for filter_line3

I'm happy with that if everyone else is - it is easy to merge patches -
harder to take them apart.

JC

>// Martin


More information about the ffmpeg-devel mailing list