[FFmpeg-devel] [PATCH v7] libavfilter/x86/vf_convolution: add sobel filter optimization and unit test with intel AVX512 VNNI

Wang, Bin bin.wang at intel.com
Mon Nov 14 15:30:24 EET 2022


> By using xmm# you're not taking into account any x86inc SWAPing, so this is
> using xmm0 and xmm1 where the single scalar float input arguments reside (at
> least on unix64), instead of xm0 and xm1 (xmm16 and xmm17) where the
> broadcasted scalars were stored.
> This, again, only worked by chance on unix64 because you're using scalar fmadd,
> and shouldn't work at all on win64.
> 
> Also, all these as is are being encoded as VEX, not EVEX, but it should be fine
> leaving them untouched instead of using xm#, since they will be shorter (five
> bytes instead of six for some) by using the lower, non callee-saved regs.

Thanks for the help. I'm not familiar with WIN64 asm. So what I need to do is change the WIN64 swap from:
SWAP xmm0, xmm2
SWAP xmm1, xmm3
To:
VBROADCASTSS m0, xmm2
VBROADCASTSS m1, xmm3

Is that correct?

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel at ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".


More information about the ffmpeg-devel mailing list