[FFmpeg-devel] [FFFjo] [FFmpeg/FFmpeg] swscale: Implement neon assembly for yuv2nv12cx and yuv2planeX_10 (PR #20028)
dashsantosh-mcw
code at ffmpeg.org
Thu Jul 24 10:13:23 EEST 2025
Checkasm Benchmark Results
yuv2nv12cX_2_512_accurate_c: 3496.2 ( 1.00x)
yuv2nv12cX_2_512_accurate_neon: 409.5 ( 8.54x)
yuv2nv12cX_2_512_approximate_c: 3495.1 ( 1.00x)
yuv2nv12cX_2_512_approximate_neon: 409.4 ( 8.54x)
yuv2nv12cX_4_512_accurate_c: 4676.5 ( 1.00x)
yuv2nv12cX_4_512_accurate_neon: 613.1 ( 7.63x)
yuv2nv12cX_4_512_approximate_c: 4677.8 ( 1.00x)
yuv2nv12cX_4_512_approximate_neon: 607.8 ( 7.70x)
yuv2nv12cX_8_512_accurate_c: 7221.6 ( 1.00x)
yuv2nv12cX_8_512_accurate_neon: 1003.8 ( 7.19x)
yuv2nv12cX_8_512_approximate_c: 7221.2 ( 1.00x)
yuv2nv12cX_8_512_approximate_neon: 1016.4 ( 7.11x)
yuv2nv12cX_16_512_accurate_c: 13731.1 ( 1.00x)
yuv2nv12cX_16_512_accurate_neon: 1757.2 ( 7.81x)
yuv2nv12cX_16_512_approximate_c: 13740.7 ( 1.00x)
yuv2nv12cX_16_512_approximate_neon: 1757.3 ( 7.82x)
yuv2yuvX_10_LE_16_0_512_accurate_c: 7836.9 ( 1.00x)
yuv2yuvX_10_LE_16_0_512_accurate_neon: 840.4 ( 9.33x)
yuv2yuvX_10_LE_16_0_512_approximate_c: 7930.8 ( 1.00x)
yuv2yuvX_10_LE_16_0_512_approximate_neon: 838.5 ( 9.46x)
yuv2yuvX_10_LE_16_16_512_accurate_c: 7594.3 ( 1.00x)
yuv2yuvX_10_LE_16_16_512_accurate_neon: 815.2 ( 9.32x)
yuv2yuvX_10_LE_16_16_512_approximate_c: 7687.0 ( 1.00x)
yuv2yuvX_10_LE_16_16_512_approximate_neon: 811.9 ( 9.47x)
yuv2yuvX_10_LE_16_32_512_accurate_c: 7366.4 ( 1.00x)
yuv2yuvX_10_LE_16_32_512_accurate_neon: 785.8 ( 9.37x)
yuv2yuvX_10_LE_16_32_512_approximate_c: 7426.5 ( 1.00x)
yuv2yuvX_10_LE_16_32_512_approximate_neon: 786.4 ( 9.44x)
yuv2yuvX_10_LE_16_48_512_accurate_c: 7123.1 ( 1.00x)
yuv2yuvX_10_LE_16_48_512_accurate_neon: 761.7 ( 9.35x)
yuv2yuvX_10_LE_16_48_512_approximate_c: 7182.7 ( 1.00x)
yuv2yuvX_10_LE_16_48_512_approximate_neon: 763.0 ( 9.41x)
yuv2yuvX_10_BE_16_0_512_accurate_c: 8092.6 ( 1.00x)
yuv2yuvX_10_BE_16_0_512_accurate_neon: 860.2 ( 9.41x)
yuv2yuvX_10_BE_16_0_512_approximate_c: 8183.5 ( 1.00x)
yuv2yuvX_10_BE_16_0_512_approximate_neon: 861.4 ( 9.50x)
yuv2yuvX_10_BE_16_16_512_accurate_c: 7837.4 ( 1.00x)
yuv2yuvX_10_BE_16_16_512_accurate_neon: 834.0 ( 9.40x)
yuv2yuvX_10_BE_16_16_512_approximate_c: 7927.9 ( 1.00x)
yuv2yuvX_10_BE_16_16_512_approximate_neon: 834.6 ( 9.50x)
yuv2yuvX_10_BE_16_32_512_accurate_c: 7605.1 ( 1.00x)
yuv2yuvX_10_BE_16_32_512_accurate_neon: 807.5 ( 9.42x)
yuv2yuvX_10_BE_16_32_512_approximate_c: 7691.4 ( 1.00x)
yuv2yuvX_10_BE_16_32_512_approximate_neon: 807.3 ( 9.53x)
yuv2yuvX_10_BE_16_48_512_accurate_c: 7344.3 ( 1.00x)
yuv2yuvX_10_BE_16_48_512_accurate_neon: 782.7 ( 9.38x)
yuv2yuvX_10_BE_16_48_512_approximate_c: 7440.1 ( 1.00x)
yuv2yuvX_10_BE_16_48_512_approximate_neon: 781.9 ( 9.51x)
---
View it on FFmpeg Forgejo ( https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20028 ) or reply to this email directly.
More information about the ffmpeg-devel
mailing list