[FFmpeg-devel] [PATCH v2 1/1] swscale/aarch64/output: Implement neon assembly for yuv2planeX_10_c_template()

Logaprakash Ramajayam logaprakash.ramajayam at multicorewareinc.com
Wed Jul 2 10:31:33 EEST 2025


Handled all the comments and updated checkasm for yuv2planeX_10_c()

Checkasm Benchmark results:

yuv2yuvX_10_LE_16_0_512_accurate_c:                   7836.9 ( 1.00x)
yuv2yuvX_10_LE_16_0_512_accurate_neon:                 840.4 ( 9.33x)
yuv2yuvX_10_LE_16_0_512_approximate_c:                7930.8 ( 1.00x)
yuv2yuvX_10_LE_16_0_512_approximate_neon:              838.5 ( 9.46x)
yuv2yuvX_10_LE_16_16_512_accurate_c:                  7594.3 ( 1.00x)
yuv2yuvX_10_LE_16_16_512_accurate_neon:                815.2 ( 9.32x)
yuv2yuvX_10_LE_16_16_512_approximate_c:               7687.0 ( 1.00x)
yuv2yuvX_10_LE_16_16_512_approximate_neon:             811.9 ( 9.47x)
yuv2yuvX_10_LE_16_32_512_accurate_c:                  7366.4 ( 1.00x)
yuv2yuvX_10_LE_16_32_512_accurate_neon:                785.8 ( 9.37x)
yuv2yuvX_10_LE_16_32_512_approximate_c:               7426.5 ( 1.00x)
yuv2yuvX_10_LE_16_32_512_approximate_neon:             786.4 ( 9.44x)
yuv2yuvX_10_LE_16_48_512_accurate_c:                  7123.1 ( 1.00x)
yuv2yuvX_10_LE_16_48_512_accurate_neon:                761.7 ( 9.35x)
yuv2yuvX_10_LE_16_48_512_approximate_c:               7182.7 ( 1.00x)
yuv2yuvX_10_LE_16_48_512_approximate_neon:             763.0 ( 9.41x)
yuv2yuvX_10_BE_16_0_512_accurate_c:                   8092.6 ( 1.00x)
yuv2yuvX_10_BE_16_0_512_accurate_neon:                 860.2 ( 9.41x)
yuv2yuvX_10_BE_16_0_512_approximate_c:                8183.5 ( 1.00x)
yuv2yuvX_10_BE_16_0_512_approximate_neon:              861.4 ( 9.50x)
yuv2yuvX_10_BE_16_16_512_accurate_c:                  7837.4 ( 1.00x)
yuv2yuvX_10_BE_16_16_512_accurate_neon:                834.0 ( 9.40x)
yuv2yuvX_10_BE_16_16_512_approximate_c:               7927.9 ( 1.00x)
yuv2yuvX_10_BE_16_16_512_approximate_neon:             834.6 ( 9.50x)
yuv2yuvX_10_BE_16_32_512_accurate_c:                  7605.1 ( 1.00x)
yuv2yuvX_10_BE_16_32_512_accurate_neon:                807.5 ( 9.42x)
yuv2yuvX_10_BE_16_32_512_approximate_c:               7691.4 ( 1.00x)
yuv2yuvX_10_BE_16_32_512_approximate_neon:             807.3 ( 9.53x)
yuv2yuvX_10_BE_16_48_512_accurate_c:                  7344.3 ( 1.00x)
yuv2yuvX_10_BE_16_48_512_accurate_neon:                782.7 ( 9.38x)
yuv2yuvX_10_BE_16_48_512_approximate_c:               7440.1 ( 1.00x)
yuv2yuvX_10_BE_16_48_512_approximate_neon:             781.9 ( 9.51x)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Swscale-Aarch64-Implement-neon-assembly-yuv2planeX_10_c_template.patch
Type: application/octet-stream
Size: 25716 bytes
Desc: Swscale-Aarch64-Implement-neon-assembly-yuv2planeX_10_c_template.patch
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250702/65722fa1/attachment.obj>


More information about the ffmpeg-devel mailing list