[FFmpeg-devel] [PATCH v2 1/1] swscale/aarch64/output: Implement neon assembly for yuv2planeX_10_c_template()
Logaprakash Ramajayam
logaprakash.ramajayam at multicorewareinc.com
Wed Jul 2 12:10:44 EEST 2025
Attaching the Assembly implementation of yuv2planeX_10_c() patch in text format.
________________________________
From: Logaprakash Ramajayam
Sent: Wednesday, July 2, 2025 1:01 PM
To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
Subject: [FFmpeg-devel] [PATCH v2 1/1] swscale/aarch64/output: Implement neon assembly for yuv2planeX_10_c_template()
Handled all the comments and updated checkasm for yuv2planeX_10_c()
Checkasm Benchmark results:
yuv2yuvX_10_LE_16_0_512_accurate_c: 7836.9 ( 1.00x)
yuv2yuvX_10_LE_16_0_512_accurate_neon: 840.4 ( 9.33x)
yuv2yuvX_10_LE_16_0_512_approximate_c: 7930.8 ( 1.00x)
yuv2yuvX_10_LE_16_0_512_approximate_neon: 838.5 ( 9.46x)
yuv2yuvX_10_LE_16_16_512_accurate_c: 7594.3 ( 1.00x)
yuv2yuvX_10_LE_16_16_512_accurate_neon: 815.2 ( 9.32x)
yuv2yuvX_10_LE_16_16_512_approximate_c: 7687.0 ( 1.00x)
yuv2yuvX_10_LE_16_16_512_approximate_neon: 811.9 ( 9.47x)
yuv2yuvX_10_LE_16_32_512_accurate_c: 7366.4 ( 1.00x)
yuv2yuvX_10_LE_16_32_512_accurate_neon: 785.8 ( 9.37x)
yuv2yuvX_10_LE_16_32_512_approximate_c: 7426.5 ( 1.00x)
yuv2yuvX_10_LE_16_32_512_approximate_neon: 786.4 ( 9.44x)
yuv2yuvX_10_LE_16_48_512_accurate_c: 7123.1 ( 1.00x)
yuv2yuvX_10_LE_16_48_512_accurate_neon: 761.7 ( 9.35x)
yuv2yuvX_10_LE_16_48_512_approximate_c: 7182.7 ( 1.00x)
yuv2yuvX_10_LE_16_48_512_approximate_neon: 763.0 ( 9.41x)
yuv2yuvX_10_BE_16_0_512_accurate_c: 8092.6 ( 1.00x)
yuv2yuvX_10_BE_16_0_512_accurate_neon: 860.2 ( 9.41x)
yuv2yuvX_10_BE_16_0_512_approximate_c: 8183.5 ( 1.00x)
yuv2yuvX_10_BE_16_0_512_approximate_neon: 861.4 ( 9.50x)
yuv2yuvX_10_BE_16_16_512_accurate_c: 7837.4 ( 1.00x)
yuv2yuvX_10_BE_16_16_512_accurate_neon: 834.0 ( 9.40x)
yuv2yuvX_10_BE_16_16_512_approximate_c: 7927.9 ( 1.00x)
yuv2yuvX_10_BE_16_16_512_approximate_neon: 834.6 ( 9.50x)
yuv2yuvX_10_BE_16_32_512_accurate_c: 7605.1 ( 1.00x)
yuv2yuvX_10_BE_16_32_512_accurate_neon: 807.5 ( 9.42x)
yuv2yuvX_10_BE_16_32_512_approximate_c: 7691.4 ( 1.00x)
yuv2yuvX_10_BE_16_32_512_approximate_neon: 807.3 ( 9.53x)
yuv2yuvX_10_BE_16_48_512_accurate_c: 7344.3 ( 1.00x)
yuv2yuvX_10_BE_16_48_512_accurate_neon: 782.7 ( 9.38x)
yuv2yuvX_10_BE_16_48_512_approximate_c: 7440.1 ( 1.00x)
yuv2yuvX_10_BE_16_48_512_approximate_neon: 781.9 ( 9.51x)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Swscale-Aarch64-Implement-neon-assembly-yuv2planeX_10_c_template.patch
Type: application/octet-stream
Size: 25716 bytes
Desc: Swscale-Aarch64-Implement-neon-assembly-yuv2planeX_10_c_template.patch
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250702/334b1f46/attachment.obj>
More information about the ffmpeg-devel
mailing list