[FFmpeg-devel] [PATCH] lavc/aarch64: new optimization for 8-bit hevc_pel_uni_w_pixels, qpel_uni_w_h, qpel_uni_w_v, qpel_uni_w_hv and qpel_h
Martin Storsjö
martin at martin.st
Thu Jun 1 14:23:28 EEST 2023
On Sun, 28 May 2023, Logan.Lyu wrote:
>
> 在 2023/5/28 12:36, Jean-Baptiste Kempf 写道:
>> Hello,
>>
>> The last interaction still has the wrong name in patchset.
> Thanks for reminding. I modified the correct name in git.
Thanks, most of the issues in the patch seem to have been fixed - however
there's one big breakage here. Also even if this is accepted, we'll have
to wait for the dependency patches to be merged before these can go in
though.
For restoring the saved registers on the stack, you currently have this:
ldp x19, x30, [sp]
ldp x26, x27, [sp, #16]
ldp x24, x25, [sp, #32]
ldp x22, x23, [sp, #48]
ldp x20, x21, [sp, #64]
add sp, sp, #80
You can avoid the extra add at the end by reordering them like this:
ldp x26, x27, [sp, #16]
ldp x24, x25, [sp, #32]
ldp x22, x23, [sp, #48]
ldp x20, x21, [sp, #64]
ldp x19, x30, [sp], #80
But the order/layout of the registers doesn't match how they are backed
up. So when you run checkasm, you'll get these errors:
I8MM:
- hevc_pel.qpel [OK]
put_hevc_qpel_uni_w_hv4_8_i8mm (failed to preserve register)
put_hevc_qpel_uni_w_hv8_8_i8mm (failed to preserve register)
put_hevc_qpel_uni_w_hv16_8_i8mm (failed to preserve register)
put_hevc_qpel_uni_w_hv32_8_i8mm (failed to preserve register)
put_hevc_qpel_uni_w_hv64_8_i8mm (failed to preserve register)
- hevc_pel.qpel_uni_w [FAILED]
checkasm: 5 of 1136 tests have failed
It's easiest to make the epilogue a mirror copy of the prologue.
Please rerun checkasm on as system that does support i8mm when posting
updated patches.
// Martin
More information about the ffmpeg-devel
mailing list