[FFmpeg-devel] 回复: [PATCH 1/3] avcodec/x86/vvc/vvc_alf: fix integer overflow

Wed May 29 22:44:00 EEST 2024

Ronald S. Bultje:
> 发件人: Ronald S. Bultje <rsbultje at gmail.com>
> 发送时间: 2024年5月29日 10:51
> 收件人: FFmpeg development discussions and patches
> 抄送: James Almer; Wu Jianhua
> 主题: Re: [FFmpeg-devel] [PATCH 1/3] avcodec/x86/vvc/vvc_alf: fix integer overflow
> 
> Hi,
> 
> On Wed, May 29, 2024 at 11:38 AM <toqsxw at outlook.com<mailto:toqsxw at outlook.com>> wrote:
> +%else
> +    vpunpcklqdq      m11, m2, m2
> +    vpunpckhqdq      m12, m2, m2
> +    vpunpcklwd       m11, m11, m14
> +    vpunpcklwd       m12, m12, m14
> +    paddd             m0, m11
> +    paddd             m1, m12
> +    packssdw          m0, m0, m1
> +%endif
> 
> punpcklqdq a, src, src
> punpckhqdq b, src, src
> punpcklwd a, a, zero
> punpcklwd b, b, zero
> 
> is the same as
> 
> punpcklwd a, src, zero
> punpckhwd b, src, zero

Thank you for pointing out this. This modification is really helpful for my improvement!

Andreas:
>Can this happen with real inputs (like when called from the decoder)? If
> not, then the test needs to be made more realistic.
> Anyway, what is the performance impact of this?

I didn't have a unit test, but the average FPS looks no change.

Ronald:
> Also, the whole thing just emulates a saturated add. Can't you use paddsw instead of paddw and be done with it? To add to Andreas' question: is saturating here normatively required?

We didn't have any sample that failed for this issue except for the checksum with specific seeds. I think we can keep not changing it until a real sample has something wrong. 

@Nuomi to get more details.