[FFmpeg-devel] [PATCH v3] checkasm/h264dsp: Fix stack overflow in check_idct_dequant

Tue Jun 17 05:01:40 EEST 2025

> 在 2025年6月17日，上午2:29，Andreas Rheinhardt <andreas.rheinhardt at outlook.com> 写道：
> 
> Zhao Zhili:
>> 
>> 
>>>> On Jun 16, 2025, at 19:03, Andreas Rheinhardt <andreas.rheinhardt at outlook.com> wrote:
>>> 
>>> Zhao Zhili:
>>>> 
>>>> 
>>>>> On Jun 16, 2025, at 17:46, Andreas Rheinhardt <andreas.rheinhardt at outlook.com> wrote:
>>>>> 
>>>>> Zhao Zhili:
>>>>>> From: Zhao Zhili <zhilizhao at tencent.com>
>>>>>> 
>>>>>> ---
>>>>>> tests/checkasm/h264dsp.c | 20 +++++++++++++++-----
>>>>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>>>> 
>>>>>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c
>>>>>> index f5f9650224..a0f8fd858a 100644
>>>>>> --- a/tests/checkasm/h264dsp.c
>>>>>> +++ b/tests/checkasm/h264dsp.c
>>>>>> @@ -328,25 +328,35 @@ static void check_idct_multiple(void)
>>>>>> static void check_idct_dequant(void)
>>>>>> {
>>>>>>   static const int depths[5] = { 8, 9, 10, 12, 14 };
>>>>>> -    LOCAL_ALIGNED_16(int16_t, src, [16]);
>>>>>> -    /* Ensure dst buffers are large enough to hold dctcoefs of all bit-depths. */
>>>>>> +    /* Ensure buffers are large enough to hold dctcoefs of all bit-depths. */
>>>>>> +    LOCAL_ALIGNED_16(uint8_t, src_buf, [16 * sizeof(int32_t)]);
>>>>>>   LOCAL_ALIGNED_16(uint8_t, dst0, [16 * 16 * sizeof(int32_t)]);
>>>>>>   LOCAL_ALIGNED_16(uint8_t, dst1, [16 * 16 * sizeof(int32_t)]);
>>>>>> +    int16_t *src = (int16_t *)src_buf;
>>>>>>   int16_t *dst_ref = (int16_t *)dst0;
>>>>>>   int16_t *dst_new = (int16_t *)dst1;
>>>>>>   H264DSPContext h;
>>>>>>   int bit_depth, i, qmul;
>>>>>>   declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t *output, int16_t *input, int qmul);
>>>>>> 
>>>>>> -    for (int j = 0; j < 16; j++)
>>>>>> -        src[j] = (rnd() % 512) - 256;
>>>>>> -
>>>>>>   qmul = rnd() % 4096;
>>>>>> 
>>>>>>   for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) {
>>>>>>       bit_depth = depths[i];
>>>>>>       ff_h264dsp_init(&h, bit_depth, 1);
>>>>>> 
>>>>>> +        if (bit_depth == 8) {
>>>>>> +            for (size_t j = 0; j < 16; j++) {
>>>>>> +                int16_t r = (rnd() % 512) - 256;
>>>>>> +                AV_WN16A(&src_buf[j << 1], r);
>>>>>> +            }
>>>>>> +        } else {
>>>>>> +            for (size_t j = 0; j < 16; j++) {
>>>>>> +                int32_t r = (rnd() % (1 << (bit_depth + 1))) - (1 << bit_depth);
>>>>>> +                AV_WN32A(&src_buf[j << 2], r);
>>>>>> +            }
>>>>>> +        }
>>>>>> +
>>>>>>       memset(dst0, 0, 16 * 16 * SIZEOF_COEF);
>>>>>>       memset(dst1, 0, 16 * 16 * SIZEOF_COEF);
>>>>>> 
>>>>> 
>>>>> This still has an effective-type violation: src_buf is of type uint8_t,
>>>>> yet the ff_h264_luma_dc_dequant_idct functions will read it as
>>>>> int16_t/int32_t. It also still has the downside that buffer overflows
>>>>> for the 8bit case can go undetected.
>>>> 
>>>> A bunch of template has cast like
>>>> 
>>>>   pixel *dst = (pixel *)_dst;
>>>>   const pixel *src = (const pixel *)_src;
>>>> 
>>>> then read and write as int16_t.
>>>> 
>>>> And a bunch of checkasm use uint8_t[] array on stack as src and dst,
>>>> which leading to UB.
>>>> 
>>>> This patch isn’t specific. And this patch add zero UB (it’s there before the patch,
>>>> both src and dst are accessed as int32_t/int16_t while they are int16_t and uint8_t).
>>>> 
>>> 
>>> This patch adds UB: src was int16_t before, so that the accesses in the
>>> eight bit function were fine, but are not with this patch. Anyway, it is
>>> irrelevant now.
>> 
>> Why it suddenly becomes a big problem access to properly aligned uint8_t *?
>> 
>> I don’t mind to discuss the rules regarding to these violating of strict aliasing,
>> especially in checkasm. But why it suddenly becomes a rule blocking a patch
>> trying to fix a fate failure.
>> 
>> I don’t buy the reason "the accesses in the eight bit function were fine”.
>> 
> 
> The effective type violation goes hand in hand with using a too big
> buffer for the smaller type, making the test less strict. This is an
> issue that checkasm should worry about (the effective type violation
> itself is not that important).

It’s the same buffer size inside libavcodec/h264, the test is as strict as real use case. As long as the output is correct, over read a few bytes inside the input buffer doesn’t matter. 

And there are tools to detect read uninitialized values. Without tools, stack overflow cannot be detected neither.

There is a v5. No more comments.

> Anyway, have you seen my patch?
> 
> - Andreas
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".