[FFmpeg-devel] [PATCH 3/4] avutil/cuda_check: propagate AVERROR_UNRECOVERABLE when needed

Soft Works softworkz at hotmail.com
Tue Nov 22 20:08:35 EET 2022



> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> James Almer
> Sent: Tuesday, November 22, 2022 3:41 PM
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH 3/4] avutil/cuda_check: propagate
> AVERROR_UNRECOVERABLE when needed
> 
> On 11/22/2022 11:33 AM, Soft Works wrote:
> >
> >
> >> -----Original Message-----
> >> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> >> James Almer
> >> Sent: Tuesday, November 22, 2022 2:31 PM
> >> To: ffmpeg-devel at ffmpeg.org
> >> Subject: Re: [FFmpeg-devel] [PATCH 3/4] avutil/cuda_check:
> propagate
> >> AVERROR_UNRECOVERABLE when needed
> >>
> >> On 11/22/2022 10:21 AM, Timo Rothenpieler wrote:
> >>> On 22/11/2022 14:07, James Almer wrote:
> >>>> Based on a patch by Soft Works.
> >>>>
> >>>> Signed-off-by: James Almer <jamrial at gmail.com>
> >>>> ---
> >>>>    libavutil/cuda_check.h | 4 ++++
> >>>>    1 file changed, 4 insertions(+)
> >>>>
> >>>> diff --git a/libavutil/cuda_check.h b/libavutil/cuda_check.h
> >>>> index f5a9234eaf..33aaf9c098 100644
> >>>> --- a/libavutil/cuda_check.h
> >>>> +++ b/libavutil/cuda_check.h
> >>>> @@ -49,6 +49,10 @@ static inline int ff_cuda_check(void *avctx,
> >>>>            av_log(avctx, AV_LOG_ERROR, " -> %s: %s", err_name,
> >>>> err_string);
> >>>>        av_log(avctx, AV_LOG_ERROR, "\n");
> >>>> +    // Not recoverable
> >>>> +    if (err == CUDA_ERROR_UNKNOWN)
> >>>> +        return AVERROR_UNRECOVERABLE;
> >>>
> >>> Why does specifically CUDA_ERROR_UNKNOWN get mapped to
> >> unrecoverable?
> >>
> >> It's the code that Soft Works found out was returned repeatedly no
> >> matter how many packets you fed to the encoder, which meant it was
> >> stuck
> >> in an unrecoverable state. See
> >> http://ffmpeg.org/pipermail/ffmpeg-devel/2021-October/287153.html
> >>
> >> If you know of cases where this error could be returned in valid
> >> recoverable scenarios that are not already handled in some other
> way,
> >> what do you suggest could be done?
> >
> > Thanks James, for picking this up!
> >
> > All I can say is that my original patch is deployed to a quite a
> > number of systems and there hasn't been any case where this
> > would have had an adverse effect.
> >
> > I hadn't reported this to Nvidia because a solution was needed
> > and it was an erroneous file, so the best they could
> > have probably done was to return a different error code ;-)
> >
> > softworkz
> 
> Can you be more specific about what kind of erroneous file it was?
> Are
> we talking about a completely broken stream where no packet was valid
> and even the software decoder would fail, or something that had one
> invalid packet that somehow chocked the nvdec...

I was able to find the conversations where this had been reported.
There were three cases, two were investigated, both of which quite 
similar.

The first case was about an mpegts "recording" from some online
stream where the "recorder" was simply reconnecting on connection
failure and then continued writing to the same mpegts file.
It seems the server had disconnected after 30 min and the
streams have changed from then on:

11:35:35.096 frame=107726 fps=371 q=29.0 size=  588032kB time=00:29:57.59 bitrate=2682.9kbits/s throttle=off speed=6.18x    
11:35:35.596 frame=107907 fps=371 q=28.0 size=  589312kB time=00:30:00.62 bitrate=2684.2kbits/s throttle=off speed=6.18x    
11:35:35.995 [mpeg2_cuvid @ 0x699a40] AVHWFramesContext is already initialized with incompatible parameters
11:35:35.995 [mpeg2_cuvid @ 0x699a40] ctx->cvdl->cuvidParseVideoData(ctx->cuparser, &cupkt) failed -> CUDA_ERROR_UNKNOWN: unknown error
11:35:35.995 Error while decoding stream #0:0: Generic error in an external library
11:35:35.998 [mpeg2_cuvid @ 0x699a40] ctx->cvdl->cuvidParseVideoData(ctx->cuparser, &cupkt) failed -> CUDA_ERROR_UNKNOWN: unknown error
11:35:35.998 Error while decoding stream #0:0: Generic error in an external library
11:35:36.003 [mpeg2_cuvid @ 0x699a40] ctx->cvdl->cuvidParseVideoData(ctx->cuparser, &cupkt) failed -> CUDA_ERROR_UNKNOWN: unknown error

We can't know what "incompatible parameters" actually means. It could
be the frame size, but it could also be a different codec (like H264
instead of MPEG2) or both, or interlaced/non-interlaced.


The other case was similar. The user had eventually admitted:

"I used ffmpeg and a bash script to concat the 3x videos into a single episode"

and that the codecs might have been different. Here it fails right from 
the start as the "-ss 00:07:57.000" is probably jumping right into the second
segment which differs from the probe results.
(total length 22min)

I remember now that I had constructed test files like this, but with 
much shorter "bad parts". The ffmpeg parser could read over it (at least
somewhat and eventually recover, while the cuvid parser never came back.
But that was just to find out whether the cuvid error state is terminal
or not. The ability to recover doesn’t help when a stream change is 
permanent (= not an erroneous incident for a few seconds).


As such, the requirement was simply: when that happens, ffmpeg should exit.
(instead of feeding the cuvid zombie to infinity)

Best regards,
softworkz








More information about the ffmpeg-devel mailing list