[FFmpeg-devel] [PATCH v3 3/6] lavc/qsvdec: Replace current parser with MFXVideoDECODE_DecodeHeader()
Rogozhkin, Dmitry V
dmitry.v.rogozhkin at intel.com
Sat Mar 16 03:18:22 EET 2019
On Mon, 2019-03-11 at 17:23 +0800, Li, Zhong wrote:
> > -----Original Message-----
> > From: Rogozhkin, Dmitry V
> > Sent: Saturday, March 9, 2019 8:48 AM
> > To: ffmpeg-devel at ffmpeg.org
> > Cc: Li, Zhong <zhong.li at intel.com>
> > Subject: Re: [FFmpeg-devel] [PATCH v3 3/6] lavc/qsvdec: Replace
> > current
> > parser with MFXVideoDECODE_DecodeHeader()
> >
> > On Fri, 2019-03-08 at 15:40 +0800, Zhong Li wrote:
> > > Using MSDK parser can improve qsv decoder pass rate in some cases
> > > (E.g:
> > > sps declares a wrong level_idc, smaller than it should be).
> > > And it is necessary for adding new qsv decoders such as MJPEG and
> > > VP9
> > > since current parser can't provide enough information.
> > > Actually using MFXVideoDECODE_DecodeHeader() was disscussed at
> > > https://ffmpeg.org/pipermail/ffmpeg-devel/2015-July/175734.html
> > > and
> > > merged as commit 1acb19d, but was overwritten when merged libav
> > > patches (commit: 1f26a23) without any explain.
> > >
> > > v2: split decode header from decode_init, and call it for
> > > everyframe
> > > to detect format/resoultion change. It can fix some regression
> > > issues
> > > such as hevc 10bits decoding.
> > >
> > > Signed-off-by: Zhong Li <zhong.li at intel.com>
> > > ---
> > > libavcodec/qsvdec.c | 172 ++++++++++++++++++++----------
> > > ----
> > > ----
> > > libavcodec/qsvdec.h | 2 +
> > > libavcodec/qsvdec_h2645.c | 1 +
> > > libavcodec/qsvdec_other.c | 1 +
> > > 4 files changed, 93 insertions(+), 83 deletions(-)
> > >
> > > diff --git a/libavcodec/qsvdec.c b/libavcodec/qsvdec.c index
> > > 4a0be811fb..b78026e14d 100644
> > > --- a/libavcodec/qsvdec.c
> > > +++ b/libavcodec/qsvdec.c
> > > @@ -120,19 +120,17 @@ static inline unsigned int
> > > qsv_fifo_size(const
> > > AVFifoBuffer* fifo)
> > > return av_fifo_size(fifo) / qsv_fifo_item_size();
> > > }
> > >
> > > -static int qsv_decode_init(AVCodecContext *avctx, QSVContext *q)
> > > +static int qsv_decode_preinit(AVCodecContext *avctx, QSVContext
> > > *q,
> > > enum AVPixelFormat *pix_fmts, mfxVideoParam *param)
> > > {
> > > - const AVPixFmtDescriptor *desc;
> > > mfxSession session = NULL;
> > > int iopattern = 0;
> > > - mfxVideoParam param = { 0 };
> > > - int frame_width = avctx->coded_width;
> > > - int frame_height = avctx->coded_height;
> > > int ret;
> > >
> > > - desc = av_pix_fmt_desc_get(avctx->sw_pix_fmt);
> > > - if (!desc)
> > > - return AVERROR_BUG;
> > > + ret = ff_get_format(avctx, pix_fmts);
> > > + if (ret < 0) {
> > > + q->orig_pix_fmt = avctx->pix_fmt = AV_PIX_FMT_NONE;
> > > + return ret;
> > > + }
> > >
> > > if (!q->async_fifo) {
> > > q->async_fifo = av_fifo_alloc(q->async_depth *
> > > qsv_fifo_item_size()); @@ -170,48 +168,72 @@ static int
> > > qsv_decode_init(AVCodecContext *avctx, QSVContext *q)
> > > return ret;
> > > }
> > >
> > > - ret = ff_qsv_codec_id_to_mfx(avctx->codec_id);
> > > - if (ret < 0)
> > > - return ret;
> > > + param->IOPattern = q->iopattern;
> > > + param->AsyncDepth = q->async_depth;
> > > + param->ExtParam = q->ext_buffers;
> > > + param->NumExtParam = q->nb_ext_buffers;
> > >
> > > - param.mfx.CodecId = ret;
> > > - param.mfx.CodecProfile = ff_qsv_profile_to_mfx(avctx-
> > > >codec_id,
> > > avctx->profile);
> > > - param.mfx.CodecLevel = avctx->level ==
> >
> > FF_LEVEL_UNKNOWN ?
> > > MFX_LEVEL_UNKNOWN : avctx->level;
> > > -
> > > - param.mfx.FrameInfo.BitDepthLuma = desc->comp[0].depth;
> > > - param.mfx.FrameInfo.BitDepthChroma = desc->comp[0].depth;
> > > - param.mfx.FrameInfo.Shift = desc->comp[0].depth >
> >
> > 8;
> > > - param.mfx.FrameInfo.FourCC = q->fourcc;
> > > - param.mfx.FrameInfo.Width = frame_width;
> > > - param.mfx.FrameInfo.Height = frame_height;
> > > - param.mfx.FrameInfo.ChromaFormat =
> >
> > MFX_CHROMAFORMAT_YUV420;
> > > -
> > > - switch (avctx->field_order) {
> > > - case AV_FIELD_PROGRESSIVE:
> > > - param.mfx.FrameInfo.PicStruct =
> >
> > MFX_PICSTRUCT_PROGRESSIVE;
> > > - break;
> > > - case AV_FIELD_TT:
> > > - param.mfx.FrameInfo.PicStruct =
> >
> > MFX_PICSTRUCT_FIELD_TFF;
> > > - break;
> > > - case AV_FIELD_BB:
> > > - param.mfx.FrameInfo.PicStruct =
> >
> > MFX_PICSTRUCT_FIELD_BFF;
> > > - break;
> > > - default:
> > > - param.mfx.FrameInfo.PicStruct =
> >
> > MFX_PICSTRUCT_UNKNOWN;
> > > - break;
> > > - }
> > > + return 0;
> > > + }
> > >
> > > - param.IOPattern = q->iopattern;
> > > - param.AsyncDepth = q->async_depth;
> > > - param.ExtParam = q->ext_buffers;
> > > - param.NumExtParam = q->nb_ext_buffers;
> > > +static int qsv_decode_init(AVCodecContext *avctx, QSVContext *q,
> > > mfxVideoParam *param)
> > > +{
> > > + int ret;
> > >
> > > - ret = MFXVideoDECODE_Init(q->session, ¶m);
> > > + avctx->width = param->mfx.FrameInfo.CropW;
> > > + avctx->height = param->mfx.FrameInfo.CropH;
> > > + avctx->coded_width = param->mfx.FrameInfo.Width;
> > > + avctx->coded_height = param->mfx.FrameInfo.Height;
> > > + avctx->level = param->mfx.CodecLevel;
> > > + avctx->profile = param->mfx.CodecProfile;
> > > + avctx->field_order = ff_qsv_map_picstruct(param-
> > > > mfx.FrameInfo.PicStruct);
> > >
> > > + avctx->pix_fmt = ff_qsv_map_fourcc(param-
> > > > mfx.FrameInfo.FourCC);
> > >
> > > +
> > > + ret = MFXVideoDECODE_Init(q->session, param);
> > > if (ret < 0)
> > > return ff_qsv_print_error(avctx, ret,
> > > "Error initializing the
> >
> > MFX video
> > > decoder");
> > >
> > > - q->frame_info = param.mfx.FrameInfo;
> > > + q->frame_info = param->mfx.FrameInfo;
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int qsv_decode_header(AVCodecContext *avctx, QSVContext
> > > *q,
> > > AVPacket *avpkt, enum AVPixelFormat *pix_fmts, mfxVideoParam
> > > *param)
> > > +{
> > > + int ret;
> > > +
> > > + mfxBitstream bs = { { { 0 } } };
> > > +
> > > + if (avpkt->size) {
> > > + bs.Data = avpkt->data;
> > > + bs.DataLength = avpkt->size;
> > > + bs.MaxLength = bs.DataLength;
> > > + bs.TimeStamp = avpkt->pts;
> > > + if (avctx->field_order == AV_FIELD_PROGRESSIVE)
> > > + bs.DataFlag |=
> >
> > MFX_BITSTREAM_COMPLETE_FRAME;
> > > + } else
> > > + return AVERROR_INVALIDDATA;
> > > +
> > > +
> > > + if(!q->session) {
> > > + ret = qsv_decode_preinit(avctx, q, pix_fmts, param);
> > > + if (ret < 0)
> > > + return ret;
> > > + }
> > > +
> > > + ret = ff_qsv_codec_id_to_mfx(avctx->codec_id);
> > > + if (ret < 0)
> > > + return ret;
> > > +
> > > + param->mfx.CodecId = ret;
> > > + ret = MFXVideoDECODE_DecodeHeader(q->session, &bs, param);
> > > + if (MFX_ERR_MORE_DATA == ret) {
> > > + return AVERROR(EAGAIN);
> > > + }
> > > + if (ret < 0)
> > > + return ff_qsv_print_error(avctx, ret,
> > > + "Error decoding stream header");
> > >
> > > return 0;
> > > }
> > > @@ -494,7 +516,10 @@ int ff_qsv_process_data(AVCodecContext
> > > *avctx,
> > > QSVContext *q,
> > > uint8_t *dummy_data;
> > > int dummy_size;
> > > int ret;
> > > - const AVPixFmtDescriptor *desc;
> > > + mfxVideoParam param = { 0 };
> > > + enum AVPixelFormat pix_fmts[3] = { AV_PIX_FMT_QSV,
> > >
> >
> > + AV_PIX_FMT_NV
> > 12,
> > >
> >
> > + AV_PIX_FMT_N
> > ONE };
> > >
> > > if (!q->avctx_internal) {
> > > q->avctx_internal = avcodec_alloc_context3(NULL); @@
> >
> > -508,7
> > > +533,6 @@ int ff_qsv_process_data(AVCodecContext *avctx,
> > > QSVContext
> > > *q,
> > > return AVERROR(ENOMEM);
> > >
> > > q->parser->flags |= PARSER_FLAG_COMPLETE_FRAMES;
> > > - q->orig_pix_fmt = AV_PIX_FMT_NONE;
> > > }
> > >
> > > if (!pkt->size)
> > > @@ -521,15 +545,17 @@ int ff_qsv_process_data(AVCodecContext
> >
> > *avctx,
> > > QSVContext *q,
> > > pkt->data, pkt->size, pkt->pts, pkt->dts,
> > > pkt->pos);
> > >
> > > - avctx->field_order = q->parser->field_order;
> > > /* TODO: flush delayed frames on reinit */
> > > - if (q->parser->format != q->orig_pix_fmt ||
> > > - FFALIGN(q->parser->coded_width, 16) != FFALIGN(avctx-
> > > > coded_width, 16) ||
> > >
> > > - FFALIGN(q->parser->coded_height, 16) != FFALIGN(avctx-
> > > > coded_height, 16)) {
> > >
> > > - enum AVPixelFormat pix_fmts[3] = { AV_PIX_FMT_QSV,
> > >
> >
> > - AV_PIX_F
> > MT_NONE,
> > >
> >
> > - AV_PIX_F
> > MT_NONE };
> > > - enum AVPixelFormat qsv_format;
> > > +
> > > + // sw_pix_fmt was initialized as NV12, will be updated after
> > > header decoded if not true.
> > > + if (q->orig_pix_fmt != AV_PIX_FMT_NONE)
> > > + pix_fmts[1] = q->orig_pix_fmt;
> >
> > Well, it still seems non self-explanatory to me why we have
> > pix_fmts[1] here.
>
> I think it has been explained and descripted in my comment. It means
> sw_pix_fmt.
> For more detail, probably you want to check the implementation of
> ff_get_format().
>
> > > +
> > > + ret = qsv_decode_header(avctx, q, pkt, pix_fmts, ¶m);
> >
> > Parsing can be expensive to perform on each frame. See more below.
>
> 1. Parsing is just to reading header bits, I don't think it is CPU
> heavy calculation task, compared with slice level decoding.
> 2. Mostly only key frames have completed header. For non-key frame,
> just return MFX_ERR_MORE_DATA instead of real parsing work.
>
> > > +
> > > + if (ret >= 0 && (q->orig_pix_fmt !=
> > > ff_qsv_map_fourcc(param.mfx.FrameInfo.FourCC) ||
> > > + avctx->coded_width != param.mfx.FrameInfo.Width ||
> > > + avctx->coded_height != param.mfx.FrameInfo.Height)) {
> > > AVPacket zero_pkt = {0};
> >
> > You are actually trying to perform the task which is being done by
> > mediasdk.
> > Moreover, you don't detect all the cases when mediasdk decoder will
> > require
> > to be reset. For example, you may have 2 streams concatenated which
> > have
> > perfectly the same resolution, but still decoder reset will be
> > required because
> > of profile/level/gop structure changes and memory reallocation or
> > better to
> > say additional allocation could also be required because mediasdk
> > may
> > require different (bigger or lesser) number of surfaces to properly
> > handle
> > new bitstream. See more below.
>
> Which case of gop structure changing need to reset? Could you please,
> give an example.
Dmitry Gurulev, can you, please, help answer these questions?
>
> > >
> > > if (q->buffered_count) {
> > > @@ -538,45 +564,24 @@ int ff_qsv_process_data(AVCodecContext
> >
> > *avctx,
> > > QSVContext *q,
> > > q->buffered_count--;
> > > return qsv_decode(avctx, q, frame, got_frame,
> >
> > &zero_pkt);
> > > }
> > > -
> > > q->reinit_flag = 0;
> > >
> > > - qsv_format = ff_qsv_map_pixfmt(q->parser->format, &q-
> > > > fourcc);
> > >
> > > - if (qsv_format < 0) {
> > > - av_log(avctx, AV_LOG_ERROR,
> > > - "Decoding pixel format '%s' is not
> >
> > supported\n",
> > > - av_get_pix_fmt_name(q->parser->format));
> > > - ret = AVERROR(ENOSYS);
> > > - goto reinit_fail;
> > > - }
> > > + q->orig_pix_fmt = avctx->pix_fmt = pix_fmts[1] =
> > > ff_qsv_map_fourcc(param.mfx.FrameInfo.FourCC);
> > >
> > > - q->orig_pix_fmt = q->parser->format;
> > > - avctx->pix_fmt = pix_fmts[1] = qsv_format;
> > > - avctx->width = q->parser->width;
> > > - avctx->height = q->parser->height;
> > > - avctx->coded_width = FFALIGN(q->parser->coded_width,
> >
> > 16);
> > > - avctx->coded_height = FFALIGN(q->parser->coded_height,
> >
> > 16);
> > > - avctx->level = q->avctx_internal->level;
> > > - avctx->profile = q->avctx_internal->profile;
> > > + avctx->coded_width = param.mfx.FrameInfo.Width;
> > > + avctx->coded_height = param.mfx.FrameInfo.Height;
> > >
> > > - ret = ff_get_format(avctx, pix_fmts);
> > > + ret = qsv_decode_preinit(avctx, q, pix_fmts, ¶m);
> > > if (ret < 0)
> > > goto reinit_fail;
> > > + q->initialized = 0;
> > > + }
> > >
> > > - avctx->pix_fmt = ret;
> > > -
> > > - desc = av_pix_fmt_desc_get(avctx->pix_fmt);
> > > - if (!desc)
> > > - goto reinit_fail;
> > > -
> > > - if (desc->comp[0].depth > 8) {
> > > - avctx->coded_width
> >
> > = FFALIGN(q->parser->coded_width,
> > > 32);
> > > - avctx->coded_height =
> >
> > FFALIGN(q->parser->coded_height,
> > > 32);
> > > - }
> > > -
> > > - ret = qsv_decode_init(avctx, q);
> > > + if (!q->initialized) {
> > > + ret = qsv_decode_init(avctx, q, ¶m);
> > > if (ret < 0)
> > > goto reinit_fail;
> > > + q->initialized = 1;
> > > }
> > >
> > > return qsv_decode(avctx, q, frame, got_frame, pkt);
> >
> > So, from my perspective you should not attempt to call decode
> > header on
> > each incoming data portion and try to interpret results. Instead
> > you should
> > respect what mediasdk DecodeFrameAsync is returning. Basically it
> > will
> > return MFX_ERR_INCOMPATIBLE_VIDEO_PARAM is decoder reset is
> > required.
> > This gives you 2 situations when you should call
> > DecodeHeader:
> > 1. You either call it if decoder is not initalized at all 2. Or you
> > call it upon
> > receiving MFX_ERR_INCOMPATIBLE_VIDEO_PARAM
> >
> > Once you received MFX_ERR_INCOMPATIBLE_VIDEO_PARAM step to
> > perform are:
> > 1. Flush cached frames from medisdk decoder passing nullptr
> > bitstream to
> > DecodeFrameAsync 2. Run decode header and get updated parameters 3.
> > Check (via QueryIOSurf) and potentially reallocate surfaces to
> > satisfy new
> > bitstream requirement 4. Reset decoder with new parameters
>
> 1. This patch is just to replace original parser to MSDK parser. I
> don't like to mix other thing here (thus probably introduces new
> bugs, but full round testing has been done for this patch).
> If it is really necessary, should be done by an additional patch.
> 2. Unfortunately, if we don't reset and set correct
> resolution/pix_fmt, FFmpeg-qsv already failed in the progress of MSDK
> initialization (e,g: pass a NV12 format but actually it is P010. see
> : https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/qsv.c#L424
> )
> There is no opportunity to get the status of DecodeFrameAsync at
> these case. I haven't compared MSDK sample decoder with current
> FFmpeg (Will take a look later). But IIRC, MSDK sample decoder isn't
> doing a better job than ffmpeg-qsv for reinit case.
> (one example, it is hang when run sample_decode h264 -i /fate-
> suite/h264/reinit-large_420_8-to-small_420_8.h264 -r , and this clip
> is changing from a larger resolution to a smaller one. I wonder to
> know what will happen if change resolution from smaller to a larger,
> eg, 720x480 to 1920x1080).
> 3. IMHO, reinit work is better to do before decoding, instead of
> decoding it and recover it after it is failed.
> 4. This a common function for all decoders, I would like to see it is
> under control of ffmpeg. MSDK may have no unified implementation of
> all codecs ( e.g: https://github.com/Intel-Media-SDK/MediaSDK/issues/
> 333 ).
> Your suggestion may be addressed only when I see any new issues or
> bugs. But currently I can't see the necessity.
>
> > Application may attempt to optimize procedure on step 3 above and
> > avoid
> > surfaces reallocation if possible. For example, if you have
> > allocated
> > 1920x1080 surfaces and resolution changed to 720x480, you may
> > actually
> > stay on 1920x1080 if you are ok to use more memory hoping to
> > receive
> > 1920x1080 stream in the future again. Eventually this would work if
> > you have
> > enough number of 1920x1080 surfaces.
>
> Hm, this is a simpler topic, https://patchwork.ffmpeg.org/patch/12257
> / is to prepare this though decoder should be updated too. But as
> previous discussion, thus should be a separated patch and need pre-
> testing.
More information about the ffmpeg-devel
mailing list