[FFmpeg-devel] [PATCH 2/5] libx264: Update ROI behaviour to match documentation
Guo, Yejun
yejun.guo at intel.com
Tue Mar 5 14:46:10 EET 2019
> -----Original Message-----
> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf
> Of Mark Thompson
> Sent: Tuesday, March 05, 2019 8:52 AM
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH 2/5] libx264: Update ROI behaviour to
> match documentation
>
> On 28/02/2019 06:38, Guo, Yejun wrote:
> >> -----Original Message-----
> >> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On
> Behalf
> >> Of mypopy at gmail.com
> >> Sent: Thursday, February 28, 2019 11:26 AM
> >> To: FFmpeg development discussions and patches <ffmpeg-
> >> devel at ffmpeg.org>
> >> Subject: Re: [FFmpeg-devel] [PATCH 2/5] libx264: Update ROI behaviour
> to
> >> match documentation
> >>
> >> On Thu, Feb 28, 2019 at 10:53 AM Guo, Yejun <yejun.guo at intel.com>
> wrote:
> >>>> -----Original Message-----
> >>>> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On
> >> Behalf
> >>>> Of Mark Thompson
> >>>> Sent: Thursday, February 28, 2019 6:00 AM
> >>>> To: ffmpeg-devel at ffmpeg.org
> >>>> Subject: [FFmpeg-devel] [PATCH 2/5] libx264: Update ROI behaviour to
> >>>> match documentation
> >>>>
> >>>> Fix the quantisation offset - use the whole range, and don't change the
> >>>> offset size based on bit depth.
> >>>>
> >>>> Use correct bottom/right edge locations (they are offsets from
> >> bottom/right,
> >>>> not from top/left).
> >>>>
> >>>> Iterate the list in reverse order. The regions are in order of
> >> decreasing
> >>>> importance, so the most important must be applied last.
> >>>> ---
> >>>> libavcodec/libx264.c | 50 ++++++++++++++++++++++++-----------------
> ---
> >>>> 1 file changed, 27 insertions(+), 23 deletions(-)
> >>>>
> >>>> diff --git a/libavcodec/libx264.c b/libavcodec/libx264.c
> >>>> index a3493f393d..475719021e 100644
> >>>> --- a/libavcodec/libx264.c
> >>>> +++ b/libavcodec/libx264.c
> >>>> @@ -285,16 +285,18 @@ static int X264_frame(AVCodecContext *ctx,
> >>>> AVPacket *pkt, const AVFrame *frame,
> >>>> int nnal, i, ret;
> >>>> x264_picture_t pic_out = {0};
> >>>> int pict_type;
> >>>> + int bit_depth;
> >>>> int64_t *out_opaque;
> >>>> AVFrameSideData *sd;
> >>>>
> >>>> x264_picture_init( &x4->pic );
> >>>> x4->pic.img.i_csp = x4->params.i_csp;
> >>>> #if X264_BUILD >= 153
> >>>> - if (x4->params.i_bitdepth > 8)
> >>>> + bit_depth = x4->params.i_bitdepth;
> >>>> #else
> >>>> - if (x264_bit_depth > 8)
> >>>> + bit_depth = x264_bit_depth;
> >>>> #endif
> >>>> + if (bit_depth > 8)
> >>>> x4->pic.img.i_csp |= X264_CSP_HIGH_DEPTH;
> >>>> x4->pic.img.i_plane = avfmt2_num_planes(ctx->pix_fmt);
> >>>>
> >>>> @@ -359,45 +361,47 @@ static int X264_frame(AVCodecContext *ctx,
> >>>> AVPacket *pkt, const AVFrame *frame,
> >>>> if (frame->interlaced_frame == 0) {
> >>>> int mbx = (frame->width + MB_SIZE - 1) / MB_SIZE;
> >>>> int mby = (frame->height + MB_SIZE - 1) / MB_SIZE;
> >>>> + int qp_range = 51 + 6 * (bit_depth - 8);
> >>>
> >>> just found following from "$ ./x264 --fullhelp", not sure what 81 means
> >> here. Shall we change our qoffset formula accordingly?
> >>> --qpmin <integer> Set min QP [0]
> >>> --qpmax <integer> Set max QP [81]
>
> libx264 applies a fixed offset to the QP range to make it nonnegative (by
> adding QpBdOffsetY). The maximum of 81 therefore corresponds to a bit
> depth of (81-51)/6+8 = 13 bits, the higher values only being valid at the higher
> depths.
>
> I guess that's set for a higher depth than libx264 actually supports to avoid
> any future problems with range (e.g. 12-bit support would not be an
> unreasonable feature).
>
got it, thanks.
> >>>> int nb_rois;
> >>>> - AVRegionOfInterest* roi;
> >>>> - float* qoffsets;
> >>>> + const AVRegionOfInterest *roi;
> >>>> + float *qoffsets;
> >>>> qoffsets = av_mallocz_array(mbx * mby,
> >> sizeof(*qoffsets));
> >>>> if (!qoffsets)
> >>>> return AVERROR(ENOMEM);
> >>>>
> >>>> - nb_rois = sd->size / sizeof(AVRegionOfInterest);
> >>>> - roi = (AVRegionOfInterest*)sd->data;
> >>>> - for (int count = 0; count < nb_rois; count++) {
> >>>> - int starty = FFMIN(mby, roi->top / MB_SIZE);
> >>>> - int endy = FFMIN(mby, (roi->bottom + MB_SIZE
> >> - 1)/ MB_SIZE);
> >>>> - int startx = FFMIN(mbx, roi->left / MB_SIZE);
> >>>> - int endx = FFMIN(mbx, (roi->right + MB_SIZE
> >> - 1)/ MB_SIZE);
> >>>> + roi = (const AVRegionOfInterest*)sd->data;
> >>>> + if (!roi->self_size || sd->size % roi->self_size
> >> != 0) {
> >>>> + av_log(ctx, AV_LOG_ERROR, "Invalid
> >>>> AVRegionOfInterest.self_size.\n");
> >>>> + return AVERROR(EINVAL);
> >>>> + }
> >>>> + nb_rois = sd->size / roi->self_size;
> >>>> +
> >>>> + // This list must be iterated in reverse because
> >> regions are
> >>>> + // defined in order of decreasing importance.
> >>>
> >>> Nit: the reason may be more straight forward.
> >>> This list must be iterated in reverse because: when overlapping regions
> >> are defined, the first region containing a given area of the frame applies.
>
> Right, yes. I've updated the comment appropriately.
>
> >>>> + for (int i = nb_rois - 1; i >= 0; i--) {
> >>>> + int startx, endx, starty, endy;
> >>>> float qoffset;
> >>>>
> >>>> + roi = (const AVRegionOfInterest*)(sd->data +
> >> roi->self_size * i);
> >>>> +
> >>>> + starty = av_clip(roi->top / MB_SIZE, 0, mby);
> >>>> + endy = av_clip((frame->height - roi->bottom
> >> + MB_SIZE - 1) /
> >>>> MB_SIZE, 0, mby);
> >>>> + startx = av_clip(roi->left / MB_SIZE, 0, mbx);
> >>>> + endx = av_clip((frame->width - roi->right +
> >> MB_SIZE - 1) /
> >>>> MB_SIZE, 0, mbx);
> >>>
> >>> not quite understand why endx/endy is calculated so.
> >>> For example, for a 1920x1080 frame, and roi->top is 0, and roi->bottom is
> >> 1080, then,
> >>> starty is 0, endy is 0, which make the following loop does not work.
> >>
> >> I think Mark use the (left/top) and (right/bottom) as the offset, like this:
> >> in the fig, (left/top) == (s1x, s1y), (right/bottom) ==(s2x,s2y)
> >>
> >>
> >> +-----------------------+------> w (x)
> >> | ^ |
> >> | | s1y |
> >> | V |
> >> | +***********+ |
> >> | s1x * * s2x|
> >> |<--> * ROI *<-->|
> >> | * * |
> >> | +***********+ |
> >> | ^ |
> >> | | s2y |
> >> | V |
> >> |-----------------------+
> >> |
> >> V
> >>
> >> h (y)
> >>
> >
> > thanks, I guess so. But, I'm not quite understand why use this style.
>
> This definition is what is in the current documentation:
> <http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavutil/frame.h;h=8aa3e8
> 8367ad3605033dbfe92a431d97e0834023;hb=HEAD#l215>. I followed that
> when writing the VAAPI support, and only afterwards realised that it didn't
> match what libx264 was doing.
my fault, I just copied the comment, I min-understood it. I would prefer to
correct the documentation typo here.
>
> I wouldn't mind a different way to define the rectangle if everyone agrees.
> This method matches cropping in AVFrame and various codecs like H.264.
> The current libx26[45] code instead defines the coordinates of the top left
> and bottom right of the region. A third alternative would be the coordinates
> of the top left combined with a width and height, which would match some
> other filters as suggested by Moritz in the addroi thread (that was primarily
> talking about the filter option interface, but I don't think it's unreasonable to
> want the API to match).
I'm also ok to switch to top/left/width/height.
>
> Thanks
>
> - Mark
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
More information about the ffmpeg-devel
mailing list