[FFmpeg-devel] [PATCH 2/5] libx264: Update ROI behaviour to match documentation
Mark Thompson
sw at jkqxz.net
Tue Mar 5 02:52:19 EET 2019
On 28/02/2019 06:38, Guo, Yejun wrote:
>> -----Original Message-----
>> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf
>> Of mypopy at gmail.com
>> Sent: Thursday, February 28, 2019 11:26 AM
>> To: FFmpeg development discussions and patches <ffmpeg-
>> devel at ffmpeg.org>
>> Subject: Re: [FFmpeg-devel] [PATCH 2/5] libx264: Update ROI behaviour to
>> match documentation
>>
>> On Thu, Feb 28, 2019 at 10:53 AM Guo, Yejun <yejun.guo at intel.com> wrote:
>>>> -----Original Message-----
>>>> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On
>> Behalf
>>>> Of Mark Thompson
>>>> Sent: Thursday, February 28, 2019 6:00 AM
>>>> To: ffmpeg-devel at ffmpeg.org
>>>> Subject: [FFmpeg-devel] [PATCH 2/5] libx264: Update ROI behaviour to
>>>> match documentation
>>>>
>>>> Fix the quantisation offset - use the whole range, and don't change the
>>>> offset size based on bit depth.
>>>>
>>>> Use correct bottom/right edge locations (they are offsets from
>> bottom/right,
>>>> not from top/left).
>>>>
>>>> Iterate the list in reverse order. The regions are in order of
>> decreasing
>>>> importance, so the most important must be applied last.
>>>> ---
>>>> libavcodec/libx264.c | 50 ++++++++++++++++++++++++--------------------
>>>> 1 file changed, 27 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/libavcodec/libx264.c b/libavcodec/libx264.c
>>>> index a3493f393d..475719021e 100644
>>>> --- a/libavcodec/libx264.c
>>>> +++ b/libavcodec/libx264.c
>>>> @@ -285,16 +285,18 @@ static int X264_frame(AVCodecContext *ctx,
>>>> AVPacket *pkt, const AVFrame *frame,
>>>> int nnal, i, ret;
>>>> x264_picture_t pic_out = {0};
>>>> int pict_type;
>>>> + int bit_depth;
>>>> int64_t *out_opaque;
>>>> AVFrameSideData *sd;
>>>>
>>>> x264_picture_init( &x4->pic );
>>>> x4->pic.img.i_csp = x4->params.i_csp;
>>>> #if X264_BUILD >= 153
>>>> - if (x4->params.i_bitdepth > 8)
>>>> + bit_depth = x4->params.i_bitdepth;
>>>> #else
>>>> - if (x264_bit_depth > 8)
>>>> + bit_depth = x264_bit_depth;
>>>> #endif
>>>> + if (bit_depth > 8)
>>>> x4->pic.img.i_csp |= X264_CSP_HIGH_DEPTH;
>>>> x4->pic.img.i_plane = avfmt2_num_planes(ctx->pix_fmt);
>>>>
>>>> @@ -359,45 +361,47 @@ static int X264_frame(AVCodecContext *ctx,
>>>> AVPacket *pkt, const AVFrame *frame,
>>>> if (frame->interlaced_frame == 0) {
>>>> int mbx = (frame->width + MB_SIZE - 1) / MB_SIZE;
>>>> int mby = (frame->height + MB_SIZE - 1) / MB_SIZE;
>>>> + int qp_range = 51 + 6 * (bit_depth - 8);
>>>
>>> just found following from "$ ./x264 --fullhelp", not sure what 81 means
>> here. Shall we change our qoffset formula accordingly?
>>> --qpmin <integer> Set min QP [0]
>>> --qpmax <integer> Set max QP [81]
libx264 applies a fixed offset to the QP range to make it nonnegative (by adding QpBdOffsetY). The maximum of 81 therefore corresponds to a bit depth of (81-51)/6+8 = 13 bits, the higher values only being valid at the higher depths.
I guess that's set for a higher depth than libx264 actually supports to avoid any future problems with range (e.g. 12-bit support would not be an unreasonable feature).
>>>> int nb_rois;
>>>> - AVRegionOfInterest* roi;
>>>> - float* qoffsets;
>>>> + const AVRegionOfInterest *roi;
>>>> + float *qoffsets;
>>>> qoffsets = av_mallocz_array(mbx * mby,
>> sizeof(*qoffsets));
>>>> if (!qoffsets)
>>>> return AVERROR(ENOMEM);
>>>>
>>>> - nb_rois = sd->size / sizeof(AVRegionOfInterest);
>>>> - roi = (AVRegionOfInterest*)sd->data;
>>>> - for (int count = 0; count < nb_rois; count++) {
>>>> - int starty = FFMIN(mby, roi->top / MB_SIZE);
>>>> - int endy = FFMIN(mby, (roi->bottom + MB_SIZE
>> - 1)/ MB_SIZE);
>>>> - int startx = FFMIN(mbx, roi->left / MB_SIZE);
>>>> - int endx = FFMIN(mbx, (roi->right + MB_SIZE
>> - 1)/ MB_SIZE);
>>>> + roi = (const AVRegionOfInterest*)sd->data;
>>>> + if (!roi->self_size || sd->size % roi->self_size
>> != 0) {
>>>> + av_log(ctx, AV_LOG_ERROR, "Invalid
>>>> AVRegionOfInterest.self_size.\n");
>>>> + return AVERROR(EINVAL);
>>>> + }
>>>> + nb_rois = sd->size / roi->self_size;
>>>> +
>>>> + // This list must be iterated in reverse because
>> regions are
>>>> + // defined in order of decreasing importance.
>>>
>>> Nit: the reason may be more straight forward.
>>> This list must be iterated in reverse because: when overlapping regions
>> are defined, the first region containing a given area of the frame applies.
Right, yes. I've updated the comment appropriately.
>>>> + for (int i = nb_rois - 1; i >= 0; i--) {
>>>> + int startx, endx, starty, endy;
>>>> float qoffset;
>>>>
>>>> + roi = (const AVRegionOfInterest*)(sd->data +
>> roi->self_size * i);
>>>> +
>>>> + starty = av_clip(roi->top / MB_SIZE, 0, mby);
>>>> + endy = av_clip((frame->height - roi->bottom
>> + MB_SIZE - 1) /
>>>> MB_SIZE, 0, mby);
>>>> + startx = av_clip(roi->left / MB_SIZE, 0, mbx);
>>>> + endx = av_clip((frame->width - roi->right +
>> MB_SIZE - 1) /
>>>> MB_SIZE, 0, mbx);
>>>
>>> not quite understand why endx/endy is calculated so.
>>> For example, for a 1920x1080 frame, and roi->top is 0, and roi->bottom is
>> 1080, then,
>>> starty is 0, endy is 0, which make the following loop does not work.
>>
>> I think Mark use the (left/top) and (right/bottom) as the offset, like this:
>> in the fig, (left/top) == (s1x, s1y), (right/bottom) ==(s2x,s2y)
>>
>>
>> +-----------------------+------> w (x)
>> | ^ |
>> | | s1y |
>> | V |
>> | +***********+ |
>> | s1x * * s2x|
>> |<--> * ROI *<-->|
>> | * * |
>> | +***********+ |
>> | ^ |
>> | | s2y |
>> | V |
>> |-----------------------+
>> |
>> V
>>
>> h (y)
>>
>
> thanks, I guess so. But, I'm not quite understand why use this style.
This definition is what is in the current documentation: <http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavutil/frame.h;h=8aa3e88367ad3605033dbfe92a431d97e0834023;hb=HEAD#l215>. I followed that when writing the VAAPI support, and only afterwards realised that it didn't match what libx264 was doing.
I wouldn't mind a different way to define the rectangle if everyone agrees. This method matches cropping in AVFrame and various codecs like H.264. The current libx26[45] code instead defines the coordinates of the top left and bottom right of the region. A third alternative would be the coordinates of the top left combined with a width and height, which would match some other filters as suggested by Moritz in the addroi thread (that was primarily talking about the filter option interface, but I don't think it's unreasonable to want the API to match).
Thanks
- Mark
More information about the ffmpeg-devel
mailing list