[FFmpeg-devel] [PATCH] H264 DXVA2 implementation
Laurent Aimar
fenrir
Thu Jan 7 23:19:26 CET 2010
Hi,
On Thu, Jan 07, 2010, Reimar D?ffinger wrote:
> On Thu, Jan 07, 2010 at 08:36:11PM +0100, Laurent Aimar wrote:
> > + if (size <= dxva_size) {
> > + memcpy(dxva_data, data, size);
> > +
> > + memset(dsc, 0, sizeof(*dsc));
> > + dsc->CompressedBufferType = type;
> > + dsc->DataSize = size;
> > + dsc->NumMBsInBuffer = mb_count;
> > + }
> > + if (FAILED(IDirectXVideoDecoder_ReleaseBuffer(ctx->decoder, type))) {
> > + av_log(avctx, AV_LOG_DEBUG, "Failed to release buffer type %d\n", type);
> > + return -1;
> > + }
> > + if (dxva_size < size) {
> > + av_log(avctx, AV_LOG_DEBUG, "Buffer for type %d was too small\n", type);
> > + return -1;
> > + }
>
> This is the opposite of condition above, so use an else above (yes, I know why
> you did it like this, but I am sure this can be done nicer.
Then it becomes:
if (dxva_size < size) {
if (FAILED(IDirectXVideoDecoder_ReleaseBuffer(ctx->decoder, type)))
av_log(avctx, AV_LOG_DEBUG, "Failed to release buffer type %d\n", type);
av_log(avctx, AV_LOG_DEBUG, "Buffer for type %d was too small\n", type);
return -1;
}
memcpy(dxva_data, data, size);
memset(dsc, 0, sizeof(*dsc));
dsc->CompressedBufferType = type;
dsc->DataSize = size;
dsc->NumMBsInBuffer = mb_count;
if (FAILED(IDirectXVideoDecoder_ReleaseBuffer(ctx->decoder, type))) {
av_log(avctx, AV_LOG_DEBUG, "Failed to release buffer type %d\n", type);
return -1;
}
return 0;
or
if (dxva_size < size) {
av_log(avctx, AV_LOG_DEBUG, "Buffer for type %d was too small\n", type);
goto release;
}
memcpy(dxva_data, data, size);
memset(dsc, 0, sizeof(*dsc));
dsc->CompressedBufferType = type;
dsc->DataSize = size;
dsc->NumMBsInBuffer = mb_count;
release:
if (FAILED(IDirectXVideoDecoder_ReleaseBuffer(ctx->decoder, type))) {
av_log(avctx, AV_LOG_DEBUG, "Failed to release buffer type %d\n", type);
return -1;
}
return size <= dxva_size ? 0 : -1; // or -(dxva_size < size) for a non readable way
But I don't really find them nicer.
>
> > + pp->residual_colour_transform_flag=
> > + (h->sps.profile_idc >= 100 &&
> > + h->sps.chroma_format_idc == 3) ? h->sps.residual_color_transform_flag : 0;
>
> Any reason why the sps reading code shouldn't just apply this condition?
Well, the DXVA2 specs says that all non bitstream defined values must be set
to 0 (with a few exceptions). It is unrelated to what does h264.c (it could put
non 0 value if it simplify its job). (Referenced as [1].)
>
> > + if (h->sps.profile_idc >= 100) {
> > + pp->bit_depth_luma_minus8 = h->sps.bit_depth_luma - 8;
> > + pp->bit_depth_chroma_minus8 = h->sps.bit_depth_chroma - 8;
> > + } else {
> > + pp->bit_depth_luma_minus8 = 0;
> > + pp->bit_depth_chroma_minus8 = 0;
> > + }
> Again, why not in sps reading code (the condition, not the - 8)?
For that one (where a sensible value stored by h264.c would also match the
requirement of DXVA2), I would say that it doesn't seem the spirit of h264.c
to initialized non bitstream defined value with a valid one.
>
> > + for (i = 0; i < 6; i++)
> > + for (j = 0; j < 16; j++)
> > + qm->bScalingLists4x4[i][j] = h->pps.scaling_matrix4[i][zigzag_scan[j]];
> > +
> > + for (i = 0; i < 2; i++)
> > + for (j = 0; j < 64; j++)
> > + qm->bScalingLists8x8[i][j] = h->pps.scaling_matrix8[i][ff_zigzag_direct[j]];
>
> Swap the loop, and in the second case unroll it.
> I doubt speed matters, but then you can do the zigzag lookup in the outer loop
> which is more readable IMO.
for (j = 0; j < 16; j++) {
const int zj = zigzag_scan[j];
for (i = 0; i < 6; i++)
qm->bScalingLists4x4[i][j] = h->pps.scaling_matrix4[i][zj];
}
for (j = 0; j < 64; j++) {
const int zj = ff_zigzag_direct[j];
qm->bScalingLists8x8[i][j] = h->pps.scaling_matrix8[0][zj];
qm->bScalingLists8x8[i][j] = h->pps.scaling_matrix8[1][zj];
}
Like that ?
Not sure it helps/improves (and no, speed does not matter at all here).
>
> > + switch (h->slice_type) {
> > + case FF_P_TYPE: slice->slice_type = 0; break;
> > + case FF_B_TYPE: slice->slice_type = 1; break;
> > + case FF_I_TYPE: slice->slice_type = 2; break;
> > + case FF_SP_TYPE: slice->slice_type = 3; break;
> > + case FF_SI_TYPE: slice->slice_type = 4; break;
>
> There are no constants defined for those values?
They come from the specs (H264 one).
> > + for (list = 0; list < 2; list++) {
> > + unsigned i;
> > + for (i = 0; i < FF_ARRAY_ELEMS(slice->RefPicList[list]); i++) {
> > + if (list < h->list_count && i < h->ref_count[list]) {
> > + const Picture *r = &h->ref_list[list][i];
> > + unsigned plane;
> > + slice->RefPicList[list][i].Index7Bits = get_surface_index(ctx, r);
> > + slice->RefPicList[list][i].AssociatedFlag = r->reference == PICT_BOTTOM_FIELD;
> > + if (h->luma_weight_flag[list]) {
> > + slice->Weights[list][i][0][0] = h->luma_weight[list][i];
> > + slice->Weights[list][i][0][1] = h->luma_offset[list][i];
> > + } else {
> > + slice->Weights[list][i][0][0] = 1 << h->luma_log2_weight_denom;
> > + slice->Weights[list][i][0][1] = 0;
> > + }
> > + for (plane = 1; plane < 3; plane++) {
> > + int w, o;
> > + if (h->chroma_weight_flag[list]) {
> > + w = h->chroma_weight[list][i][plane-1];
> > + o = h->chroma_offset[list][i][plane-1];
> > + } else {
> > + w = 1 << h->chroma_log2_weight_denom;
> > + o = 0;
> > + }
> > + slice->Weights[list][i][plane][0] = w;
> > + slice->Weights[list][i][plane][1] = o;
> > + }
> > + } else {
> > + unsigned plane;
> > + slice->RefPicList[list][i].bPicEntry = 0xff;
> > + for (plane = 0; plane < 3; plane++) {
> > + slice->Weights[list][i][plane][0] = 0;
> > + slice->Weights[list][i][plane][1] = 0;
> > + }
> > + }
> > + }
> > + }
>
> Maybe it would be better to split this and the other one like it into
> two parts, one initialization looping from 0 to FF_ARRAY_ELEMS..
> and one setup lopping from 0 till h->ref_count
I don't like computing and storing values that will be overwritten
just after. It offuscates what is really needed. But if you want it,
I could do it.
> > + if (h->pps.redundant_pic_cnt_present)
> > + slice->redundant_pic_cnt = h->redundant_pic_count;
>
> Why/how can h->redundant_pic_count have the wrong value if h->pps.redundant_pic_cnt_present is 0?
Because then, h264.c does not set it.
> > + slice->cabac_init_idc = h->pps.cabac ? h->cabac_init_idc : 0;
>
> Same.
Same than [1]
> > + if (h->deblocking_filter < 2)
> > + slice->disable_deblocking_filter_idc = 1 - h->deblocking_filter;
>
> Does this intentionally cover values < 0 or wouldn't maybe !h->deblocking_filter be clearer?
It cannot be < 0 (deblocking_filter is 0, 1 or 2), and we want to
exchange the value 0 and 1 while keeping 2 (to revert the clever trick that
h264.c does).
> > +/* */
> > +static int start_frame(AVCodecContext *avctx,
> > + av_unused const uint8_t *buffer,
> > + av_unused uint32_t size)
>
> Not a helpful comment
Missed it, will remove.
>
> > + /* Create an annexe B bitstream buffer with only slice NAL and finalize slice */
>
> A e to much in annex
Thanks.
> > +#define OFFSET_OF(t,f) ((intptr_t)&((t*)NULL)->f)
> > + assert(OFFSET_OF(DXVA_Slice_H264_Short, BSNALunitDataLocation) ==
> > + OFFSET_OF(DXVA_Slice_H264_Long, BSNALunitDataLocation));
> > + assert(OFFSET_OF(DXVA_Slice_H264_Short, SliceBytesInBuffer) ==
> > + OFFSET_OF(DXVA_Slice_H264_Long, SliceBytesInBuffer));
> > +#undef OFFSET_OF
>
> There is offsetof which probably works at least as well as this hack which
> invokes undefined behaviour.
Why does it invoke undefined behaviour ?
(Btw, the same trick is used by the linux kernel header).
Anyway, I will use offsetof().
>
> > + if (i == ctx_pic->slice_count - 1)
> > + padding = 128 - ((¤t[start_code_size + size] - dxva_data) % 128);
>
> Why not & 127?
> If you use % I think a lot of people familiar with FFmpeg will think you need it for
> its different behaviour with negative values.
I use & and >> tricks only when speed matters.
I will use &.
> > + if (¤t[start_code_size + size] > &dxva_data[dxva_size]) {
>
> Pointlessly using & and [] instead of just adding pointers?
> Also, sure this does not invoke undefined behaviour (i.e. is
> current + start_code_size + size at most one after the last element of the allocated
> memory?)
I can do
if (current - dxva_data + start_code_size + size > dxva_size)
to be sure that no undefined behaviour happen.
> > + if (!FAILED(IDirectXVideoDecoder_ReleaseBuffer(ctx->decoder,
> > + DXVA2_BitStreamDateBufferType))) {
> > + DXVA2_DecodeBufferDesc *dsc = &buffer[buffer_count++];
> > + memset(dsc, 0, sizeof(*dsc));
> > + dsc->CompressedBufferType = DXVA2_BitStreamDateBufferType;
> > + dsc->DataSize = current - dxva_data;
> > + dsc->NumMBsInBuffer = mb_count;
> > + }
> > + else
> > + av_log(avctx, AV_LOG_DEBUG, "Failed to add BS\n");
>
> And I don't think it is a good idea to just silently swallow all those errors.
>
> > + memset(&exe, 0, sizeof(exe));
>
> exe IMO is bad name.
Why ? The structure type is DXVA2_DecodeExecuteParams. I could call it
cfg, but it don't think it would be better.
Regards,
--
fenrir
More information about the ffmpeg-devel
mailing list