[FFmpeg-devel] [PATCH] h264 parallelized, (was: Parallelized h264 proof-of-concept)
Michael Niedermayer
michaelni
Sat Jun 16 18:14:07 CEST 2007
Hi
On Fri, Jun 15, 2007 at 10:10:54PM +0200, Andreas ?man wrote:
> Andreas ?man wrote:
> >Hi
> >
> >Michael Niedermayer wrote:
> >>Hi
> >>
> >>
> >>av_free() + av_malloc() or pass an argument to MPV_common_init()
> >
> >I'll extend MPV_common_init() with an additional argument then.
> >
> >>> static void filter_mb_fast( H264Context *h, int mb_x, int mb_y, uint8_t
> >>> *img_y, uint8_t *img_cb, uint8_t *img_cr, unsigned int linesize,
> >>> unsigned int uvlinesize);
> >>>+static void execute_decode_slices(H264Context *h, int reset);
> >>cant you order the new functions so as to avoid that?
> >
> >Hm, not really. decode_slice_header() needs to be able to
> >fire off any pending slices in case the deblocking-type changes
> >within a frame. (AFAIK this is valid according to the specs,
> >perhaps I'm wrong?)
> >
>
> Okay, here are the finalized patches.
>
> #1 - Extend MPV_common_init() with an addition arg for context size
> when doing multi threading.
hmm, seeing the patch, i think i would prefer some simpler solution,
maybe adding the size to MpegEncContext? or even better adding
thread_context[] to H264Context, this would also avoid the casts to
H264Context
>
> #2 - Factor out init_scan_tables()
looks ok (and can be applied)
>
> #3 - Decouple bit context from h264 context in decode_ref_pic_marking()
looks ok (and can be applied)
>
> #4 - Slice level parallelism for deblocking type 0 and 2
>
> regression tests passes
regression tests ? theres no h.264 regression test ...
i assume you mean you tested this on several h.264 streams and the
output is binary identical ...
[...]
> @@ -3022,8 +3067,18 @@
> MpegEncContext * const s = &h->s;
> int temp8, i;
> uint64_t temp64;
> - int deblock_left = (s->mb_x > 0);
> - int deblock_top = (s->mb_y > 0);
> + int deblock_left;
> + int deblock_top;
> + int mb_xy;
> +
> + if(h->deblocking_filter == 2) {
> + mb_xy = s->mb_x + s->mb_y*s->mb_stride;
> + deblock_left = h->slice_table[mb_xy] == h->slice_table[mb_xy - 1];
> + deblock_top = h->slice_table[mb_xy] == h->slice_table[h->top_mb_xy];
> + } else {
> + deblock_left = (s->mb_x > 0);
> + deblock_top = (s->mb_y > 0);
> + }
is this multitrheading specific? or a deblocking_filter == 2 fix? in the later
case it should be in a seperate patch
[...]
> if(!FRAME_MBAFF){
> int qp_thresh = 15 - h->slice_alpha_c0_offset - FFMAX(0, h->pps.chroma_qp_index_offset);
> int qp = s->current_picture.qscale_table[mb_xy];
> - if(qp <= qp_thresh
> - && (mb_x == 0 || ((qp + s->current_picture.qscale_table[mb_xy-1] + 1)>>1) <= qp_thresh)
> - && (mb_y == 0 || ((qp + s->current_picture.qscale_table[h->top_mb_xy] + 1)>>1) <= qp_thresh)){
> - return;
> +
> + if(qp <= qp_thresh) {
> + if(h->deblocking_filter == 1) {
> + if((mb_x == 0 || ((qp + s->current_picture.qscale_table[mb_xy-1] + 1)>>1) <= qp_thresh) &&
> + (mb_y == 0 || ((qp + s->current_picture.qscale_table[h->top_mb_xy] + 1)>>1) <= qp_thresh))
> + return;
> + } else {
> + int mb_slice = h->slice_table[mb_xy];
> + int left_qp, top_qp;
> +
> + left_qp = h->slice_table[mb_xy - 1] == mb_slice ? ((qp + s->current_picture.qscale_table[mb_xy-1] + 1)>>1) : 0;
> + top_qp = h->slice_table[h->top_mb_xy] == mb_slice ? ((qp + s->current_picture.qscale_table[h->top_mb_xy] + 1)>>1) : 0;
> + if(left_qp <= qp_thresh &&
> + top_qp <= qp_thresh)
> + return;
> + }
> }
> }
is this specific to threads?
[...]
> +
> + h->max_contexts = avctx->thread_count > 0 ? avctx->thread_count : 1;
i think thread_count must be >0
[...]
> + hx = (H264Context *)s->thread_context[h->current_context] ? (H264Context *)s->thread_context[h->current_context] : h;
what about s->thread_context[0] == h, i think that would avoid this check?
or does that cause other parts of the code to become more complex?
[...]
> case NAL_DPA:
> - init_get_bits(&s->gb, ptr, bit_length);
> - h->intra_gb_ptr=
> - h->inter_gb_ptr= NULL;
> - s->data_partitioning = 1;
> + init_get_bits(&hx->s.gb, ptr, bit_length);
> + hx->intra_gb_ptr=
> + hx->inter_gb_ptr= NULL;
> + hx->s.data_partitioning = 1;
>
> - if(decode_slice_header(h) < 0){
> + if(decode_slice_header(hx, h) < 0){
> av_log(h->s.avctx, AV_LOG_ERROR, "decode_slice_header error\n");
> }
indention is wrong here
[...]
> Index: libavcodec/h264.h
> ===================================================================
> --- libavcodec/h264.h (revision 9281)
> +++ libavcodec/h264.h (working copy)
> @@ -381,6 +381,16 @@
> const uint8_t *field_scan8x8_cavlc_q0;
>
> int x264_build;
> +
> + /* Slice-based multi threading members.
> + * These are only used in the "master" context */
doxygen supports comments on groups of variables, this should be used
here
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
No human being will ever know the Truth, for even if they happen to say it
by chance, they would not even known they had done so. -- Xenophanes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070616/0e373af2/attachment.pgp>
More information about the ffmpeg-devel
mailing list