[FFmpeg-devel] [PATCH] h264 parallelized

Mon Sep 3 10:32:43 CEST 2007

Hello,

Michael Niedermayer wrote:
> Hi
> 

>> +    if(h->deblocking_filter == 1 && h0->max_contexts > 1) {
>> +        h0->max_contexts = 1;
>> +        if(h != h0)
>> +            execute_decode_slices(h, 0); // deblocking switched inside frame, process pending slices
>> +
>> +        if(!h0->single_decode_warning) {
>> +            av_log(s->avctx, AV_LOG_INFO, "Cannot parallelize deblocking type 1, decoding such frames in sequential order\n");
>> +            h0->single_decode_warning = 1;
>> +        }
>> +    }
> 
> have you tested the code above?

The code that forces sequential decoding:
Yes, md5 matches with unmodified svn.

The case where deblocking switches inside of a frame:
No, i don't have such content available.

Do you think there is a bug lurking here?

> [...]
>> +    if(h->current_context == 1) {
>> +        decode_slice(avctx, h);
>> +    } else {
>> +        for(i = 1; i < h->current_context; i++) {
>> +            hx = h->thread_context[i];
> 
>> +            hx->s.error_resilience = 1;
> 
> why?

Otherwise ff_er_add_slice() wont decrease error_count in the
slave contexts.

But, i guess it should replaced with

hx->s.error_resilience = avctx->error_resilience;

instead, right?

>> +            hx->s.error_count = 0;
>> +        }
>> +
>> +        avctx->execute(avctx, (void *)decode_slice, (void **)h->thread_context + h->first_context,
>> +                       NULL, h->current_context - h->first_context);
>> +
>> +        /* pull back stuff from slices to master context */
>> +        hx = h->thread_context[h->current_context - 1];
>> +        s->mb_x = hx->s.mb_x;
>> +        s->mb_y = hx->s.mb_y;
>> +        for(i = 1; i < h->current_context; i++)
>> +            h->s.error_count += h->thread_context[i]->s.error_count;
>> +    }
>> +    if(reset) {
>> +        h->current_context = 0;
>> +        h->first_context = 0;
>> +    } else {
>> +        /* if currently decoding a slice header / NAL unit, we cannot just set h->current_context to 0.
> 
> why not? this increases complexity alot ...
> wouldnt copying the current context to 0 work too?

Yes, this might be feasible, lemme check that ..

> 
> 
> [...]
>>          case NAL_IDR_SLICE:
>> +            if(h != hx && (h->long_ref_count || h->short_ref_count))
>> +                execute_decode_slices(h, 0); // idr() below will reap frames, execute all we've done so far
>>              idr(h); //FIXME ensure we don't loose some frames if there is reordering
> 
> why is this needed?

The initial thought behind it was:

* idr() destroys pictures in h->long_ref and h->short_ref.
* These may already have been copied with clone_slice()

However, that would only be the case if a picture contains
mixed idr and non-idr slices. And that is not valid, right?

Perhaps it's better to add a check for this, and bail out?