[FFmpeg-devel] [PATCH v9 3/6] libavcodec/webp: add support for animated WebP

Tomas Härdin git at haerdin.se
Sun Dec 31 17:17:01 EET 2023


sön 2023-12-31 klockan 15:54 +0100 skrev Thilo Borgmann via ffmpeg-
devel:
> 
> Am 31.12.23 um 13:56 schrieb Tomas Härdin:
> > > +    for (int y = 0; y < height; y++) {
> > > +        const uint8_t *src1 = src1_data[0] + y *
> > > src1_linesize[0];
> > > +        const uint8_t *src2 = src2_data[0] + (y + pos_y) *
> > > src2_linesize[0] + pos_x * src2_step[0];
> > > +        uint8_t       *dest = dest_data[0] + (y + pos_y) *
> > > dest_linesize[0] + pos_x * sizeof(uint32_t);
> > > +        for (int x = 0; x < width; x++) {
> > > +            int src1_alpha = src1[0];
> > > +            int src2_alpha = src2[0];
> > > +
> > > +            if (src1_alpha == 255) {
> > > +                memcpy(dest, src1, sizeof(uint32_t));
> > > +            } else if (src1_alpha + src2_alpha == 0) {
> > > +                memset(dest, 0, sizeof(uint32_t));
> > > +            } else {
> > > +                int tmp_alpha = src2_alpha -
> > > ROUNDED_DIV(src1_alpha
> > > * src2_alpha, 255);
> > > +                int blend_alpha = src1_alpha + tmp_alpha;
> > > +
> > > +                dest[0] = blend_alpha;
> > > +                dest[1] = ROUNDED_DIV(src1[1] * src1_alpha +
> > > src2[1]
> > > * tmp_alpha, blend_alpha);
> > > +                dest[2] = ROUNDED_DIV(src1[2] * src1_alpha +
> > > src2[2]
> > > * tmp_alpha, blend_alpha);
> > > +                dest[3] = ROUNDED_DIV(src1[3] * src1_alpha +
> > > src2[3]
> > > * tmp_alpha, blend_alpha);
> > > +            }
> > 
> > Is branching and a bunch of function calls (which I hope get
> > optimized
> > out) really faster than just always doing the blending?
> 
> If I trust my START_TIMER/STOP_TIMER interpretation, I'd say so:
> 
> With branches:
> 253315 UNITS in blend_alpha_yuva,     128 runs,      0 skips
> 
> Always blending:
> 351104 UNITS in blend_alpha_yuva,     128 runs,      0 skips

Alright. Still curious if it can be sped up by checking multiple pixels
at a time. But that can be done later
> 

> > > +static int blend_frame_into_canvas(WebPContext *s)
> > > +{
> > > +    AVFrame *canvas = s->canvas_frame.f;
> > > +    AVFrame *frame  = s->frame;
> > > +    int width, height;
> > > +    int pos_x, pos_y;
> > > +
> > > +    if ((s->anmf_flags & ANMF_BLENDING_METHOD) ==
> > > ANMF_BLENDING_METHOD_OVERWRITE
> > > +        || frame->format == AV_PIX_FMT_YUV420P) {
> > > +        // do not blend, overwrite
> > > +
> > > +        if (canvas->format == AV_PIX_FMT_ARGB) {
> > > +            width  = s->width;
> > > +            height = s->height;
> > > +            pos_x  = s->pos_x;
> > > +            pos_y  = s->pos_y;
> > > +
> > > +            for (int y = 0; y < height; y++) {
> > > +                const uint32_t *src = (uint32_t *) (frame-
> > > >data[0] +
> > > y * frame->linesize[0]);
> > > +                uint32_t *dst = (uint32_t *) (canvas->data[0] +
> > > (y +
> > > pos_y) * canvas->linesize[0]) + pos_x;
> > > +                memcpy(dst, src, width * sizeof(uint32_t));
> > > +            }
> > 
> > This could be reduced to a single memcpy() when linesizes are
> > equal.
> > Same for the other memcpy()s
> 
> Its a subimage copied into a canvas (see pos_x and pos_y).
> Has to be copied line-by-line.

Ah, I missed that

/Tomas


More information about the ffmpeg-devel mailing list