[FFmpeg-devel] [PATCHv2] add signature filter for MPEG7 video signature

Wed Mar 30 22:57:47 CEST 2016

On Mittwoch, 30. März 2016 15:29:27 CEST Michael Niedermayer wrote:
> On Wed, Mar 30, 2016 at 01:57:24PM +0200, Gerion Entrup wrote:
> > Attached improved version of patch.
> > 
> > Differences to last time:
> > - reduce amount of errors in the signature (the last patch included some int
> > foo = a/b). This version replaces this with a rational.
> > - implement binary output.
> > - fixes in configure, some typos
> > 
> > I have found the conformance testfiles [1]. Both the binary and the xml output
> > passes the conformance test but are not bitexact. I wrote some python script
> > to prove this (see attachment). I don't see why this happens. If someone want
> > to help, the correspondent reference code is in the file
> > "ExtractionUtilities/VideoSignatureExtraction.cpp" beginning with line 1615,
> > that could be found here [2].
> > 
> > Then a few questions:
> > - The timebase of the testfiles is 90000. In the binary output unfortunately there
> > is only place for a 16 bit number, so this don't fit. Currently the code simply crop
> > remaining bits. Is there a better solution (devide with some number etc)?
> > 
> > - I try to use put_bits32 where it is possible, because I thought is is faster. Then
> > I saw it internally uses put_bits as well. Does it have a performance impact to
> > replace it with put_bits(..., 8, ...) (would simplify the code a lot)?
> > 
> > Gerion
> > 
> > [1] http://standards.iso.org/ittf/PubliclyAvailableStandards/c057047_ISO_IEC_15938-7_2003_Amd_6_2011_Conformance_Testing.zip
> > [2] http://standards.iso.org/ittf/PubliclyAvailableStandards/c056735_ISO_IEC_15938-6_2003_Amd_4_2011_Electronic_inserts.zip
> 
> >  Changelog                      |    1 
> >  configure                      |    1 
> >  doc/filters.texi               |   70 +++
> >  libavfilter/Makefile           |    1 
> >  libavfilter/allfilters.c       |    1 
> >  libavfilter/signature.h        |  574 ++++++++++++++++++++++++++++++
> >  libavfilter/signature_lookup.c |  527 +++++++++++++++++++++++++++
> >  libavfilter/version.h          |    4 
> >  libavfilter/vf_signature.c     |  774 +++++++++++++++++++++++++++++++++++++++++
> >  9 files changed, 1951 insertions(+), 2 deletions(-)
> > 18a73574782a4e5e576bed3857fd283a009ff532  0001-add-signature-filter-for-MPEG7-video-signature.patch
> > From c81db6a999694f01335ee0d88483f276f2d10d3f Mon Sep 17 00:00:00 2001
> > From: Gerion Entrup <gerion.entrup at flump.de>
> > Date: Sun, 20 Mar 2016 11:10:31 +0100
> > Subject: [PATCH] add signature filter for MPEG7 video signature
> > 
> > This filter does not implement all features of MPEG7. Missing features:
> > - compression of signature files
> > - work only on (cropped) parts of the video
> > ---
> >  Changelog                      |   1 +
> >  configure                      |   1 +
> >  doc/filters.texi               |  70 ++++
> >  libavfilter/Makefile           |   1 +
> >  libavfilter/allfilters.c       |   1 +
> >  libavfilter/signature.h        | 574 ++++++++++++++++++++++++++++++
> >  libavfilter/signature_lookup.c | 527 ++++++++++++++++++++++++++++
> >  libavfilter/version.h          |   4 +-
> >  libavfilter/vf_signature.c     | 774 +++++++++++++++++++++++++++++++++++++++++
> >  9 files changed, 1951 insertions(+), 2 deletions(-)
> >  create mode 100644 libavfilter/signature.h
> >  create mode 100644 libavfilter/signature_lookup.c
> >  create mode 100644 libavfilter/vf_signature.c
> > 
> > diff --git a/Changelog b/Changelog
> > index 1f57f5e..5b76607 100644
> > --- a/Changelog
> > +++ b/Changelog
> > @@ -12,6 +12,7 @@ version <next>:
> >  - ciescope filter
> >  - protocol blacklisting API
> >  - MediaCodec H264 decoding
> > +- MPEG-7 Video Signature filter
> >  
> >  
> >  version 3.0:
> [...]
> 
> > +typedef struct {
> > +    int x;
> > +    int y;
> > +} Point;
> > +
> > +typedef struct {
> > +    Point up;
> > +    Point to;
> > +} Block;
> 
> these are used for tables of small values, int which is 32bit
> would waste quite some space, can uint8_t be used too ?
Yes, all Points are < 32 .

> 
> 
> [...]
> > +/* bitcount[index] = amount of ones in (binary) index */
> > +static const int bitcount[256] =
> > +{
> > +  0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
> > +  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
> > +  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
> > +  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> > +  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
> > +  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> > +  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> > +  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
> > +  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
> > +  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> > +  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> > +  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
> > +  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
> > +  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
> > +  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
> > +  4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8
> > +};
> 
> av_popcount()
> that also does 4 bytes at a time
>  
> 
> [...]
> > +static int get_l1dist(AVFilterContext *ctx, SignatureContext *sc, uint8_t *first, uint8_t *second)
> > +{
> > +    unsigned int i;
> > +    int dist = 0;
> > +    int f,s;
> > +
> > +    for(i=0; i < SIGELEM_SIZE/5; i++){
> > +        if(first[i] != second[i]){
> > +            f = first[i];
> > +            s = second[i];
> > +            do {
> > +                dist += FFABS((f % 3) - (s % 3));
> > +                f/=3;
> > +                s/=3;
> > +            } while(f > 0 || s > 0);
> 
> division and modulo are slow
> if this is speed relevant then please use a LUT
The already existing lut was meant for this. I improved the code and actually uses it.
BTW is division by 2 optimized out or it is better to use >> 1 ?

> 
> 
> [...]
> 
> > +static int filter_frame(AVFilterLink *inlink, AVFrame *picref)
> > +{
> > +    AVFilterContext *ctx = inlink->dst;
> > +    SignatureContext *sic = ctx->priv;
> > +    StreamContext *sc = &(sic->streamcontexts[FF_INLINK_IDX(inlink)]);
> > +    FineSignature* fs;
> > +
> > +
> > +
> > +    unsigned int pot3[5] = { 3*3*3*3, 3*3*3, 3*3, 3, 1 };
> > +    /* indexes of words : 210,217,219,274,334  44,175,233,270,273  57,70,103,237,269  100,285,295,337,354  101,102,111,275,296
> > +    s2usw = sorted to unsorted wordvec: 44 is at index 5, 57 at index 10...
> > +    */
> > +    unsigned int wordvec[25] = {44,57,70,100,101,102,103,111,175,210,217,219,233,237,269,270,273,274,275,285,295,296,334,337,354};
> 
> > +    unsigned int s2usw[25]   = { 5,10,11, 15, 20, 21, 12, 22,  6,  0,  1,  2,  7, 13, 14,  8,  9,  3, 23, 16, 17, 24,  4, 18, 19};
> 
> static const uint8_t
> 
> 
> > +
> > +    uint8_t wordt2b[5] = { 0, 0, 0, 0, 0 }; /* word ternary to binary */
> > +    uint64_t intpic[32][32];
> > +    uint64_t rowcount;
> > +    uint8_t *p = picref->data[0];
> > +    int inti, intj;
> > +    int *intjlut;
> > +
> > +    double conflist[DIFFELEM_SIZE];
> > +    int f = 0, g = 0, w = 0;
> > +    int dh1 = 1, dh2 = 1, dw1 = 1, dw2 = 1, denum, a, b;
> > +    int i,j,k,ternary;
> > +    uint64_t blocksum;
> > +    int blocksize;
> > +    double th; /* threshold */
> > +    double sum;
> > +
> > +    /* initialize fs */
> > +    if(sc->curfinesig){
> > +        fs = av_mallocz(sizeof(FineSignature));
> > +        sc->curfinesig->next = fs;
> > +        fs->prev = sc->curfinesig;
> > +        sc->curfinesig = fs;
> > +    }else{
> > +        fs = sc->curfinesig = sc->finesiglist;
> > +        sc->curcoursesig1->first = fs;
> > +    }
> > +
> > +    fs->pts = picref->pts;
> > +    fs->index = sc->lastindex++;
> > +
> 
> > +    for (i=0; i<32; i++){
> > +        for(j=0; j<32; j++){
> > +            intpic[i][j]=0;
> > +        }
> > +    }
> 
> memset
> 
> 
> > +    intjlut = av_malloc(inlink->w * sizeof(int));
> > +    for (i=0; i < inlink->w; i++){
> > +        intjlut[i] = (i<<5)/inlink->w;
> > +    }
> > +
> > +    for (i=0; i < inlink->h; i++){
> > +        inti = (i<<5)/inlink->h;
> > +        for (j=0; j< inlink->w; j++){
> > +            intj = intjlut[j];
> > +            intpic[inti][intj] += p[j];
> > +        }
> > +        p += picref->linesize[0];
> > +    }
> > +    av_free(intjlut);
> > +
> > +    /* The following calculate a summed area table (intpic) and brings the numbers
> > +     * in intpic to to the same denuminator.
> > +     * So you only have to handle the numinator in the following sections.
> > +     */
> > +    dh1 = inlink->h/32;
> > +    if (inlink->h%32)
> > +        dh2 = dh1 + 1;
> > +    dw1 = inlink->w/32;
> > +    if (inlink->w%32)
> > +        dw2 = dw1 + 1;
> > +    denum = dh1 * dh2 * dw1 * dw2;
> > +
> > +    for (i=0; i<32; i++){
> > +        rowcount = 0;
> > +        a = 1;
> > +        if (dh2 > 1) {
> > +            a = ((inlink->h*(i+1))%32 == 0) ? (inlink->h*(i+1))/32 - 1 : (inlink->h*(i+1))/32;
> > +            a -= ((inlink->h*i)%32 == 0) ? (inlink->h*i)/32 - 1 : (inlink->h*i)/32;
> > +            a = (a == dh1)? dh2 : dh1;
> > +        }
> > +        for (j=0; j<32; j++){
> > +            b = 1;
> > +            if (dw2 > 1) {
> > +                b = ((inlink->w*(j+1))%32 == 0) ? (inlink->w*(j+1))/32 - 1 : (inlink->w*(j+1))/32;
> > +                b -= ((inlink->w*j)%32 == 0) ? (inlink->w*j)/32 - 1 : (inlink->w*j)/32;
> > +                b = (b == dw1)? dw2 : dw1;
> > +            }
> > +            rowcount += intpic[i][j] *= a * b;
> > +            if(i>0){
> > +                intpic[i][j] = intpic[i-1][j] + rowcount;
> > +            } else {
> > +                intpic[i][j] = rowcount;
> > +            }
> > +        }
> > +    }
> > +
> 
> > +    for (i=0; i< ELEMENT_COUNT; i++){
> > +        const ElemCat* elemcat = elements[i];
> > +        double* elemsignature = av_malloc(sizeof(double) * elemcat->elem_count);
> > +        double* sortsignature = av_malloc(sizeof(double) * elemcat->elem_count);
> 
> missing alloc failure checks
How do I handle this in export() and lookup_signatures()? Both are called in uninit,
which returns void, so return AVERROR does not work.
> 
> 
> [...]
> > +static int request_frame(AVFilterLink *outlink)
> > +{
> > +    AVFilterContext *ctx = outlink->src;
> > +    SignatureContext *sc = ctx->priv;
> > +    int i, ret;
> > +
> 
> > +    for (i = 0; i < sc->nb_inputs; i++)
> > +        ret = ff_request_frame(ctx->inputs[i]);
> 
> ignoring the return code for all but the last call
hope I fixed it.
> 
> [...]
> 

Add improved patch.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-add-signature-filter-for-MPEG7-video-signature.patch
Type: text/x-patch
Size: 78449 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160330/4c5c93dc/attachment.bin>