[FFmpeg-devel] [OPW] OPW Project Proposal

Michael Niedermayer michael at niedermayer.cc
Fri Oct 28 02:08:24 EEST 2016


On Thu, Oct 27, 2016 at 11:38:27PM +0530, Pallavi Kumari wrote:
> Hi Michael,
> 
> I have attached a patch with the mail.
> 
> With `avcodec_get_frame_defaults (&frame)` in the function readAudio
> program was working fine with old version in my system. But this function
> is not in the latest git repo so I used `av_frame_unref(&frame)` instead
> and the program segfaults. I am not sure why or how to fix this.

theres "AVFrame frame;"

thats bad, as it depends on sizef(AVFrame) to allocat frame on the
stack.
you should always allocate/clone/... one if one is not already
available, each of these would be setup correctly and not need
avcodec_get_frame_defaults()


> 
> Also, I haven't properly understood the input and output flow of
> libavfilters (how to write input and output links? and overall flow). Could
> you explain it to me?

see existing filters, af_*.c
maybe libavfilter/af_astats.c has a similar structure to the filter
here

theres also
doc/filter_design.txt


> 
> 
> 
> On Thu, Oct 27, 2016 at 3:24 AM, Michael Niedermayer <michael at niedermayer.cc
> > wrote:
> 
> > On Thu, Oct 27, 2016 at 12:40:53AM +0530, Pallavi Kumari wrote:
> > > I mean deciding a timeline for the opw project. Which is to be mentioned
> > in
> > > application
> >
> > its your application, you can choose whatever timeline you feel makes
> > sense.
> > As far as iam concered whats important is to finish the qualification
> > task well before we have to choose the applicant(s) we accept this
> > year.
> >
> > The timeline for the main project in dec-mar can be segmented as you
> > like but leave plenty of time toward the end, problems always occur
> > and things get delayed.
> > Also both for qualification and the main project submit code early
> > if you have questions ask early,
> > if you dont get satisfactory reply from me on something ask again and
> > louder/clearer ... i do sometimes also forget to reply until i then
> > remember again days later ...
> >
> > also i think the architecture wasnt discussed much yet
> > do you have some plans about it ?
> >
> > I mean for example there could be a avfilter that passes audio through
> > and adds fingerint(s) as metadata into thf AVFrame
> > (as with avpriv_frame_get_metadatap in vf_idet.c) and also print the
> > fingerprints via av_log()
> >
> > a 2nd filter could then look these fingerprints from the metadata
> > up in some file/database
> >
> > a 3rd one could add the fingerprints to the file/database
> >
> > or these also could all be in one filter of course
> >
> > this can be extended in the future to for example have a filter
> > delay the audio until lookup succeeds and then once identified
> > lookup lyrics online and add them or such (this of course would be a
> > future thing after outreachy/opw)
> >
> > but maybe you have a different plan on how the components would
> > interact ?
> >
> > thx
> >
> > >
> > > On Wed, Oct 26, 2016 at 6:01 PM, Michael Niedermayer
> > <michael at niedermayer.cc
> > > > wrote:
> > >
> > > > Hi
> > > >
> > > > On Mon, Oct 17, 2016 at 02:06:26PM +0530, Pallavi Kumari wrote:
> > > > > Hi Michael,
> > > > >
> > > > > I figured out the use of fft. Help me with the time line setting.
> > Thanks
> > > >
> > > > I dont understand the question,
> > > > if its about AVFILTER_FLAG_SUPPORT_TIMELINE*, please ignore this for
> > > > now, its not needed
> > > >
> > > > [...]
> > > > --
> > > > Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC7
> > 87040B0FAB
> > > >
> > > > Rewriting code that is poorly written but fully understood is good.
> > > > Rewriting code that one doesnt understand is a sign that one is less
> > smart
> > > > then the original author, trying to rewrite it will not make it better.
> > > >
> > > _______________________________________________
> > > ffmpeg-devel mailing list
> > > ffmpeg-devel at ffmpeg.org
> > > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > --
> > Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> >
> > He who knows, does not speak. He who speaks, does not know. -- Lao Tsu
> >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> >

>  Makefile        |    1 
>  af_peakpoints.c |  263 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  allfilters.c    |    1 
>  version.h       |    2 
>  4 files changed, 266 insertions(+), 1 deletion(-)
> 5cc989fc1ba9466043f833059ee82efcaa3cf898  0001-avfilter-add-peakpoints-filter.patch
> From 60429fb7d83d11d6dd733c3477950a940bd0d84a Mon Sep 17 00:00:00 2001
> From: Atana <atana at openmailbox.org>
> Date: Mon, 24 Oct 2016 17:16:09 +0530
> Subject: [PATCH] avfilter: add peakpoints filter
> 
> ---
>  libavfilter/Makefile        |   1 +
>  libavfilter/af_peakpoints.c | 263 ++++++++++++++++++++++++++++++++++++++++++++
>  libavfilter/allfilters.c    |   1 +
>  libavfilter/version.h       |   2 +-
>  4 files changed, 266 insertions(+), 1 deletion(-)
>  create mode 100644 libavfilter/af_peakpoints.c
> 
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index 7ed4696..1a18902 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -96,6 +96,7 @@ OBJS-$(CONFIG_LADSPA_FILTER)                 += af_ladspa.o
>  OBJS-$(CONFIG_LOUDNORM_FILTER)               += af_loudnorm.o
>  OBJS-$(CONFIG_LOWPASS_FILTER)                += af_biquads.o
>  OBJS-$(CONFIG_PAN_FILTER)                    += af_pan.o
> +OBJS-$(CONFIG_PEAKPOINTS_FILTER)             += af_peakpoints.o
>  OBJS-$(CONFIG_REPLAYGAIN_FILTER)             += af_replaygain.o
>  OBJS-$(CONFIG_RESAMPLE_FILTER)               += af_resample.o
>  OBJS-$(CONFIG_RUBBERBAND_FILTER)             += af_rubberband.o
> diff --git a/libavfilter/af_peakpoints.c b/libavfilter/af_peakpoints.c
> new file mode 100644
> index 0000000..84b51da
> --- /dev/null
> +++ b/libavfilter/af_peakpoints.c
> @@ -0,0 +1,263 @@
> +/*
> + * Copyright (c) 2016 Pallavi Kumari
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include "libavcodec/avcodec.h"
> +#include "libavcodec/avfft.h"
> +#include "libavformat/avformat.h"
> +#include "libswscale/swscale.h"
> +#include "avfilter.h"
> +#include "libavutil/opt.h"
> +
> +
> +/* Structure to contain peak points context */
> +typedef struct {
> +    const AVClass *class;
> +    double *data;
> +    int nsamples;
> +    double *peaks;
> +    int size; // number of peaks
> +    int windowSize;
> +    char *inputFile;
> +} PeakPointsContext;
> +
> +/* returns maximum value from an array conditioned on start and end index */
> +static double getMax(double *res_arr, int startIndex, int endIndex) {
> +    int i;
> +    double max = res_arr[startIndex];
> +    for (i = startIndex; i <= endIndex; i++) {
> +	    if (res_arr[i] > max) {
> +	        max = res_arr[i];
> +	    }
> +    }
> +    return max;
> +}
> +

> +/* Stores peak frequency for each window(of chunkSize) in peaks array */
> +static void getPeakPointInChunk(int chunkSize, double *res_arr, int size, double *peaks) {
> +    int endIndex, i = 0, peakIndex = 0;
> +    int startIndex = 0;
> +    double max;
> +    // get a chunk and find max value in it
> +    while (i < size) {

> +	    if (i % chunkSize-1 == 0) {

% is slow (compared to &)
also doing a i++ loop instead of i+= chunkSize-1 is slow too
does lots of unneeed operations, which the compiler might or might
not optimize out


> +            max = getMax(res_arr, startIndex, i);
> +	        peaks[peakIndex++] = max;
> +	        startIndex = startIndex + chunkSize;
> +	    }

tabs are also forbidden in ffmpeg git


> +        i += 1;
> +    }
> +}
> +

> +/* read audio stream from input file */
> +static void readAudio (int *sampleRate, PeakPointsContext *ppc) {   
> +    // Initialize FFmpeg
> +    av_register_all ();
> +
> +    AVFormatContext *formatContext = NULL;
> +    AVCodecContext *codecContext = NULL;
> +    AVCodec *codec = NULL;
> +    AVPacket packet;
> +    AVFrame frame;
> +
> +    ssize_t nSamples = 0;
> +
> +    // open the file and autodetect the format
> +    if (avformat_open_input (&formatContext, ppc->inputFile, NULL, NULL) < 0)
> +        printf("Can not open input file %s", ppc->inputFile);
> +
> +    if (avformat_find_stream_info (formatContext, NULL) < 0)
> +        printf("Can not find stream information");
> +
> +    // find audio stream (between video/audio/subtitles/.. streams)
> +        int audioStreamId = av_find_best_stream (formatContext, AVMEDIA_TYPE_AUDIO,
> +                                            -1, -1, &codec, 0);
> +    if (audioStreamId < 0)
> +        printf("Can not find audio stream in the input file");
> +
> +    codecContext = formatContext->streams[audioStreamId]->codec;
> +
> +    // init the audio decoder
> +    if (avcodec_open2 (codecContext, codec, NULL) < 0)
> +        printf("Can not open audio decoder");
> +
> +    // read all packets
> +    int i = 0;
> +    while (av_read_frame (formatContext, &packet) == 0)  {

A libavfilter would generally analyze its audio input stream instead
of opening a audio file on its own



> +        if (packet.stream_index == audioStreamId)  {
> +            //avcodec_get_frame_defaults (&frame);
> +            av_frame_unref (&frame);
> +            int gotFrame = 0;
> +            if (avcodec_decode_audio4 (codecContext,&frame,&gotFrame,&packet) < 0){
> +                printf("Error decoding audio");
> +            }
> +            if (gotFrame)  {
> +                // audio frame has been decoded
> +                int size = av_samples_get_buffer_size (NULL,
> +                                                   codecContext->channels,
> +                                                   frame.nb_samples,
> +                                                   codecContext->sample_fmt,
> +                                                   1);
> +                if (size < 0)  {
> +                    printf("av_samples_get_buffer_size invalid value");
> +                }
> +
> +	            // printf("%d\n", *frame.data[0]);
> +                ppc->data[i] = (double)*frame.data[0];
> +                i += 1;
> +                nSamples += frame.nb_samples;
> +            }
> +        }
> +        av_free_packet (&packet);
> +    }
> +    ppc->nsamples = i;
> +
> +    if (nSamples < 1)
> +        printf("Decoded audio data is empty");
> +
> +    int sampleSize = av_get_bytes_per_sample (codecContext->sample_fmt);
> +    if (sampleSize < 1)
> +        printf("Invalid sample format");
> +
> +    // optional, return value with sample rate
> +    if (sampleRate) *sampleRate = codecContext->sample_rate;
> +
> +    if (codecContext) avcodec_close (codecContext);
> +
> +    avformat_close_input (&formatContext);
> +}
> +
> +/* Get peaks points from windowed frequency domain data*/
> +static void getPeakPoints(PeakPointsContext *ppc) {
> +    int i, k, size, chunkSize, chunkSampleSize, resSize, sample_rate;
> +    double *fft_res;
> +    int m;
> +
> +    ppc->data = malloc(sizeof(double)*10000);
> +    readAudio (&sample_rate, ppc);
> +
> +    size = ppc->nsamples;
> +    m = log2(ppc->windowSize);
> +    chunkSize = ppc->windowSize;
> +    chunkSampleSize = size/chunkSize;
> +    resSize = chunkSize * chunkSampleSize;
> +    fft_res = malloc(sizeof(double) * resSize);
> +
> +    RDFTContext *rdftC = av_rdft_init(m, DFT_R2C);
> +    FFTSample *data;

> +    data = malloc(sizeof(FFTSample)*chunkSize);
> +    // FFT transform for windowed time domain data
> +    // window is of size chunkSize
> +    k = 0;
> +    while (k < resSize) {
> +        //copy data
> +        for (i = 0; i < chunkSize; i++) {
> +            data[i] = ppc->data[i+k];

missing malloc failure check


> +        }
> +        //calculate FFT
> +        av_rdft_calc(rdftC, data);
> +        for (i = 0; i < chunkSize; i++) {
> +	    fft_res[i+k] = data[i];
> +        }
> +        k = k + chunkSize;
> +    }
> +
> +    av_rdft_end(rdftC);
> +
> +    int pSize = resSize/chunkSize;
> +    ppc->size = pSize;
> +    ppc->peaks = malloc(sizeof(double)*pSize);
> +    getPeakPointInChunk(chunkSize, fft_res, resSize, ppc->peaks);
> +}
> +
> +
> +#define OFFSET(x) offsetof(PeakPointsContext, x)
> +
> +static const AVOption peakpoints_options[] = {
> +    { "input",  "set input audio file", OFFSET(inputFile), AV_OPT_TYPE_STRING, {.str="sample.mp3"}, CHAR_MIN, CHAR_MAX },
> +    { "wsize",  "set window size", OFFSET(windowSize),  AV_OPT_TYPE_INT,    {.i64=16},    0, INT_MAX},
> +    { NULL },
> +};
> +
> +static const char *peakpoints_get_name(void *ctx) {
> +    return "peakpoints";
> +}
> +
> +static const AVClass peakpoints_class = {
> +    .class_name = "PeakPointsContext",
> +    .item_name  = peakpoints_get_name,
> +    .option     = peakpoints_options,
> +};
> +
> +
> +AVFILTER_DEFINE_CLASS(peakpoints);
> +
> +static av_cold int init(AVFilterContext *ctx)
> +{
> +    PeakPointsContext *p = ctx->priv;
> +
> +    if (!(p->inputFile)) {
> +        av_log(ctx, AV_LOG_ERROR, "Input audio file must be passed\n");
> +        return AVERROR(EINVAL);
> +    }
> +
> +    if (p->windowSize < 16) {
> +	     av_log(ctx, AV_LOG_ERROR, "window size must be greater than or equal to 16");
> +        return AVERROR(EINVAL);
> +    }
> +    //printf("Getting peak points..");
> +    // get peaks
> +    getPeakPoints(p);
> +    //printf("Peak points collected.");
> +    // print peaks
> +    int i;
> +    for (i = 0; i < p->size; i++) {
> +        av_log(ctx, AV_LOG_DEBUG, "%f", p->peaks[i]);
> +    }
> +    return 0;
> +}
> +

> +static av_cold void uninit(AVFilterContext *ctx)
> +{
> +    PeakPointsContext *p = ctx->priv;
> +    int i;
> +

> +    // free memory allocated for audio data
> +    for (i = 0; p->nsamples; i++) {
> +	free(&p->data[i]);
> +    }
> +
> +    // free memory allocated for peaks
> +    for (i = 0; p->size; i++) {
> +	free(&p->data[i]);
> +    }

looks like a double free

also please use av_freep() and av_malloc*()

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No human being will ever know the Truth, for even if they happen to say it
by chance, they would not even known they had done so. -- Xenophanes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20161028/d73d8882/attachment.sig>


More information about the ffmpeg-devel mailing list