[FFmpeg-devel] [PATCH] OSQ lossless audio format support

Fri Aug 25 01:06:05 EEST 2023

On Thu, Aug 24, 2023 at 11:51 PM James Almer <jamrial at gmail.com> wrote:

> On 8/24/2023 6:11 PM, Paul B Mahol wrote:
> > On Thu, Aug 24, 2023 at 11:00 PM James Almer <jamrial at gmail.com> wrote:
> >
> >> On 8/24/2023 6:52 AM, Paul B Mahol wrote:
> >>> +static int osq_receive_frame(AVCodecContext *avctx, AVFrame *frame)
> >>> +{
> >>> +    OSQContext *s = avctx->priv_data;
> >>> +    GetBitContext *gb = &s->gb;
> >>> +    int ret, n;
> >>> +
> >>> +    while (s->bitstream_size < s->max_framesize) {
> >>> +        int size;
> >>> +
> >>> +        if (!s->pkt->data) {
> >>> +            ret = ff_decode_get_packet(avctx, s->pkt);
> >>> +            if (ret == AVERROR_EOF && s->bitstream_size > 0)
> >>> +                break;
> >>> +            if (ret < 0)
> >>> +                return ret;
> >>> +        }
> >>> +
> >>> +        size = FFMIN(s->pkt->size - s->pkt_offset, s->max_framesize -
> >> s->bitstream_size);
> >>> +        memcpy(s->bitstream + s->bitstream_size, s->pkt->data +
> >> s->pkt_offset, size);
> >>> +        s->bitstream_size += size;
> >>> +        s->pkt_offset += size;
> >>> +
> >>> +        if (s->pkt_offset == s->pkt->size) {
> >>> +            av_packet_unref(s->pkt);
> >>> +            s->pkt_offset = 0;
> >>> +        }
> >>
> >> This looks like you're assembling a packet of max_framesize bytes. You
> >> should instead do that in a parser, and ensure here that the packets fed
> >> to this decoder are <= max_framesize.
> >>
> >>> +    }
> >>> +
> >>> +    frame->nb_samples = FFMIN(s->frame_samples, s->nb_samples);
> >>> +    if (frame->nb_samples <= 0)
> >>> +        return AVERROR_EOF;
> >>> +
> >>> +    if ((ret = ff_get_buffer(avctx, frame, 0)) < 0)
> >>> +        goto fail;
> >>> +
> >>> +    if ((ret = init_get_bits8(gb, s->bitstream, s->bitstream_size)) <
> 0)
> >>> +        goto fail;
> >>> +
> >>> +    if ((ret = osq_decode_block(avctx, frame)) < 0)
> >>> +        goto fail;
> >>> +
> >>> +    s->nb_samples -= frame->nb_samples;
> >>> +
> >>> +    n = get_bits_count(gb) / 8;
> >>> +    if (n > s->bitstream_size) {
> >>> +        ret = AVERROR_INVALIDDATA;
> >>> +        goto fail;
> >>> +    }
> >>> +
> >>> +    memmove(s->bitstream, &s->bitstream[n], s->bitstream_size - n);
> >>> +    s->bitstream_size -= n;
> >>> +
> >>> +    return 0;
> >>> +
> >>> +fail:
> >>> +    s->bitstream_size = 0;
> >>> +    s->pkt_offset = 0;
> >>> +    av_packet_unref(s->pkt);
> >>> +
> >>> +    return ret;
> >>> +}
> >>> +
> >>> +const AVInputFormat ff_osq_demuxer = {
> >>> +    .name           = "osq",
> >>> +    .long_name      = NULL_IF_CONFIG_SMALL("raw OSQ"),
> >>> +    .read_probe     = osq_probe,
> >>> +    .read_header    = osq_read_header,
> >>> +    .read_packet    = ff_raw_read_partial_packet,
> >>
> >> Instead of sending an arbitrarily sized packet (1024 bytes as of now),
> >> you should set codecpar->frame_size and propagate packets with that
> >> amount of bytes instead.
> >> A parser is still needed, though, for non seekable input (a pipe). And
> >> in case the decoder is fed with non lavf input.
> >>
> >
> > Format is not seekable, packet sizes are nowhere stored in .osq files.
>
> Same case as raw formats like H.264. No packet sizes, no seekability
> without having parsed the entire sequence to build a list of seek
> points, etc. A parser has to assemble access units.
>

But there is no any info how to assemble anything here, unless
you propose decoding inside parser?

>
> >
> > Think of this format like APAC/BONK/WAVARC but just no need to keep
> unused
> > bits from previous decoded data/packet.
> > With this fact, parser makes no sense to do.
>
> You know that frame_size is (frame_samples_per_channel * 16 + 1024) *
> channels. A parser can work with that (Look at gsm parser for example.
> There may be others too).
>

There is reason why variable is called max_framesize.

You should not propagate truncated data in 1024 byte chunks, and the
> decoder should not expect that either.
>

Nothing gets truncated. That is just worst case scenario size for packet.
In another words, size of packet can be anything between 33 and
max_framesize.

>
> >
> >
> >>> +    .extensions     = "osq",
> >>> +    .flags          = AVFMT_NOBINSEARCH | AVFMT_NOGENSEARCH |
> >> AVFMT_NO_BYTE_SEEK | AVFMT_NOTIMESTAMPS,
> >>> +    .raw_codec_id   = AV_CODEC_ID_OSQ,
> >>> +    .priv_data_size = sizeof(FFRawDemuxerContext),
> >>> +    .priv_class     = &ff_raw_demuxer_class,
> >>> +};
> >>
> >> _______________________________________________
> >> ffmpeg-devel mailing list
> >> ffmpeg-devel at ffmpeg.org
> >> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >>
> >> To unsubscribe, visit link above, or email
> >> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
> >>
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>