[FFmpeg-devel] [PATCH] OSQ lossless audio format support
James Almer
jamrial at gmail.com
Fri Aug 25 19:42:21 EEST 2023
On 8/25/2023 1:28 PM, Paul B Mahol wrote:
> On Fri, Aug 25, 2023 at 5:57 PM James Almer <jamrial at gmail.com> wrote:
>
>> On 8/24/2023 7:06 PM, Paul B Mahol wrote:
>>> On Thu, Aug 24, 2023 at 11:51 PM James Almer <jamrial at gmail.com> wrote:
>>>
>>>> On 8/24/2023 6:11 PM, Paul B Mahol wrote:
>>>>> On Thu, Aug 24, 2023 at 11:00 PM James Almer <jamrial at gmail.com>
>> wrote:
>>>>>
>>>>>> On 8/24/2023 6:52 AM, Paul B Mahol wrote:
>>>>>>> +static int osq_receive_frame(AVCodecContext *avctx, AVFrame *frame)
>>>>>>> +{
>>>>>>> + OSQContext *s = avctx->priv_data;
>>>>>>> + GetBitContext *gb = &s->gb;
>>>>>>> + int ret, n;
>>>>>>> +
>>>>>>> + while (s->bitstream_size < s->max_framesize) {
>>>>>>> + int size;
>>>>>>> +
>>>>>>> + if (!s->pkt->data) {
>>>>>>> + ret = ff_decode_get_packet(avctx, s->pkt);
>>>>>>> + if (ret == AVERROR_EOF && s->bitstream_size > 0)
>>>>>>> + break;
>>>>>>> + if (ret < 0)
>>>>>>> + return ret;
>>>>>>> + }
>>>>>>> +
>>>>>>> + size = FFMIN(s->pkt->size - s->pkt_offset, s->max_framesize
>> -
>>>>>> s->bitstream_size);
>>>>>>> + memcpy(s->bitstream + s->bitstream_size, s->pkt->data +
>>>>>> s->pkt_offset, size);
>>>>>>> + s->bitstream_size += size;
>>>>>>> + s->pkt_offset += size;
>>>>>>> +
>>>>>>> + if (s->pkt_offset == s->pkt->size) {
>>>>>>> + av_packet_unref(s->pkt);
>>>>>>> + s->pkt_offset = 0;
>>>>>>> + }
>>>>>>
>>>>>> This looks like you're assembling a packet of max_framesize bytes. You
>>>>>> should instead do that in a parser, and ensure here that the packets
>> fed
>>>>>> to this decoder are <= max_framesize.
>>>>>>
>>>>>>> + }
>>>>>>> +
>>>>>>> + frame->nb_samples = FFMIN(s->frame_samples, s->nb_samples);
>>>>>>> + if (frame->nb_samples <= 0)
>>>>>>> + return AVERROR_EOF;
>>>>>>> +
>>>>>>> + if ((ret = ff_get_buffer(avctx, frame, 0)) < 0)
>>>>>>> + goto fail;
>>>>>>> +
>>>>>>> + if ((ret = init_get_bits8(gb, s->bitstream, s->bitstream_size))
>> <
>>>> 0)
>>>>>>> + goto fail;
>>>>>>> +
>>>>>>> + if ((ret = osq_decode_block(avctx, frame)) < 0)
>>>>>>> + goto fail;
>>>>>>> +
>>>>>>> + s->nb_samples -= frame->nb_samples;
>>>>>>> +
>>>>>>> + n = get_bits_count(gb) / 8;
>>>>>>> + if (n > s->bitstream_size) {
>>>>>>> + ret = AVERROR_INVALIDDATA;
>>>>>>> + goto fail;
>>>>>>> + }
>>>>>>> +
>>>>>>> + memmove(s->bitstream, &s->bitstream[n], s->bitstream_size - n);
>>>>>>> + s->bitstream_size -= n;
>>>>>>> +
>>>>>>> + return 0;
>>>>>>> +
>>>>>>> +fail:
>>>>>>> + s->bitstream_size = 0;
>>>>>>> + s->pkt_offset = 0;
>>>>>>> + av_packet_unref(s->pkt);
>>>>>>> +
>>>>>>> + return ret;
>>>>>>> +}
>>>>>>> +
>>>>>>> +const AVInputFormat ff_osq_demuxer = {
>>>>>>> + .name = "osq",
>>>>>>> + .long_name = NULL_IF_CONFIG_SMALL("raw OSQ"),
>>>>>>> + .read_probe = osq_probe,
>>>>>>> + .read_header = osq_read_header,
>>>>>>> + .read_packet = ff_raw_read_partial_packet,
>>>>>>
>>>>>> Instead of sending an arbitrarily sized packet (1024 bytes as of now),
>>>>>> you should set codecpar->frame_size and propagate packets with that
>>>>>> amount of bytes instead.
>>>>>> A parser is still needed, though, for non seekable input (a pipe). And
>>>>>> in case the decoder is fed with non lavf input.
>>>>>>
>>>>>
>>>>> Format is not seekable, packet sizes are nowhere stored in .osq files.
>>>>
>>>> Same case as raw formats like H.264. No packet sizes, no seekability
>>>> without having parsed the entire sequence to build a list of seek
>>>> points, etc. A parser has to assemble access units.
>>>>
>>>
>>> But there is no any info how to assemble anything here, unless
>>> you propose decoding inside parser?
>>>
>>>
>>>
>>>>
>>>>>
>>>>> Think of this format like APAC/BONK/WAVARC but just no need to keep
>>>> unused
>>>>> bits from previous decoded data/packet.
>>>>> With this fact, parser makes no sense to do.
>>>>
>>>> You know that frame_size is (frame_samples_per_channel * 16 + 1024) *
>>>> channels. A parser can work with that (Look at gsm parser for example.
>>>> There may be others too).
>>>>
>>>
>>> There is reason why variable is called max_framesize.
>>>
>>> You should not propagate truncated data in 1024 byte chunks, and the
>>>> decoder should not expect that either.
>>>>
>>>
>>> Nothing gets truncated. That is just worst case scenario size for packet.
>>> In another words, size of packet can be anything between 33 and
>>> max_framesize.
>>
>> By truncated i mean the demuxer propagates arbitrary 1024 byte sized
>> packets. And in the decoder you're assembling a buffer of max_framesize
>> in osq_receive_frame() by requesting packets until you reach that size
>> or EOF, whatever comes first, before calling osq_decode_block(). So
>> unless you hit EOF, it will always be max_framesize. That's what i say
>> should be done in a simple, small parser.
>>
>
> I fully understand you, but your reasoning is critically flawed.
> And I have no time to educated an uneducated.
Not insulting people will help your arguments.
>
> For The Last Time:
>
> Even if parser does that what you propose, decoder would need to keep own
> internal
> buffer again and than memmove() bytes around.
>
> And even if parser is implemented that would effectively break other
> possible containers different
> than .osq - such containers would properly split data into packets and my
> decoder would still work in such scenario but
> your proposed way would only force such containers to blacklist your
> proposed parser.
You know a parser is doable and the proper approach. With frame_samples
and the first part of osq_decode_block() you can get it going. But do
whatever you want.
More information about the ffmpeg-devel
mailing list