[FFmpeg-devel] [PATCH] speex in ogg muxer
Justin Ruggles
justin.ruggles
Sun Sep 6 02:20:03 CEST 2009
Justin Ruggles wrote:
> Justin Ruggles wrote:
>
>> Justin Ruggles wrote:
>>
>>> Justin Ruggles wrote:
>>>
>>>> Justin Ruggles wrote:
>>>>
>>>>> Baptiste Coudurier wrote:
>>>>>> Justin Ruggles wrote:
>>>>>>> Baptiste Coudurier wrote:
>>>>>>>> Hi Justin,
>>>>>>>>
>>>>>>>> Justin Ruggles wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> This patch adds speex support to the ogg muxer. It basically does the
>>>>>>>>> same thing as Ogg/FLAC, in that the 1st packet is a global header from
>>>>>>>>> extradata and the 2nd packet is vorbiscomment metadata.
>>>>>>>>>
>>>>>>>>> This seems to work just fine for speex-to-speex stream copy, but
>>>>>>>>> probably would not work for flv-to-speex because flv doesn't to have any
>>>>>>>>> speex extradata from what I can tell. I guess a header could be
>>>>>>>>> constructed, but that would be a separate patch to the flv demuxer.
>>>>>>>>>
>>>>>>>>> This patch is a precursor to libspeex encoding support, which I'll be
>>>>>>>>> sending shortly.
>>>>>>>>>
>>>>>>>>> -Justin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>> Index: libavformat/oggenc.c
>>>>>>>>> ===================================================================
>>>>>>>>> --- libavformat/oggenc.c (revision 19244)
>>>>>>>>> +++ libavformat/oggenc.c (working copy)
>>>>>>>>> @@ -104,17 +125,39 @@
>>>>>>>>> bytestream_put_byte(&p, 0x00); // streaminfo
>>>>>>>>> bytestream_put_be24(&p, 34);
>>>>>>>>> bytestream_put_buffer(&p, streaminfo, FLAC_STREAMINFO_SIZE);
>>>>>>>>> - oggstream->header_len[1] = 1+3+4+strlen(vendor)+4;
>>>>>>>>> - oggstream->header[1] = av_mallocz(oggstream->header_len[1]);
>>>>>>>>> - p = oggstream->header[1];
>>>>>>>>> + p = ogg_write_vorbiscomment(4, bitexact, &oggstream->header_len[1]);
>>>>>>>>> + if (!p)
>>>>>>>>> + return -1;
>>>>>>>> AVERROR(ENOMEM)
>>>>>>> fixed.
>>>>>>>
>>>>>>>>> @@ -144,6 +188,12 @@
>>>>>>>>> av_log(s, AV_LOG_ERROR, "Extradata corrupted\n");
>>>>>>>>> av_freep(&st->priv_data);
>>>>>>>>> }
>>>>>>>>> + } else if (st->codec->codec_id == CODEC_ID_SPEEX) {
>>>>>>>>> + if (ogg_build_speex_headers(st->codec, oggstream,
>>>>>>>>> + st->codec->flags & CODEC_FLAG_BITEXACT) < 0) {
>>>>>>>>> + av_log(s, AV_LOG_ERROR, "error writing Speex headers\n");
>>>>>>>>> + av_freep(&st->priv_data);
>>>>>>>>> + }
>>>>>>>> return error here with the return code of the func :>
>>>>>>>> Yes, it seems flac miss it too, this needs a fix.
>>>>>>>>
>>>>>>>> patch fine otherwise, maybe a micro bump for avformat would be nice.
>>>>>>> fixed. new patch attached. the new patch also differs in that it
>>>>>>> overrides the extra_headers field in the Speex header to be 0 since only
>>>>>>> the 2 required headers are written.
>>>>>>>
>>>>>> patch ok if it works :>
>>>> Ok, back to square one.
>>>>
>>>>> Hmm... I've done several more tests and it does not quite work as-is for
>>>>> all samples. Here is what I have run into. The tests so far are for
>>>>> ogg-to-ogg stream copy.
>>>>>
>>>>> - When the source has more than 1 frame per packet, the resulting copy
>>>>> plays fine with ffmpeg/ffplay but is quick and choppy with speexdec. I
>>>>> was able to fix this by modifying the ogg/speex demuxer to set
>>>>> avctx->frame_size to the number of samples in a packet instead of in a
>>>>> frame. I also had to update the libspeex decoder accordingly. Maybe
>>>>> this is the wrong way to go about it though. I'm guessing it is a
>>>>> timestamp/granulepos issue, but I don't know enough about Ogg to tell
>>>>> more than that.
>>>> This is now corrected after much discussion. :)
>>>>
>>>>> - Even with the fix and even with 1 frame per packet, 2 short samples
>>>>> I've tested so far have a single soft pop when the stream-copied file is
>>>>> decoded with speexdec, but it's fine with ffmpeg/ffplay.
>>>>>
>>>>> Maybe someone else might have an idea of what could be going wrong?
>>>> Now I think I know what is going wrong, and there is nothing we can do
>>>> about it I think. speexenc does some weird things with granule
>>>> positions. It starts out for a long time with granulepos=0 even though
>>>> it is encoding audio, then when it starts writing granule positions it
>>>> is not always in sync with the start of the stream. Below is a little
>>>> snippet from a comparison of an original spx file to a copied spx file.
>>>> Each packet should be 320 samples.
>>>>
>>>> [...]
>>>>
>>>> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 57
>>>> +00:00:01.120: serialno 0000000000, granulepos 17920, packetno 57
>>>>
>>>> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 58
>>>> +00:00:01.140: serialno 0000000000, granulepos 18240, packetno 58
>>>>
>>>> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 59
>>>> +00:00:01.160: serialno 0000000000, granulepos 18560, packetno 59
>>>>
>>>> -00:00:01.171: serialno 1626088319, granulepos 18737, packetno 60
>>>> +00:00:01.180: serialno 0000000000, granulepos 18880, packetno 60
>>>>
>>>> -00:00:01.191: serialno 1626088319, calc. gpos 19057, packetno 61
>>>> +00:00:01.191: serialno 0000000000, granulepos 19057, packetno 61
>>>>
>>>> -00:00:01.211: serialno 1626088319, calc. gpos 19377, packetno 62
>>>> +00:00:01.211: serialno 0000000000, granulepos 19377, packetno 62
>>> So... I figured it out, but you may not want to know the answer. ;)
>>>
>>> The granulepos of the first packet is supposed to be interpreted as
>>> smaller than the full frame size by calculating what the granulepos of
>>> the first page would normally be, then subtracting it from what it
>>> really is to get the delay.
>>>
>>>> >From above, this is the last packet in the first page. There are 59
>>> packets per page in this stream (the first 2 packets are headers, hence
>>> the packetno of 60).
>>>> -00:00:01.171: serialno 1626088319, granulepos 18737, packetno 60
>>>> +00:00:01.180: serialno 0000000000, granulepos 18880, packetno 60
>>> speexdec interprets the first packet as having a delay of
>>> 18880-18737=143 samples. So the first packet should be 320-143=177
>>> samples long, and the decoder discards the first 143 samples of the
>>> first frame.
>>>
>>> None of this is documented except for in the speexenc and speexdec
>>> source code. From analyzing a Speex-in-FLV sample, it appears that the
>>> way Adobe handles this in Flash Media Server is to do like our ogg
>>> demuxer does and interpret the first page as if each frame is 320
>>> samples, then resync timestamps with the source after the first page,
>>> causing a skip in timestamps after the first page instead of at the
>>> beginning of the stream.
>>>
>>> I'm still not sure what to do about this though...
>> This patch makes it so that all the pts and durations are correct for
>> Ogg/Speex. It basically just changes the durations of the first and
>> last packets.
>
> nevermind. this doesn't quite work. i'm still working on it. damn ogg
> and its craziness!
Ok, now this patch should work correctly.
-Justin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speex_granulepos_delay_2.patch
Type: text/x-diff
Size: 3767 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090905/cceec834/attachment.patch>
More information about the ffmpeg-devel
mailing list