[Ffmpeg-devel] RTP patches & RFC

Fri Oct 27 18:56:42 CEST 2006

On Oct 27, 2006, at 5:53 AM, Michael Niedermayer wrote:

>>
>> Once those are applied, I'll add the rtp modification patch.
>>
>> Next the h264 patch (Makefile, rtp modification, rtp_h264.c,  
>> rtp_h264.h)
>>
>> Next I'll fix the AAC frequency/stereo initialization bug in a patch.
>>
>> Then the RTCP statistics patch (maintaining the stats)
>>
>> Then I'll integrate the RTCP sender stuff.
>>
>>> [...]
>>>>>>
>>>>>> /**
>>>>>>   RTP/H264 specific private data.
>>>>>> */
>>>>>> typedef struct h264_rtp_extra_data {
>>>>>>   unsigned long cookie;       ///< sanity check, to make sure we
>>>>>> get the pointer we're expecting.
>>>>>>
>>>>>>   struct packet_queue network_packets;        ///< network
>>>>>> packets are in this list...
>>>>>>   struct packet_queue frame_packets;  ///< linked list of all
>>>>>> the packets with the same timestamp.
>>>>>>   struct packet_queue out_of_band_packets;    ///< pps and
>>>>>> sps... (trasmitted via sdp)
>>>>>>   struct packet_queue partial_packets;        ///< fragmentation
>>>>>> unit packets.
>>>>>>   struct packet_queue packet_pool;    ///< preallocated packet
>>>>>> pool; we get them from here first if we need them...
>>>>>
>>>>> one thing ive been curious about is why does h.264 need this mess
>>>>> while the
>>>>> other codecs dont? what is the problem with just removing the  
>>>>> extra
>>>>> headers
>>>>> adding the 001 startcode prefixes and then passing the packets
>>>>> through the
>>>>> AVParser? are the packets out of order in some way or not?  
>>>>> maybe my
>>>>> question
>>>>> is stupid but i plain dont understand why this complex buffering
>>>>> system is
>>>>> needed for h.264 ...
>>>>
>>>> They aren't out of order with this packetization mode, but if the
>>>> other mode was implemented, it has Decoding Order Numbers in it,
>>>> which would require out of order reordering.
>>>
>>> hmm, is this other mode used by anyone? is it mandatory in the sense
>>> that if the decoder doesnt support it its out of luck instead of the
>>> server just switching to the normal mode?
>>
>> The packetization modes are specified by the server, and I don't
>> think you can request a different one.  0 is simply NALs coming
>> across unaltered.  1 adds fragmenting large nals and conglomerating
>> small nals.  2 adds decoding order & out of order stuff.  So yes, I
>> think you are out of luck.
>
> hmm, rfc says:
> "
>    When SDP Offer/Answer model or any other capability exchange
>    procedure is used in session setup, the properties of the received
>    stream SHOULD be such that the receiver capabilities are not
>    exceeded.  In the SDP Offer/Answer model, the receiver can indicate
>    its capabilities to allocate a deinterleaving buffer with the  
> deint-
>    buf-cap MIME parameter.  The sender indicates the requirement  
> for the
>    deinterleaving buffer size with the sprop-deint-buf-req MIME
>    parameter.  It is therefore RECOMMENDED to set the deinterleaving
>    buffer size, in terms of number of bytes, equal to or greater than
>    the value of sprop-deint-buf-req MIME parameter.  See section  
> 8.1 for
>    further information on deint-buf-cap and sprop-deint-buf-req MIME
>    parameters and section 8.2.2 for further information on their  
> use in
>    SDP Offer/Answer model.
> "
>
> "
>        deint-buf-cap:   This parameter signals the capabilities of a
>                         receiver implementation and indicates the
>                         amount of deinterleaving buffer space in units
>                         of bytes that the receiver has available for
>                         reconstructing the NAL unit decoding order.  A
>                         receiver is able to handle any stream for  
> which
>                         the value of the sprop-deint-buf-req parameter
>                         is smaller than or equal to this parameter.
>
>                         If the parameter is not present, then a value
>                         of 0 MUST be used for deint-buf-cap.  The  
> value
>                         of deint-buf-cap MUST be an integer in the
>                         range of 0 to 4294967295, inclusive.
>
>                             Informative note: deint-buf-cap indicates
>                             the maximum possible size of the
>                             deinterleaving buffer of the receiver  
> only.
>
>                             When network jitter can occur, an
>                             appropriately sized jitter buffer has to
>                             be provisioned for as well.
> "
>
> "
>       As specified above, an offerer has to include the size of the
>       deinterleaving buffer in the offer for an interleaved H.264
>       stream.  To enable the offerer and answerer to inform each other
>       about their capabilities for deinterleaving buffering, both
>       parties are RECOMMENDED to include "deint-buf-cap".  This
>       information MAY be used when the value for "sprop-deint-buf-req"
>       is selected in a second round of offer and answer.  For
>       interleaved streams, it is also RECOMMENDED to consider offering
>       multiple payload types with different buffering requirements  
> when
>       the capabilities of the receiver are unknown.
> "
>
> this doesnt sound like the reordering is required for receivers ...  
> or did
> i missunderstand the rfc?

Well, that's a different property than the one I was looking at.
I am looking at the packetization-mode parameter, and if it is equal  
to 2, then it is interleaved mode:
            2: Interleaved Mode: 25 (STAP-B), 26 (MTAP16), 27  
(MTAP24), 28 (FU-A), and 29 (FU-B) are allowed.

I don't have the docs here, I will have to double check when I get to  
work.

>>
>>>>
>>>> The other part of the issue is that fragmentation packets should be
>>>> reassembled before handing them to the AVParser, so that if  
>>>> there is
>>>> a sequence issue or a missing packet, the entire packet can be
>>>> dropped without going to the codec to corrupt the stream.  There is
>>>> no way (IMHO) of doing this, if I just passed the packets up the
>>>> chain.  I know the parser is resilient, but basically a packet  
>>>> could
>>>> be broken anywhere.  it seems like a lot of strain to put on the
>>>> decoder's error detection/correction, when at my level I KNOW if
>>>> parts were dropped.
>>>
>>> well, but the decoder should be able to decode the part of a NAL  
>>> unit
>>> until the missing part, so droping the whole just isnt correct
>>> but note, i dont know how well this currently works with h264.c,
>>> its just
>>> supposed to work and does work with the mpeg/h263 decoders ...
>>
>> We can do either.  Here's my thoughts on dropping the entire packet:
>> 	Pros:
>> 		I _know_ that it was not complete.
>> 		That's what the RFC says to do.
>> 		Prevents weird invalid data stream issues in the decode (it
>> 		might  not like it if it doesn't get a certain byte..).
>
> invalid data must be handled sanely anyway when the source is the  
> internet

Another issue i just thought of is that a fragmented packet is  
supposed to have the timestamp of the first packet received, not the  
last packet received.  I don't know whether the server always sends  
them with the same timestamp or not, but if I did away with the  
packets in the background, this would not be correct (then the parser  
would think they were for different frames..)

>>
>>> [...]
>>>>> base64_decode() should be in a seperate patch
>>>>
>>>> Okay, but where should it be?  It's currently only used by the h264
>>>> stuff, so I have it in my h264 code.  I did see that there was a
>>>
>>> base64.c base64.h in libavformat, we can always move or rename it
>>> later
>>>
>>>
>>>> base64 encode in the source somewhere, but it's static.
>>>
>>> that could also be moved into base64.c/h ...
>>
>> base64.[ch].  I don't like the decode routine, i swiped it from
>> elsewhere (and fixed it), I'm sure it can be improved on (I think you
>> even sent a suggestion before)
>
> i did and would be happy if you could try it instead of the large
> thing below, if it doesnt work tell me and ill look at it

See other thread.

Thanks!
-Ryan