In the RV30 format, another encapsulated data layer for the video stream has been introduced. In lack of a name I'll call them sub packets, the normal packets which exist in RV10 I'll call - well - packets. The stream of bytes which is put in one memory block is named block. The format extension has been * by wild guessing and comparing the received data between the original viewer and the demuxer. Every packet may contain one or more sub packets, and one block may be contained inside one or more adjacent (sub) packets. A sub packet (fragment) starts with a small header (two letters are one byte): aa(bb)[cccc{gggg}dddd{eeee}ff] aa: indicates the type of header the 2MSB indicate the header type (mask C0) C0: the [] part exists, but not () -> whole packet (not fragmented) 00, 80: the data block is encapsulated inside multiple packets. bb contains the sequence number, beginning with 1. 80 indicates the last sub packet containing data for the current block. no C0 or 40 sub packet follows until the block has been finished with a 80 sub packet. No packet with another stream than the current video stream is inside the sub packet chain, at least I haven't seen such case - the demuxer will shout in this case. 40: [] does not exist, the meaning of bb is unknown to me data follows to the end of the packet mask 3F: unknown bb: sub-seq - mask 0x7f: the sequence number of the _fragment_ mask 0x80: unknown, seems to alternating between 00 and 80 cccc: mask 3FFF: _total_ length of the packet mask C000: unknown, seems to be always 4000 when it's all 0000, it indicates value >=0x4000, you should read {gggg} and use all 16 bits of it. dunno what happens when length>=65536... dddd: when it's 0, {} exists (only in this case) mask 3FFF: offset of fragment inside the whole packet. it's counted from the beginning of the packet, except when hdr&0x80, hten it's relative to the _end_ of the packet, so it's equal to fragment size! mask C000: like cccc, except the first case (always 4000) when it's all 0000, it indicates value >=0x4000, you should read {eeee} and use all 16 bits of it. dunno what happens when length>=65536... ff: sequence number of the assembled (defragmented) packets, beginning with 0 NOTE: the demuxer should build a table of the subpacket offsets relative to the start of whole packet, it's required by the rv20/rv30 video decoders. packet header: ushort unknown ulong blocksize ushort streamid uchar reserved uchar flags 1=reliable, 2=keyframe