[MPlayer-dev-eng] [RFC] libnut demuxer API

Oded Shimon ods15 at ods15.dyndns.org
Wed Sep 7 07:45:59 CEST 2005


On Wed, Sep 07, 2005 at 12:12:43AM -0400, Rich Felker wrote:
> On Tue, Sep 06, 2005 at 09:42:53PM +0300, Oded Shimon wrote:
> > My demuxer API is horrible, I can feel it...
> > 
> > I need comments, lots of them. What should I do. I certainely can't use 
> > MPlayer as an example design :) LAVF is too complicated for me...
> > 
> > My muxer API probably sucks too.
> > 
> > Here's what I have so far. Please tell me how much it sucks.
> 
> Not too bad..
> 
> > typedef struct {
> > 	void * priv;
> > 	int (*seek)(void * priv, long pos, int dummy);
> > 	size_t (*read)(uint8_t * buf, size_t dummy, size_t len, void * priv);
> > 	off_t filesize;
> > } InputStreamContext;
> 
> Don't try to keep compat with stdio functions; instead just use sane
> API and call fread/fseek if the function pointers are NULL or
> something, or make wrappers.

I liked the NULL idea, I'll see what I can do...

> > typedef struct {
> > 	uint8_t * data;
> > 	int len;
> > } Data;
> 
> Useless struct.

Yeah, I dunno what I was doing :P

> > typedef struct {
> > 	int width;
> > 	int height;
> > 	int sample_width;
> > 	int sample_height;
> > 	int colorspace_type;
> > } VideoData;
> > 
> > typedef struct {
> > 	int samplerate_nom;
> > 	int samplerate_denom;
> > 	int channel_count;
> > } AudioData;
> > 
> > typedef struct {
> > 	int type;
> > 	Data fourcc;
> > 	int time_base_nom;
> > 	int time_base_denom;
> > 	int fixed_fps;
> > 	Data codec_specific;
> > 	union {
> > 		VideoData vd;
> > 		AudioData ad;
> > 	} u;
> > } StreamData;
> 
> Personally I'd do this without a union, putting the generic stream
> header at the beginning of the specific-type headers and casting
> pointers. I just dislike unions tho. :)

I think I'll just put all data in a single struct.
I can't do the pointer casting trick because this struct is used as an 
array.

> > typedef struct {
> > 	int stream_count;
> > 	FILE * out;
> > 	StreamData * sd;
> > } InputData;
> 
> Output cannot depend on FILE* and stdio. What if the user wants to
> write to a memory buffer, an unbuffered socket, etc. or even use the
> lib on an embedded device with no stdio?? IMO all io should be thru
> callbacks.

Yeah, muxer output will evantually also change to using callbacks.. i want 
to get a working muxer as soon as possible.

> > typedef struct {
> > 	int id;
> > 	Data type;
> > 	Data name;
> 
> These can just be null terminated strings in the API.
> 
> > 	union {
> > 		int i;
> > 		Data s;
> > 	} val;
> > } InfoData;
> 
> Again I don't see much advantage to union...
> 
> > typedef struct {
> > 	int stream;
> > 	int pts;
> > 	int is_key;
> > 	Data f;
> > } FrameData;
> 
> I just thought of it at this point, but almost all your struct names
> end in "Data". This is really rather silly. Personally I'd prefer them
> to have names like:
> 
> struct nut_frame;
> struct nut_info_packet;
> struct nut_stream_header;
> ...

the reason for the data is before i had an api this all used to be a single 
file, and i had 'Context' stuff and 'Data' stuff :) I guess now it can be 
changed.

> > typedef struct {
> > 	enum { e_headers, e_unkown, e_frame, e_info } type;
> > 	FrameData fd; // the 'data' part is unused.
> > } PacketData;
> 
> ??

It's what 'nut_read_next_packet' gives.

> > struct NUTContext_s;
> > typedef struct NUTContext_s NUTContext;
> > 
> > // Muxer
> > 
> > NUTContext * nut_muxer_init(InputData * id);
> 
> This struct is very much nonsense. Putting all the args to a function
> in one struct when they're not related is bad API design.

Well, almost everything in InputData has to do with the actual audio video 
file. the only thing in it that should be left out is the 'FILE * out', and 
I think I'll make a nut_muxer_options struct (should it write index, header 
repetition, etc.)

> > void nut_muxer_uninit(NUTContext * nut);
> > 
> > void nut_write_frame(NUTContext * nut, const FrameData * fd);
> > void nut_write_info(NUTContext * nut, const InfoData info []);
> 
> IMO the API should work something like this:
> 
> 1. Constuct a NUT muxer context, and fill it with all the global
>    header, stream header, and info header data. This can be done
>    either with direct access to structs and manual allocations and
>    field fill-in by the caller, or with helper functions that allocate
>    everything for you and through which you're required to access the
>    structs. Some global muxer settings like the degree of header
>    repetition, etc. should also be set at this point.

I sent a 'random.c' to the ML a few days ago, see how i used the muxer API 
there.

> 2. Write a sequence of frames. The first frame write should trigger
>    all the headers to be written out to the output stream, if the
>    caller hasn't already forced them to be written with another
>    function. Frame writes can also trigger headers to be written again
>    later when a header repetition point is reached, if header
>    repetition is enabled. After the first event which triggers headers
>    to be written, it's illegal to modify anything in the headers. If
>    the header structs are encapsulated with access functions, the API
>    should prevent further modification.

I already have an implementation of this!
nut_write_frame() has A LOT of black magic! It writes syncpoints, it 
repeats headers at positions decided by magic numbers.
BTW, the current black magic numbers are, 2^23, and 2^n (n >= 25)
this makes for header reptition at:
8MB, 32MB, 64MB, 128MB, 256MB, 512MB ...

> 3. Call a close function to finalize the output, writing indices if
>    the caller wants the file to have an index.

Yeah, there's a comment in my muxer_unint implementation:
put_index(); // TODO make this optional

> > // Demuxer
> > 
> > /** Just inits stuff, can never fail. */
> > NUTContext * nut_demuxer_init(InputStreamContext isc);
> 
> I assume this mallocs something? If so, "can never fail" is
> misleading. :)

ahem, yes, malloc :)
even if you run out of ram, in linux, malloc STILL doesn't fail :P

> Actually I'd prefer an API that does no mallocs, but either uses an
> allocation callback or requires the caller to pre-allocate things.

that's slightly insane?...
basically this lib should depend on NOTHING?.. (hey, i need memcpy for some 
cases.. :)

> > /** Just frees stuff. */
> > void nut_demuxer_uninit(NUTContext * nut);
> > 
> > /** All these functions return:
> > 0 success
> > 1 unexpected or expected EOF.
> > 2 EAGAIN.
> > negative number: error, detailed error in nut_error(-1 * ret);
> > 
> > fatality:
> > EAGAIN is never fatal
> > EOF is always fatal
> > negative error is non-fatal unless it was given by nut_read_next_packet.
> > 
> > free-ing data malloced by the funtions is only ever needed after a
> > successful call. - to protect this, things which should've been malloced will
> > turn to NULL.
> > 
> > nut_read_next_packet should be called after a negative error.
> 
> This is very low-level API. I'm not sure it's a bad idea to have such
> a thing, but if you do you also need a sane high-level API where the
> caller can just..

we can have both :) i need at the very least to make the low level first!

> 1. Have the library find and load all necessary headers. Export this
>    information to the caller in a sane way.
> 2. Optionally have the library check for and load an index and/or
>    final pts in the file.
> 3. Seek to a particular frame pts in a particular stream, either
>    next/prev keyframe or next/prev frame of any type.
> 4. Peek at next frame to determine stream/pts/etc.
> 5. Demux next frame, returning all known data about the frame. Or skip
>    next frame if it's not wanted.

All of this is almost extremely similar to what i have now, only difference 
is the headers..
if i'm not mistaken, you can have this api just by making a tiny 
'demuxer_init' function which just - calls nut_demuxer_init, calls 
nut_read_next_packet, and calls nut_read_headers.

> ALL of these operations should be able to return an "EAGAIN"-like code
> in the case where they run out of data trying to perform their
> operation, in which case it should be possible to retry again once the
> caller determines that more data is available. This may be somewhat
> nontrivial for error recovery and seeking, with multiple seeks and
> reads, but it's totally possible.

btw, i also have a little annoyance regarding implementation.
get_v can either return the v it got, or some kind of error..

so:

bla1 = get_v(nut);
bla2 = get_v(nut);
bla3 = get_v(nut);
bla4 = get_v(nut);

having to check for every damn one for error is annoying. :/
so, is making a GET_V macro sane?..

#define GET_V(var, nut) if ((var = get_v(nut)) < 0) return -var;
GET_V(bla1, nut);

(as -1 would mean EOF, -2 would mean EAGAIN.. EAGAIN might need to be 
treated specially to "reset" the buffer...)

> Also, please ensure that frames are demuxed with "zero copy". It may
> be preferable to actually have the demuxer return a pointer to the
> internal read buffer, if there's a sane way to lock it to ensure that
> it remains valid as long as the caller needs it. For example if the
> input stream is an mmapped .nut file, there should be some way to
> directly use the mmap for output without ever copying.. But this may
> be difficult..?? :(

since nut_read_frame is literally nothing more than a read(), i guess it 
wouldn't be a big problem to make it behave like read()...
to keep the api simple and consistent though, and to also be able to tell 
the difference between EOF and EAGAIN, i want to make a int pointer 'len' 
as an argument...

> > nut_read_next_packet also does syncpoint handling.
> > It'll start hunting syncpoints if no syncpoint has appeared for a
> > long time or one of the functions just return a negative error.
> 
> Should be better explained.. There have been LONG discussions on how
> error recovery should work but I'm too lazy to dig them up at the
> moment.

Well, I think if i hit any error reading the vlc's, i might as well start 
looking for syncpoints, cause i know i just hit a damaged stream (or 
abuse the index?)...
although if just say a single frame is damaged, you might not want to start 
searching for next syncpoint..
bah, error recovery is tough, and a far future TODO...

> > Other black magic:
> > When nut_read_headers is called, and the stream is seekable, it
> > searches for an index
> 
> Should be optional, or optionally deferred until the first seek is
> requested. It sucks a lot when streaming over http, as ivan keeps
> pointing out in his flames.. :)

i guess i'll make a nut demuxer options struct too :)

> > nut_skip_packet works differently according to packet type:
> > it either reads forward_ptr and skips by that, or it skips by
> > frame data. It also mallocs and frees memory to discard the
> > packet if non-seekable or method 0 was chosen.
> 
> Use alloca or vla. Also there should be a way for the callback to
> implement discarding reads, since they're trivial on some input
> streams even where seeking isn't possible..

Hmm, what inputs are that?
I just realized, EAGAIN is slightly trickier in nut_skip_packet, it might 
have to do a little black magic and store the length it needs to read...
Maybe PacketData should have a 'len' field, so nut_read_next_packet will 
read the forward_ptr...

- ods15




More information about the MPlayer-dev-eng mailing list