[NUT-devel] [nut]: r160 - trunk/docs/nut-english.txt

Sat Oct 28 19:35:28 CEST 2006

Author: ods15
Date: Sat Oct 28 19:35:27 2006
New Revision: 160

Added:
   trunk/docs/nut-english.txt

Log:
add nut-english.txt
as stated by Rich - a very quick draft, too over-propogated and needs a 
major revision :)


Added: trunk/docs/nut-english.txt
==============================================================================

--- (empty file)
+++ trunk/docs/nut-english.txt	Sat Oct 28 19:35:27 2006
@@ -0,0 +1,196 @@
+
+
+!!! DRAFT DRAFT DRAFT !!!
+
+DRAFT USAGE / SEMANTICS / RATIONALE SECTIONS FOR NUT SPEC
+
+
+Overview of NUT
+
+Unlike many popular containers, a NUT file can largely be viewed as a
+byte stream, as opposed to having a global block structure. NUT files
+consist of a sequence of packets, which can contain global headers,
+file metadata, stream headers for the individual media streams,
+optional index data to accelerate seeking, and, of course, the actual
+encoded media frames. Aside from frames, all packets begin with a
+64-bit startcode, the first byte of which is 0x4E, the ASCII character
+'N'. In addition to identifying the type of packet to follow, these
+startcodes (combined with CRC) allow for reliable resynchronization
+when reading damaged or incomplete files. Packets have a common
+structure that enables a process reading the file both to verify
+packet contents and to bypass uninteresting packets without having to
+be aware of the specific packet type.
+
+In order to facilitate identification and playback of NUT files,
+strict rules are imposed on the location and order of packets and
+streams. Streams can be of class video, audio, subtitle, or
+user-defined data. Additional classes may be added in a later version
+of the NUT specification. Streams must be numbered consecutively
+beginning from 0. This allows simple and compact reference to streams
+in packet types where overhead must be kept to a minimum.
+
+Header Structure
+
+A NUT file must begin with a magic identification string, followed by
+the main header and a stream header for each stream, ordered by stream
+id. No other packets may intervene between these header packets. For
+robustness, a NUT file needs to include backup copies of the headers.
+In the absence of valid headers at the beginning of the file,
+processes attempting to read a NUT file are recommended to search for
+backup headers beginning at each power-of-two byte offset in the file.
+Simple stop conditions are provided to ensure that this search
+algorithm is bounded logarithmically in file length.
+
+Metadata - Info Packets
+
+The NUT main header and stream headers may be followed by metadata
+"info" packets, which contain (mostly textual, but other formats are
+possible) information on the file, on particular streams, or on
+particular time intervals ("chapters") of the file, such as: title,
+author, language, etc. One should not that info packets may occur at
+other locations in a file, particulatly in a file that is being
+generated/transmitted in real time; however, a process interpreting a
+NUT file should not make any attempt to search for info packets except
+in their usual location, i.e. following the headers. It is intended
+that processes presenting the contents of a NUT file will make
+automated responses to information stored in these packets, e.g.
+selecting a subtitle language based on the user's preferred list of
+languages, or providing a visual list of chapters to the user.
+Therefore, the format of info packets and the data they are to contain
+has been carefully specified and is aligned with International
+Standards for language codes and so forth. For this reason it is also
+important that info packets be stored in the correct locations, so
+that processes making automated responses to these packets can operate
+correctly.
+
+Index
+
+An index packet to facilitate O(1) seek-to-time operations may follow
+the headers. If an index packet does exist here, it should be placed
+after info packets, rather than before. Since the contents of the
+index depend on knowing the complete contents of the file, most
+processes generating NUT files are not expected to store an index with
+the headers. This option is merely provided for applications where it
+makes sense, to allow the index to be read without any seek operations
+on the underlying media when it is available.
+
+On the other hand, all NUT files except live streams (which have no
+concept of "end of file") must include an index at the end of the
+file, followed by a fixed-size 32-bit integer that is an offset
+backwards from end-of-file at which the final index packet begins.
+This is the only fixed-size field specified by NUT, and makes it
+possible to locate an index stored at the end of the file without
+resorting to unreliable heuristics.
+
+Streams
+
+A NUT file consists of one or more streams, intended to be presented
+simultaneously in synchronization with one another. Use of streams as
+independent entities is discouraged, and the nature of NUT's ordering
+requirements on frames makes it highly disadvantageous to store
+anything except the audio/video/subtitle/etc. components of a single
+presentation together in a single NUT file. Nonlinear playback order,
+scripting, and such are topics outside the scope of NUT, and should be
+handled at a higher protocol layer should they be desired (for
+example, using several NUT files with an external script file to
+control their playback in combination).
+
+With each stream, a single media encoding format is associated. The
+stream headers convey properties of the encoding, such as video frame
+dimensions, sample rates, and the compression standard ("codec") used
+(if any). Stream headers may also carry with them an opaque, binary
+object in a codec-specific format, containing global parameters for
+the stream such as codebooks. Both the compression format and whatever
+parameters are stored in the stream header (including NUT fields and
+the opaque global header object) are constant for the duration of the
+stream.
+
+Frames
+
+NUT is built on the model that video, audio, and subtitle streams all
+consist of a sequence of "frames", where the specific definition of
+frame is left partly to the codec, but should be roughly interpreted
+as the smallest unit of data which can be decoded (not necessarily
+independently; it may depend on previously-decoded frames) to a
+complete presentation unit occupying an interval of time. In
+particular, video frames correspond to the usual idea of a frame as a
+picture that is displayed beginning at its assigned timestamp until it
+is replaced by a subsequent picture with a later timestamp. Subtitle
+frames should be thought of as individual subtitles in the case of
+simple text-only streams, or as events that alter the presentation in
+the case of more advanced subtitle formats. Audio frames are merely
+intervals of samples; their length is determined by the compression
+format used.
+
+Frames need not be decoded in their presentation order. NUT allows for
+arbitrary out-of-order frame systems, from classic MPEG-1-style B
+frames to H.264 B pyramid and beyond, using a simple notion of "delay"
+and an implicitly-determined "decode timestamp" (dts). Out-of-order
+decoding is not limited to video streams; it is available to audio
+streams as well, and, given the right conditions, even subtitle
+streams, should a subtitle format choose to make use of such a
+capability.
+
+Central to NUT is the notion that EVERY frame has a timestamp. This
+differs from other major container formats which allow timestamps to
+be omitted for some or even most frames. The decision to explicitly
+timestamp each frame allows for powerful high-level seeking and
+editing in applications without any interaction with the codec level.
+This makes it possible to develop applications which are completely
+unaware of the codecs used, and allows applications which do need to
+perform decoding to be more properly factored.
+
+Keyframes
+
+NUT defines a "key frame" as any frame such that the frame itself and
+all subsequent (with regard to presentation time) frames of the stream
+can be decoded successfully without reference to prior (with regard to
+storage/decoding order) frames in the stream. This definition may
+sometimes be bent on a per-codec basis, particularly with audio
+formats where there is MDCT window overlap or similar.
+
+The concept of key frames is central to seeking, and key frames will
+be the targets of the seek-to-time operation.
+
+Representation of Time
+
+NUT represents all timestamps as exact integer multiples of a rational
+number "time base". Files can have multiple time bases in order to
+accurately represent the time units of each stream. The set of
+available time bases is defined in the main header, while each stream
+header indicates which time base the corresponding stream will use.
+
+Effective use of time bases both allows for compact representation of
+timestamps, minimizing overhead, and enriches the information
+contained in the file. For example, a process interpreting a NUT file
+with a video time base of 1/25 second knows it can convert the video
+to fixed-framerate 25 fps content or present it faithfully on a PAL
+display.
+
+The scope of the media contained in a NUT file is a single contiguous
+interval of time. Timestamps need not begin at zero, but they may not
+jump backwards. Any large forward jump in timestamps must be
+interpreted as a frame with a large presentation interval, not as a
+discontinuity in the presentation. Without conditions such as these,
+NUT could not guarantee correct seeking in efficient time bounds.
+
+Aside from provisions made for out-of-order decoding, all frames in a
+NUT file must be strictly ordered by timestamp. For the purpose of
+sorting frames, all timestamps are treated as rational numbers derived
+from a coded integer timestamp and the associated time base, and
+compared under the standard ordering on the rational numbers.
+
+Frame Coding
+
+Each frame begins with a "framecode", a single byte which indexes a
+table in the main header. This table can associate properties such as
+stream id, size, relative timestamp, keyframe flag, etc. with the
+frame that follows, or allow the values to be explicitly coded
+following the framecode byte. By careful construction of the framecode
+table in the main header, an average overhead of significantly less
+than 2 bytes per frame can be achieved for single-stream files at low
+bitrates.
+
+Syncpoints
+
+...