[FFmpeg-devel] [PATCH v2] avformat/avcodec: Add DTS-UHD demuxer and parser, movenc support.

Sun Apr 16 22:55:46 EEST 2023

Hi

On Sat, Apr 15, 2023 at 01:04:42PM -0700, Roy Funderburk wrote:
> 
> Parsing and demuxing of DTS-UHD input files per ETSI TS 102 114 is added
> as demuxer "dtsuhd".  movenc supports DTS-UHD audio track.
> 
> Signed-off-by: Roy Funderburk <Roy.Funderburk at xperi.com>
> ---
>  Changelog                  |   1 +
>  configure                  |   1 +
>  doc/general_contents.texi  |   1 +
>  libavcodec/Makefile        |   1 +
>  libavcodec/codec_desc.c    |   7 +
>  libavcodec/codec_id.h      |   1 +
>  libavcodec/dtsuhd_common.c | 991 +++++++++++++++++++++++++++++++++++++
>  libavcodec/dtsuhd_common.h |  84 ++++
>  libavcodec/dtsuhd_parser.c | 141 ++++++
>  libavcodec/parsers.c       |   1 +
>  libavformat/Makefile       |   1 +
>  libavformat/allformats.c   |   1 +
>  libavformat/dtshddec.c     |   2 +-
>  libavformat/dtsuhddec.c    | 214 ++++++++
>  libavformat/movenc.c       |  32 ++
>  libavformat/version.h      |   2 +-
>  16 files changed, 1479 insertions(+), 2 deletions(-)
>  create mode 100644 libavcodec/dtsuhd_common.c
>  create mode 100644 libavcodec/dtsuhd_common.h
>  create mode 100644 libavcodec/dtsuhd_parser.c
>  create mode 100644 libavformat/dtsuhddec.c
> 
[...]

> +/* In the specification, the pseudo code defaults the 'add' parameter to true.
> +   Table 7-30 shows passing an explicit false, most other calls do not
> +   pass the extractAndAdd parameter.
> +
> +   Function based on code in Table 5-2
> +*/
> +static int get_bits_var(GetBitContext *gb, const uint8_t table[], int add)
> +{
> +    static const int bits_used[8] = { 1, 1, 1, 1, 2, 2, 3, 3 };
> +    static const int index_table[8] = { 0, 0, 0, 0, 1, 1, 2, 3 };
> +    int code = show_bits(gb, 3); /* value range is [0, 7] */
> +    int i;
> +    int index = index_table[code];
> +    int value = 0;
> +
> +    skip_bits(gb, bits_used[code]);
> +    if (table[index] > 0) {
> +        if (add) {
> +            for (i = 0; i < index; i++)
> +                value += 1 << table[i];
> +        }
> +        value += get_bits_long(gb, table[index]);
> +    }

If the speed of this matters,
you could remove the indirection by index_table and remove teh add code, that
would add 12 entries to some of these tables

something like:

int code = show_bits(gb, 3);
skip_bits(gb, bits_used[code]);
if (table[code][0] == 0)
    return 0;
return get_bits_long(gb, table[code][0]) + table[code][1];
    
OTOH if speed doesnt matter then this can probably be left as is


[...]
> +
> +/* Table 6-9 p 38 */
> +static int check_crc(DTSUHD *h, int bit, int bytes)
> +{
> +    GetBitContext gb;
> +    int i;
> +    static const uint16_t lookup[16] = {
> +        0x0000, 0x1021, 0x2042, 0x3063, 0x4084, 0x50A5, 0x60C6, 0x70E7,
> +        0x8108, 0x9129, 0xA14A, 0xB16B, 0xC18C, 0xD1AD, 0xE1CE, 0xF1EF
> +    };
> +    uint16_t crc = 0xFFFF;
> +
> +    init_get_bits(&gb, h->data, h->data_bytes * 8);
> +    skip_bits(&gb, bit);
> +    for (i = -bytes; i < bytes; i++)
> +        crc = (crc << 4) ^ lookup[(crc >> 12) ^ get_bits(&gb, 4)];
> +
> +    return crc != 0;
> +}

likely should use libavutil/crc.h


[...]

> +/* Table 7-26 */
> +static void parse_ch_mask_params(DTSUHD *h, MD01 *md01, MDObject *object)
> +{
> +    GetBitContext *gb = &h->gb;
> +    const int ch_index = object->rep_type == REP_TYPE_BINAURAL ? 1 : get_bits(gb, 4);
> +    static const int mask_table[14] = { /* Table 7-27 */
> +        0x000001, 0x000002, 0x000006, 0x00000F, 0x00001F, 0x00084B, 0x00002F,
> +        0x00802F, 0x00486B, 0x00886B, 0x03FBFB, 0x000003, 0x000007, 0x000843,
> +    };
> +
> +    if (ch_index == 14)
> +        object->ch_activity_mask = get_bits(gb, 16);
> +    else if (ch_index == 15)

> +        object->ch_activity_mask = get_bits(gb, 32);

get_bits_long()

[...]

> +/** Allocate parsing handle.  The parsing handle should be used to parse
> +    one DTS:X Profile 2 Audio stream, then freed by calling DTSUHD_destroy().
> +    Do not use the same parsing handle to parse multiple audio streams.
> +
> +  @return Parsing handle for use with other functions, or NULL on failure.
> +*/
> +DTSUHD *dtsuhd_create(void)

stuff needs av / avpriv prefixes when shared between libraries other symbols arent
exported and will break build depending on build options

also minor libavcodec version needs to be +1 when adding av* symbols
and libavcodec and libavformat changes should be in 2 seperate patches


[...]
> +    if (fi) {
> +        fi->sync = h->is_sync_frame;
> +        fi->frame_bytes = h->frame_bytes;
> +        fi->sample_rate = h->sample_rate;
> +        fi->sample_count = (h->frame_duration * fi->sample_rate) / (h->clock_rate * fraction);


> +        fi->duration = (double)fi->sample_count / fi->sample_rate;

it feels as if double is not needed here
Either AVRational or a simple integer type int / int64_t in samples instead of seconds
seem better as it would be exact and no odd platform rounding difference
could happen


[...]
> +
> +/** Return the offset of the first UHD audio frame.
> +    When supplied a buffer containing DTSHDHDR file content, the DTSHD
> +    headers are skipped and the offset to the first byte of the STRMDATA
> +    chunk is returned, along with the size of that chunk.
> +
> +  @param[in] dataStart DTS:X Profile 2 file content to parse
> +  @param[in] dataSize Number of valid bytes in 'dataStart'
> +  @param[out] Number of leading DTS:X Profile 2 audio frames to discard,
> +              may be NULL
> +  @param[out] Size of STRMDATA payload, may be NULL
> +  @return STRMDATA payload offset or 0 if not a valid DTS:X Profile 2 file
> +*/
> +int dtsuhd_strmdata_payload(const uint8_t *data_start, int data_size, size_t *strmdata_size)
> +{
> +    const uint8_t *data = data_start;
> +    const uint8_t *data_end = data + data_size;
> +    uint64_t chunk_size = 0;
> +
> +    if (data + DTSUHD_CHUNK_HEADER >= data_end || memcmp(data, "DTSHDHDR", 8))
> +        return 0;
> +

> +    for (; data + DTSUHD_CHUNK_HEADER + 4 <= data_end; data += chunk_size + DTSUHD_CHUNK_HEADER) {
> +        chunk_size = AV_RB64(data + 8);
> +
> +        if (!memcmp(data, "STRMDATA", 8)) {
> +            if (strmdata_size)
> +                *strmdata_size = chunk_size;
> +            return (int)(data - data_start) + DTSUHD_CHUNK_HEADER;
> +        }
> +    }

this can infinite loop
undefined behavior for teh out of array pointers that can happen with the
"right" chunk_size
also data can decrease if one ignores that this is already undefined before


[...]
> +
> +    ffstream(st)->need_parsing = AVSTREAM_PARSE_FULL_RAW;
> +    st->codecpar->codec_type = AVMEDIA_TYPE_AUDIO;
> +    st->codecpar->codec_id = s->iformat->raw_codec_id;
> +    st->codecpar->ch_layout.order = AV_CHANNEL_ORDER_NATIVE;
> +    st->codecpar->ch_layout.nb_channels = di.channel_count;
> +    st->codecpar->ch_layout.u.mask = di.ffmpeg_channel_mask;
> +    st->codecpar->codec_tag = AV_RL32(di.coding_name);
> +    st->codecpar->frame_size = 512 << di.frame_duration_code;
> +    st->codecpar->sample_rate = di.sample_rate;

you could align all the "=" below each other, that would make this look
more pretty

thx

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Complexity theory is the science of finding the exact solution to an
approximation. Benchmarking OTOH is finding an approximation of the exact
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20230416/43fcd6d7/attachment.sig>