[FFmpeg-devel] [PATCH 0/7] RFC: complete rework of s337m support

Thu Dec 5 16:15:16 EET 2024

On 04/12/2024 23:14, ffnicolasg at sfr.fr wrote:
> From: Nicolas Gaullier <nicolas.gaullier at cji.paris>
>
> I would like to have some feedback on this work before going on...
> I send this as an RFC, and so, I have not considered versioning,
> changelog, doc, deprecation schemes. More testing and testing by
> other people is also obviously required.
>
> The fate samples are available here: https://0x0.st/X7mV.zip
>
> To begin, I'd like to raise the issues in current code:
> - current code requires a constant 'offset' between each frame, but
> this requirement is not standard and prevents conformant files to be
> decoded. It also prevents support for unexpected forms of Dolby E
> (Dolby E D2 is not supported because its 'offset' is not recognised).
> - current code does not allow 'missing frames' with silence fallback.
> However, this is very commonplace: for example, someone wants to add
> a still picture with credits and no audio, so inserts some null bytes
> = pcm/silence. The standard recommands to be able to switch between pcm
> and Dolby E frame by frame, so it 'should' be supported.
> - the Dolby E sample rate is not always an integer, and its rounded
> value makes its sync slightly drift.
> - spdif and s337m are implemented separately, despite it's almost the
> same thing; the spdif syncword is even the same as the 16-bit s337m
> syncword!
>
> At the end, it really seems unreasonable to handle raw dolby_e streams
> without the upper s337m layer. The native dolby_e decoder output with
> its weird sample rate (specially @29.97) is not very convenient, too.
>
> Here is what I designed:
> - the mpegts/wav/mxf(stereo) support is handled via stream probing which
> is a proven feature for years, with very little code addition and low
> dependance to the demuxers. It is the spirit of the standard: every
> uncompressed stereo audio can support s337m.
> - a s337m parser: full parsing is required even for frame-wrapped mxf
> as broken files that do not have a valid s337m phase results in audio
> frames crossing the edit unit boundaries.
> - a s337m decoder: it includes a resampler: output and input sample_rate
> are the same, sync is always correct.

I strongly object to using a resampler here. We *never* covert audio or 
video

in decoders from the native codec samplerate or pixel format.

This includes Opus. Opus carries the original samplerate of the audio 
encoded

into its extradata. We ignore it, whilst libopusfile does not. I've been 
on #opus

for a long time, and the only thing I've ever heard about libopusfile's 
behaviour

has been confusion and criticism from its users.

You should expose this, and let users make an informed decision. Most users

will have a resampler anyway, and they may be *particularly sensitive* about

any potential latency you introduce, so applying mandatory resampler here is

both heavy-handed and problematic.