[FFmpeg-devel] Handling dual language mono audio encoded as stereo

Sat Jul 17 01:35:53 CEST 2010

Michael Niedermayer kirjoitti lauantai, 20. helmikuuta 2010 03:00:20:
> On Tue, Feb 16, 2010 at 12:35:02AM +0200, Anssi Hannula wrote:
> > > On Sun, Feb 14, 2010 at 07:32:08PM +0200, Anssi Hannula wrote:
> > > > Hi all!
> > > > 
> > > > Some nordic DVB channels encode e.g. four mono tracks with different
> > > > languages into two stereo tracks (mpeg layer 2). The ISO639 language
> > > > descriptor then has both language codes, separated by a null byte.
> > > > This is probably a remnant from pre-DVB era, but we should somehow
> > > > handle it nevertheless.
> > > > 
> > it ["null" byte above] is the audio type code of the first track,
> > which is always present but is ignored by ffmpeg even without this
> > patch. There's another audio type code for the second channel after the
> > second ISO639 language code.
> > 
> > The possible values are:
> > 00 Undefined
> > 01 Clean effects
> > 02 Hearing impaired
> > 03 Visual impaired commentary
> > 04-FF Reserved
> > 
> > I guess we should somehow add this in metadata. Then of course there's
> > also the case where audio type differs between channels :)
> 
> this really is a mess :(
> why do all these comitees always come up with doing everything in the most
> painfull and backwardly hacked in way.
> I guess simply setting the AVStream metadata to things like
> Channel0/Language="eng"
> Channel1/Language="jpn"
> Channel0/Disposition="comment"
> Channel0/TargetAudience="visually impaired"
> 
> is the least annoying way to handle this

Attached is a patchset that sets
channel0/language="eng"
channel1/language="jpn"
channel1/audio_type="visual impaired"
language="eng+jpn"

This time also mpegtsenc gets the support, and there is also a patch for 
ffplay to add '-ach' commandline option and 'c' key binding to switch audible 
audio channel, and a patch for ffmpeg that adds '-achannel' for selecting a 
single channel.

If e.g. re-encoding channel 1 of above example, the metadata will be mangled 
accordingly:
language="jpn"
audio_type="visual impaired"

Now, about the metadata format... I named the "audio_type" key as per specs as 
avformat.h does currently say "exported exactly as stored in the container", 
though I'm not really sure if it can be applied like this.
Do you think we should directly mangle it to "target_audience" (and make it a 
generic name listed in avformat.h?) as "visually impaired"?
If so, would "clean effects" translate to "disposition=clean effects"?

I dropped the "commentary" word, as there is no non-commentary "visual 
impaired" value, so broadcasters use the same value whether it is commentary 
or not.

-- 
Anssi Hannula