[FFmpeg-devel] [PATCH]Add data_types for EAC3 and TrueHD to spdif

Sun Jul 25 22:35:52 CEST 2010

Carl Eugen Hoyos kirjoitti keskiviikko, 16. kes?kuuta 2010 17:10:47:
> M?ns Rullg?rd <mans <at> mansr.com> writes:
> > > These are the result of a bit of trial-and-error, I will apply if
> > > nobody objects.
> > 
> > Tested on a real receiver?
> 
> Yes.

Interesting, I didn't get my receiver to recognize TrueHD.

I tried the calculated (see below where I do it for E-AC-3) offset of 2560, 
but it didn't work. Specification says 15360 IEC60958 frames, which is 61440 
bytes. However, if I've calculated correctly, that would mean that one would 
need to have 960 PCM frames per burst (when rate is 48kHz), while one gets 40 
PCM frames from one TrueHD frame. I tried to send 24 frames per burst, but 
that didn't work either (and the specification explicitly says that one burst 
has to contain one complete frame). I defined burst-size in bytes as per 
specification.

After that I looked at the HDMI specification, and according to it one has to 
send IEC 61937 encapsulated compressed audio with frame rates above 192 kHz 
(like TrueHD with 8 * 192 kHz) in a HBR (High-Bitrate) Audio Stream Packet 
instead of Audio Sample Packet. The difference between them is only a few 
flipped bits, but it has to be supported by the driver/hardware anyway, and 
AFAICS no ALSA driver currently does (I guess hard some sound hardware could 
silently detect the case itself and use HBR packets, but I doubt it), which 
might be because do one has needed that yet.

I guess there's also a (small) chance that (some) receivers may decode a 
TrueHD stream from the regular Audio Sample Packets, but mine didn't seem to 
(or maybe I still didn't get the channel order right; multichannel audio needs 
conversion between HDMI<->ALSA order which afaiu needs to reversed to preserve 
the bitstream, and for extra fun the driver seems to do the mapping wrongly 
for me).

> What else could "trial-and-error" mean in this context?
> 
> For E-AC-3, I currently get noise with some similarity to the actual audio
> - any suggestions for pkt_offset (and bitstream_mode)?

Since E-AC-3 is passed via spdif with 4x sample rate [1], the link bandwidth 
in bytes per second is:
4 * sample_rate * 2 (channels per iec60958 frame) * 16 (bits per sample in 
iec60958) / 8 (bits per byte) = 16 * sample_rate

One (E-)AC-3 frame is decoded to 256 * hdr.frame_size PCM samples per channel. 
Therefore one needs sample_rate / (256 * hdr.frame_size) (E-)AC-3 frames per 
second.

In order to get the frames timed correctly, we need to send one E-AC-3 frame 
in this many bytes:
bandwidth per second / number of frames needed per second
= 16 * sample_rate / (sample_rate / (256 * hdr.frame_size))
= 4096 * hdr.frame_size

So the correct offset value *seems* to be 4096 * hdr.frame_size.

Indeed I tried this sample
http://samples.mplayerhq.hu/A-codecs/AC3/eac3/csi_miami_5.1_256_spx.eac3
encoding it with offset value 4096 * 6 = 24576, and it worked fine in my 
receiver. However,
http://samples.mplayerhq.hu/A-codecs/AC3/eac3/serenity_english_5.1_1536.eac3
with offset value 4096 * 1 = 4096 did not play properly. So I guess we're 
still missing something.

I also guess that pkt_size may be defined in bytes instead of bits for this 
codec to avoid overflow (I know this is the case for TrueHD/MAT). However, my 
receiver doesn't seem to care, it (the first sample) works with either.

Also, no data-type dependent info seem to be required, however it is possible 
that the same bitstream_mode as for AC-3 is applicable. For E-AC-3 it is in a 
different position (see ff_eac3_parse_header()), though, so I tested without 
it. According to the AC-3 specification non-zero bitstream_modes are for 
"music+effects", "visually impaired", "hearing impaired", "dialogue", etc, so 
it would be zero most of the time.

[1] http://msdn.microsoft.com/en-us/library/dd316761(VS.85).aspx

-- 
Anssi Hannula