[FFmpeg-devel] [PATCH] avfilter: add ambisonic decoder filter

James Almer jamrial at gmail.com
Sun Oct 30 21:41:44 EET 2022


On 10/30/2022 4:19 PM, Paul B Mahol wrote:
> On 10/30/22, James Almer <jamrial at gmail.com> wrote:
>> On 10/30/2022 3:58 PM, Paul B Mahol wrote:
>>> On 10/30/22, James Almer <jamrial at gmail.com> wrote:
>>>> On 10/30/2022 3:29 PM, Paul B Mahol wrote:
>>>>> On 10/30/22, James Almer <jamrial at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 10/30/2022 3:19 PM, Paul B Mahol wrote:
>>>>>>> On 10/30/22, James Almer <jamrial at gmail.com> wrote:
>>>>>>>> On 10/30/2022 12:34 PM, Paul B Mahol wrote:
>>>>>>>>> +static const struct {
>>>>>>>>> +    const int              order;
>>>>>>>>> +    const int              inputs;
>>>>>>>>> +    const int              speakers;
>>>>>>>>> +    const int              near_field;
>>>>>>>>> +    const int              type;
>>>>>>>>> +    const double           xover;
>>>>>>>>> +    const AVChannelLayout  outlayout;
>>>>>>>>> +    const double          *speakers_azimuth;
>>>>>>>>> +    const double          *speakers_elevation;
>>>>>>>>> +    const double          *speakers_distance;
>>>>>>>>> +} ambisonic_tab[] = {
>>>>>>>>> +    [MONO] = {
>>>>>>>>> +        .order = 0,
>>>>>>>>> +        .inputs = 1,
>>>>>>>>> +        .speakers = 1,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO,
>>>>>>>>> +        .speakers_azimuth = (const double[1]){ 0. },
>>>>>>>>> +        .speakers_distance = (const double[1]){ 1. },
>>>>>>>>> +    },
>>>>>>>>> +    [STEREO] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 2,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO,
>>>>>>>>> +        .speakers_azimuth = (const double[2]){ -30, 30},
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [STEREO_DOWNMIX] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 2,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout =
>>>>>>>>> (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO_DOWNMIX,
>>>>>>>>> +        .speakers_azimuth = (const double[2]){ -90, 90 },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [SURROUND] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 3,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_SURROUND,
>>>>>>>>> +        .speakers_azimuth = (const double[3]){ -45, 45, 0 },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [L2_1] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 3,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_2_1,
>>>>>>>>> +        .speakers_azimuth = (const double[3]){ -45, 45, 180 },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [TRIANGLE] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 3,
>>>>>>>>> +        .type = 1,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_SURROUND,
>>>>>>>>> +        .speakers_azimuth = (const double[3]){ -120, 120, 0 },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [QUAD] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 4,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_QUAD,
>>>>>>>>> +        .speakers_azimuth = (const double[4]){ -45, 45, -135, 135
>>>>>>>>> },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [SQUARE] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 4,
>>>>>>>>> +        .type = 1,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_4POINT0,
>>>>>>>>> +        .speakers_azimuth = (const double[4]){ 0, -90, 180, 90 },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [L4_0] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 4,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_4POINT0,
>>>>>>>>> +        .speakers_azimuth = (const double[4]){ -30, 30, 0, 180 },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [L5_0] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 5,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout =
>>>>>>>>> (AVChannelLayout)AV_CHANNEL_LAYOUT_5POINT0_BACK,
>>>>>>>>> +        .speakers_azimuth = (const double[5]){ -30, 30, 0, -145,
>>>>>>>>> 145
>>>>>>>>> },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [L5_0_SIDE] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 5,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_5POINT0,
>>>>>>>>> +        .speakers_azimuth = (const double[5]){ -30, 30, 0, -110,
>>>>>>>>> 110
>>>>>>>>> },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [L6_0] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 6,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_6POINT0,
>>>>>>>>> +        .speakers_azimuth = (const double[6]){ -30, 30, 0, 180,
>>>>>>>>> -110,
>>>>>>>>> 110
>>>>>>>>> },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [L7_0] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 7,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_7POINT0,
>>>>>>>>> +        .speakers_azimuth = (const double[7]){ -30, 30, 0, -145,
>>>>>>>>> 145,
>>>>>>>>> -110, 110 },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [TETRA] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 4,
>>>>>>>>> +        .type = 2,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_QUAD,
>>>>>>>>> +        .speakers_azimuth = (const double[4]){ -90, 90, 0, 180 },
>>>>>>>>> +        .speakers_elevation = (const double[4]){ -35.3, -35.3,
>>>>>>>>> 35.3,
>>>>>>>>> 35.3
>>>>>>>>> },
>>>>>>>>> +        .speakers_distance = same_distance,
>>>>>>>>> +    },
>>>>>>>>> +    [CUBE] = {
>>>>>>>>> +        .order = 1,
>>>>>>>>> +        .inputs = 4,
>>>>>>>>> +        .speakers = 8,
>>>>>>>>> +        .type = 2,
>>>>>>>>> +        .near_field = NF_NONE,
>>>>>>>>> +        .xover = 0.,
>>>>>>>>> +        .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_7POINT1,
>>>>>>>>
>>>>>>>> 7.1 defines an LFE channel, which is clearly not intended here, so
>>>>>>>> it
>>>>>>>> should be either:
>>>>>>>>
>>>>>>>> .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_MASK(8,
>>>>>>>>                       AV_CH_LAYOUT_QUAD     |
>>>>>>>>                       AV_CH_TOP_FRONT_LEFT  |
>>>>>>>>                       AV_CH_TOP_FRONT_RIGHT |
>>>>>>>>                       AV_CH_TOP_BACK_LEFT   |
>>>>>>>>                       AV_CH_TOP_BACK_RIGHT),
>>>>>>>>
>>>>>>>> Or the AV_CHANNEL_LAYOUT_CUBE layout (using the exact same bitmask
>>>>>>>> as
>>>>>>>> above) after the patch i sent just now is committed.
>>>>>>>
>>>>>>> CUBE is as real cube in 3d space. No current layout in API can be
>>>>>>> described correctly.
>>>>>>
>>>>>> the TOP_* channels are in a different height layer than the other
>>>>>> channels, namely above them. The result for this bitmask is a 3D cube
>>>>>> layout (Left and right speakers both front and back, in two different
>>>>>> height layers).
>>>>>
>>>>> https://en.wikipedia.org/wiki/Ambisonic_reproduction_systems#Cube
>>>>
>>>> "If all speakers are placed in room corners", So a 7.1 layout, besides
>>>> defining one channel as LFE, is also not ideal given that it defines two
>>>> speaker positions in places other than the room corners, and by no means
>>>> describes a 3D cube since all speakers are at the same height.
>>>> The TOP channel ids let you assign four channels to speakers placed in
>>>> the top room corners.
>>>
>>> 7.1 is there just because it have 8 channels. No other reasons.
>>
>> If you want to just say "there are eight channels", then make the
>> AVChannelLayout define 8 channels of order UNSPEC. Let the user then
>> figure out what to do with them. But if you want to actually give each
>> channel a position, then you need to use the proper IDs to describe the
>> intended layout.
>>
>> Using 7.1 here is lying to the user, as it's telling them he needs to
>> place two speakers at the front, two at the back, two to the sides, and
>> all at the same height, plus one channel wrongly feeding the sub woofer.
>> Not to mention what would happen if you try to downmix the stream. swr
>> would just generate awful results.
>> Meanwhile, using cube you'll tell the user to place one speaker on every
>> corner in the room, or a downmixer how to properly handle the channels
>> to get good results (e.g. on headphones).
> 
> I never claimed it is perfect solution.
> 
> Is there a way to let it use unspec but still not confuse swr to abort
> and do pointless messages?

swr will reject any layout order other than native if the process 
requires rematrix (It should print a "Rematrix is needed" message then 
error out) because it obviously has no information about how to handle 
the channels.
ffmpeg.c however might try to guess the layout if it gets a stream of 
uspec order, and for an 8 channel stream it will just set it to 7.1, 
which again is not ideal here and will make swr do bad things.

Just use the CUBE layout. I pushed it just now. Swr will downmix it in a 
better way than if you tell it it's 7.1.


More information about the ffmpeg-devel mailing list