[FFmpeg-devel] [PATCH] avfilter: add ambisonic decoder filter
James Almer
jamrial at gmail.com
Sun Oct 30 21:41:44 EET 2022
On 10/30/2022 4:19 PM, Paul B Mahol wrote:
> On 10/30/22, James Almer <jamrial at gmail.com> wrote:
>> On 10/30/2022 3:58 PM, Paul B Mahol wrote:
>>> On 10/30/22, James Almer <jamrial at gmail.com> wrote:
>>>> On 10/30/2022 3:29 PM, Paul B Mahol wrote:
>>>>> On 10/30/22, James Almer <jamrial at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 10/30/2022 3:19 PM, Paul B Mahol wrote:
>>>>>>> On 10/30/22, James Almer <jamrial at gmail.com> wrote:
>>>>>>>> On 10/30/2022 12:34 PM, Paul B Mahol wrote:
>>>>>>>>> +static const struct {
>>>>>>>>> + const int order;
>>>>>>>>> + const int inputs;
>>>>>>>>> + const int speakers;
>>>>>>>>> + const int near_field;
>>>>>>>>> + const int type;
>>>>>>>>> + const double xover;
>>>>>>>>> + const AVChannelLayout outlayout;
>>>>>>>>> + const double *speakers_azimuth;
>>>>>>>>> + const double *speakers_elevation;
>>>>>>>>> + const double *speakers_distance;
>>>>>>>>> +} ambisonic_tab[] = {
>>>>>>>>> + [MONO] = {
>>>>>>>>> + .order = 0,
>>>>>>>>> + .inputs = 1,
>>>>>>>>> + .speakers = 1,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_MONO,
>>>>>>>>> + .speakers_azimuth = (const double[1]){ 0. },
>>>>>>>>> + .speakers_distance = (const double[1]){ 1. },
>>>>>>>>> + },
>>>>>>>>> + [STEREO] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 2,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO,
>>>>>>>>> + .speakers_azimuth = (const double[2]){ -30, 30},
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [STEREO_DOWNMIX] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 2,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout =
>>>>>>>>> (AVChannelLayout)AV_CHANNEL_LAYOUT_STEREO_DOWNMIX,
>>>>>>>>> + .speakers_azimuth = (const double[2]){ -90, 90 },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [SURROUND] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 3,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_SURROUND,
>>>>>>>>> + .speakers_azimuth = (const double[3]){ -45, 45, 0 },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [L2_1] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 3,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_2_1,
>>>>>>>>> + .speakers_azimuth = (const double[3]){ -45, 45, 180 },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [TRIANGLE] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 3,
>>>>>>>>> + .type = 1,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_SURROUND,
>>>>>>>>> + .speakers_azimuth = (const double[3]){ -120, 120, 0 },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [QUAD] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 4,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_QUAD,
>>>>>>>>> + .speakers_azimuth = (const double[4]){ -45, 45, -135, 135
>>>>>>>>> },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [SQUARE] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 4,
>>>>>>>>> + .type = 1,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_4POINT0,
>>>>>>>>> + .speakers_azimuth = (const double[4]){ 0, -90, 180, 90 },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [L4_0] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 4,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_4POINT0,
>>>>>>>>> + .speakers_azimuth = (const double[4]){ -30, 30, 0, 180 },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [L5_0] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 5,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout =
>>>>>>>>> (AVChannelLayout)AV_CHANNEL_LAYOUT_5POINT0_BACK,
>>>>>>>>> + .speakers_azimuth = (const double[5]){ -30, 30, 0, -145,
>>>>>>>>> 145
>>>>>>>>> },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [L5_0_SIDE] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 5,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_5POINT0,
>>>>>>>>> + .speakers_azimuth = (const double[5]){ -30, 30, 0, -110,
>>>>>>>>> 110
>>>>>>>>> },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [L6_0] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 6,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_6POINT0,
>>>>>>>>> + .speakers_azimuth = (const double[6]){ -30, 30, 0, 180,
>>>>>>>>> -110,
>>>>>>>>> 110
>>>>>>>>> },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [L7_0] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 7,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_7POINT0,
>>>>>>>>> + .speakers_azimuth = (const double[7]){ -30, 30, 0, -145,
>>>>>>>>> 145,
>>>>>>>>> -110, 110 },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [TETRA] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 4,
>>>>>>>>> + .type = 2,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_QUAD,
>>>>>>>>> + .speakers_azimuth = (const double[4]){ -90, 90, 0, 180 },
>>>>>>>>> + .speakers_elevation = (const double[4]){ -35.3, -35.3,
>>>>>>>>> 35.3,
>>>>>>>>> 35.3
>>>>>>>>> },
>>>>>>>>> + .speakers_distance = same_distance,
>>>>>>>>> + },
>>>>>>>>> + [CUBE] = {
>>>>>>>>> + .order = 1,
>>>>>>>>> + .inputs = 4,
>>>>>>>>> + .speakers = 8,
>>>>>>>>> + .type = 2,
>>>>>>>>> + .near_field = NF_NONE,
>>>>>>>>> + .xover = 0.,
>>>>>>>>> + .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_7POINT1,
>>>>>>>>
>>>>>>>> 7.1 defines an LFE channel, which is clearly not intended here, so
>>>>>>>> it
>>>>>>>> should be either:
>>>>>>>>
>>>>>>>> .outlayout = (AVChannelLayout)AV_CHANNEL_LAYOUT_MASK(8,
>>>>>>>> AV_CH_LAYOUT_QUAD |
>>>>>>>> AV_CH_TOP_FRONT_LEFT |
>>>>>>>> AV_CH_TOP_FRONT_RIGHT |
>>>>>>>> AV_CH_TOP_BACK_LEFT |
>>>>>>>> AV_CH_TOP_BACK_RIGHT),
>>>>>>>>
>>>>>>>> Or the AV_CHANNEL_LAYOUT_CUBE layout (using the exact same bitmask
>>>>>>>> as
>>>>>>>> above) after the patch i sent just now is committed.
>>>>>>>
>>>>>>> CUBE is as real cube in 3d space. No current layout in API can be
>>>>>>> described correctly.
>>>>>>
>>>>>> the TOP_* channels are in a different height layer than the other
>>>>>> channels, namely above them. The result for this bitmask is a 3D cube
>>>>>> layout (Left and right speakers both front and back, in two different
>>>>>> height layers).
>>>>>
>>>>> https://en.wikipedia.org/wiki/Ambisonic_reproduction_systems#Cube
>>>>
>>>> "If all speakers are placed in room corners", So a 7.1 layout, besides
>>>> defining one channel as LFE, is also not ideal given that it defines two
>>>> speaker positions in places other than the room corners, and by no means
>>>> describes a 3D cube since all speakers are at the same height.
>>>> The TOP channel ids let you assign four channels to speakers placed in
>>>> the top room corners.
>>>
>>> 7.1 is there just because it have 8 channels. No other reasons.
>>
>> If you want to just say "there are eight channels", then make the
>> AVChannelLayout define 8 channels of order UNSPEC. Let the user then
>> figure out what to do with them. But if you want to actually give each
>> channel a position, then you need to use the proper IDs to describe the
>> intended layout.
>>
>> Using 7.1 here is lying to the user, as it's telling them he needs to
>> place two speakers at the front, two at the back, two to the sides, and
>> all at the same height, plus one channel wrongly feeding the sub woofer.
>> Not to mention what would happen if you try to downmix the stream. swr
>> would just generate awful results.
>> Meanwhile, using cube you'll tell the user to place one speaker on every
>> corner in the room, or a downmixer how to properly handle the channels
>> to get good results (e.g. on headphones).
>
> I never claimed it is perfect solution.
>
> Is there a way to let it use unspec but still not confuse swr to abort
> and do pointless messages?
swr will reject any layout order other than native if the process
requires rematrix (It should print a "Rematrix is needed" message then
error out) because it obviously has no information about how to handle
the channels.
ffmpeg.c however might try to guess the layout if it gets a stream of
uspec order, and for an 8 channel stream it will just set it to 7.1,
which again is not ideal here and will make swr do bad things.
Just use the CUBE layout. I pushed it just now. Swr will downmix it in a
better way than if you tell it it's 7.1.
More information about the ffmpeg-devel
mailing list