[FFmpeg-devel] [PATCH 5/9] sbc: implement SBC encoder (low-complexity subband codec)
Rostislav Pehlivanov
atomnuker at gmail.com
Sat Feb 24 23:31:30 EET 2018
On 24 February 2018 at 12:01, Aurelien Jacobs <aurel at gnuage.org> wrote:
> On Thu, Feb 22, 2018 at 06:18:48PM +0000, Rostislav Pehlivanov wrote:
> > On 21 February 2018 at 22:37, Aurelien Jacobs <aurel at gnuage.org> wrote:
> >
> > > This was originally based on libsbc, and was fully integrated into
> ffmpeg.
> > > ---
> > > doc/general.texi | 2 +-
> > > libavcodec/Makefile | 1 +
> > > libavcodec/allcodecs.c | 1 +
> > > libavcodec/sbcdsp.c | 382 ++++++++++++++++++++++++++++++
> > > +++++++++++++
> > > libavcodec/sbcdsp.h | 83 ++++++++++
> > > libavcodec/sbcdsp_data.c | 329 +++++++++++++++++++++++++++++++++++++
> > > libavcodec/sbcdsp_data.h | 55 +++++++
> > > libavcodec/sbcenc.c | 411 ++++++++++++++++++++++++++++++
> > > +++++++++++++++++
> > > 8 files changed, 1263 insertions(+), 1 deletion(-)
> > > create mode 100644 libavcodec/sbcdsp.c
> > > create mode 100644 libavcodec/sbcdsp.h
> > > create mode 100644 libavcodec/sbcdsp_data.c
> > > create mode 100644 libavcodec/sbcdsp_data.h
> > > create mode 100644 libavcodec/sbcenc.c
> > >
> > > +
> > > +#define OFFSET(x) offsetof(SBCEncContext, x)
> > > +#define AE AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM
> > > +static const AVOption options[] = {
> > > + { "joint_stereo", "use joint stereo",
> > > + OFFSET(joint_stereo), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1,
> AE },
> > >
> > + { "dual_channel", "use dual channel",
> > > + OFFSET(dual_channel), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1,
> AE },
> > >
> >
> > Erm those 2 things should be decided by the encoder, not by exposing them
> > to the user. The encoder should decide which mode has lower distortion
> for
> > a given signal.
>
> See bellow.
>
> > > + { "subbands", "number of subbands (4 or 8)",
> > > + OFFSET(subbands), AV_OPT_TYPE_INT, { .i64 = 8 }, 4, 8,
> AE },
> > >
> >
> > The encoder doesn't check if the value isn't 4 or 8 so 5, 6 and 7 are all
> > accepted. Similar issue to the previous option too.
>
> OK, fixed.
>
> > > + { "bitpool", "bitpool value",
> > > + OFFSET(bitpool), AV_OPT_TYPE_INT, { .i64 = 32 }, 0, 255,
> AE },
> > >
> >
> > This should be controlled by the bitrate setting. Either have a function
> to
> > translate bitrate to bitpool value or a table which approximately maps
> > bitrate values supplied to bitpools. You could expose it directly as well
> > as mapping it to a bitrate value by using the global_quality setting so
> it
> > shouldn't be a custom encoder option.
>
> Indeed, this is better this way, thanks for the suggestion.
>
> > > + { "blocks", "number of blocks (4, 8, 12 or 16)",
> > > + OFFSET(blocks), AV_OPT_TYPE_INT, { .i64 = 16 }, 4, 16,
> AE },
> > > + { "snr", "use SNR mode (instead of loudness)",
> > > + OFFSET(allocation), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1,
> AE },
> > >
> >
> > SNR mode too needs to be decided by the encoder rather than exposing it
> as
> > a setting.
>
> See bellow.
>
> > > + { "msbc", "use mSBC mode (wideband speech mono SBC)",
> > >
> >
> > Add a profile fallback setting for this as well, like in aac where
> -aac_ltp
> > turns LTP mode on and -profile:a aac_ltp does the same.
>
> Not sure of the benefits of having 2 redundant way to do this, but OK.
>
> > You don't have to make the encoder decide which stereo coupling mode or
> > snr/loudness setting to use, you can implement that with a later patch.
> > I think you should remove the "blocks" and "subbands" settings as well
> and
> > instead replace those with a single "latency" setting like the native
> Opus
> > encoder in milliseconds which would adjust both of them on init to set
> the
> > frame size. This would also allow the encoder to change them. Again, you
> > don't have to do this now, you can send a patch which adds a "latency"
> > option later.
>
> I can see the value in what you propose, and I think that indeed, it
> would be great for the encoder to choose by itself the most appropriate
> parameters by default.
> But on the other hand, we are talking about a codec targetting some
> hardware decoders (blutetooth speakers, headsets...), and all those
> parameters have to be negotiated between the encoder and the hardware
> decoder. So I think it is mandatory to keep all those parameters
> accessible to be able to setup the encoder according to the target
> hardware capabilities.
>
>
Hardware isn't as limited as you might think it is, and there's also no
negotiation happening between encoders and decoders IRL (except for modern
streaming protocols which suck). While its true that we've had some issues
with hardware decoders not supporting rarely used features we wait for
users to report them and don't assume that all hardware violates the specs.
And looking at the code, only 2 settings strike out as being able to cause
issues: the frame size, controlled by subbands and blocks and the msbc
mode. There's no way in hell that dual stereo and joint coding are
unsupported (except for mono streams obviously but we can detect that) and
like I said, you can add a setting to control the frame size later on, and
the SNR setting is far too trivial to not be implemented.
Hence I think that only the msbc setting and the latency need to be exposed.
If someone reports that a device doesn't support e.g. non-SNR mode we'll
add an override. We can easily add options, but removing them requires a 2
year deprecation period. In all cases, the encoder should decide the
parameters unless told otherwise, rather than having the user do so.
More information about the ffmpeg-devel
mailing list