[FFmpeg-devel] [PATCH] libswresample: avoid s16p internal processing format

Sun Jan 8 17:18:44 EET 2023

On 1/8/23, Michael Niedermayer <michael at niedermayer.cc> wrote:
> On Fri, Jan 06, 2023 at 07:01:06PM +0100, Paul B Mahol wrote:
>> On Fri, Jan 6, 2023 at 6:25 PM Michael Niedermayer
>> <michael at niedermayer.cc>
>> wrote:
>>
>> > On Thu, Jan 05, 2023 at 11:08:25PM +0100, Paul B Mahol wrote:
>> > > On Thu, Jan 5, 2023 at 9:53 PM Michael Niedermayer <
>> > michael at niedermayer.cc>
>> > > wrote:
>> > >
>> > > > On Thu, Jan 05, 2023 at 01:44:10PM +0100, Paul B Mahol wrote:
>> > > > > Patch attached.
>> > > >
>> > > > >  swresample.c |    3 ++-
>> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > > > > eee7a0685b44aa867562138a2e2437ecb8844612
>> > > > 0001-libswresample-swresample-avoid-s16p-internal-transfe.patch
>> > > > > From 9c4cd60e2dd41cf98d693c8251f4cfade0807073 Mon Sep 17 00:00:00
>> > 2001
>> > > > > From: Paul B Mahol <onemda at gmail.com>
>> > > > > Date: Thu, 5 Jan 2023 13:40:12 +0100
>> > > > > Subject: [PATCH] libswresample/swresample: avoid s16p internal
>> > transfer
>> > > > format
>> > > > >
>> > > > > Instead use float one by default for sample rate conversions.
>> > > > > The s16p internal transfer format produces visible and hearable
>> > > > > quantization artifacts.
>> > > >
>> > > > When does this occur and why?
>> > > >
>> > >
>> > > It occurs always. Just compare output with 16bit and
>> > > int32/float/double.
>> > > Look at other people report on internet.
>> > > Look at src.infinitewave.ca
>> >
>> > src.infinitewave.ca uses 32bit none of what it shows should touch the
>> > codepath
>> > you change.
>> >
>> > if we look at src.infinitewave.ca for swr we see 2 types of artifacts
>> > 1. Aliassing which is at maybe -120db with the actual signal at 0db
>> >    i would like to see some evidence that a human can hear this
>> >
>>
>> For s16p<->s16p it is much lower, around -78dB thus this patch.
>
> ok but you pointed to the website that apparently uses >=32bit if i trust
> what they write.
> And even if they test this i cannot use that website to replicate the issue
> and the fix

If one use pure 16bit processing sweep results are even worse.

Just resample using fltp/dblp/s32p and s16p and compare (it does not
matter what, just not simple sine constant frequencies waves)

The s16p result is much worse and contains huge quantization noises.

They are not that obviously easy to hear, but are there, and
difference is > -80dB for dithered 16bit input.

You can generate and display sweep with/out resampling all with
ffmpeg/ffplay.ffplay -f lavfi -i
aevalsrc="sin(PI*t*t*t*100):s=96000",aresample=44100:tsf=s16p,showspectrum=scale=log:color=cool:overlap=0:fps=25:drange=96:legend=1

Play with tsf=values and drange=[96-150]
So, for 16bit input, drange=96 and tsf=s16p looks fine, but that web
pages shows bad results,
thus I think that they use >16bit and tsf=s16p.

But anyway the filter kernel is quantized to 16bit for s16p and if
16bit input is dithered results are also worse for s16p, because if
you reformat s16p to flt/s32p and resample and then convert back to
s16p errors are bigger than 1/2 bits range when compared to direct s16
resampling.

>
> I just wanted to know how i can test this. You are testing it too ...
> so i can see what you see
> Id like to make sure this is the correct fix for the problem and
> Id like to make sure its used when it makes sense and not when not.
>
>
>>
>> Also for others and reports for swr its is lower than exact -120dB
>
> The 120 was by "eye" from teh chart on  the web i didnt meassure it
>
>
>>
>>
>> 2. Reflection and attenuation at the transition frequency
>> > With linear filters there is a tradeof between attenuation of the
>> > passband, reflection of frequencies beyond, latency and so on
>> > You can have a perfect sharp cutoff with no attenuation and no
>> > refelection
>> > that requires a infinitly long filter. And while this looks best in
>> > this
>> > frequency plot, does it actually sound best ? If you can hear -120db
>> > signals you surely would then also hear the ringing long before a
>> > gunshot
>> > from such long filter.
>> >
>> > also what actually is the optimal frequency response of this filter ?
>> > with a 22khz cutoff, a 22.1khz sine should be silence is that
>> > really subjectively better than a 21.9khz sine ?
>> > Iam not sure about this. Has someone done actual hearing tests with
>> > actual real audio? the sinc filter originates from the idea of lossless
>> > reconstruction of frequencies below nyquist if iam not mistaken, but
>> > humans
>> > are not trying to losslessly restore a block of frequencies. A human
>> > listener
>> > generally wants to enjoy listening to some media. Has someone looked
>> > into
>> > what is actually best for that real use case ?
>> > This question matters because with it we can tune the filter parameters
>> > to
>> > target humans.
>> >
>> > But lets push the doubts about choosing resampling purely based on
>> > frequency
>> > analysis away.
>> > swresample has several parameters with which you can tune this:
>> > we have a filter_size, if thats bigger you should get closer to the
>> > ideal
>> > sinc. Theres phase_shift which may reduce the (i assume) unhearable
>> > aliasing.
>> > And cutoff which should allow to tune the (i assume) hearable
>> > reflection/attenuation tradeoff also theres filter_type to allow you to
>> > tune the
>> > window function.
>> >
>> > If there are issues reported by people using their ears, please provide
>> > more
>> > details, iam interrested in these cases.
>> >
>> >
>> > >
>> > >
>> > > > This change should be limited to the case that benefits, this would
>> > force
>> > > > this
>> > > > even without resampling in some cases.
>> > > >
>> > >
>> > > It is forced only if sample rates between input and output differs.
>> >
>> > If iam not mistaken it affects rematrixing without resampling too
>> >
>>
>> How so?
>> I really doubt that this patch do that.
>
> I could be missing something but
> int_sample_fmt is set to before 16bit and afterwards 32bit
> and alot of things are using this:
>     set_audiodata_fmt(&s->postin, s->int_sample_fmt);
>     set_audiodata_fmt(&s->midbuf, s->int_sample_fmt);
>     set_audiodata_fmt(&s->preout, s->int_sample_fmt);
>
> rematrix seems using these
>             swri_rematrix(s, preout, midbuf, out_count, preout==out);
> ...
>             swri_rematrix(s, midbuf, postin, in_count, midbuf==out);
>
> so i assumed that this patch makes a difference for it. Again i could be
> missing
> something

Yes, but only if sample rates differ the fltp is used instead of s16p
for resampling.

>
> thx
>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> The educated differ from the uneducated as much as the living from the
> dead. -- Aristotle
>