[FFmpeg-devel] [PATCH 1/3] 4xm: prevent overflow during bit rate calculation

Fri Dec 16 01:00:24 EET 2016

On Thu, Dec 15, 2016 at 03:57:57PM -0500, Ronald S. Bultje wrote:
> Hi,
> 
> On Thu, Dec 15, 2016 at 9:28 AM, Michael Niedermayer <michael at niedermayer.cc
> > wrote:
> 
> > On Thu, Dec 15, 2016 at 08:02:52AM -0500, Ronald S. Bultje wrote:
> > > Hi,
> > >
> > > On Wed, Dec 14, 2016 at 7:11 PM, Andreas Cadhalpun <
> > > andreas.cadhalpun at googlemail.com> wrote:
> > >
> > > > On 14.12.2016 02:46, Ronald S. Bultje wrote:
> > > > > Not wanting to discourage you, but I wonder if there's really a
> > point to
> > > > > this...?
> > > >
> > > > These patches are prerequisites for enforcing the validity of codec
> > > > parameters [1].
> > > >
> > > > > I don't see how the user experience changes.
> > > >
> > > > Users won't see ffmpeg claiming nonsense bit rates like
> > -1184293205235990
> > > > kb/s
> > > > anymore.
> > >
> > >
> > > I don't think you understand my question.
> > >
> > > - does this belong in 4xm.c? (Or in generic code? Or in the app?)
> > > - should it return an error? (Or just clip parameters? Or ignore the
> > > invalid value?)
> > >
> > > > This isn't specifically intended at this patch, but rather at the sort
> > of
> > > > > rabbit hole this change might lead to,
> > > >
> > > > I have a pretty good map of this rabbit hole (i.e. lots of samples
> > > > triggering
> > > > UBSan errors) and one day I might try to dig it up, but for now I'm
> > > > limiting
> > > > myself to the codec parameters.
> > >
> > >
> > > I'm not saying mplayer was great, but one of the principles I believe we
> > > always copied from them was to try to play files to the best of our
> > > abilities and not error out at the first convenience. This isn't just a
> > > theoretical thing, a lot of people credited mplayer with playing utterly
> > > broken AVI files that probably even ffmpeg rejects. What ffmpeg added on
> > > top of that is to make a project maintainable by not being an utter
> > > crapshoot.
> > >
> > > This isn't 4xm.c-specific, this is a general philosophical question:
> > > - should we error out?
> > > - should this be in generic code if we're likely to repeat such checks
> > all
> > > over the place?
> > >
> > > > which would cause the code to be uber-full of such checks, none of
> > which
> > > > > really have any significance. But maybe others disagree...
> > > >
> > > > Not relying on undefined behavior is a significant improvement. And
> > doing
> > > > these checks consequently where the values are set makes it possible
> > > > for other code to rely on their validity without further checks.
> > >
> > >
> > > Unless "UB" leads to actual bad behaviour, I personally don't necessarily
> > > care. I'm very scared that when you go beyond codec parameters, and you
> > > want to check for overflows all over the place, we'll never see the end
> > of
> > > it...
> > >
> >
> > > I'd appreciate if others could chime in here, I don't want to carry this
> > > argument all by myself if nobody cares.
> >
> > as you are asking for others oppinion
> > valid C code must not trigger undefined behavior
> 
> 
> So, I asked on IRC, here's 3 suggestions from Roger Combs:
> 
> - in case we want to be pedantic (I personally don't care, but I understand
> other people are), it might make sense to just make these members
> (channels, block_align, bitrate) unsigned. That removes the UB of the
> overflow, and it fixes the negative number reporting in client apps for
> bitrate, but you can still have positive crazy numbers and it doesn't
> return an error.
> - if for whatever reason some things cannot be done in generic code or by
> changing the type (this should really cover most cases), and we want
> specific overflow checks, then maybe we want to have some generic helper
> macros that make them one-liners in decoders. This would return an error
> along with fixing the UB.
> - have overflow-safe multiplication functions instead of checking each
> argument before doing a multiply, like __builtin_mul_overflow, and then
> return INT64_MAX if it would overflow inside that function. This fixes
> display of numbers in client applications and the UB, but without returning
> an error.
> 
> What I want most - and this comment applies to all patches of this sort -
> is to have something that we can all agree is OK for pretty much any
> decoder out there without significant overhead in code (source - not
> binary) size. This way, you have a template to work from and fix specific
> instances without us having to argue over every single time you do a next
> run with ubsan. I personally like suggestion (1), unsigned is a pretty good
> type for things that shouldn't have negative values. WDYT?

unsigned is not unproblematic.
If you use unsigned values, all computations touched are unsigned (
or a larger data type). If unsigned they cannot be negative. This
requires great care buffer end, index and start computations can become
invalid very easily by changing the signedness of a type involved.

also unsigned makes detecting overflows impossible with existing
tools i know of. So while now if a decoder simply gives bad output with
no clear hints as to why, one can use ubsan, asan, valgrind, ...
and it will show integer overflows if thats the cause. But with
unsigned values nothing would be shown in a case that a affected
computation overflow was involved.

IMO if you compute the bitrate out of channels, bits per sample and
sample rate.
its correct to check that the result is representable in the bitrate
field if prior checks on the inputs are not sufficient.

If computations involve for example array indexing or division
checks cant be skiped as they would crash. Should we treat integer
overflow differntly ?
IMO the professional thing to do is to set the bitrate correctly or
if it is not possible to (not set it / set it to "unknown" / set it
by other means)
setting it to some overflowed value even if it doesnt matter for
the functioning of the demuxer is at least ugly

Anyway, what i think is important really is to understand the
extend of this. Are we dealing with 100, 1000, 10000 individual issues?
i think if its 100 or rather 100 checks would fix all issues then
simply adding checks should not really be a problem.

Considering that iam fixing these kind of issues since a while and
people do not report astronomic numbers of such issues to me it may
be that this whole problem is not of a magnitude that requires
extreem meassures.
But i may be wrong i have not myself tried to analyze how many such
issues exist.

What i do know though is there are some of these integer overflows
in  DSP code with fuzzed samples, and i do not know of a solution
for the DSP case that i like. Its almost as if leaving the DSP code
as is with the overflows is least bad.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

It is what and why we do it that matters, not just one of them.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20161216/cc727751/attachment.sig>