[FFmpeg-devel] [PATCH 03/14] pthread_frame: merge the functionality for normal decoder init and init_thread_copy

Fri Apr 3 09:32:26 EEST 2020

Quoting David Bryant (2020-04-03 07:14:21)
> On 4/2/20 2:13 AM, Anton Khirnov wrote:
> > Quoting David Bryant (2020-04-01 23:35:13)
> >> On 3/31/20 2:47 AM, Anton Khirnov wrote:
> >>>>> Looking at wavpack, the code looks suspicious to me. You allocate one
> >>>>> dsdctx per channel at init, but AFAIU the number of channels can change
> >>>>> at each frame.
> >>>>>
> >>>> Multichannel WavPack consists of sequences of stereo or mono WavPack blocks that include flags indicating whether they are the
> >>>> beginning or end of a "frame" or "packet". This allows the number of channels available to be essentially unlimited, however the
> >>>> total number of channels in a stream may not change. When decoding, if a sequence of blocks generates more or less than the
> >>>> correct number of channels, this is flagged as an error and overrun is prevented (see libavcodec/wavpack.c, line 1499 and line
> >>>> 1621).
> >>> If my reading of the code is correct, line 1499 only checks for overflow
> >>> of the channel count read for the current sequence. That channel count
> >>> is exported to AVCodeContext in the block starting on line 1464 and
> >>> I don't see any checks for whether it is equal to the initial one.
> >>>
> >> Full disclosure: I did not write the WavPack decoder. I understand _how_ it works, but I don't necessarily understand _why_ it
> >> was written the way it was due to gaps in my knowledge of the FFmpeg internals, and there are certainly things in there that
> >> don't look quite right to me.
> >>
> >> That said, I have tested it thoroughly and it handles everything I throw at it, and I have extended it to handle the DSD mode
> >> without anything unexpected happening. And of course it seems to handle extensive fuzzing, and the recent error was related to
> >> my DSD change, suggesting that the core it pretty robust. My inclination has been to not "fix" anything that wasn't broken.
> >>
> >> Yes, that channel count stuff is a little difficult to follow. I could experiment with streams that abruptly change channel
> >> configuration, but again I'm hesitant to change code that's worked so well so long.
> > I can easily get the decoder to crash even in single-threaded mode. Just
> > make the demuxer always say the stream is mono, then try decoding your
> > DSD sample.
> 
> Okay, I reproduced that. However, I'm not sure it's really that serious because the demuxer and the decoder use basically the
> same logic to determine that number of channels, so it would be tricky to craft a file that caused the demuxer to calculate 1
> channel and the decoder to reach a different result, which is probably why the fuzzers have not been able to do so.

I disagree - this is just as serious as any other crash on invalid data.

The decoder must not ever crash for arbitrary input data. It cannot
assume that it's always going to be provided valid input or that it's
always used with some specific demuxer. The caller might not be using
libavformat at all, libavcodec should work with any source of data.

Besides, I can reproduce the crash with no code modification necessary
by muxing your sample into Matroska and then hex-editing the file to say
it has only one channel.

> 
> That said, it would not be hard to have the decoder save away the initial channel count from avctx, never overwrite it, and
> verify that all superblocks match that count. I don't know off hand whether that would break anything though. Is it guaranteed
> that avctx->channels is correct during the call to decode_init(), even if it's being demuxed from Matroska, for example? If so,
> why is the WavPack decoder _ever_ writing to the avctx channel count?

The question is what you define as correct. The initial value provided
by the caller is supposed to be the caller's best guess at the correct
(initial) value. It may come from a demuxer, or from the Delphic Oracle.
The decoder doesn't know.

For some codecs, the relevant parameters are not specified in the
bistream, so they have no choice but to use the caller's value. That is
not the case for WavPack, since the channel count IS written in the
bistream. So I think it's best to ignore the caller-provided value and
always use the one we get from the data.

> 
> >
> > Also, I looked at the specification PDF you linked in the previous email
> > and I see nothing in it that explicitly forbids the channel count from
> > changing from one "superblock" to the next. It seems to me like you
> > could just concatenate wavpack files with different channel counts and
> > get a valid wavpack file.
> >
> You're right, I could not find that being explicitly forbidden. Nevertheless, it _is_ forbidden for any stream parameters to
> change (e.g. data format, sampling rate, channel count or mapping) and both libwavpack and the FFmpeg demuxer enforce this (see
> libavformat/wvdec.c starting at line 211), or at least gracefully fail without crashing.

Forbidden by whom? From what I can see:
- sample format: the wv demuxer does not set it at all, the decoder sets
  it in wavpack_decode_frame() from the block header; also, Michael sent
  a patch recently to deal with a wavpack decoder crash related to
  format changes
- sample rate: set by wv demuxer and then checked that it does not
  change; but the caller is not necessarily using the wv demuxer and the
  decoder will happily change it the same way it does with channel count

> 
> The reason it's never mentioned is probably just because it never occurred to me. None of the source file formats for WavPack
> (e.g., wav, caf, dsdiff) allow format changes and neither do any of the other lossless compressors I am aware of.

Correct, I cannot see any other lossless codecs that would allow this.
But then again all these codecs have global headers. So it's not that
they artificially forbid format changes, but the bistream format is not
able to represent them.

That is different for WavPack - it goes out of its way to make its
blocks independent with no global headers. All other codecs that work
like this allow format changes, so it seems natural that WavPack should
as well.

As I said before - it seems to me that you could simply concatenate two
WavPack files and get a valid WavPack file.

> I can't imagine how that would work,

Why not? We have plenty of decoders where the format or channel count
can change whenever.

> and extensive fuzzing has ensured that pathological files like you
> suggest are unlikely to be the source of exploits.

See the crash above. I think fuzzing is very overhyped these days. It's
a useful tool for sure, but one should not rely purely on fuzzing too
much. It may not cover important parts of the input space because of the
way it is set up. Or there could be exploitable codepaths that require
too specific inputs for a random process to find, but a determined
attacker could construct them from reading the code.

-- 
Anton Khirnov