[FFmpeg-devel] [PATCH 03/14] pthread_frame: merge the functionality for normal decoder init and init_thread_copy

Mon Apr 6 07:11:17 EEST 2020

On 4/2/20 11:32 PM, Anton Khirnov wrote:
> Quoting David Bryant (2020-04-03 07:14:21)
>> On 4/2/20 2:13 AM, Anton Khirnov wrote:
>>> Quoting David Bryant (2020-04-01 23:35:13)
>>>> On 3/31/20 2:47 AM, Anton Khirnov wrote:
>>>>>>> Looking at wavpack, the code looks suspicious to me. You allocate one
>>>>>>> dsdctx per channel at init, but AFAIU the number of channels can change
>>>>>>> at each frame.
>>>>>>>
>>>>>> Multichannel WavPack consists of sequences of stereo or mono WavPack blocks that include flags indicating whether they are the
>>>>>> beginning or end of a "frame" or "packet". This allows the number of channels available to be essentially unlimited, however the
>>>>>> total number of channels in a stream may not change. When decoding, if a sequence of blocks generates more or less than the
>>>>>> correct number of channels, this is flagged as an error and overrun is prevented (see libavcodec/wavpack.c, line 1499 and line
>>>>>> 1621).
>>>>> If my reading of the code is correct, line 1499 only checks for overflow
>>>>> of the channel count read for the current sequence. That channel count
>>>>> is exported to AVCodeContext in the block starting on line 1464 and
>>>>> I don't see any checks for whether it is equal to the initial one.
>>>>>
>>>> Full disclosure: I did not write the WavPack decoder. I understand _how_ it works, but I don't necessarily understand _why_ it
>>>> was written the way it was due to gaps in my knowledge of the FFmpeg internals, and there are certainly things in there that
>>>> don't look quite right to me.
>>>>
>>>> That said, I have tested it thoroughly and it handles everything I throw at it, and I have extended it to handle the DSD mode
>>>> without anything unexpected happening. And of course it seems to handle extensive fuzzing, and the recent error was related to
>>>> my DSD change, suggesting that the core it pretty robust. My inclination has been to not "fix" anything that wasn't broken.
>>>>
>>>> Yes, that channel count stuff is a little difficult to follow. I could experiment with streams that abruptly change channel
>>>> configuration, but again I'm hesitant to change code that's worked so well so long.
>>> I can easily get the decoder to crash even in single-threaded mode. Just
>>> make the demuxer always say the stream is mono, then try decoding your
>>> DSD sample.
>> Okay, I reproduced that. However, I'm not sure it's really that serious because the demuxer and the decoder use basically the
>> same logic to determine that number of channels, so it would be tricky to craft a file that caused the demuxer to calculate 1
>> channel and the decoder to reach a different result, which is probably why the fuzzers have not been able to do so.
> I disagree - this is just as serious as any other crash on invalid data.
>
> The decoder must not ever crash for arbitrary input data. It cannot
> assume that it's always going to be provided valid input or that it's
> always used with some specific demuxer. The caller might not be using
> libavformat at all, libavcodec should work with any source of data.
>
> Besides, I can reproduce the crash with no code modification necessary
> by muxing your sample into Matroska and then hex-editing the file to say
> it has only one channel.

Yes, I have reproduced this crash even without Matroska. I was working on a solution but you beat me to it. Thanks!

>
>> That said, it would not be hard to have the decoder save away the initial channel count from avctx, never overwrite it, and
>> verify that all superblocks match that count. I don't know off hand whether that would break anything though. Is it guaranteed
>> that avctx->channels is correct during the call to decode_init(), even if it's being demuxed from Matroska, for example? If so,
>> why is the WavPack decoder _ever_ writing to the avctx channel count?
> The question is what you define as correct. The initial value provided
> by the caller is supposed to be the caller's best guess at the correct
> (initial) value. It may come from a demuxer, or from the Delphic Oracle.
> The decoder doesn't know.
>
> For some codecs, the relevant parameters are not specified in the
> bistream, so they have no choice but to use the caller's value. That is
> not the case for WavPack, since the channel count IS written in the
> bistream. So I think it's best to ignore the caller-provided value and
> always use the one we get from the data.

Agreed.

>
>>> Also, I looked at the specification PDF you linked in the previous email
>>> and I see nothing in it that explicitly forbids the channel count from
>>> changing from one "superblock" to the next. It seems to me like you
>>> could just concatenate wavpack files with different channel counts and
>>> get a valid wavpack file.
>>>
>> You're right, I could not find that being explicitly forbidden. Nevertheless, it _is_ forbidden for any stream parameters to
>> change (e.g. data format, sampling rate, channel count or mapping) and both libwavpack and the FFmpeg demuxer enforce this (see
>> libavformat/wvdec.c starting at line 211), or at least gracefully fail without crashing.
> Forbidden by whom? From what I can see:
> - sample format: the wv demuxer does not set it at all, the decoder sets
>   it in wavpack_decode_frame() from the block header; also, Michael sent
>   a patch recently to deal with a wavpack decoder crash related to
>   format changes
> - sample rate: set by wv demuxer and then checked that it does not
>   change; but the caller is not necessarily using the wv demuxer and the
>   decoder will happily change it the same way it does with channel count
>
>> The reason it's never mentioned is probably just because it never occurred to me. None of the source file formats for WavPack
>> (e.g., wav, caf, dsdiff) allow format changes and neither do any of the other lossless compressors I am aware of.
> Correct, I cannot see any other lossless codecs that would allow this.
> But then again all these codecs have global headers. So it's not that
> they artificially forbid format changes, but the bistream format is not
> able to represent them.

I disagree with that. It would be trivial to add a new STREAMINFO block to a FLAC file and change the format after that. This is
not specifically prohibited in the FLAC specs, but it's sort of implied when it says that the initial STREAMINFO block has
"information about the whole stream".

WAV files could certainly have multiple <fmt> and <data> pairs in a file, and I could not find any mention of this being
prohibited, but you would be hard pressed to find a program that would deal with that correctly.

And WavPack actually does have a global header of sorts. The first block that contains audio determines the format of the entire
file. I could not find that explicitly stated anywhere, but it is how it works, and changing the format in the middle of the
file is undefined behavior (although it should not crash, obviously).

>
> That is different for WavPack - it goes out of its way to make its
> blocks independent with no global headers. All other codecs that work
> like this allow format changes, so it seems natural that WavPack should
> as well.
>
> As I said before - it seems to me that you could simply concatenate two
> WavPack files and get a valid WavPack file.

Well, you actually can't produce a valid file that way because the WavPack blocks include information required for seeking. But
if you got around that it would still not be a valid file, by convention.

>
>> I can't imagine how that would work,
> Why not? We have plenty of decoders where the format or channel count
> can change whenever.

Well, there are all sorts of reasons why this would not be appropriate for native WavPack files, and what they are used for
today. For example, if the sample rate changed it would be impossible to determine the duration without scanning the whole file.
Also, libwavpack does not have the ability to generate such files nor the APIs that would allow decoding them in any reasonable way.

That said, I am only referring to the native WavPack container file format (.wv and .wvc). If WavPack streams are embedded in
another container (e.g., Matroska) then this is fine (assuming the container format is happy with it) because, as you say, each
WavPack audio block is independent.

>
>> and extensive fuzzing has ensured that pathological files like you
>> suggest are unlikely to be the source of exploits.
> See the crash above. I think fuzzing is very overhyped these days. It's
> a useful tool for sure, but one should not rely purely on fuzzing too
> much. It may not cover important parts of the input space because of the
> way it is set up. Or there could be exploitable codepaths that require
> too specific inputs for a random process to find, but a determined
> attacker could construct them from reading the code.
>
Of course fuzzing cannot find everything, but my experience has been that fuzzing finds things that I would _never_ find by just
looking at the code.

And I was certainly not suggesting that one should just write sloppy code knowing that all the bugs will be found with fuzzing.

-David