[FFmpeg-devel] [PATCH 1/2] avcodec/vp9: Check in decode_tiles() if there is data remaining
Michael Niedermayer
michael at niedermayer.cc
Sat Aug 11 13:55:25 EEST 2018
On Fri, Aug 10, 2018 at 10:41:21PM -0400, Ronald S. Bultje wrote:
> Hi,
>
> On Fri, Aug 10, 2018 at 6:54 PM, Michael Niedermayer <michael at niedermayer.cc
> > wrote:
>
> > On Fri, Aug 10, 2018 at 02:16:56AM -0400, Ronald S. Bultje wrote:
> > > My objection:
> > > if a file has exactly symbols of 8 bits in arithdata, then after all
> > this,
> > > the arithcoder will signal empty and EOF, even though no error occured.
> > > Imagine a bitcoder (non-arith) of this situation.
> > [..]
> > > After get_bits(gb, 8),
> > > the data pointer will have reached the end, and the bits_left is 0, but
> > > that does not indicate an error, quite the contrary. It just means that
> > the
> > > byte boundary happened to align with the exact end of the file. This can
> > > happen.
> >
> > Yes but thats not something we do with bitcoders.
> > bits_left being 0 does indicate an error when the next
> > step unconditionally reads 1 or more bits.
> > The same was the intend of the patch here, check if the end was reached
> > before more reads.
> > vp56_rac_renorm() always reads data if(bits >= 0 && c->buffer < c->end)
> >
> > I had thought that will get executed in all cases, i may have missed
> > something and the check should be moved by a few lines
>
>
> For example, you're checking the arithcoder in pass=2 also, but no
> bitstream reading occurs in that pass...
this is what i meant by "should be moved by a few lines"
that is as in here:
@@ -1306,6 +1306,9 @@ static int decode_tiles(AVCodecContext *avctx,
decode_sb_mem(td, row, col, lflvl_ptr,
yoff2, uvoff2, BL_64X64);
} else {
+ if (td->c->end <= td->c->buffer && td->c->bits >= 0) {
+ return AVERROR_INVALIDDATA;
+ }
decode_sb(td, row, col, lflvl_ptr,
yoff2, uvoff2, BL_64X64);
>
> > My suggestion:
> > > add an eof flag to the arithcoder. When we have reached the above
> > condition
> > > where new data is needed but not present, simply set the EOF flag, and
> > > check that for errors. If it's set, you can error out.
> >
> > This is possible but it will make the code likely slower as the reading
> > happens in a av_always_inline function which itself is used by several
> > av_always_inline functions. So this else context->flag = 1;
> > could end up being added to many points in the binary.
> >
> > I can do this of course if you prefer it just seems sub-optimal to me
> > unless you have some idea on how to do this without increasing the
> > complexity of the rac functions
>
>
> But it's an error branch, it is not normally executed. It just makes the
> binary a few bytes larger.
The condition has to be checked even if there is no error.
Currently the existing branch is a read vs no read check that combines
both the need to read and the end of buffer checks.
with the code to set the flag there would be 3 instead of 1
if (need to read)
if(space left)
do read
else
set flag
vs
if (need to read and space left)
do read
That means the condition is split in 2 branches instead of 1 and
theres a else
and this is one of the most inner and often executed functions
>
> Here's another way of looking at this, which isn't so much about exact
> bytes and instructions (which will all change in the noise range), but
> about code maintainability: you're likely going to want to do fixes like
> this for vp8, vp9, vp56, etc., and similar for users of other arithcoders
> like cabac or the multisymbol one in daala/av1 and so on. Each of these
> decoders are in and by themselves pretty complex creatures, and the people
> understanding what's going on under the hood are few and far between, and
> half of them are no longer active. I'm not saying you're not intelligent,
> but I do think bugs like the one above can creep in because you're not
> intimately familiar with all bells and whistles in each decoder codebase.
> That being true, a case could be made that noise-range changes in
> instruction count or byte size is less important than ease of maintenance.
Of course iam not intimately familiar with every decoder. Noone is or could
be. But still ive found this pass=2 bug you talk about before you mentioned
"pass 2" the first time. So either you saw it before and did not mentioning
it directly (which sort of makes your reviews not that usefull)
or your review is just making us find bugs by randomly poking each other.
And Of course changing the code slightly leads to noise in benchmarks.
But the point is that adding operations in inner loops makes them more
complex thus slower on average. turning a single if() into a nested
if() if/else will make the code slower except by chance of the noise.
Also not only is there more code in the path exeucted. There is more
pressure on the code cache (which is small) as it needs to hold also
some of the not executed instructions that are branched over. These
instructions increase as there are more branches and a flag.
This is a very often executed code.
And the various branch prediction caches and buffers also will see
additional pressure for each of the branches added, there are 2 and
as this is a av_always_inline function used by av_always_inline functions
this amplifies the branches by a bunch so there could be quite a few
that get added
> An EOF flag is easy to maintain and hard to misread, whereas the example
> above demonstrates that the inferred checks are brittle and will be much
> harder to debug if they do make it into our codebase because someone forgot
> to review them.
In the previous mail ive suggested using a vp8_rac_is_end() function for
teh test. This resolves the issue you describe here.
You know, you have snipped the suggested solution and now present the
problem it is intended to solve. Iam not sure what to make of this
>
> So it's a balance between priorities. Does every cycle count? Or is
> maintenance more important? Each of us is correct in our point, nobody is
> wrong, but we need to balance which one is more important...
See above, the concern of maintainability is solved by using a vp8_rac_is_end()
function instead of the litteral harder to maintain check. In fact such
function is cleaner than directly accessing a flag inside the arith coder
context fron outside the arithcoder code.
And at the same time this avoids the concern about speed loss.
So there seems no need to compromise between priorities. Its easy to have both
fast code and maintainable code at the same time.
But if you still want a flag, i can add one for vp9.
I thought you really cared about your vp9 decoders speed though ...
Thanks
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
No great genius has ever existed without some touch of madness. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20180811/696125a6/attachment.sig>
More information about the ffmpeg-devel
mailing list