[FFmpeg-devel] [PATCH v22 02/10] avcodec/evc_parser: Added parser implementation for EVC format

Fri May 19 13:31:44 EEST 2023

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of James
> Almer
> Sent: środa, 10 maja 2023 22:21
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH v22 02/10] avcodec/evc_parser: Added
> parser implementation for EVC format
> 
> On 4/27/2023 9:02 AM, Dawid Kozinski wrote:
> > - Added constants definitions for EVC parser
> > - Provided NAL units parsing following ISO_IEC_23094-1
> > - EVC parser registration
> >
> > Signed-off-by: Dawid Kozinski <d.kozinski at samsung.com>
> > ---
> >   libavcodec/Makefile     |    1 +
> >   libavcodec/evc.h        |  155 ++++
> >   libavcodec/evc_parser.c | 1497
> +++++++++++++++++++++++++++++++++++++++
> >   libavcodec/parsers.c    |    1 +
> >   4 files changed, 1654 insertions(+)
> >   create mode 100644 libavcodec/evc.h
> >   create mode 100644 libavcodec/evc_parser.c
> 
> There seems to have been a regression in this version compared to v20.
> Try to compile with the libxevd decoder disabled and open a raw file (not
mp4).
> 
> Whereas with v20 i was getting
> 
> > Input #0, evc, from 'akiyo_cif.evc':
> >   Duration: N/A, bitrate: N/A
> >   Stream #0:0: Video: evc (Baseline), none, 352x288, 25 fps, 25 tbr,
> > 1200k tbn
> 
> I'm now getting
> 
> > Input #0, evc, from 'akiyo_cif.evc':
> >   Duration: N/A, bitrate: N/A
> >   Stream #0:0: Video: evc (Baseline), none, 555902582x0, 25 fps, 25
> > tbr, 1200k tbn
> 
> It seems that the problem showed up because you moved the parameter sets
to
> stack to skip allocating them, and you no longer check if they exist or
were
> parsed by looking at slice_pic_parameter_set_id and such.
> 
> That aside, i looked at the EVC spec and noticed that the "raw" format in
Annex-
> B is unlike that of H.26{4,5,6}: There's no start code, and instead
there's a NAL
> size prefix, which sounds like the isobmff way of encapsulating NALUs.
> I also noticed that the syntax for nal_unit() contains an
> emulation_prevention_three_byte element, but there's no explanation for it
in
> the semantics section. Its existence in H.26* is to prevent parsers from
> misinterpreting a sequence of two or more zeroes as a potential start
code, but
> that's clearly not the case here, so why is it defined and present at all?
> 
> What this means is that the parser can't be made to assemble an AU. If you
feed
> it data starting from the middle of a NAL, it will not be able to sync to
the start
> of the next NAL because all it looks for is a four byte size value and
will accept
> the very first four bytes its fed as one.
> Since i doubt the spec can be changed at this point to amend this
oversight, the
> AU assembling will have to occur in the evc demuxer, much like we do with
AV1
> (both raw obu and Annex-B formats as defined in the corresponding spec),
and
> the parser be limited to parse fully assembled NALs with
parse_nal_units().

According to your last EVC review:
We have re-implemented the EVC demuxer to assemble Access Units (AUs) and
pass them further, while the EVC parser is limited to parsing complete NAL
units provided in consecutive AUs.

However, before we create a new patchset, we would like to discuss some
things because we believe this solution is not optimal.

According to the EVC documentation, "Each access unit contains a set of VCL
NAL units that together compose a primary coded picture."
This means that to find the boundaries of an AU, we need to identify all the
NAL units that contain data for a single encoded picture.
In our case, to determine whether the next VCL NAL unit contains data for
the same picture as the previous VCL NAL unit or for a different encoded
picture, we need information contained in the NAL units. We need to extract
certain data from NAL units like SPS and PPS, as well as from the Slice
Headers of VCL NAL units.
In other words, this implies that parsing NAL units is required for this
purpose.

It may not be as extensive parsing as in the EVC parser, but still, there is
a significant amount of parsing involved, which adds some overhead.
Another question is whether the demuxer is the appropriate place for parsing
NAL units? I guess it's not.

Perhaps it would be better if the demuxer prepared complete NAL units
instead of complete AUs. The EVC parser would then receive complete NAL
units, and since it already parses them, we have the necessary information
for preparing complete AUs at no extra cost.

Preparing complete NAL units is definitely simpler than preparing complete
AUs. It only requires finding the length of the NAL unit and putting that
amount of data from the input into the AVPacket.

Please let me know your thoughts on this.

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://protect2.fireeye.com/v1/url?k=ecf2ce2f-8d79db0f-ecf34560-
> 74fe485fb347-2d1651bc168e08a9&q=1&e=5d9c25e1-dfe9-4e94-af2d-
> a9f88fe66d4f&u=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpe
> g-devel
> 
> To unsubscribe, visit link above, or email ffmpeg-devel-request at ffmpeg.org
> with subject "unsubscribe".