[FFmpeg-devel] [PATCH v11 1/3] libavcodec/dnxucdec: DNxUncompressed decoder

Fri Oct 11 04:08:23 EEST 2024

On 10.10.24 14:13, Anton Khirnov wrote:
> Quoting Martin Schitter (2024-10-10 04:58:40)
>> +static int pass_though(AVCodecContext *avctx, AVFrame *frame, const AVPacket *avpkt)
>> +{
>> +    /* there is no need to copy as the data already match
>> +     * a known pixel format */
>> +
>> +    frame->buf[0] = av_buffer_ref(avpkt->buf);
>> +
>> +    if (!frame->buf[0]) {
>> +        return AVERROR(ENOMEM);
>> +    }
>> +
>> +    return av_image_fill_arrays(frame->data, frame->linesize, avpkt->data,
>> +                               avctx->pix_fmt, avctx->width, avctx->height, 1);
>> +}
> 
> I've already commented on this in a previous version - this should be
> directly exported as rawvideo by the demuxer rather than requiring a
> special encoder.

hmm -- you touched this topic once in an annotation one month ago:

   On 09.09.24 12:56, Anton Khirnov wrote:
   > Quoting Martin Schitter (2024-09-08 20:41:38)
   >> diff --git a/libavcodec/dnxucdec.c b/libavcodec/dnxucdec.c
   >> new file mode 100644
   >> index 0000000..455c374
   >> --- /dev/null
   >> +++ b/libavcodec/dnxucdec.c
   >> @@ -0,0 +1,495 @@
   >> +/*
   >> + * Avid DNxUncomressed / SMPTE RDD 50 demuxer
   >
   > This says it's a demuxer, while it's implemented as a decoder.
   >
   > I'm also wondering if this shouldn't be handled as demuxer exporting
   > raw video, at least in some of cases if not all.

And I quickly responded:

   On 10.09.24 13:58, martin schitter wrote:
   > When I started the work, I also thought, it will not require much
   > more than minimal modifications of MXF related raw data transport
   > details, but during reverse engineering the actually used dense
   > bitpacking turned out to be more complicated.
   >
   > The codec section should fit rather well for this component,
   > although large parts of it could be also categorized just as
   > bitstream filter.

Although I accepted all your other suggestions for improvement and 
rewrote lots of code, I didn't change my mid about this particular 
topic. And I also didn't get any response or further defense of your 
point of view concerning this issue until now.

I really think, you are wrong in this judgment.

There is a good reason, why this new standards about uncompressed 
payloads were published recently. They describe embedding conditions, 
are far more complex than just adding simple raw data image streams. 
Sure, there is no fancy compression or other kind of rocket science 
involved, but the specified range of possible configuration variants, 
defined pixel formats and image layer combinations goes for beyond any 
simplicity expected at first sight.

I'm really curious, if this other effort, implementing the uncompressed 
MP4 payload counterpart, will come to a different conclusion in this 
regard and get rid of any codec-like file format specific intermediate 
layer by just extending the [de]muxer?

But for DNxUncompressed I really think, the current implementation 
structure makes sense -- although it just supports the most elementary 
core features until now and not all those other rarely used 
capabilities, which would require even more substantial adaptation efforts.

>> +static int float2planes(AVCodecContext *avctx, AVFrame *frame, const AVPacket *pkt)
>> +{
>> +    int lw;
>> +    const size_t sof = 4;
>> +
>> +    lw = frame->width;
>> +
>> +    for(int y = 0; y < frame->height; y++){
>> +        for(int x = 0; x < frame->width; x++){
>> +            memcpy(&frame->data[2][sof*(lw*y + x)], &pkt->data[sof* 3*(lw*y + x)], sof);
>> +            memcpy(&frame->data[0][sof*(lw*y + x)], &pkt->data[sof*(3*(lw*y + x) + 1)], sof);
>> +            memcpy(&frame->data[1][sof*(lw*y + x)], &pkt->data[sof*(3*(lw*y + x) + 2)], sof);
>> +        }
>> +    }
> 
> Same here, deinterleaving packed to planar is a job for swscale.

Some of these rather inefficient interleave<->planar conversions where 
just necessary in practice, because swscale wasn't able to figure out a 
working pipeline construction otherwise, although in theory the affected 
pixel format closer to the data input should be supported and work as 
well!:(

At the end I just looked for something that simply works in real world, too!

>> +    return pkt->size;
>> +}
>> +
>> +static int half_add_alpha(AVCodecContext *avctx, AVFrame *frame, const AVPacket *pkt)
>> +{
>> +    /* ffmpeg doesn't provide any Float16 RGB pixel format without alpha channel
>> +     * right now. As workaround we simply add an opaque alpha layer. */
> 
> So why not add the format then?

No! -- I definitely don't want to add another ad-hoc provisional 
solution to swscale! :(

That's something, which someone with more experience and familiarity 
with all the surrounding code base has to create and maintain.

I don't think, we need an endless amount of different pixel formats.
A useful subset, that really works and provides efficient exchange with 
hardware acceleration solutions are enough. More exotic redundant 
variants should be better translated immediately on import/export (e.g. 
in codecs ;)).

But at least this minimal subset, which is required to represent any 
common data resolutions (including floating point formats) without lossy 
conversion, should be available out of the box.

Martin