[FFmpeg-user] AVFrame, AV_NUM_DATA_POINTERS

Mark Filipak (ffmpeg) markfilipak at bog.us
Mon Sep 28 23:54:42 EEST 2020


On 09/28/2020 03:49 PM, James Darnley wrote:
> On 28/09/2020, Mark Filipak (ffmpeg) <markfilipak at bog.us> wrote:
>> On 09/27/2020 03:31 PM, James Darnley wrote:
>>> On 27/09/2020, Mark Filipak (ffmpeg) <markfilipak at bog.us> wrote:
>>>> 2, Are the width & height indexes in bytes or samples? If bytes, how are
>>>> 8-bit v. 10-bit v. 12-bit
>>>> pixel formats handled at the index generation end?
>>>
>>> Width and height are given in pixels.  How that relates to bytes in
>>> memory depends on the pixel format.  Different planes can have
>>> different sizes like in the extremely common yuv420p.
>>
>> Ah-ha #1. I think you've answered another question. The planes that ffmpeg
>> refers to are the Y, Cb,
>> and Cr samples, is that correct?
> 
> If the pixel format is a YCbCr format, such as yuv420p, then yes.  If
> it matters to you: I am not sure of the exact order of the planes.  It
> is probably documented in the pixel format header.

Yes, I'm familiar with pixfmt.h
I find this surprising. But then, ffmpeg is full of surprises, eh?

I anticipated there would be a single ffmpeg video processing/pipeline format that decoders would 
provision. Many, differing pixel formats seems a point of complexity that promotes error.

Regarding the order of the planes, I suspect there is none. I've not examined the source code, but I 
suspect that 3 unique buffer pointers are supplied to the decoder. Also surprising is that the word 
"plane" is apparently used for both video and audio.

> RGB is also available and maybe some other more niche ones.  Oh, alpha
> channels too.  Again see the pixel format.
> 
>> So, I'm going to make some statements that can be confirmed or refuted --
>> making statements rather
>> than asking questions is just part of my training, not arrogance. Statements
>> are usually clearer.
>> I'm trying to nail down the structures for integration into my glossary.
>>
>> For YCbCr420, 8-bit, 720x576 (for example), the planes are separate and the
>> structures are:
>> Y: 2-dimensional, 720x576 byte array.
>> Cb: 2-dimensional, 180x144 byte array.
>> Cr: 2-dimensional, 180x144 byte array.
> 
> What do you mean by 2 dimensional?

Width x Height.

>  IMO you should think of the planes
> as a single block of memory each.  The first pixel will be the first
> byte.  In your example the first plane in a yuv420p picture will be at
> least 720*576 bytes long.  The two chroma planes will have 360x288
> samples each with their own linesize.  I'm not sure how you got
> 180x144.  The subsampling is only a factor of 2 for 4:2:0.

I don't know what you mean. In 4:2:0 format, there are 1 each of Cb & Cr for every 4 Y.
180x144 = (720/2)x(576/2). ...Argh! Wrong! ...Duh?

Of course I should have written 360x288 -- my bad. 8-] ...brain fart! (How embarrassing.)

> The linesize can make it larger than that.  The linesize also says how
> many bytes are between the start of a row and the start of the next.
> 
> The same color space and subsampling could be expressed in a few
> different ways.  Again it is the pixel format which says how the data
> is laid out in memory.  You will probably have yuv420p
> 
>> Specifically, the decoder's output is not in macroblock format, correct? The
>> reason I ask for
>> confirmation is that H.262 implies that even raw pictures are in macroblock
>> format, improbable as
>> that may seem.
> 
> An AVFrame might not come from a source that has macroblocks.  I have
> no idea what H.262 says.

Okay, some architecture, okay? I'm interested in how ffmpeg programmatically represents frames 
during processing. (Frames are represented as (W/16)*(H/16) number of macroblocks in MPEG-PSs.)

>>>   Byte order
>>> (endianess) of larger samples depends on the pixel format (but it is
>>> usually native).  The number of bytes used for a sample is given in
>>> the pixel format.  The bits are in the low N bits.
>>
>> Ah-ha #2. I think you've answered yet another question: The arrays are
>> bytes, not bits, correct? So,
>> going from 8-bit samples to 10-bit samples doubles the sizes of the arrays,
>> correct?
> 
> You cannot easily address bits in C and ffmpeg doesn't bother with bit
> fields.  yuv420p10 will use 16-bit words with the samples in the low
> 10 bits and the high 6 are zero.  This does have the effect of
> doubling the size of the memory buffers.
> 
> P.S.   When I say pixel format I mean the specific ffmpeg feature.

Understood.

Thanks again, James. I'm going to assume that Y, Cb, and Cr are buffered separately, i.e. that 
there's no frame struct per se.

I think that wraps it up vis-a-vis ffmpeg internal representation of video.

-- 
The U.S. political problem? Amateurs are doing the street fighting.
The Princeps Senatus and the Tribunus Plebis need their own armies.


More information about the ffmpeg-user mailing list