[FFmpeg-user] should I shoot the dog?

Tue Sep 29 20:44:45 EEST 2020

On 09/29/2020 12:57 PM, Devin Heitmueller wrote:
> On Tue, Sep 29, 2020 at 12:28 PM Mark Filipak (ffmpeg)
> <markfilipak at bog.us> wrote:
-snip-
> I would encourage you stop trying to invent new terminology ...
-snip-

With due respect to you, I'm not trying to invent new terminology. I'm trying to create extended 
terminology that builds on the existing terminology. But we shall see, eh? If what I do is crap, 
then I'll be the first to throw it away. I've thrown away weeks of work in the past.

>> YCbCr420 sampleset:
>>     A sampleset with sample-quads:
>>     .---.---.
>>     ¦ S ¦ S ¦
>>     :---:---:
>>     ¦ S ¦ S ¦
>>     '---'---', reduced to 1/4 chrominance resolution:
>>     .---.---. .-------. .-------.
>>     ¦ Y ¦ Y ¦ ¦       ¦ ¦       ¦
>>     :---:---: ¦Cb     ¦ ¦Cr     ¦
>>     ¦ Y ¦ Y ¦ ¦       ¦ ¦       ¦
>>     '---'---' '-------' '-------', distinguished by binary metadata:
>>     'chroma_format' = 01. (See "Cb420 & Cr420 macroblocks", "Y macroblock".)
>>
>> YCbCr422 sampleset:
>>     A sampleset with sample-quads:
>>     .---.---.
>>     ¦ S ¦ S ¦
>>     :---:---:
>>     ¦ S ¦ S ¦
>>     '---'---', reduced to 1/2 chrominance resolution:
>>     .---.---. .-------. .-------.
>>     ¦ Y ¦ Y ¦ ¦Cb     ¦ ¦Cr     ¦
>>     :---:---: :-------: :-------:
>>     ¦ Y ¦ Y ¦ ¦Cb     ¦ ¦Cr     ¦
>>     '---'---' '-------' '-------', distinguished by binary metadata:
>>     'chroma_format' = 10. (See "Cb422 & Cr422 macroblocks", "Y macroblock".)
>>
>> YCbCr444 sampleset:
>>     A sampleset with sample-quads:
>>     .---.---.
>>     ¦ S ¦ S ¦
>>     :---:---:
>>     ¦ S ¦ S ¦
>>     '---'---', having full chrominance resolution:
>>     .---.---. .---.---. .---.---.
>>     ¦ Y ¦ Y ¦ ¦Cb ¦Cb ¦ ¦Cr ¦Cr ¦
>>     :---:---: :---:---: :---:---:
>>     ¦ Y ¦ Y ¦ ¦Cb ¦Cb ¦ ¦Cr ¦Cr ¦
>>     '---'---' '---'---' '---'---', distinguished by binary metadata:
>>     'chroma_format' = 11. (See "Cb444 & Cr444 macroblocks", "Y macroblock".)
> 
> The diagrams are probably fine, but probably not how I would draw them
> given they blur the relationship between packed and planar.  Either
> it's packed, in which case you should probably show 4:2:2 as YCbYCr,
> or it's planer in which case the Cb/Cr samples should not be adjacent
> per line (i.e. have all the Y lines followed by all the Cr/Cb lines).
> You may wish to take into your account your newfound understanding of
> packed vs. planar to redo these diagrams representing the data as
> either one or the other.

Thank you, Devin. Yes, the diagrams are incomplete. And, yes, I will do diagrams that take planar v. 
packed into account. I will post them when completed. May I also say that I appreciate your 
attitude: That seekers are not stupid or trolls.

Regarding "adjacent per line", the references to "Cb444 & Cr444 macroblocks", "Y macroblock" make 
that clear, but I will revise the note to better indicate that the chroma subsamples are not adjacent.

Regarding "4:2:2 as YCbYCr" packed, I can't fully visualize it because, I think, there should be 4 Y 
samples, not 2. But don't explain it, though. Not yet. Wait until I post a diagram of it and then 
let me know what you think and how that diagram is wrong. :-)

I don't want to exploit your generosity. I'll do the grunt work.

> I would probably also refrain from using the term "macroblock" to
> describe the raw decoded video, as macroblocks are all about how the
> pixels are organized in the compressed domain.  Once they are decoded
> there is no notion of macroblocks in the resulting video frames.

Got it. Regarding "compressed domain" (in which macroblocks are sparse), that's what I initially 
thought, but H.262 pretty strongly implies that macroblocks also apply to raw video. That seems 
logical to me (as datasets prior to compression).

Unrelated: In the glossary, I seek to always have "distinguished by" clauses so that readers can be 
sure about when and where a particular definition applies.

>>> ... If the video frame is interlaced
>>> however, the first chroma sample corresponds to the first two luma
>>> samples on line 1 and the first two luma samples on line 3.  The first
>>> chroma sample on the second line of chroma corresponds with the first
>>> two luma samples on line 2 and the first two luma samples on line 4.
>>
>> I have pictures of those, too. What do you think of the above pictures? Do you a, like them, or b,
>> loathe them, or c, find them unnecessary?
> 
> I would probably see if you can find drawings already out there.  For
> example the Wikipedia article on YUV has some pretty good
> representations for pixel arrangement in various pixel formats.  So
> does the LinuxTV documentation.

Thanks for the tips.

>>> This is known as "interlaced chroma" and a Google search will reveal
>>> lots of cases where it's done wrong and what the effects are.  This is
>>> the article I usually refer people to:
>>>
>>> https://hometheaterhifi.com/technical/technical-reviews/the-chroma-upsampling-error-and-the-420-interlaced-chroma-problem/
>>>
>>> The above article does a really good job explaining the behavior (far
>>> better than I could do in the one paragraph above).
>>
>> I've seen that produce mild combing. I'll read your reference.
> 
> Yes, it often manifests visually as a combing effect, but it's not as
> pronounced as a typical deinterlacing artifact since it's only the
> chroma lines that are rendered in the wrong order.  Hence it is most
> obvious with certain content types like animation (where you are more
> likely to have sharp transitions between certain colors).  Also,
> deinterlacing artifacts are most visible with high motion, whereas
> interlaced chroma artifacts can be visible even with static non-moving
> video.

Indeed. I have cited exactly what you cite in the various definitions, re: motion object edges v. 
stationary object edges, and also the effects on verticals v. diagonals.

-- 
The U.S. political problem? Amateurs are doing the street fighting.
The Princeps Senatus and the Tribunus Plebis need their own armies.