[FFmpeg-devel] Question regarding ogg cover art implementation

Mon Aug 23 06:14:05 EEST 2021

Jesse Obic:
> Hi,
> 
> First time using a mailing list and (properly) working with C, so please
> forgive me if I've done something wrong.
> 
> I'm looking at implementing https://trac.ffmpeg.org/ticket/4448
> <https://trac.ffmpeg.org/ticket/4448>, and I understand what I need to
> do but I'm on the fence about how I should go about doing it.
> 
> For the OGG container, cover art is embedded into the file by adding a
> METADATA_BLOCK_PICTURE tag in the vorbis comment section. The data of
> this tag is a base64 representation of the FLAC embedded image data
> structure, which includes the data of the image itself.
> 
> I looked at how vorbis comments are currently written, and it's lead me
> to `ff_vorbiscomment_write`. It takes in an AVDictionary for metadata,
> and an array of AVChapters for chapter writing. This is where I get a
> bit concerned.
> 
> I see two options as to how I could handle this:
> 
> 
> 1. Place the METADATA_BLOCK_PICTURE tag in the metadata AVDictionary.
> 
> This would be the easiest functionality wise, however it feels like a
> bad idea in my gut because the dictionary would be housing a value that
> is 1.33 * the size of the image(s) being embedded. So for example, if
> you wished to embed a 12 MB image it would require a 16MB malloc which
> doesn't feel right.
> 
> The ogg specification allows for multiple embedded images by specifying
> the same METADATA_BLOCK_PICTURE tag multiple times, which would require
> me to use AV_DICT_MULTIKEY when adding values to the dictionary. I
> couldn't find any examples of using it within the source code for either
> writing or reading (which I will have to handle in oggdec.c as well), so

Isn't reading already supported? See ff_vorbis_comment in
libavformat/oggparsevorbis.c. (These functions use one buffer too much
which could be easily avoided.)

> I'm not sure if other parts of the codebase will handle it well. This is
> probably more paranoia than anything.

Embedded cover arts are exported as streams with the disposition
AV_DISPOSITION_ATTACHED_PIC set; they are not exported as dictionary.

> 
> 2. Change ff_vorbiscomment_write to take in an array of AVStream /
> AVPacket to write it directly to the output IO stream
> 
> As far as I know this would classify as an API/ABI change. However it
> would mean that I don't have to allocate such a huge array upfront, and
> can encode smaller chunks at a time as I write them to the output stream.
> 
> To mitigate it, it could be created as a new function with the
> additional parameters, and the old function pointing towards the new one
> instead. I'm not entirely sure how APIs / ABIs are exposed and consumed
> in C, so I'm not sure if this will fix the issue.
> 

All functions prefixed with "ff_" are internal functions; changing them
does not cause any ABI/API problems.

- Andreas