[FFmpeg-devel] [PATCH v1 5/8] avformat/mov_muxer: Extended MOV muxer to handle APV video content
Dawid Kozinski/Multimedia (PLT) /SRPOL/Staff Engineer/Samsung Electronics
d.kozinski at samsung.com
Thu Apr 24 11:59:16 EEST 2025
> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of James
> Almer
> Sent: środa, 23 kwietnia 2025 16:44
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH v1 5/8] avformat/mov_muxer: Extended
> MOV muxer to handle APV video content
>
> On 4/23/2025 11:13 AM, Dawid Kozinski wrote:
> > @@ -2757,6 +2789,8 @@ static int mov_write_video_tag(AVFormatContext
> *s, AVIOContext *pb, MOVMuxContex
> > }
> > else if (track->par->codec_id ==AV_CODEC_ID_EVC) {
> > mov_write_evcc_tag(pb, track);
> > + } else if (track->par->codec_id ==AV_CODEC_ID_APV) {
> > + mov_write_apvc_tag(pb, track);
> > } else if (track->par->codec_id == AV_CODEC_ID_VP9) {
> > mov_write_vpcc_tag(mov->fc, pb, track);
> > } else if (track->par->codec_id == AV_CODEC_ID_AV1) { @@ -6713,6
> > +6747,18 @@ int ff_mov_write_packet(AVFormatContext *s, AVPacket *pkt)
> > memset(trk->vos_data + size, 0, AV_INPUT_BUFFER_PADDING_SIZE);
> > }
> >
> > + if (par->codec_id == AV_CODEC_ID_APV && !trk->vos_len) {
> > + ret = ff_isom_create_apv_dconf_record(&trk->vos_data, &trk-
> >vos_len);
> > + if (!trk->vos_data) {
> > + ret = AVERROR(ENOMEM);
> > + goto err;
> > + }
> > + }
> > +
> > + if (par->codec_id == AV_CODEC_ID_APV && trk->vos_len) {
> > + ret = ff_isom_fill_apv_dconf_record(trk->vos_data, pkt->data, size);
> > + }
> > +
> > if (par->codec_id == AV_CODEC_ID_AAC && pkt->size > 2 &&
> > (AV_RB16(pkt->data) & 0xfff0) == 0xfff0) {
> > if (!trk->st->nb_frames) {
>
> Instead of this, add APV to the list in
> https://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavformat/movenc.c;h=4bc8b
> d1b2ab765c2b9f5f5dfc2dcb77361f2b944;hb=HEAD#l6697
> so the first packet is always copied to trk->vos_data in case
> par->extradata is not set.
>
> After that, ff_isom_write_apvc() can either write the extradata as is if it's
> already a configuration record, or generate it if it's just a packet of PBUs (See
> ff_isom_write_hvcc()).
Previously, I had this implemented as follows:
if ((par->codec_id == AV_CODEC_ID_DNXHD ||
par->codec_id == AV_CODEC_ID_H264 ||
par->codec_id == AV_CODEC_ID_HEVC ||
par->codec_id == AV_CODEC_ID_VVC ||
par->codec_id == AV_CODEC_ID_VP9 ||
par->codec_id == AV_CODEC_ID_EVC ||
par->codec_id == AV_CODEC_ID_APV ||
par->codec_id == AV_CODEC_ID_TRUEHD) && !trk->vos_len &&
AV_CODEC_ID_APV was included in the list at
https://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavformat/movenc.c;h=4bc8bd1b2ab765c2b9f5f5dfc2dcb77361f2b944;hb=HEAD#l6697.
It was exactly as you described, which caused the first packet to be copied to trk->vos_data when par->extradata was not set.
However, I had to change this because I was unable to fill the APVDecoderConfigurationRecord structure in the ff_isom_write_apvc() function based on the data from trk->vos_data (data from a single packet - data only from the first AU). The structure is defined as follows:
https://github.com/AcademySoftwareFoundation/openapv/blob/main/readme/apv_isobmff.md.
The structure contains the fields number_of_configuration_entry and number_of_frame_info[i]. It seems to me that filling in correctly the APVDecoderConfigurationRecord structure requires analyzing the entire stream.
number_of_configuration_entry - this value corresponds to the number of different PBU types in all AUs (PBU types for frames: 1 - primary frame; 2 - non-primary frame; 25 - preview frame; 26 - depth frame; 27 - alpha frame).
number_of_frame_info[i] - indicates the number of variations of the frame header information; for each configuration, there can be one or more variations of frame_info.
Examples:
--------------
Case 1:
======
If we have a stream containing only primary frames and if each primary frame has the same data in frame_info, then we have:
number_of_configuration_entry = 1
number_of_frame_info[0] = 1 (if all primary frames have the same frame_info)
Case 2:
======
If we have a stream containing both primary frames and non-primary frames, and if each primary and non-primary frame has the same data in frame_info, then we have:
number_of_configuration_entry = 2
number_of_frame_info[0] = 1 (if all primary frames have the same frame_info)
number_of_frame_info[1] = 1 (if all non-primary frames (type: 2) have the same frame_info)
Case 3:
======
number_of_configuration_entry = 1
number_of_frame_info[0] = 2 (if primary frames have 2 kinds of frame_info)
To correctly fill in number_of_configuration_entry and number_of_frame_info[i], we need to check all AUs one by one. The number_of_configuration_entry will equal the number of PBU types in the stream (only those that contain frame_info, such as 1, 2, 25, 26, 27). Therefore, if the stream contains PBU types 1 and 2, then number_of_configuration_entry = 2. For each of these types, there can be different number_of_frame_info. If in the stream we have a certain number of primary frame PBUs with Full HD resolution, followed by a certain number of primary frame PBUs with UHD resolution, then for the configuration related to the primary frame, we will have number_of_frame_info[i] = 2.
Please correct me if I'm wrong.
More information about the ffmpeg-devel
mailing list