[FFmpeg-devel] BUG in use of extradata and extradata_size with dvb subtitles and teletext

Sat Jan 25 07:58:40 CET 2014



>________________________________
> From: Andriy Lysnevych <andriy.lysnevych at gmail.com>
>To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org> 
>Sent: Friday, 24 January 2014, 21:47
>Subject: Re: [FFmpeg-devel] BUG in use of extradata and extradata_size with dvb subtitles and teletext
> 
>
>On Fri, Jan 24, 2014 at 12:52 AM, Marton Balint <cus at passwd.hu> wrote:
>>
>> On Thu, 23 Jan 2014, Michael Niedermayer wrote:
>>
>>> On Wed, Jan 22, 2014 at 05:07:37PM +0200, Andriy Lysnevych wrote:
>>>>
>>>> Please review the updated patch with fixes based on your comments.
>>>>
>>>> Regards,
>>>> Andriy Lysnevych
>>>
>>>
>>>>  mpegts.c    |   40 ++++++++++++++++++++++++---
>>>>  mpegtsenc.c |   86
>>>> ++++++++++++++++++++++++++++++++++++++++++++++--------------
>>>>  2 files changed, 102 insertions(+), 24 deletions(-)
>>>> 2d733a7a4b4575cc0d4b1832bf4cce2a6821379c
>>>> 0001-correct-DVB-teletext-processing.patch
>>>> From c899d53980a23c132c46e778ac9f15d3897b914b Mon Sep 17 00:00:00 2001
>>>> From: Serhii Marchuk <serhii.marchuk at gmail.com>
>>>> Date: Wed, 22 Jan 2014 10:11:56 +0200
>>>> Subject: mpegts muxer and demuxer: correct DVB teletext processing
>>>>
>>>> * Using extradata by TS demuxer to store values from PMT
>>>> * Using extradata by TS muxer to correctly restore PMT table
>>>> * PES_header_data_length should be always 0x24 for DVB teletext,
>>>>   according to DVB standard
>>>> * Support of multiple languages in one DVB teletext stream:
>>>>   comma separated language codes in metadata "language" field
>>>>
>>>> The patch fixes #2223 ticket.
>>>> ---
>>>>  libavformat/mpegts.c    | 40 ++++++++++++++++++++---
>>>>  libavformat/mpegtsenc.c | 86
>>>> ++++++++++++++++++++++++++++++++++++++-----------
>>>>  2 files changed, 102 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
>>>> index db380ca..3d32f9f 100644
>>>> --- a/libavformat/mpegts.c
>>>> +++ b/libavformat/mpegts.c
>>>> @@ -1470,11 +1470,41 @@ int ff_parse_mpeg2_descriptor(AVFormatContext
>>>> *fc, AVStream *st, int stream_type
>>>>          }
>>>>          break;
>>>>      case 0x56: /* DVB teletext descriptor */
>>>> -        language[0] = get8(pp, desc_end);
>>>> -        language[1] = get8(pp, desc_end);
>>>> -        language[2] = get8(pp, desc_end);
>>>> -        language[3] = 0;
>>>> -        av_dict_set(&st->metadata, "language", language, 0);
>>>> +        {
>>>> +            uint8_t *extradata = NULL;
>>>> +            int language_count = desc_len / 5;
>>>> +
>>>> +            if (desc_len > 0 && desc_len % 5 != 0)
>>>> +                return AVERROR_INVALIDDATA;
>>>> +
>>>> +            if (language_count > 0) {
>>>> +                if (st->codec->extradata == NULL) {
>>>> +                    if (ff_alloc_extradata(st->codec, language_count *
>>>> 2)) {
>>>> +                        return AVERROR(ENOMEM);
>>>> +                    }
>>>> +                }
>>>> +
>>>> +               if (st->codec->extradata_size < language_count * 2)
>>>> +                   return AVERROR_INVALIDDATA;
>>>> +
>>>> +               extradata = st->codec->extradata;
>>>> +
>>>> +                for (i = 0; i < language_count; i++) {
>>>> +                    language[i * 4 + 0] = get8(pp, desc_end);
>>>> +                    language[i * 4 + 1] = get8(pp, desc_end);
>>>> +                    language[i * 4 + 2] = get8(pp, desc_end);
>>>> +                    language[i * 4 + 3] = ',';
>>>> +
>>>> +                    memcpy(extradata, *pp, 2);
>>>> +                    extradata += 2;
>>>> +
>>>> +                    *pp += 2;
>>>> +                }
>>>> +
>>>> +                language[i * 4 - 1] = 0;
>>>> +                av_dict_set(&st->metadata, "language", language, 0);
>>>> +            }
>>>> +        }
>>>>          break;
>>>>      case 0x59: /* subtitling descriptor */
>>>>          language[0] = get8(pp, desc_end);
>>>
>>>
>>> just wanted to apply but
>>>
>>> this produces:
>>> Stream #0:17[0x899](ger,ger): Subtitle: dvb_teletext ([6][0][0][0] /
>>> 0x0006)
>>> that is the same language twice
>>> is that intended ?
>>
>>
>> I think it is, since the teletext descriptor can contain a language multiple
>> times (e.g. in the first entry the initial teletext page is defined for a
>> certain language, in the next the subtitling page)
>>
>
>That's right. Most likely your TS stream contains DVB teletext that
>consists of teletext (first ger) and subtitles (second ger)
>substreams.
>
>>
>>>
>>> also please add a check that the language array is big enough
>>> ATM it is but thats just because its litterally hardcoded length is
>>> sufficient for the current maximal language_count and max number of
>>> bytes written.
>>>
>>> also please split the mpegts and mpegtsenc parts into seperate patches
>>> (i can split it too when applying if you prefer)
>>>
>
>We did the check of language array length and split the patch. Please review.
>
>Regards,
>Andriy Lysnevych
>
>_______________________________________________
>ffmpeg-devel mailing list
>ffmpeg-devel at ffmpeg.org
>http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>

Well in my patch that i forgot to repost i decoded the page information so you ended up with the language being

ger,I100;ger,S888

And with this when producing the ts I can reproduce the SI correctly with all languages

joolz