[FFmpeg-devel] [PATCH 2/2] lavc: support subtitles charset conversion.
Clément Bœsch
ubitux at gmail.com
Fri Jan 11 20:18:04 CET 2013
On Tue, Jan 08, 2013 at 12:29:36PM +0100, Nicolas George wrote:
[...]
> > + /**
> > + * Character encoding of the input subtitles file.
> > + * - decoding: set by user, libavformat or libavcodec
> > + * - encoding: unused
> > + */
> > + char *sub_charenc;
>
> Needs a not about not accessing it directly from outside lavc (and possibly
> an accessor).
>
Just kept "set by user". On top of the struct: "Please use AVOptions
(av_opt* / av_set/get*()) to access these fields from user applications."
This is consistent with the other options.
> > +
> > + /**
> > + * Subtitles character encoding mode.
>
> > + * - decoding: set by libavformat or libavcodec, not intended to be used by user apps
>
> Do we consider lavf mandatory for subtitles demuxing? IMHO, from lavc's
> point of view, lavf is a "user app".
>
I kept this lavc internal for now.
> > + * - encoding: unused
> > + */
> > + int sub_charenc_mode;
>
> > +#define FF_SUB_CHARENC_MODE_DO_NOTHING -1 ///< do nothing (demuxer outputs a stream supposed to be already in UTF-8, or the codec is bitmap for instance)
> > +#define FF_SUB_CHARENC_MODE_AUTOMATIC 0 ///< libavcodec will select the mode itself
> > +#define FF_SUB_CHARENC_MODE_DECODER_PRE 2 ///< the AVPacket data needs to be recoded to UTF-8 before being fed to the decoder, requires iconv
> > +//#define FF_SUB_CHARENC_MODE_DECODER_POST 3 ///< the AVSubitle data needs to be recoded to UTF-8 after the decoder pass, requires iconv
>
> Nit: enum?
>
I prefer define in these case, at least consistent with the other fields.
[...]
> > diff --git a/libavformat/assdec.c b/libavformat/assdec.c
> > index 35fcb51..710a171 100644
> > --- a/libavformat/assdec.c
> > +++ b/libavformat/assdec.c
> > @@ -93,6 +93,7 @@ static int ass_read_header(AVFormatContext *s)
> > avpriv_set_pts_info(st, 64, 1, 100);
> > st->codec->codec_type = AVMEDIA_TYPE_SUBTITLE;
>
> > st->codec->codec_id= AV_CODEC_ID_SSA;
> > + st->codec->sub_charenc_mode = FF_SUB_CHARENC_MODE_DO_NOTHING;
>
> lavf should not access the final AVCodecContext fields directly. Also,
> usually, FF_* macros are for intra-use only (but there are exceptions).
>
> Another problem: at this point, sub_charenc is still NULL, and therefore
> DO_NOTHING is redundant. If this code path leads to sub_charenc being
> non-NULL, that means it has been set by the user: ignoring it seems wrong.
> IMHO, the demuxer should only set the mode to DO_NOTHING if it actually does
> something with the encoding.
>
> (ASS files in something else than UTF-8 exist; I am sure malformed webftt
> files exist too somewhere.)
>
For this one and the following, sub_charenc_mode setting removed for now.
[...]
I won't be available for about one month, but since Alexander raised some
interest to be working on this, here is my current work:
The two current patches for character encoding support, rebased on
master:
https://github.com/ubitux/FFmpeg/compare/master...sub-recode-nofilter
The additional lavfi patches, rebased on sub-recode-nofilter:
https://github.com/ubitux/FFmpeg/compare/master...sub-recode
These patches are ready from my PoV (just be careful about the TODO
version bump).
Of course, this may be discussed because of the UTF-16 insanity. My
opinion is that for standalone text-subtitles formats we should just make
ff_subtitles_read_chunk() and ff_get_line() support an utf16-to-utf8
convert and be it. Yes we would loose the utf-16 encoding when
transcoding, but I think that's really not important.
Another thing is that we don't support at all utf-16 currently in these
demuxers, and I'm relatively against reworking them to change their
c-string logic into a binary one.
After the sub-recode-nofilter patches, utf-16 is supported for binary
demuxers.
Anyone is welcome to work on this, and/or push when ready.
See you in one month,
--
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130111/e6fcb2f4/attachment.asc>
More information about the ffmpeg-devel
mailing list