[FFmpeg-devel] [PATCH 1/2] avutil/avstring: do not loose ascii characters when decoding non utf-8 with av_utf8_decode()
Michael Niedermayer
michaelni at gmx.at
Sun Apr 13 03:26:04 CEST 2014
On Sun, Apr 13, 2014 at 12:10:59AM +0200, Nicolas George wrote:
> Le tridi 23 germinal, an CCXXII, Michael Niedermayer a écrit :
> > Subject: [FFmpeg-devel] [PATCH 1/2] avutil/avstring: do not loose ascii
> > characters when decoding non utf-8 with av_utf8_decode()
>
> Spelling mistake: "to loose" means "to set free", as in "the Forsaken are
> loose" (sorry, re-reading WoT). The correct spelling would be "lose". This
> applies to the next patch too, of course.
>
> >
> > Fixes Ticket3363
> >
> > Signed-off-by: Michael Niedermayer <michaelni at gmx.at>
> > ---
> > libavutil/avstring.c | 8 ++++----
> > 1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/libavutil/avstring.c b/libavutil/avstring.c
> > index f4374fd..e75cdc6 100644
> > --- a/libavutil/avstring.c
> > +++ b/libavutil/avstring.c
> > @@ -331,15 +331,15 @@ int av_utf8_decode(int32_t *codep, const uint8_t **bufp, const uint8_t *buf_end,
> > while (code & top) {
> > int tmp;
> > if (p >= buf_end) {
> > - ret = AVERROR(EILSEQ); /* incomplete sequence */
> > - goto end;
> > + (*bufp) ++;
> > + return AVERROR(EILSEQ); /* incomplete sequence */
> > }
> >
> > /* we assume the byte to be in the form 10xx-xxxx */
> > tmp = *p++ - 128; /* strip leading 1 */
> > if (tmp>>6) {
> > - ret = AVERROR(EILSEQ);
> > - goto end;
> > + (*bufp) ++;
> > + return AVERROR(EILSEQ);
>
> With this form, each byte of an invalid sequence will trigger EILSEQ for
> each byte in an invalid sequence, instead of treating the whole sequence as
> invalid. I do not know whether this is better or not.
The problem is that while bytes n..m might form an invalid sequence
n+1.. where n+1<=m might very well be a valid sequence. So skiping
the whole sequence is problematic as it could easily loose the start
of the next valid sequence
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
The real ebay dictionary, page 2
"100% positive feedback" - "All either got their money back or didnt complain"
"Best seller ever, very honest" - "Seller refunded buyer after failed scam"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140413/b8f0af51/attachment.asc>
More information about the ffmpeg-devel
mailing list