[FFmpeg-devel] [PATCH] avcodec/utils/avpriv_find_start_code: optimization. If HAVE_FAST_UNALIGNED is true, handle "1 + sizeof(long)" bytes per step.

Michael Niedermayer michaelni at gmx.at
Thu Jan 1 20:18:37 CET 2015


On Fri, Jan 02, 2015 at 01:27:51AM +0800, zhaoxiu.zeng wrote:
> 在 2015/1/1 11:49, Michael Niedermayer 写道:
> > On Thu, Jan 01, 2015 at 10:13:58AM +0800, zhaoxiu.zeng wrote:
> >>  libavcodec/utils.c | 68 ++++++++++++++++++++++++++++++++++++++++++++----------
> >>  1 file changed, 56 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/libavcodec/utils.c b/libavcodec/utils.c
> >> index 1ec5cae..14a43e2 100644
> >> --- a/libavcodec/utils.c
> >> +++ b/libavcodec/utils.c
> >> @@ -3772,30 +3772,74 @@ const uint8_t *avpriv_find_start_code(const uint8_t *av_restrict p,
> >>                                        uint32_t *av_restrict state)
> >>  {
> >>      int i;
> >> +    uint32_t stat;
> >>  
> >>      av_assert0(p <= end);
> >>      if (p >= end)
> >>          return end;
> >>  
> >> +    stat = *state;
> >>      for (i = 0; i < 3; i++) {
> >> -        uint32_t tmp = *state << 8;
> >> -        *state = tmp + *(p++);
> >> -        if (tmp == 0x100 || p == end)
> >> +        uint32_t tmp = stat << 8;
> >> +        stat = tmp + *(p++);
> >> +        if (tmp == 0x100 || p == end) {
> >> +            *state = stat;
> >>              return p;
> >> +        }
> >>      }
> >>  
> >> -    while (p < end) {
> >> -        if      (p[-1] > 1      ) p += 3;
> >> -        else if (p[-2]          ) p += 2;
> >> -        else if (p[-3]|(p[-1]-1)) p++;
> >> -        else {
> >> +#if HAVE_FAST_UNALIGNED
> >> +#if HAVE_FAST_64BIT
> >> +    for (; p + 6 <= end; p += 9) {
> >> +        uint64_t t = AV_RN64A(p - 2);
> >> +        if (!((t - 0x0100010001000101ULL) & ~(t | 0x7fff7fff7fff7f7fULL)))
> >> +            continue;
> >> +#else
> >> +    for (; p + 2 <= end; p += 5) {
> >> +        uint32_t t = AV_RN32A(p - 2);
> >> +        if (!((t - 0x01000101U) & ~(t | 0x7fff7f7fU)))
> >> +            continue;
> >> +#endif
> >> +        /* find the first zero byte in t */
> >> +#if HAVE_BIGENDIAN
> >> +        while (t >> (sizeof(t) * 8 - 8)) {
> >> +            t <<= 8;
> >> +            p++;
> >> +        }
> >> +#else
> >> +        while (t & 0xff) {
> >> +            t >>= 8;
> >> +            p++;
> >> +        }
> >> +#endif
> > 
> > this maybe can be simplified by using ff_startcode_find_candidate_c()
> > 
> There is a little different. ff_startcode_find_candidate_c find the first "0x00", but we only care "0x00 0x00".
> Use 0x0100010001000101ULL not 0x0101010101010101ULL to reduce the hit ratio of lonely "0x00", so it can be faster
> if there are some lonely "0x00" in the buffer.

if you can improve ff_startcode_find_candidate_c() please do so.
But the code should not be duplicated in a slightly different way

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Those who are too smart to engage in politics are punished by being
governed by those who are dumber. -- Plato 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20150101/88ddcee7/attachment.asc>


More information about the ffmpeg-devel mailing list