[FFmpeg-devel] [PATCH v2 3/7] avcodec/hevc: Add pel_uni_w_pixels4/6/8/12/16/24/32/48/64 asm opt

yinshiyou-hf at loongson.cn yinshiyou-hf at loongson.cn
Thu Dec 28 04:18:35 EET 2023


> -----原始邮件-----
> 发件人: jinbo <jinbo at loongson.cn>
> 发送时间:2023-12-27 12:50:15 (星期三)
> 收件人: ffmpeg-devel at ffmpeg.org
> 抄送: jinbo <jinbo at loongson.cn>
> 主题: [FFmpeg-devel] [PATCH v2 3/7] avcodec/hevc: Add pel_uni_w_pixels4/6/8/12/16/24/32/48/64 asm opt
> 

> +
> +.macro HEVC_PEL_UNI_W_PIXELS8_LSX src0, dst0, w
> +    vldrepl.d      vr0,    \src0,   0
> +    vsllwil.hu.bu  vr0,    vr0,     0
> +    vexth.wu.hu    vr5,    vr0
> +    vsllwil.wu.hu  vr0,    vr0,     0
> +    vslli.w        vr0,    vr0,     6
> +    vslli.w        vr5,    vr5,     6
> +    vmul.w         vr0,    vr0,     vr1
> +    vmul.w         vr5,    vr5,     vr1
> +    vadd.w         vr0,    vr0,     vr2
> +    vadd.w         vr5,    vr5,     vr2
You can use 'vmadd.w' here.
> +    vsra.w         vr0,    vr0,     vr3
> +    vsra.w         vr5,    vr5,     vr3
> +    vadd.w         vr0,    vr0,     vr4
> +    vadd.w         vr5,    vr5,     vr4
> +    vssrani.h.w    vr5,    vr0,     0
> +    vssrani.bu.h   vr5,    vr5,     0
> +.if \w == 6
> +    fst.s          f5,     \dst0,   0
> +    vstelm.h       vr5,    \dst0,   4,     2
> +.else
> +    fst.d          f5,     \dst0,   0
> +.endif
> +.endm
> +
> +.macro HEVC_PEL_UNI_W_PIXELS8x2_LASX src0, dst0, w
> +    vldrepl.d      vr0,    \src0,   0
> +    add.d          t2,     \src0,   a3
> +    vldrepl.d      vr5,    t2,      0
> +    xvpermi.q      xr0,    xr5,     0x02
> +    xvsllwil.hu.bu xr0,    xr0,     0
> +    xvexth.wu.hu   xr5,    xr0
> +    xvsllwil.wu.hu xr0,    xr0,     0
> +    xvslli.w       xr0,    xr0,     6
> +    xvslli.w       xr5,    xr5,     6
> +    xvmul.w        xr0,    xr0,     xr1
> +    xvmul.w        xr5,    xr5,     xr1
> +    xvadd.w        xr0,    xr0,     xr2
> +    xvadd.w        xr5,    xr5,     xr2

Use 'vmadd.w' will be better.

> +    xvsra.w        xr0,    xr0,     xr3
> +    xvsra.w        xr5,    xr5,     xr3
> +    xvadd.w        xr0,    xr0,     xr4
> +    xvadd.w        xr5,    xr5,     xr4
> +    xvssrani.h.w   xr5,    xr0,     0
> +    xvpermi.q      xr0,    xr5,     0x01
> +    xvssrani.bu.h  xr0,    xr5,     0
> +    add.d          t3,     \dst0,   a1
> +.if \w == 6
> +    vstelm.w       vr0,    \dst0,   0,     0
> +    vstelm.h       vr0,    \dst0,   4,     2
> +    vstelm.w       vr0,    t3,      0,     2
> +    vstelm.h       vr0,    t3,      4,     6
> +.else
> +    vstelm.d       vr0,    \dst0,   0,     0
> +    vstelm.d       vr0,    t3,      0,     1
> +.endif
> +.endm
> +
> +.macro HEVC_PEL_UNI_W_PIXELS16_LSX src0, dst0
> +    vld            vr0,    \src0,   0
> +    vexth.hu.bu    vr7,    vr0
> +    vexth.wu.hu    vr8,    vr7
> +    vsllwil.wu.hu  vr7,    vr7,     0
> +    vsllwil.hu.bu  vr5,    vr0,     0
> +    vexth.wu.hu    vr6,    vr5
> +    vsllwil.wu.hu  vr5,    vr5,     0
> +    vslli.w        vr5,    vr5,     6
> +    vslli.w        vr6,    vr6,     6
> +    vslli.w        vr7,    vr7,     6
> +    vslli.w        vr8,    vr8,     6
> +    vmul.w         vr5,    vr5,     vr1
> +    vmul.w         vr6,    vr6,     vr1
> +    vmul.w         vr7,    vr7,     vr1
> +    vmul.w         vr8,    vr8,     vr1
> +    vadd.w         vr5,    vr5,     vr2
> +    vadd.w         vr6,    vr6,     vr2
> +    vadd.w         vr7,    vr7,     vr2
> +    vadd.w         vr8,    vr8,     vr2

Use 'vmadd.w', please check it in your left code.


本邮件及其附件含有龙芯中科的商业秘密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制或散发)本邮件及其附件中的信息。如果您错收本邮件,请您立即电话或邮件通知发件人并删除本邮件。 
This email and its attachments contain confidential information from Loongson Technology , which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it. 


More information about the ffmpeg-devel mailing list