[FFmpeg-devel] [PATCH 09/11] avcodec/x86: allow future 8-bit simple idct to have "DC only hack"
Henrik Gramner
henrik at gramner.com
Sat Jun 24 21:01:25 EEST 2017
On Mon, Jun 19, 2017 at 5:11 PM, James Darnley <jdarnley at obe.tv> wrote:
> + por m1, m8, m13
> + por m1, m12
> + por m1, [blockq+ 16] ; { row[1] }[0-7]
> + por m1, [blockq+ 48] ; { row[3] }[0-7]
> + por m1, [blockq+ 80] ; { row[5] }[0-7]
> + por m1, [blockq+112] ; { row[7] }[0-7]
Using a single register as destination here means that only one
instruction per cycle can be executed due to dependencies. Splitting
it across two destinations would double the (local) IPC.
OoOE might alleviate it, but no reason to unnecessarily rely on it.
More information about the ffmpeg-devel
mailing list