[FFmpeg-devel] [PATCH 1/2] lavc/pcm_tablegen: slight speedup of table generation
Ganesh Ajjanagadde
gajjanagadde at gmail.com
Sat Jan 2 16:59:52 CET 2016
On Wed, Dec 30, 2015 at 8:34 PM, Ganesh Ajjanagadde
<gajjanagadde at gmail.com> wrote:
> This gets rid of some branches to speed up table generation slightly
> (impact higher on mulaw than alaw). Tables are identical to before,
> tested with FATE.
>
> Sample benchmark (Haswell, GNU/Linux+gcc):
> old:
> 313494 decicycles in build_alaw_table, 4094 runs, 2 skips
> 315959 decicycles in build_alaw_table, 8190 runs, 2 skips
>
> 323599 decicycles in build_ulaw_table, 4095 runs, 1 skips
> 318849 decicycles in build_ulaw_table, 8188 runs, 4 skips
>
> new:
> 261902 decicycles in build_alaw_table, 4096 runs, 0 skips
> 266519 decicycles in build_alaw_table, 8192 runs, 0 skips
>
> 209657 decicycles in build_ulaw_table, 4096 runs, 0 skips
> 232656 decicycles in build_ulaw_table, 8192 runs, 0 skips
>
> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
> ---
> libavcodec/pcm_tablegen.h | 24 ++++++++++++------------
> 1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/libavcodec/pcm_tablegen.h b/libavcodec/pcm_tablegen.h
> index 1387210..7269977 100644
> --- a/libavcodec/pcm_tablegen.h
> +++ b/libavcodec/pcm_tablegen.h
> @@ -87,21 +87,21 @@ static av_cold void build_xlaw_table(uint8_t *linear_to_xlaw,
> {
> int i, j, v, v1, v2;
>
> - j = 0;
> - for(i=0;i<128;i++) {
> - if (i != 127) {
> - v1 = xlaw2linear(i ^ mask);
> - v2 = xlaw2linear((i + 1) ^ mask);
> - v = (v1 + v2 + 4) >> 3;
> - } else {
> - v = 8192;
> - }
> - for(;j<v;j++) {
> + j = 1;
> + linear_to_xlaw[8192] = mask;
> + for(i=0;i<127;i++) {
> + v1 = xlaw2linear(i ^ mask);
> + v2 = xlaw2linear((i + 1) ^ mask);
> + v = (v1 + v2 + 4) >> 3;
> + for(;j<v;j+=1) {
> + linear_to_xlaw[8192 - j] = (i ^ (mask ^ 0x80));
> linear_to_xlaw[8192 + j] = (i ^ mask);
> - if (j > 0)
> - linear_to_xlaw[8192 - j] = (i ^ (mask ^ 0x80));
> }
> }
> + for(;j<8192;j++) {
> + linear_to_xlaw[8192 - j] = (127 ^ (mask ^ 0x80));
> + linear_to_xlaw[8192 + j] = (127 ^ mask);
> + }
> linear_to_xlaw[0] = linear_to_xlaw[1];
> }
>
> --
> 2.6.4
>
ping
More information about the ffmpeg-devel
mailing list