[MPlayer-dev-eng] Performance test of (nearly) all libavcodec options

Rémi Guyomarch rguyom at pobox.com
Fri Jan 3 00:55:16 CET 2003


(yes, you will need a large screen to view some tables here, deal with
it)

test set :
  - source : "Pale Rider" DVD
  - length : 3 minutes
  - frame size : 720 x 576 cropped & scaled down to 576 x 240
  - frame rate : 25 fps

test hardware & OS :
  - Athlon XP 1800, 512 MB, FreeBSD 4.7
  - various processes running in the background

tested code :
  - libavcodec build 4649

codec options :
  $common = "vcodec=mpeg4:vhq:v4mv:vqmax=31:vlelim=-3:vcelim=-5:keyint=300:lumi_mask=0.12:dark_mask=0.15:scplx_mask=0.02:tcplx_mask=0.08:naq"
  $bfcommon = "$common:vb_qfactor=1.25:vb_qoffset=0.6"
  target bitrate = 725 kbits/s

notes :
  - The source itself is an old, noisy film.
  - 2 passes for each, encoding speed is the speed of the second pass.
  - I didn't tested 'cmp' and 'subcmp' individualy.
  - I didn't tested other vb_qfactor and vb_qoffset. The values used
    here are a bit arbitrary.
  - PSNR is in a logarithmic scale, there's a big difference visualy
    between 38 dB and 40 dB for example.
  - It would be interesting to see the performance of XViD here ...


0. For those who don't want all the details
===========================================

*) Here's a table of various options, their result in visual quality
(PSNR, and yes, I know it's not a perfect tool) and the encoding
speed in fps.

      PSNR    fps  options

 1)   38.32   37   $common
 2)   38.91   35   $bfcommon:vmax_b_frames=2
 3)   39.24   25   $bfcommon:vmax_b_frames=2:cmp=2:subcmp=2
 4)   39.35   15   $bfcommon:vmax_b_frames=1:trell:last_pred=5:cmp=2:subcmp=2
 5)   39.42   10   qpel:$common:cmp=2:subcmp=2
 6)   39.68    9   qpel:$bfcommon:vmax_b_frames=1
 7)   39.89    6   qpel:$bfcommon:vmax_b_frames=1:cmp=2:subcmp=2
 8)   40.12    4   qpel:$bfcommon:vmax_b_frames=1:cmp=3:subcmp=3:trell:last_pred=5

        1)      2)      3)      4)      5)      6)      7)      8) 
  q2:   156     161     184     121     499     285     306     200
  q3:   833     677     834     565    2515    1324    1411     900
  q4:  1561    1201    1221    1319    1321    1354    1339    1432
  q5:  1241    1368    1392    1073     165    1089    1082    1243
  q6:   479     720     709     903             406     355     509
  q7:   228     371     158     306              41       6     215
  q8:     2                     204                                
  q9:			          8	   	   	           
       ----    ----    ----    ----    ----    ----    ----    ----
 avq:  4.39    4.65    4.46    4.85    3.26    4.03    3.95    4.36
                                                                   
   Y: 37.24   37.79   38.13   38.28   38.41   38.61   38.94   39.09
  Cb: 41.17   42.06   42.26   42.15   41.95   42.51   42.61   42.76
  Cr: 42.88   43.71   43.91   43.76   43.54   44.07   44.17   44.27
      -----   -----   -----   -----   -----   -----   -----   -----
 All: 38.32   38.91   39.24   39.35   39.42   39.68   39.89   40.12
                                                                   
 fps:   37      35      25      15      10      9       6       4  
                                                                   
rate: 727.4   727.4   727.7   727.7   728.2   727.0   728.0   727.6





I. Compare functions and EPZS diamond size
==========================================

*) increasing the EPZS diamond size doesn't seems to be a win : it can
significantly decrease speed and most of the time it decrease quality
too
*) the SATD compare function slightly increase quality (0.09 PSNR) to
the cost of a significant slowdown (37 fps -> 30 fps)


 1) $common
 2) dia=2
 3) {sub,}cmp=sse
 4) {sub,}cmp=psnr
 5) {sub,}cmp=dct
 6) {sub,}cmp=satd
 7) {sub,}cmp=satd, dia=2
 8) {sub,}cmp=satd, dia=3
 9) {sub,}cmp=satd, dia=4
10) {sub,}cmp=satd, dia=-1
11) {sub,}cmp=satd, dia=-2
12) {sub,}cmp=satd, dia=-3

        1)      2)      3)      4)      5)      6)      7)      8)      9)     10)     11)     12) 
  q2:   156     157     165     139     161     166     165     166     162     168     169     164
  q3:   833     824     877     771     850     857     838     836     845     842     847     843
  q4:  1561    1535    1595    1470    1610    1601    1598    1600    1584    1580    1606    1607
  q5:  1241    1264    1174    1340    1188    1192    1211    1207    1210    1218    1183    1201
  q6:   479     487     483     489     495     481     479     485     485     481     490     480
  q7:   228     231     204     285     194     201     207     204     212     209     203     203
  q8:     2       2       2       6       2       2       2       2       2       2       2       2
       ----    ----    ----    ----    ----    ----    ----    ----    ----    ----    ----    ----
 avq:  4.39    4.40    4.35    4.48    4.35    4.35    		                        	   
                						                                   
   Y: 37.24   37.22   37.28   37.13   37.34   37.33   37.33   37.32   37.32   37.32   37.33   37.33     
  Cb: 41.17   41.16   41.18   41.10   41.19   41.19   41.18   41.19   41.19   41.19   41.20   41.19     
  Cr: 42.88   42.89   42.91   42.84   42.92   42.92   42.91   42.92   42.92   42.92   42.93   42.91     
      -----   -----   -----   -----   -----   -----   -----   -----   -----   -----   -----   -----
 All: 38.32   38.31   38.36   38.22   38.41   38.41   38.40   38.40   38.39   38.40   38.41   38.40
                                   				                                   
 fps:   37      35      27      12      22      30      26      22      17     24       28     26  
                						                                   
rate: 727.4   727.4   727.4   727.5   727.5   727.4   727.4   727.4   727.4   727.4   727.4   727.4



II B frames and compare functions
=================================

*) B frames are a huge win ! at least on the PSNR side :) but don't
push them too far, use 1 or 2 B frames, not more.

*) Even if the average quantizer is larger for 1 or 2 B frames than
for not at all, the PSNR is better and visually the result is way nicer.

*) The price to pay in terms of performance is very small.

*) Using SATD instead of the default ME compare function gives even
larger PSNR gains than without B frames. Using DCT instead of SATD
gives even more gains but the performance hit is too large IMHO.

*) The combination of 2 B frames and SATD gives a 0.92 dB boost of PSNR
in exchange of a 32% speed drop.


 1) $common
 2) $bfcommon:vmax_b_frames=1
 3) $bfcommon:vmax_b_frames=2
 4) $bfcommon:vmax_b_frames=3
 5) $bfcommon:vmax_b_frames=4
 6) $bfcommon:vmax_b_frames=1:cmp=2:subcmp=2 (2==satd)
 7) $bfcommon:vmax_b_frames=2:cmp=2:subcmp=2 (2==satd)
 8) $bfcommon:vmax_b_frames=1:cmp=3:subcmp=3 (3==dct)
 9) $bfcommon:vmax_b_frames=2:cmp=3:subcmp=3 (3==dct)
10) $bfcommon:vmax_b_frames=1:cmp=1:subcmp=1 (1==sse)
11) $bfcommon:vmax_b_frames=2:cmp=1:subcmp=1 (1==sse)

        1)      2)      3)      4)      5)      6)      7)      8)      9)     10)     11)
  q2:   156     188     161     101      70     199     184     206     184     193     171
  q3:   833     795     677     536     396     875     834     869     847     816     709
  q4:  1561    1366    1201    1060     863    1396    1221    1391    1243    1386    1176
  q5:  1241    1252    1368    1367    1228    1237    1392    1257    1413    1244    1367
  q6:   479     559     720     927    1044     509     709     500     649     543     727
  q7:   228     317     371     494     802     271     158     264     162     304     348
  q8:     2      22              12      93      12              12              13
       ----    ----    ----    ----    ----    ----    ----    ----    ----    ----    ----
 avq:  4.39    4.49    4.65    4.89    5.23    4.41    4.46    4.40    4.44    4.46    4.58
                           			           	           
   Y: 37.24   37.78   37.79   37.37   36.98   37.98   38.13   37.99   38.16   37.84   37.82
  Cb: 41.17   41.89   42.06   41.90   41.74   42.01   42.26   42.01   42.27   41.93   42.08
  Cr: 42.88   43.54   43.71   43.60   43.41   43.66   43.91   43.64   43.92   43.57   43.72
      -----   -----   -----   -----   -----   -----   -----   -----   -----   -----   -----
 All: 38.32   38.89   38.91   38.54   38.17   39.07   39.24   39.08   39.26   38.94   38.94
                           			           	           
 fps:   37      35      35      34      34      26      25      18      17      24      23
                           			           	           
rate: 727.4   727.7   727.4   726.9   726.2   727.9   727.7   727.9   727.7   727.9   727.9



IV. QPEL, B frames and compare functions
========================================

*) Quarter Pixel Motion Estimation cause a huge performance drop
because it's all in C, there's no MMX, SSE etc... assembly code.

*) QPEL is very effective, without any further optimization, QPEL
alone gives a whole dB gain in PSNR, yeah ! :)

*) If you want to use QPEL & B frames, use only 1 B frame, with more
you actualy decrease the PSNR.


 1) $common
 2) $common:qpel
 3) $common:qpel:cmp=2:subcmp=2 (2==satd)
 4) $common:qpel:cmp=3:subcmp=3 (3==dct)
 5) $bfcommon:qpel:vmax_b_frames=1
 6) $bfcommon:qpel:vmax_b_frames=2
 7) $bfcommon:qpel:vmax_b_frames=1:cmp=2:subcmp=2 (2==satd)
 9) $bfcommon:qpel:vmax_b_frames=1:cmp=3:subcmp=3 (3==dct)

        1)      2)      3)      4)      5)      6)      7)      9)
  q2:   156     483     499     494     285     192     306     307
  q3:   833    2485    2515    2510    1324    1022    1411    1436
  q4:  1561    1361    1321    1330    1354    1341    1339    1309
  q5:  1241     171     165     166    1089    1347    1082    1084
  q6:   479	                        406     588     355     356
  q7:   228	                         41       8       6       7
  q8:     2	           	           	           
       ----    ----    ----    ----    ----    ----    ----    ----
 avq:  4.39    3.27    3.26    3.26    4.03    4.25    3.95
           	           	           	           
   Y: 37.24   38.32   38.41   38.43   38.61   38.53   38.94   38.86
  Cb: 41.17   41.93   41.95   41.95   42.51   42.58   42.61   42.61
  Cr: 42.88   43.53   43.54   43.54   44.07   44.13   44.17   44.17
      -----   -----   -----   -----   -----   -----   -----   -----
 All: 38.32   39.34   39.42   39.44   39.68   39.62   39.89   39.90
           	           	           	           
 fps:   37      11      10      9       9       8       6       5
           	           	           	           
rate: 727.4   728.2   728.2   728.2   727.0   726.5   728.0   727.9


V. B frames, number of predictors and trellis quantization
==========================================================

*) Again, without QPEL, use 2 B frames. It's nearly as fast as 1 B
frames and decrease a bit the overall PSNR.

*) last_pred alone isn't worth the trouble.

*) trellis quantization is slow but you will get 0.25 dB of PSNR which
isn't bad.


 1) $common
 2) $bfcommon:vmax_b_frames=1
 3) $bfcommon:vmax_b_frames=2
 4) $bfcommon:vmax_b_frames=1:last_pred=5
 5) $bfcommon:vmax_b_frames=2:last_pred=5
 6) $bfcommon:vmax_b_frames=1:last_pred=10
 7) $bfcommon:vmax_b_frames=2:last_pred=10
 8) $bfcommon:vmax_b_frames=1:trell
 9) $bfcommon:vmax_b_frames=2:trell
10) $bfcommon:vmax_b_frames=1:trell:last_pred=5
11) $bfcommon:vmax_b_frames=2:trell:last_pred=5
12) $bfcommon:vmax_b_frames=1:trell:last_pred=10
13) $bfcommon:vmax_b_frames=2:trell:last_pred=10

        1)      2)      3)      4)      5)      6)      7)      8)      9)     10)     11)     12)     13)
  q2:   156     188     161     188     164     188     161      99      91     102      95      98      93
  q3:   833     795     677     803     684     795     694     527     423     531     434     529     440
  q4:  1561    1366    1201    1368    1207    1380    1205    1243    1026    1246    1026    1258    1016
  q5:  1241    1252    1368    1255    1391    1254    1397    1019    1120    1021    1147    1008    1149
  q6:   479     559     720     557     685     556     682     998    1034     990    1045    1003    1031
  q7:   228     317     371     307     367     306     359     342     534     340     491     334     516
  q8:     2      22              21	         20             252     270     250     260     249     253
  q9:                                                            19              19              20
       ----    ----    ----    ----    ----    ----    ----    ----    ----    ----    ----    ----    ----
 avq:  4.39    4.49    4.65    4.49    4.63    4.49    4.63    4.98    5.18    4.97    5.14    4.97    5.14
                                   	                                                   	   
   Y: 37.24   37.78   37.79   37.80   37.81   37.80   37.82   38.06   38.05   38.08   38.09   38.07   38.09
  Cb: 41.17   41.89   42.06   41.89   42.06   41.90   42.07   42.02   42.14   42.03   42.17   42.03   42.15
  Cr: 42.88   43.54   43.71   43.54   43.70   43.53   43.71   43.65   43.76   43.64   43.78   43.64   43.78
      -----   -----   -----   -----   -----   -----   -----   -----   -----   -----   -----   -----   -----
 All: 38.32   38.89   38.91   38.90   38.93   38.90   38.94   39.14   39.15   39.16   39.19   39.15   39.18
                                   	                                                   	   
 fps:   37      35      35      31      30      27      26      19      19      19      18      17      16
                                   	                                                   	   
rate: 727.4   727.7   727.4   727.8   727.4   727.8   727.4   726.9   726.8   726.9   726.8   726.9   726.8



VI. Everything put into the mix, looking for the best PSNR
==========================================================

*) with QPEL, 1 B frame, trellis quantization, 5 predictors and the
DCT ME compare function, I got a gain of 1.80 dB, which is really HUGE !
Well, there's only one problem : encoding is nearly ten times slower ...


 1) $common
10) $bfcommon:vmax_b_frames=1:trell:last_pred=5
11) $bfcommon:vmax_b_frames=2:trell:last_pred=5
14) $bfcommon:vmax_b_frames=1:trell:last_pred=5:cmp=2:subcmp=2
15) $bfcommon:vmax_b_frames=1:trell:last_pred=5:cmp=3:subcmp=3

16) $bfcommon:vmax_b_frames=1:trell:last_pred=5:cmp=2:subcmp=2:qpel
17) $bfcommon:vmax_b_frames=1:trell:last_pred=5:cmp=3:subcmp=3:qpel

        1)     10)     11)     14)     15)     16)     17) 
  q2:   156     102      95     121     124     201     200
  q3:   833     531     434     565     564     909     900
  q4:  1561    1246    1026    1319    1319    1417    1432
  q5:  1241    1021    1147    1073    1066    1261    1243
  q6:   479     990    1045     903     904     493     509
  q7:   228     340     491     306     315     218     215
  q8:     2     250     260     204     199	           
  q9:            19               8       8	           
       ----    ----    ----    ----    ----    ----    ----
 avq:  4.39    4.97    5.14    4.85    4.85    4.35    4.36
                                   		           
   Y: 37.24   38.08   38.09   38.28   38.29   39.07   39.09
  Cb: 41.17   42.03   42.17   42.15   42.16   42.76   42.76
  Cr: 42.88   43.64   43.78   43.76   43.75   44.28   44.27
      -----   -----   -----   -----   -----   -----   -----
 All: 38.32   39.16   39.19   39.35   39.36   40.11   40.12
                                   		           
 fps:   37      19      18      15      11      5       4  
                                   		           
rate: 727.4   726.9   726.8   727.7   727.7   727.7   727.6

-- 
Rémi


More information about the MPlayer-dev-eng mailing list