[MPlayer-dev-eng] Performance test of (nearly) all libavcodec options
Rémi Guyomarch
rguyom at pobox.com
Fri Jan 3 00:55:16 CET 2003
(yes, you will need a large screen to view some tables here, deal with
it)
test set :
- source : "Pale Rider" DVD
- length : 3 minutes
- frame size : 720 x 576 cropped & scaled down to 576 x 240
- frame rate : 25 fps
test hardware & OS :
- Athlon XP 1800, 512 MB, FreeBSD 4.7
- various processes running in the background
tested code :
- libavcodec build 4649
codec options :
$common = "vcodec=mpeg4:vhq:v4mv:vqmax=31:vlelim=-3:vcelim=-5:keyint=300:lumi_mask=0.12:dark_mask=0.15:scplx_mask=0.02:tcplx_mask=0.08:naq"
$bfcommon = "$common:vb_qfactor=1.25:vb_qoffset=0.6"
target bitrate = 725 kbits/s
notes :
- The source itself is an old, noisy film.
- 2 passes for each, encoding speed is the speed of the second pass.
- I didn't tested 'cmp' and 'subcmp' individualy.
- I didn't tested other vb_qfactor and vb_qoffset. The values used
here are a bit arbitrary.
- PSNR is in a logarithmic scale, there's a big difference visualy
between 38 dB and 40 dB for example.
- It would be interesting to see the performance of XViD here ...
0. For those who don't want all the details
===========================================
*) Here's a table of various options, their result in visual quality
(PSNR, and yes, I know it's not a perfect tool) and the encoding
speed in fps.
PSNR fps options
1) 38.32 37 $common
2) 38.91 35 $bfcommon:vmax_b_frames=2
3) 39.24 25 $bfcommon:vmax_b_frames=2:cmp=2:subcmp=2
4) 39.35 15 $bfcommon:vmax_b_frames=1:trell:last_pred=5:cmp=2:subcmp=2
5) 39.42 10 qpel:$common:cmp=2:subcmp=2
6) 39.68 9 qpel:$bfcommon:vmax_b_frames=1
7) 39.89 6 qpel:$bfcommon:vmax_b_frames=1:cmp=2:subcmp=2
8) 40.12 4 qpel:$bfcommon:vmax_b_frames=1:cmp=3:subcmp=3:trell:last_pred=5
1) 2) 3) 4) 5) 6) 7) 8)
q2: 156 161 184 121 499 285 306 200
q3: 833 677 834 565 2515 1324 1411 900
q4: 1561 1201 1221 1319 1321 1354 1339 1432
q5: 1241 1368 1392 1073 165 1089 1082 1243
q6: 479 720 709 903 406 355 509
q7: 228 371 158 306 41 6 215
q8: 2 204
q9: 8
---- ---- ---- ---- ---- ---- ---- ----
avq: 4.39 4.65 4.46 4.85 3.26 4.03 3.95 4.36
Y: 37.24 37.79 38.13 38.28 38.41 38.61 38.94 39.09
Cb: 41.17 42.06 42.26 42.15 41.95 42.51 42.61 42.76
Cr: 42.88 43.71 43.91 43.76 43.54 44.07 44.17 44.27
----- ----- ----- ----- ----- ----- ----- -----
All: 38.32 38.91 39.24 39.35 39.42 39.68 39.89 40.12
fps: 37 35 25 15 10 9 6 4
rate: 727.4 727.4 727.7 727.7 728.2 727.0 728.0 727.6
I. Compare functions and EPZS diamond size
==========================================
*) increasing the EPZS diamond size doesn't seems to be a win : it can
significantly decrease speed and most of the time it decrease quality
too
*) the SATD compare function slightly increase quality (0.09 PSNR) to
the cost of a significant slowdown (37 fps -> 30 fps)
1) $common
2) dia=2
3) {sub,}cmp=sse
4) {sub,}cmp=psnr
5) {sub,}cmp=dct
6) {sub,}cmp=satd
7) {sub,}cmp=satd, dia=2
8) {sub,}cmp=satd, dia=3
9) {sub,}cmp=satd, dia=4
10) {sub,}cmp=satd, dia=-1
11) {sub,}cmp=satd, dia=-2
12) {sub,}cmp=satd, dia=-3
1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11) 12)
q2: 156 157 165 139 161 166 165 166 162 168 169 164
q3: 833 824 877 771 850 857 838 836 845 842 847 843
q4: 1561 1535 1595 1470 1610 1601 1598 1600 1584 1580 1606 1607
q5: 1241 1264 1174 1340 1188 1192 1211 1207 1210 1218 1183 1201
q6: 479 487 483 489 495 481 479 485 485 481 490 480
q7: 228 231 204 285 194 201 207 204 212 209 203 203
q8: 2 2 2 6 2 2 2 2 2 2 2 2
---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
avq: 4.39 4.40 4.35 4.48 4.35 4.35
Y: 37.24 37.22 37.28 37.13 37.34 37.33 37.33 37.32 37.32 37.32 37.33 37.33
Cb: 41.17 41.16 41.18 41.10 41.19 41.19 41.18 41.19 41.19 41.19 41.20 41.19
Cr: 42.88 42.89 42.91 42.84 42.92 42.92 42.91 42.92 42.92 42.92 42.93 42.91
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
All: 38.32 38.31 38.36 38.22 38.41 38.41 38.40 38.40 38.39 38.40 38.41 38.40
fps: 37 35 27 12 22 30 26 22 17 24 28 26
rate: 727.4 727.4 727.4 727.5 727.5 727.4 727.4 727.4 727.4 727.4 727.4 727.4
II B frames and compare functions
=================================
*) B frames are a huge win ! at least on the PSNR side :) but don't
push them too far, use 1 or 2 B frames, not more.
*) Even if the average quantizer is larger for 1 or 2 B frames than
for not at all, the PSNR is better and visually the result is way nicer.
*) The price to pay in terms of performance is very small.
*) Using SATD instead of the default ME compare function gives even
larger PSNR gains than without B frames. Using DCT instead of SATD
gives even more gains but the performance hit is too large IMHO.
*) The combination of 2 B frames and SATD gives a 0.92 dB boost of PSNR
in exchange of a 32% speed drop.
1) $common
2) $bfcommon:vmax_b_frames=1
3) $bfcommon:vmax_b_frames=2
4) $bfcommon:vmax_b_frames=3
5) $bfcommon:vmax_b_frames=4
6) $bfcommon:vmax_b_frames=1:cmp=2:subcmp=2 (2==satd)
7) $bfcommon:vmax_b_frames=2:cmp=2:subcmp=2 (2==satd)
8) $bfcommon:vmax_b_frames=1:cmp=3:subcmp=3 (3==dct)
9) $bfcommon:vmax_b_frames=2:cmp=3:subcmp=3 (3==dct)
10) $bfcommon:vmax_b_frames=1:cmp=1:subcmp=1 (1==sse)
11) $bfcommon:vmax_b_frames=2:cmp=1:subcmp=1 (1==sse)
1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11)
q2: 156 188 161 101 70 199 184 206 184 193 171
q3: 833 795 677 536 396 875 834 869 847 816 709
q4: 1561 1366 1201 1060 863 1396 1221 1391 1243 1386 1176
q5: 1241 1252 1368 1367 1228 1237 1392 1257 1413 1244 1367
q6: 479 559 720 927 1044 509 709 500 649 543 727
q7: 228 317 371 494 802 271 158 264 162 304 348
q8: 2 22 12 93 12 12 13
---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
avq: 4.39 4.49 4.65 4.89 5.23 4.41 4.46 4.40 4.44 4.46 4.58
Y: 37.24 37.78 37.79 37.37 36.98 37.98 38.13 37.99 38.16 37.84 37.82
Cb: 41.17 41.89 42.06 41.90 41.74 42.01 42.26 42.01 42.27 41.93 42.08
Cr: 42.88 43.54 43.71 43.60 43.41 43.66 43.91 43.64 43.92 43.57 43.72
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
All: 38.32 38.89 38.91 38.54 38.17 39.07 39.24 39.08 39.26 38.94 38.94
fps: 37 35 35 34 34 26 25 18 17 24 23
rate: 727.4 727.7 727.4 726.9 726.2 727.9 727.7 727.9 727.7 727.9 727.9
IV. QPEL, B frames and compare functions
========================================
*) Quarter Pixel Motion Estimation cause a huge performance drop
because it's all in C, there's no MMX, SSE etc... assembly code.
*) QPEL is very effective, without any further optimization, QPEL
alone gives a whole dB gain in PSNR, yeah ! :)
*) If you want to use QPEL & B frames, use only 1 B frame, with more
you actualy decrease the PSNR.
1) $common
2) $common:qpel
3) $common:qpel:cmp=2:subcmp=2 (2==satd)
4) $common:qpel:cmp=3:subcmp=3 (3==dct)
5) $bfcommon:qpel:vmax_b_frames=1
6) $bfcommon:qpel:vmax_b_frames=2
7) $bfcommon:qpel:vmax_b_frames=1:cmp=2:subcmp=2 (2==satd)
9) $bfcommon:qpel:vmax_b_frames=1:cmp=3:subcmp=3 (3==dct)
1) 2) 3) 4) 5) 6) 7) 9)
q2: 156 483 499 494 285 192 306 307
q3: 833 2485 2515 2510 1324 1022 1411 1436
q4: 1561 1361 1321 1330 1354 1341 1339 1309
q5: 1241 171 165 166 1089 1347 1082 1084
q6: 479 406 588 355 356
q7: 228 41 8 6 7
q8: 2
---- ---- ---- ---- ---- ---- ---- ----
avq: 4.39 3.27 3.26 3.26 4.03 4.25 3.95
Y: 37.24 38.32 38.41 38.43 38.61 38.53 38.94 38.86
Cb: 41.17 41.93 41.95 41.95 42.51 42.58 42.61 42.61
Cr: 42.88 43.53 43.54 43.54 44.07 44.13 44.17 44.17
----- ----- ----- ----- ----- ----- ----- -----
All: 38.32 39.34 39.42 39.44 39.68 39.62 39.89 39.90
fps: 37 11 10 9 9 8 6 5
rate: 727.4 728.2 728.2 728.2 727.0 726.5 728.0 727.9
V. B frames, number of predictors and trellis quantization
==========================================================
*) Again, without QPEL, use 2 B frames. It's nearly as fast as 1 B
frames and decrease a bit the overall PSNR.
*) last_pred alone isn't worth the trouble.
*) trellis quantization is slow but you will get 0.25 dB of PSNR which
isn't bad.
1) $common
2) $bfcommon:vmax_b_frames=1
3) $bfcommon:vmax_b_frames=2
4) $bfcommon:vmax_b_frames=1:last_pred=5
5) $bfcommon:vmax_b_frames=2:last_pred=5
6) $bfcommon:vmax_b_frames=1:last_pred=10
7) $bfcommon:vmax_b_frames=2:last_pred=10
8) $bfcommon:vmax_b_frames=1:trell
9) $bfcommon:vmax_b_frames=2:trell
10) $bfcommon:vmax_b_frames=1:trell:last_pred=5
11) $bfcommon:vmax_b_frames=2:trell:last_pred=5
12) $bfcommon:vmax_b_frames=1:trell:last_pred=10
13) $bfcommon:vmax_b_frames=2:trell:last_pred=10
1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11) 12) 13)
q2: 156 188 161 188 164 188 161 99 91 102 95 98 93
q3: 833 795 677 803 684 795 694 527 423 531 434 529 440
q4: 1561 1366 1201 1368 1207 1380 1205 1243 1026 1246 1026 1258 1016
q5: 1241 1252 1368 1255 1391 1254 1397 1019 1120 1021 1147 1008 1149
q6: 479 559 720 557 685 556 682 998 1034 990 1045 1003 1031
q7: 228 317 371 307 367 306 359 342 534 340 491 334 516
q8: 2 22 21 20 252 270 250 260 249 253
q9: 19 19 20
---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
avq: 4.39 4.49 4.65 4.49 4.63 4.49 4.63 4.98 5.18 4.97 5.14 4.97 5.14
Y: 37.24 37.78 37.79 37.80 37.81 37.80 37.82 38.06 38.05 38.08 38.09 38.07 38.09
Cb: 41.17 41.89 42.06 41.89 42.06 41.90 42.07 42.02 42.14 42.03 42.17 42.03 42.15
Cr: 42.88 43.54 43.71 43.54 43.70 43.53 43.71 43.65 43.76 43.64 43.78 43.64 43.78
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
All: 38.32 38.89 38.91 38.90 38.93 38.90 38.94 39.14 39.15 39.16 39.19 39.15 39.18
fps: 37 35 35 31 30 27 26 19 19 19 18 17 16
rate: 727.4 727.7 727.4 727.8 727.4 727.8 727.4 726.9 726.8 726.9 726.8 726.9 726.8
VI. Everything put into the mix, looking for the best PSNR
==========================================================
*) with QPEL, 1 B frame, trellis quantization, 5 predictors and the
DCT ME compare function, I got a gain of 1.80 dB, which is really HUGE !
Well, there's only one problem : encoding is nearly ten times slower ...
1) $common
10) $bfcommon:vmax_b_frames=1:trell:last_pred=5
11) $bfcommon:vmax_b_frames=2:trell:last_pred=5
14) $bfcommon:vmax_b_frames=1:trell:last_pred=5:cmp=2:subcmp=2
15) $bfcommon:vmax_b_frames=1:trell:last_pred=5:cmp=3:subcmp=3
16) $bfcommon:vmax_b_frames=1:trell:last_pred=5:cmp=2:subcmp=2:qpel
17) $bfcommon:vmax_b_frames=1:trell:last_pred=5:cmp=3:subcmp=3:qpel
1) 10) 11) 14) 15) 16) 17)
q2: 156 102 95 121 124 201 200
q3: 833 531 434 565 564 909 900
q4: 1561 1246 1026 1319 1319 1417 1432
q5: 1241 1021 1147 1073 1066 1261 1243
q6: 479 990 1045 903 904 493 509
q7: 228 340 491 306 315 218 215
q8: 2 250 260 204 199
q9: 19 8 8
---- ---- ---- ---- ---- ---- ----
avq: 4.39 4.97 5.14 4.85 4.85 4.35 4.36
Y: 37.24 38.08 38.09 38.28 38.29 39.07 39.09
Cb: 41.17 42.03 42.17 42.15 42.16 42.76 42.76
Cr: 42.88 43.64 43.78 43.76 43.75 44.28 44.27
----- ----- ----- ----- ----- ----- -----
All: 38.32 39.16 39.19 39.35 39.36 40.11 40.12
fps: 37 19 18 15 11 5 4
rate: 727.4 726.9 726.8 727.7 727.7 727.7 727.6
--
Rémi
More information about the MPlayer-dev-eng
mailing list