[Ffmpeg-devel-irc] ffmpeg-devel.log.20151226

burek burek021 at gmail.com
Sun Dec 27 02:05:02 CET 2015


[00:34:30 CET] <tmm1> J_Darnley: so i've been staring at this yadif code and i noticed FFABS is double-evaluating all over the filter_line function
[00:36:28 CET] <J_Darnley> Are you sure?  It does many of them but I don't think any do the same two pixels.
[00:36:33 CET] <nevcairiel> compilers are generally trusted to remove that
[00:36:59 CET] <J_Darnley> Oh you mean when it advances to the next pixel
[00:37:07 CET] <tmm1> i mean in the macro itself
[00:37:08 CET] <tmm1> #define FFABS(a) ((a) >= 0 ? (a) : (-(a)))
[00:37:31 CET] <J_Darnley> oh, well yes, as nevcairiel said
[00:38:00 CET] <tmm1> ah i didn't realize the compiler was smart enough to calculate `a` only once there
[00:38:24 CET] <J_Darnley> Well, if you use -O0 it probably won't
[00:38:51 CET] <tmm1> sure, it won't do anything smart in that case
[00:38:52 CET] <J_Darnley> (but ffmpeg doesn't configure itself with that)
[00:39:36 CET] <J_Darnley> (someone should point Ganesh at this code)
[00:43:07 CET] <J_Darnley> tmm1: ffmpeg has lots of macros which end up doing that, the MAX and MIN ones are features in yadif
[00:43:15 CET] <J_Darnley> *featured
[00:47:36 CET] <iive> tmm1: what do you mean by double evaluating? How would you write that macro so it doesn't do that...
[00:49:36 CET] <tmm1> for a callsite like FFABS(c-e), it would do ((c-e) >= 0 ? (c-e) : (-(c-e))), so c-e is performed twice
[00:50:43 CET] <nevcairiel> just need to be careful not to put anything with side-effects in there
[00:50:49 CET] <nevcairiel> but otherwise trust the compiler
[00:51:10 CET] <iive> i can't find it in the moment, but if you have (i++) isn't it possible that it evalutates it 3 times?
[00:52:03 CET] <tmm1> yea i believe that's possible, hence the side-effects warning
[00:52:34 CET] <tmm1> it does look like the compiler produces smart output w/ -O2
[00:52:48 CET] <tmm1> so much for that alleged easy win..
[00:53:46 CET] <iive> that's kind of strange... i've seen compiler optimize common algebra calculates even when explicitly written separately...
[00:53:50 CET] <J_Darnley> I doubt that you can make the C much faster.
[00:54:17 CET] <iive> you might try using built-in
[00:54:35 CET] <J_Darnley> Does arm gain anything if you operate on 16-bit ints rather than 32-bit?
[00:55:07 CET] <iive> does it even have 16 bit registers?
[00:57:16 CET] <tmm1> not sure
[00:58:01 CET] <tmm1> i think i might try using neon compiler intrinsics as a first pass rather than rewriting the whole thing in asm
[00:58:16 CET] <tmm1> trying to figure out which parts of the algorithm can actually be optimized with SIMD though
[00:59:09 CET] <J_Darnley> One thing you should definitely doublecheck is that the if(is_not_edge) branch is made constant.
[00:59:46 CET] <J_Darnley> At one time that was not constant (thanks Libav) and killed performance.
[01:00:56 CET] <iive> tmm1: you want to port yadif to arm?
[01:01:00 CET] <tmm1> should evaluate to if(1) from what i can tell
[01:01:01 CET] <J_Darnley> again, the compiler should be smart enough
[01:01:13 CET] <tmm1> yea seems like an obvious optimization for the compiler
[01:01:37 CET] <tmm1> iive: yea, trying to improve performance on aarch64
[01:03:02 CET] <iive> nice :)
[01:14:47 CET] <J_Darnley> If you want to try simd then you'll probably have to do the whole filter_line function
[01:15:19 CET] <J_Darnley> but I might suggest that you start with the code in the CHECK macros
[01:16:07 CET] <J_Darnley> That's where most of the x86 code is
[01:17:47 CET] <tmm1> cool, i was wondering what the mmx/sse optimized versions did differently
[01:17:54 CET] <J_Darnley> Nothing
[01:18:02 CET] <J_Darnley> They produce identical output
[01:18:49 CET] <J_Darnley> Are you looking for the files containing the x86 code?
[01:20:05 CET] <tmm1> i meant, what does vf_yadif.asm's filter_line do differently than the C version
[01:20:22 CET] <J_Darnley> (they are libavfilter/x86/{vf_yadif.asm,yadif-10.asm,yadif-16.asm}
[01:21:08 CET] <tmm1> yep i see them, my assembly is just really rusty
[01:26:36 CET] <iive> good luck polishing it :)
[01:51:53 CET] <tmm1> getting there, slowly
[01:54:40 CET] <tmm1> the x86 version is using the mmx registers and doing parallel loads/ops, obviously
[03:29:49 CET] <J_Darnley> it uses mm-regs for the mmx version and xmm-regs for the sse2 version
[03:30:18 CET] <J_Darnley> Well, that doesn't really matter.
[03:31:21 CET] <J_Darnley> (plus he left)
[06:02:24 CET] <tmm1> i saw the messages i missed in the channel log
[06:02:42 CET] <tmm1> can the spatial checks be skipped entirely when in nospatial mode?
[07:33:47 CET] <prelude2004c> hey anyone aroun d?
[14:44:12 CET] <J_Darnley> Does the fate website list the machine triplet thing (as in gcc -dumpmachine) anywhere?
[14:44:53 CET] <J_Darnley> oh wait, nevermind
[18:26:45 CET] <ubitux> michaelni: sorry to ask you again, but can you upload http://b.pkh.me/empty-events-2167.srt ? (MD5=6b9dd871e29faf5104be85b1c494bc47)
[18:27:00 CET] <ubitux> (same directory, fate-samples/sub)
[18:44:15 CET] <michaelni> ubitux, uploaded
[18:44:55 CET] <ubitux> thanks
[18:49:57 CET] <metRo_> Hi, I need to compile ffmpeg, should I expose my problem here or in in ffmpeg channel?
[18:51:16 CET] <Daemon404> #ffmpeg
[18:52:46 CET] <metRo_> ok
[18:56:30 CET] <metRo_> since anyone asnwer me at ffmpeg if anyone knows about pkg-config, can you help me there?
[19:18:04 CET] <J_Darnley> What's wrong with that thing now?
[19:21:39 CET] <Compn> ffmpeg's arch nemesis, pkg-config!
[19:22:03 CET] <JEEB> dunno if it's pkg-config's fault if your pc files contain paths from another chroot :P
[19:22:13 CET] <JEEB> (I helped him on the correct channel)
[19:22:41 CET] <J_Darnley> Ah
[19:23:21 CET] <J_Darnley> Well, it can only spit out the config it's given (garbage in; garbage out)
[20:23:17 CET] <cone-066> ffmpeg 03Michael Niedermayer 07master:e9e87822022f: avformat/img2dec: Skip checking the input files existence if it has already been opened
[20:23:18 CET] <cone-066> ffmpeg 03Michael Niedermayer 07master:e70d56b8ad5d: avformat/img2dec: Reuse main IO context instead of reopening a single file
[00:00:00 CET] --- Sun Dec 27 2015


More information about the Ffmpeg-devel-irc mailing list