[Ffmpeg-devel-irc] ffmpeg-devel.log.20180118

Fri Jan 19 03:05:04 EET 2018

[04:52:05 CET] <Compn> a wild ffm user appears
[09:05:58 CET] <wm4> can someone with access edit the issue tracker in there: https://github.com/FFmpeg/FFmpeg/pull/153
[09:06:16 CET] <wm4> since some people appear to be trying to open PRs to make bug reports
[09:07:18 CET] <wm4> actually looking again I'm not so sure, but should be a good idea anyway
[09:10:08 CET] <mateo`> some comments are pretty hilarious: "show your respect to the community" !
[10:03:42 CET] <KGB> [13FFV1] 15JeromeMartinez opened pull request #101: Add sample_difference description (06master...06sample_difference) 02https://git.io/vNRiS
[10:11:00 CET] <bogdanc> after going through all the cfhd_filters in cfhd.c is the output modified anywhere else?
[10:15:13 CET] <bogdanc> i'm trying to fix the distortion of the last 8 bottom pixels on cineform videos and i've tried to modify the formulas on the filter when it affects only the last 8 pixels but i'm starting to think that it gets distorted somewhere else
[11:42:52 CET] <kierank> bogdanc: have you actually confirmed the distortion happens in a player
[11:43:36 CET] <bogdanc> yes
[11:43:56 CET] <bogdanc> i'm using the "Sample with the most obvious distortions uploaded:"
[11:44:06 CET] <bogdanc> and playing it in sony vegas
[11:44:32 CET] <kierank> Huh
[11:44:53 CET] <bogdanc> i mean, it's not really a distortion, it's messed up colors
[11:45:10 CET] <bogdanc> and i tried many things with the current used filters
[11:46:27 CET] <bogdanc> but i just achieved different types of distortions, some closer to how it's supposed to look
[11:47:59 CET] <bogdanc> but when i don't even let the output to change those pixels i still get some distortions, so that's why i'm starting to think that it's some kind of a post decoding but i haven't found it yet
[11:50:17 CET] <kierank> what has sony vegas got to do with playing that file
[11:50:28 CET] <kierank> unless you're making it use ffmpeg
[11:51:35 CET] <bogdanc> it's just easy for me to analyze different outputs
[11:51:59 CET] <bogdanc> i create the outputfile using the ffmpeg in ubuntu, then i place that output in sony vegas
[11:52:01 CET] <bogdanc> https://imgur.com/ZjWLtLc
[11:52:38 CET] <bogdanc> here you can see how the bottom pixels react when i dont apply the vert_filter
[11:52:38 CET] <kierank> that's a very odd way to do things since you're actually recompressing the video
[11:54:11 CET] <bogdanc> well, what do you recommend to use?
[11:57:24 CET] <kierank> personally make ffmpeg output to yuv and use pyuv viewer
[11:57:27 CET] <kierank> but some people use mpv
[11:57:46 CET] <kierank> but anything that can show the video untouched without conversions is the most important
[11:59:19 CET] <bogdanc> how to make it to yuv?
[11:59:30 CET] <kierank> ./ffmpeg -i file.mov -y out.yuv
[12:00:44 CET] <bogdanc> ok, i'm going to use this from now on
[13:00:23 CET] <kierank> Gramner: not ffmpeg related but would you be interested in reviewing this guy's avx2 gather implementation
[13:00:24 CET] <kierank> https://github.com/glenvt18/libdvbcsa/pull/4
[15:51:47 CET] <wm4> does anyone know what AVProgram.flags contains
[15:53:01 CET] <wm4> as far as I can tell it's unused, maybe
[15:54:25 CET] <wm4> yeah commenting it and rebuilding ffmpeg works
[16:27:51 CET] <cone-321> ffmpeg 03Paul B Mahol 07master:8088b5d69c51: avfilter/af_afade: acrossfade: switch to activate
[16:38:31 CET] <jamrial> https://www.ffmpeg.org/doxygen/trunk/structAVStream.html
[16:38:36 CET] <jamrial> apparently doxygen chokes on struct fields that are wrapped in FF_API_ preprocessor checks
[16:39:31 CET] <jamrial> they are missing
[16:44:16 CET] <nevcairiel> That's ok, you shouldn't use them anymore :D
[16:46:16 CET] <jamrial> heh
[17:02:23 CET] <cone-321> ffmpeg 03James Almer 07master:4f6b34f1f803: avformat: small AVFormatContext doxy cosmetics
[17:08:51 CET] <wm4> I'm a fan of not removing documentation for deprecated fields, but those who look at the doxygen output are asking for trouble anyway
[17:12:35 CET] <nevcairiel> yeah i find doxygen mostly un-navigatable anyway
[17:12:38 CET] <nevcairiel> rather look at the headers
[17:14:04 CET] <atomnuker> I use it occasionally
[17:16:16 CET] <wm4> http://ffmpeg.org/pipermail/ffmpeg-devel/2018-January/224164.html
[17:16:22 CET] <wm4> can we penalize CE already?
[17:16:30 CET] <wm4> I'm tired of his uncalled for bullshit
[17:30:14 CET] <RiCON> unless i'm missing some message in there, i can't tell what provoked him now
[17:31:39 CET] <philipl> Looks like a passive aggressive cheap shot
[17:31:45 CET] <RiCON> i'd assume he just mistyped Nicolas, since he's clearly not quoting you
[17:32:57 CET] <kierank> he's saying they should unite against the enemy that is wm4 (yes, wtf)
[17:35:57 CET] <RiCON> i definitely didn't read that much into it, but the "common goal" line is certainly missing some context
[18:23:52 CET] <Madsfoto> Good evening. Is this a good place to ask about extending the Atadenoise filter past 129 frames? If not, where would the appropriate place be?
[18:29:19 CET] <durandal_1707> Madsfoto: why?
[18:31:05 CET] <durandal_1707> 129 size is big enough imho
[18:34:24 CET] <Madsfoto> durandal_1707> For noise reduction, I agree with you, however I do dissimilar images averaged together to create a flowing motion through the images and that requires a lot of images averaged together. Currently I use Imagemagick to get the ~600 image averaged I need but it's much slower (even in 129 pictures vs atadenoise) than ffmpeg
[18:35:29 CET] <Madsfoto> as an example, (not mine) https://vimeo.com/81126520
[18:36:36 CET] <durandal_1707> Madsfoto: but atadenoise never was designed for that in mind, if you want general mix, wait for tmix filter.
[18:38:09 CET] <durandal_1707> main issue is that one needs to keep all pictures in memory for atadenoise
[18:38:45 CET] <durandal_1707> 600 frames of what size takes how much memory?
[18:39:30 CET] <Madsfoto> durandal_1707> 1920x1080 ~3 GB in Imagemagick (got 32 GB so it's not a problem)
[18:39:57 CET] <Madsfoto> also, I was unable to get any information about tmix filter off google :(
[18:42:47 CET] <durandal_1707> Madsfoto: tmix is just temporal version of mix filter,  not yet in codebase
[18:43:37 CET] <Madsfoto> That sounds amazing!
[18:43:56 CET] <sfan5> wouldn't you just need to increase `#define FF_BUFQUEUE_SIZE 129` ?
[18:44:23 CET] <sfan5> or is there any other problem other than RAM usage with that
[18:44:36 CET] <Madsfoto> I have no idea, I'm not a programmer
[18:45:28 CET] <durandal_1707> there are other params which are not designed for averaging frames
[18:50:57 CET] <Gramner> kierank: without looking into details that's kind of awkward to make fast in avx2. gather is really slow on HSWL and all AMD cpu:s so you'd only want to use that on SKL. a possibility is to store the entire lut in registers and "gather" from those using shuffles/shifts/blends but 16 ymm regs might be too few since you'd need half of them just to store the lut.
[18:52:13 CET] <Gramner> with avx512 the latter approach would be quite manageable and with avx512vbmi straight up trivial (with the only minor complication being that no cpu has been released with avx512vbmi yet)
[18:53:23 CET] <kierank> jdarnley: ^
[18:54:34 CET] <atomnuker> Gramner: gathers are still slower than individual loads on skylake
[18:56:29 CET] <atomnuker> I think it was 2900 decicycles vs 900ish for 4 64-bit loads
[18:56:55 CET] <Gramner> yes but you'd need to do fewer loads overall compared to scalar lut[src[i]] so it might still be a win. alsto the throughput is almost the same as individual loads, it's mainly the latency that's bad, and that can be hidden if you process a sizeable amount of data before you need the results
[18:57:51 CET] <Gramner> oh, and don't use it to load just 4 values
[18:58:14 CET] <Gramner> minimum 8, preferably 16 (with zmm regs)
[18:58:39 CET] <Gramner> sicne the overhead gets proportionally larger the fewer valeus you load
[19:00:22 CET] <jdarnley> Okay, gather only really good on skylake
[19:00:29 CET] <jdarnley> *any good
[19:01:06 CET] <jdarnley> Even if you might do a "complicated" bit of math to gett he address?
[19:01:06 CET] <Gramner> s/good/not terrible/
[19:04:58 CET] <Gramner> depends on the use case I guess. maybe?
[19:06:33 CET] <atomnuker> Gramner: I didn't load just 4 values, I loaded 8 values but each pair was always going to be next to each other so 4 loads for avx2 was faster
[19:09:40 CET] <Gramner> vpgatherdd ymm is 4 µops on SKL, 34 on HSW, and 66 on ryzen :D
[19:11:51 CET] <atomnuker> damn ryzen
[19:12:45 CET] <Gramner> well yeah, the number of loads is what matters. how many "values" those bits represent is irrelevant as far as memory loads goes
[19:12:57 CET] <atomnuker> but ryzens have more units, right? or is it just for math ops?
[19:13:24 CET] <Gramner> ryzen has 4 simd execution units iirc, same as intel
[19:14:04 CET] <atomnuker> I think intel only had 2 256bit units, ryzens had 4 128bit
[19:16:00 CET] <Gramner> err, I meant to write 3. but looking it up it seems amd indeed has 4 although the 4th seems more limited
[19:18:03 CET] <Gramner> logical ops are the only ones that can run on all 4
[19:18:37 CET] <Gramner> intel has 3 256-bit ones. only one of them can do shuffles though. on skl-x there's on 512-bit and two 256-bit
[19:19:14 CET] <atomnuker> you can still run 256bit and 128bit ops on the 2 512bit ones, right?
[19:19:22 CET] <Gramner> yes
[19:19:24 CET] <Gramner> it's p5
[19:19:27 CET] <Gramner> the shuffle one
[19:19:53 CET] <Gramner> althouh it can only do shuffles on <512-bit
[19:20:03 CET] <Gramner> and logical
[19:20:10 CET] <Gramner> so and/or/xor
[19:20:21 CET] <atomnuker> wow, its economical to unroll loops by 4 for that monster
[19:20:28 CET] <Gramner> I'm assuming the rest is power gated
[19:21:38 CET] <Gramner> with out-of-order execution you don't really need to unroll things by the exact count since you'll probably end up having an OOE window that covers 10 loop iterations anyway
[19:23:17 CET] <Gramner> skylake has an OOE window of 224 instructions for example
[19:23:55 CET] <Gramner> and can execute things wildly out of order within that window to try to maximize utilization of execution units
[19:24:40 CET] <atomnuker> crazy
[19:25:09 CET] <Gramner> modern cpu architectures are sorcery and dark magic
[19:25:24 CET] <Gramner> it's surprising they work as well as they do
[19:26:50 CET] <atomnuker> what's stopping them from putting 16 or 32 arithmetic units? area?
[19:27:04 CET] <wm4> CPU: "throw shit at the wall and see what sticks"
[19:27:26 CET] <Gramner> execution units are rarely the bottleneck so it wouldn't really help
[19:27:40 CET] <Gramner> we're mostly held back by memory really
[19:28:13 CET] <Gramner> you need things in L1 to get good utilization of execuition units but L1 is tiny
[19:29:41 CET] <Gramner> cpu performance has scaled way more than memory performance over time
[19:29:44 CET] <atomnuker> 32k is pretty much okay, so I guess RAM->L1 is the bottleneck
[19:30:47 CET] <atomnuker> and you can't improve that much because relativity
[19:31:16 CET] <atomnuker> von neuman-less cpus when?
[19:41:33 CET] <atomnuker> this is ridiculous, why is realplayer still alive as well as the company behind it? I thought they went bankrupt ages ago
[19:42:27 CET] <atomnuker> how do they even get permission to use patents in their totally from scratch codecs?
[19:43:47 CET] <durandal_1707> for your amusement
[19:49:15 CET] <wm4> why does this rv thing require ffmpeg.c changes
[19:50:03 CET] <wm4> "RealMedia® HD is an awesome video codec!"
[19:50:08 CET] <wm4> good argument
[19:50:48 CET] <wm4> (that's the first thing you see on the website)
[19:52:18 CET] <wm4> the download is a .run file (probably some self extracting contraption) so my superficial curiosity ends here
[19:54:18 CET] <durandal_1707> wm4: what? it have linux executable?
[19:55:47 CET] <BtbN> the ffmpeg.c changes seem weird. Why does the coded need special treatment like that?
[19:56:34 CET] <wm4> durandal_1707: probably... and a library at least (I was curious what they have in their headers)
[19:56:50 CET] <atomnuker> I don't think so, there are just libraries in that archive
[19:56:58 CET] <atomnuker> you can use --target to just extract it
[19:57:44 CET] <atomnuker> the libraries are however not stripped, there's some data as strings left
[19:58:17 CET] <wm4> well the patch includes a <librv11_sdk.h>
[19:58:29 CET] <wm4> so that file must be somewhere
[19:59:21 CET] <durandal_1707> we do not need this libary bloat, tell him to foff
[20:00:10 CET] <durandal_1707> i do not accept anything from real, even gifts
[20:00:55 CET] <atomnuker> I'd accept a well written decoder which uses whatever it can from the hevc decoder, as a gift
[20:01:59 CET] <wm4> urgh, more entangled mpegvideo style mess?
[20:02:41 CET] <atomnuker> no, just the c/asm code, not a shared struct
[20:06:26 CET] <durandal_1707> dont accept, and it will like rest of real dead slowly
[20:07:55 CET] <RiCON> librv11dec_decoder_deps="librv11dec"
[20:08:14 CET] <RiCON> the configure changes are really weird
[21:40:47 CET] <KGB> [13FFV1] 15michaelni pushed 1 new commit to 06master: 02https://git.io/vNEeU
[21:40:47 CET] <KGB> 13FFV1/06master 146d0323e 15Jérôme Martinez: Add sample_difference description.
[23:02:09 CET] <jamrial> durandal_1707: 8088b5d69c broke the acrossfade test
[23:03:02 CET] <durandal_1707> jamrial: there was test?
[23:03:17 CET] <jamrial> durandal_1707: fate-filter-acrossfade
[23:03:30 CET] <durandal_1707> what broke?
[23:04:01 CET] <jamrial> what used to be one frame is now two
[23:04:16 CET] <jamrial> http://fate.ffmpeg.org/report.cgi?slot=x86_64-archlinux-gcc-threads-2&time=20180118181730
[23:04:35 CET] <durandal_1707> ahh, just update test
[23:13:25 CET] <cone-790> ffmpeg 03James Almer 07master:fb3fd4d50623: fate: update filter-acrossfade test reference file
[23:15:14 CET] <durandal_1707> and i really dislike testing of float / double sample formats filters
[23:15:54 CET] <durandal_1707> this one should just test final hash
[00:00:00 CET] --- Fri Jan 19 2018