[Ffmpeg-devel-irc] ffmpeg-devel.log.20170624

Sun Jun 25 03:05:03 EEST 2017

[00:01:40 CEST] <durandal_1707> the overlay filter needs some love: >8 bit support and asm
[00:04:51 CEST] <atomnuker> durandal_1707: funman had some interest in this as well, I think
[00:05:00 CEST] <atomnuker> I can't remember if he wrote something optimized
[00:10:55 CEST] <durandal_1707> atomnuker: wasnt it kierank ?
[00:11:30 CEST] <atomnuker> well, yeah, but it was funman who had to write something
[00:11:58 CEST] <durandal_1707> according to google he already did
[00:16:08 CEST] <jamrial> durandal_1707: did you check the two tests failures i mentioned above?
[00:16:14 CEST] <BBB> atomnuker: derf can be & confusing
[00:16:31 CEST] <jamrial> fate.ffmpeg.org is slowly turning very yellow
[00:16:48 CEST] <atomnuker> BBB: there's still peloverde
[00:17:16 CEST] <atomnuker> (if no one is available I'll do both)
[00:17:22 CEST] <durandal_1707> jamrial: just update checksums
[00:17:26 CEST] <BBB> atomnuker: k
[00:17:40 CEST] <BBB> why is fate so yellow
[00:17:47 CEST] <Compn> durandal_1707 : complex and daunting yes, but you can educate and maybe even trick some people into RE'ing things :)
[00:17:49 CEST] <jamrial> durandal_1707: but why did they change at all? the commit you reverted didn't affect them
[00:18:04 CEST] <jamrial> why does reverting it affect these tests?
[00:18:15 CEST] <jamrial> i don't think updating the checksum without knowing what changed is a good idea
[00:19:05 CEST] <jamrial> BBB: a recent commit broke two tests
[00:19:32 CEST] <durandal_1707> jamrial: the lavfi core is in weird state after removal of recursive code
[00:20:43 CEST] <durandal_1707> Compn: i will just wait for software source code leaks instead
[00:25:16 CEST] <durandal_1707> jamrial: and i failed to contact nicolas
[00:45:52 CEST] <atomnuker> ubitux: are you going to push your asm patches soon?
[00:54:48 CEST] <cone-545> ffmpeg 03Rostislav Pehlivanov 07master:e1120b1c5446: mdct15: add assembly optimizations for the 15-point FFT
[00:56:19 CEST] <iive> \o/
[01:04:38 CEST] <jamrial> durandal_1707: if you think it's ok then please update the checksums
[01:06:44 CEST] <durandal_1707> jamrial: the change is that older ones would get progressive flag set and new one doesnt
[01:08:22 CEST] <jamrial> alright
[01:11:50 CEST] <atomnuker> that haddps isn't getting hot at all and I have no explanation for this
[01:12:58 CEST] <iive> modern cpu are out-of-order
[01:13:08 CEST] <atomnuker> yep, that's what I thought, got lucky
[01:13:25 CEST] <iive> if it is hot, it probably is waiting for something above it.
[01:32:19 CEST] <iive> atomnuker: you are on skylake, right? addps has latency=4 invthru=1/2   ;  haddps has latency=6 invthru=2
[01:33:30 CEST] <atomnuker> yep
[01:34:01 CEST] <Gramner> all the hadd instructions are essentially 2 shuffle µops and one ALU µop on intel cpus
[01:36:15 CEST] <iive> Gramner: then why on my cpu doing the shifts and adds manually is faster?!
[01:36:24 CEST] <iive> why intel, why.....
[01:38:37 CEST] <Gramner> it's literally the same speed as doing two shuffles and one add. if you use shifts instead of shuffles that will use p01 instead of p5 which may be faster or slower depending on the port utilization
[01:39:30 CEST] <iive> shuffles... sorry.
[01:41:02 CEST] <Gramner> and they could've made those instructions fast if they wanted to. they just didn't want to spend transistors on it
[01:41:46 CEST] <Gramner> kind of a catch 22. people avoided using them because they were slow. and no point in making them fast because nobody used them
[01:41:50 CEST] <Gramner> I guess?
[01:42:14 CEST] <Gramner> and with EVEX they are gone completely so that's your long-term solution
[01:43:27 CEST] <iive> catch-22 exactly
[02:05:07 CEST] <iive> Gramner: oh, i see. I've been using hadd for horizontal sum, aka sum all elements. So I used 2 hadds this makes 4 shuffles and 2 sums in micro op.
[02:05:44 CEST] <Gramner> yes, that's bad. don't do that
[02:05:45 CEST] <iive> the sum macro used 2 shuffles and 2 sums, so it is faster.
[02:06:08 CEST] <durandal_1707> your own code?
[02:06:39 CEST] <iive> yes
[02:12:39 CEST] <iive> i wrote my own macro. it differs from x86util by that it leaves the sum already broadcasted in all elements.
[02:13:00 CEST] <durandal_1707> for ffmpeg?
[02:13:05 CEST] <rcombs> wait, so it's faster to do horizontal add using shuffles and vertical adds than to use the actual horizontal add instructions?
[02:13:13 CEST] <iive> yes, patch is on the maillist.
[02:13:28 CEST] <iive> rcombs: bingo
[02:13:38 CEST] Action: rcombs boggles
[02:13:44 CEST] <rcombs> is hadd completely useless then
[02:13:49 CEST] <iive> haddps doesn't do a full horizontal add, it does 2 by 2 add
[02:14:31 CEST] <rcombs> yeah, and you do 2 to sum the whole register
[02:15:02 CEST] <iive> yes
[02:16:12 CEST] <Gramner> the hadd instructions horizontally adds two registers, not one
[02:16:30 CEST] <Gramner> using them on a single src is kind of useless
[02:17:04 CEST] <Gramner> they're basically only worth using if they happen to do exactly what you want
[02:19:10 CEST] <iive> yep
[02:20:39 CEST] <iive> and I do wonder where intel thought it would be useful...
[02:20:56 CEST] <iive> i mean, somebody must have beened that instruction badly...
[02:21:21 CEST] <iive> needed...
[02:33:10 CEST] <jamrial> iive: there are some places where it's perfect, like aacps's add_squares
[02:34:11 CEST] <jamrial> also aacsbr's autocorrelate
[02:35:00 CEST] <jamrial> but aside from those, as Gramner said, unless the kind of horizontal add it does is exactly the kind you need, you're bettter not using it
[02:35:22 CEST] <jamrial> amd's xop has some good and fast single reg horizontal add. integer only, though
[02:39:27 CEST] <iive> so complex number math...
[02:39:41 CEST] <iive> that makes more sense.
[02:39:47 CEST] <iive> n8 ppl.
[03:40:09 CEST] <atomnuker> so if avx doesn't have any integer instructions how come lots of our video codec functions are marked as avx?
[03:40:15 CEST] <atomnuker> like the vp9 loopfilter
[03:41:48 CEST] <jamrial> atomnuker: what avx doesn't support is integer instructions with ymm regs
[03:42:01 CEST] <jamrial> that started with avx2 as i said the other day
[03:44:56 CEST] <atomnuker> yes, then what's the point in marking those functions as avx if they don't use ymm regs and they don't use floats at all?
[03:45:49 CEST] <jamrial> non destructive, three operand version of xmm instructions
[03:46:13 CEST] <atomnuker> ah, ok, I knew there had to be a reason
[03:46:27 CEST] <jamrial> in some cases, like with the transpose macros, you save a bunch of movas that way
[04:39:43 CEST] <cone-014> ffmpeg 03James Almer 07master:a579dbb4f7de: checkasm: add missing checks to float_dsp's butterflies_float test
[05:32:37 CEST] <cone-014> ffmpeg 03Reino17 07master:078322f33ced: Add support for LibOpenJPEG v2.2/git
[06:18:47 CEST] <jamrial> atomnuker: https://trac.ffmpeg.org/ticket/6484
[06:47:10 CEST] <cone-014> ffmpeg 03James Almer 07master:349446e36f17: x86/mdct15: use three operand form for some instructions
[06:54:21 CEST] <cone-014> ffmpeg 03James Almer 07master:e5bce8b4ce7b: fate: update checksums for fate-lavf-ffm and fate-lavf-mxf
[07:53:08 CEST] <ubitux> atomnuker: maybe later today i guess
[08:35:01 CEST] <ubitux> wtf is wrong with libopenjpeg?
[08:35:18 CEST] <ubitux> is our version check going to grow after every minor release?
[08:35:41 CEST] <ubitux> how stupid is this
[08:37:20 CEST] <ubitux> they do have a pc file, we should use it
[08:38:08 CEST] <ubitux> current state is braindead
[09:06:23 CEST] <ubitux> michaelni: here is a sample https://0x0.st/Uma.mkv
[09:06:35 CEST] <ubitux> wm4: don't you get ton of report for that kind of stuff btw? ^
[09:09:07 CEST] <ubitux> maybe that's just because the 720p stream is broken...
[09:12:28 CEST] <ubitux> the 1080p stream seems to be working out
[09:18:24 CEST] <wm4> ubitux: about what?
[09:18:39 CEST] <ubitux> broken pgs sub
[09:18:46 CEST] <durandal_1707> what codecs use new mdct asm?
[09:19:04 CEST] <ubitux> wm4: but that might be because the file is broken, since the 1080p looks ok
[09:19:08 CEST] <ubitux> still, looks weird
[09:19:40 CEST] <ubitux> the 720p is basically broken any time you have multiple subs
[09:20:06 CEST] <wm4> ubitux: at which point is the sample broken how?
[09:20:22 CEST] <wm4> oh timing?
[09:20:37 CEST] <ubitux> it's missing subtitles
[09:20:47 CEST] <ubitux> compare with the 2nd stream
[09:21:05 CEST] <ubitux> as soon as you have "okonomiyaki" displays, the dialogues don't show up anymore
[09:21:39 CEST] <wm4> is it a libavcodec issue?
[09:22:35 CEST] <ubitux> no idea, didn't really investigate
[09:23:12 CEST] <ubitux> i mean, it happens with ffplay as well, it's not mpv
[09:23:26 CEST] <ubitux> (assuming your question was about ffmpeg vs the world, and not lavc vs lavf)
[10:20:33 CEST] <cone-014> ffmpeg 03Paul B Mahol 07master:565dc0e283a8: avfilter/vf_overlay: add auto format mode
[12:23:02 CEST] <michaelni> ubitux, about Uma.mkv, should i upload this to fate ? if so which directory ? or did you mean something else ?
[12:23:31 CEST] <ubitux> it's probably a bit large, it could be reduced
[12:23:41 CEST] <ubitux> i have no plan wrt that sample
[12:24:00 CEST] <ubitux> it's just broken pgs sub i was refering to, and you seemed "interested" :)
[13:35:01 CEST] <atomnuker> durandal_1707: aac and opus
[13:35:10 CEST] <atomnuker> ubitux: so should I revert?
[13:35:49 CEST] <durandal_1707> revert what?
[13:36:05 CEST] <atomnuker> if the error happens only with windows and msvc couldn't we tweak the configure flag?
[13:36:27 CEST] <atomnuker> durandal_1707: that patch which did strip -wN @ -> strip -x to strip assembly
[13:53:48 CEST] <cone-014> ffmpeg 03Ronald S. Bultje 07master:97f7f831691f: vf_spp: only assign function pointers if permutation matches expectations.
[13:56:12 CEST] <jkqxz> michaelni:  What phrase would you prefer to describe an empty packet?  Other documentation doesn't suggest a common answer.
[14:12:45 CEST] <michaelni> jkqxz, iam not sure but something thats uambigous about what empty means, like dts/pts being uninitialized vs NOPTS, data being NULL vs not. only size=0 is really clear from "empty"
[14:14:13 CEST] <michaelni> empty packet as returned by av_init_packet() with data=NULL for example
[14:14:42 CEST] <michaelni> or empty packet, (size=0 other fields dont matter)
[14:15:01 CEST] <durandal_1707> i see why undefined shifts are bad but fixing them just to silence error might not be always correct solution
[14:16:44 CEST] <michaelni> durandal_1707, yes of course, is there a commit where you know/have a better solution?
[14:17:18 CEST] <durandal_1707> michaelni: no, this quite complex issue
[14:19:07 CEST] <durandal_1707> can we ditch older prores encoder its in any aspect worse than kostya one
[14:25:51 CEST] <atomnuker> put a patch up on the ML and I'll take a look at it
[14:26:09 CEST] <atomnuker> dericed had some comments about which encoder he preferred
[14:48:44 CEST] <cone-014> ffmpeg 03Michael Niedermayer 07master:4976a3411f71: avcodec/mpeg4videodec: Fix GMC with videos of dimension 1
[15:02:15 CEST] <ubitux> atomnuker: do you ask me about the strip thing?
[15:03:16 CEST] <atomnuker> yep
[15:06:33 CEST] <ubitux> atomnuker: why me? i don't remember being involved in it
[15:13:04 CEST] <atomnuker> oh, sorry, by "maybe later today i guess" I thought you meant you'd look at the failure since it was right after jamrial posted it (and I didn't notice it was jamrial)
[15:13:23 CEST] <atomnuker> nvm then
[15:19:17 CEST] <ubitux> 00:45 <@atomnuker> ubitux: are you going to push your asm patches soon?
[15:19:21 CEST] <ubitux> i was replying to this ^
[15:21:27 CEST] <atomnuker> yep, I know
[15:53:56 CEST] <kierank> BBB: is it finally all done?
[15:54:11 CEST] <BBB> kierank: no
[15:54:15 CEST] <kierank> :(
[15:54:18 CEST] <BBB> kierank: still need a final approval for the idct patches
[15:54:36 CEST] <BBB> theres no outstanding issues ATM
[15:54:40 CEST] <BBB> but it isnt approved yet either
[15:54:44 CEST] <BBB> you know how this works
[15:56:16 CEST] <jkqxz> michaelni:  How about "The supplied packet is consumed and will be blank (as if just allocated) when this function returns.", then?
[16:07:01 CEST] <kierank> BBB: maybe just push it and let michaelni complain later
[16:07:49 CEST] <BBB> lets stay nice
[16:11:00 CEST] <durandal_1707> paying debts
[16:11:32 CEST] <michaelni> jkqxz, sounds good to me, thx
[16:12:50 CEST] <michaelni> BBB, which idct patches are left and need a review ?
[16:12:59 CEST] <BBB> 9/11, 10/11 and 11/11
[16:13:35 CEST] <BBB> 9/11 is approved by me
[16:13:37 CEST] <BBB> so thats fine
[16:13:44 CEST] <BBB> [PATCH 10/11] avcodec/x86: add an 8-bit simple IDCT function based on the x86-64 high depth functions
[16:13:49 CEST] <BBB> that one is outstanding
[16:14:04 CEST] <BBB> and then [PATCH 11/11] avcodec/x86: use new x86-64 functions for -idct simple, which is fairly trivial
[16:14:08 CEST] <BBB> 10/11 is the main outstanding one
[16:14:24 CEST] <BBB> Im fine with it but you have had concerns in the past so Im waiting for you to sign off also
[16:15:00 CEST] <BBB> Ill lgtm 11/11
[16:15:08 CEST] <BBB> so we can focus on 10/11
[16:32:34 CEST] <durandal_1707> atomnuker: where is your noisereduce code?
[18:18:30 CEST] <atomnuker> durandal_1707: https://github.com/atomnuker/FFmpeg/tree/noisereduct_filterhttps://github.com/atomnuker/FFmpeg/tree/noisereduct_filter
[18:43:55 CEST] <cone-014> ffmpeg 03Mark Thompson 07master:bc4e33ce0f0e: ffmpeg: Flush output BSFs when encode reaches EOF
[18:43:56 CEST] <cone-014> ffmpeg 03Mark Thompson 07master:49419925d333: vp9: Add bsf to fix reordering in raw streams
[18:43:57 CEST] <cone-014> ffmpeg 03Mark Thompson 07master:bde04604065d: vaapi_encode: Add VP9 support
[18:43:58 CEST] <cone-014> ffmpeg 03Mark Thompson 07master:dc81f1a2cef1: doc: Add VAAPI encoders
[19:13:13 CEST] <cone-014> ffmpeg 03Marton Balint 07master:c14fa7a330f6: avformat/aviobuf: fix flushing write buffers after seeking backward or forward
[19:13:14 CEST] <cone-014> ffmpeg 03Marton Balint 07master:09891c539162: avformat/aviobuf: add support for specifying minimum packet size and marking flush points
[19:13:15 CEST] <cone-014> ffmpeg 03Marton Balint 07master:eeeb595c7f1c: avformat: make flush_packets a tri-state and set it to -1 (auto) by default
[19:13:16 CEST] <cone-014> ffmpeg 03Marton Balint 07master:db9e87dd8c1c: avformat/file: increase min/max packet size to 256k for written files
[19:18:04 CEST] <cone-014> ffmpeg 03Paul B Mahol 07master:c90b88090c26: avfilter: do not leak AVFrame on failed buffer allocation
[19:18:05 CEST] <cone-014> ffmpeg 03Paul B Mahol 07master:f483949188dc: avfilter/af_headphone: do not free frame that's gonna be reused later
[19:26:26 CEST] <cone-014> ffmpeg 03Paul B Mahol 07master:c1b43e8452e7: avfilter/vf_overlay: remove rgb option
[20:03:01 CEST] <BBB> Gramner: I think we should do micro-optimizations on the simd separately
[20:03:14 CEST] <BBB> Gramner: its been near-hell to get this patch-set algorithmically ready for inclusion
[20:03:24 CEST] <BBB> Gramner: I think j_darnley has already given up
[20:07:10 CEST] <Gramner> sure, do as you wish
[20:07:17 CEST] <BtbN> why the hell is Ubuntu still on freetype 2.6.3, even on 17.04
[20:07:29 CEST] <JEEB> huh
[20:07:38 CEST] <JEEB> when did 2.7 release? and now there's 2.8
[20:07:43 CEST] <BtbN> yep
[20:07:47 CEST] <BtbN> that's what I'm thinking
[20:07:56 CEST] <BtbN> 2.7 massively improved the rendering quality
[20:08:21 CEST] <BtbN> https://packages.ubuntu.com/artful/libfreetype6 17.10, still 2.6.3
[20:08:42 CEST] <durandal_170> have pics?
[20:08:49 CEST] <BtbN> kind of
[20:09:20 CEST] <BtbN> https://github.com/OGGM/oggm-sample-data/tree/master/baseline_images everything in 2.0.x is rendered with 2.6.3, everything in alt with 2.8.0
[20:09:26 CEST] <BtbN> it's breaking our test suite...
[20:11:00 CEST] <BtbN> https://github.com/OGGM/oggm-sample-data/commit/a9888f52fc3396ed7a8effa2f48b128acdad07c9 gives you compare-tools
[20:11:40 CEST] <atomnuker> the debian maintainer for freetype was a hardcore ubuntu spy who held it back
[20:11:52 CEST] <atomnuker> though he gave up quite some time ago and now debian has 2.8
[20:12:06 CEST] <BtbN> but why
[20:12:15 CEST] <atomnuker> they have their own freetype patches
[20:12:26 CEST] <BtbN> I don't want them
[20:12:33 CEST] <atomnuker> use debian :)
[20:13:47 CEST] <Gramner> debian expermiental has 2.8, stretch and sid has 2.6
[20:14:45 CEST] <BBB> J_Darnley: will you push? kierank <<
[20:15:03 CEST] <kierank> yes i will let him do the hnour
[20:15:16 CEST] <kierank> honour
[20:16:00 CEST] <BBB> woohoo
[20:16:07 CEST] <BtbN> Can't even properly detect it, as there is no useful version information in that package without -dev: https://packages.ubuntu.com/zesty/amd64/libfreetype6/filelist
[21:08:25 CEST] <J_Darnley> BBB: I've just seen the new emails.  Thank you.
[21:09:25 CEST] <J_Darnley> I haven't quite given up.  My frustrations have moved onto network drivers.
[21:10:59 CEST] <J_Darnley> I'll test out Gramner's suggestion
[21:11:17 CEST] <J_Darnley> then I will rebase and I guess push
[21:45:42 CEST] <durandal_170> can wavpack decoding be simdable? i havent found good candidates
[21:49:08 CEST] <RiCON>  durandal_170: they have bsd-licensed asm in libwavpack at least
[21:51:47 CEST] <durandal_170> yea but i dont see how that can be improved much
[21:53:06 CEST] <durandal_170> or wavpack does it all work in simd in one go
[22:12:04 CEST] <JEEB> hmm, getting this with the latest master and I wonder if it's valid. not like I'm using ADPCM but it seemed awfully specific http://up-cat.net/p/05599869
[22:19:01 CEST] <cone-014> ffmpeg 03Michael Niedermayer 07master:933aa91e31d5: avcodec/hevcdec: check ff_init_cabac_decoder() for failure
[22:19:02 CEST] <cone-014> ffmpeg 03Michael Niedermayer 07master:247606768033: avcodec/hevcdec: Use error path if init_get_bits8() fails
[22:19:03 CEST] <cone-014> ffmpeg 03Jun Zhao 07master:2b7d9a1f3fa7: lavc/put_bits: Add put_bits64() to support up to 64 bits.
[22:19:04 CEST] <cone-014> ffmpeg 03Jun Zhao 07master:e61abe2d7329: lavc/golobm: Add set_ue_golomb_long to support up to 2^32 -2.
[22:19:05 CEST] <cone-014> ffmpeg 03Jun Zhao 07master:32deea87c1d6: lavc/tests/golomb: Add unit test for set_ue_golomb_long.
[22:33:09 CEST] <cone-014> ffmpeg 03Paul B Mahol 07master:10542491113d: avcodec/adpcm_data: use uint16_t to handle all values
[22:52:18 CEST] <cone-014> ffmpeg 03Paul B Mahol 07master:5c1f4330d4c3: avfilter/vf_lut2: add support for gray10 and gray12 pixel formats
[23:38:38 CEST] <atomnuker> iive: dude, holy shit this thing is fast
[23:38:44 CEST] <atomnuker> avx2 works as well
[23:39:21 CEST] <durandal_170> how much fast?
[23:40:14 CEST] <atomnuker> 3.52 times on the default 96kbps
[23:40:40 CEST] <durandal_170> than pure C ?
[23:41:33 CEST] <atomnuker> yep
[23:51:55 CEST] <kierank> J_Darnley: can you push?
[23:52:58 CEST] <durandal_170> why? what you get?
[00:00:00 CEST] --- Sun Jun 25 2017