[Ffmpeg-devel-irc] ffmpeg-devel.log.20170730
burek
burek021 at gmail.com
Mon Jul 31 03:05:04 EEST 2017
[00:00:13 CEST] <iive> aha,
[01:22:38 CEST] <cone-843> ffmpeg 03Jun Zhao 07master:1e0c75ea165c: examples/hw_decode: Add a HWAccel decoding example.
[02:32:29 CEST] <atomnuker> jamrial: fixed everything and added sse3 version
[02:32:40 CEST] <atomnuker> one problem: changing the alignment to 32 segfaults
[02:32:59 CEST] <atomnuker> any ideas why that could happen?
[02:33:59 CEST] <jamrial> atomnuker: put sign_adjust_5 at the end
[02:36:44 CEST] <atomnuker> jamrial: didn't work
[02:38:04 CEST] <jamrial> it should. if the first constant is 16 bytes, whatever comes next will not be 32 byte aligned
[02:38:46 CEST] <jamrial> if you put sign_adjust_5 after the three new 32 byte contants, it should be good
[02:38:55 CEST] <atomnuker> oh, misread _5 as _r
[02:42:54 CEST] <iive> atomnuker: you need a new font for irc
[02:48:54 CEST] <atomnuker> jamrial: sent a v2 of the patch to your first reply
[03:44:47 CEST] <atomnuker> jamrial: what CPUs does HAVE_AVX2_FAST filter out which have avx2?
[04:00:30 CEST] <jamrial> atomnuker: excavator
[04:01:29 CEST] <atomnuker> its slower than avx there?
[04:02:40 CEST] <jamrial> anything using ymm regs is slow in bulldozer based cpus
[04:02:54 CEST] <jamrial> of those, excavator is the only one with avx2
[04:03:52 CEST] <atomnuker> what were amd thinking even flagging avx2 support
[04:07:04 CEST] <jamrial> a checkmark in the feature list? then there's broadcast instructions, variable bit shift instructions (even though xop had its own kind as well), vpblendd, all working with xmm regs
[04:17:28 CEST] <atomnuker> I'll need to put a big warning on the mdct function which says it underreads and overreads the input
[04:18:00 CEST] <atomnuker> aac and opus have the input in a struct with stuff before and after it so it won't segfault
[04:18:40 CEST] <atomnuker> its dirty as hell but damn non-power-of-two transforms
[04:20:37 CEST] <atomnuker> len8 is mod 4 but not mod 8
[04:40:58 CEST] <atomnuker> jamrial: the SSE3 version can be made to run on 32 bit machines too, since I'm using 8 grps and 8 xmm regs
[04:41:05 CEST] <atomnuker> https://pars.ee/temp/0001-mdct15-add-inverse-transform-postrotation-SIMD.patch
[04:41:25 CEST] <atomnuker> do you see any way to save a single grp somewhere somehow?
[04:51:37 CEST] <jamrial> atomnuker: you could move len8 back to stack (mov len8m, len8q) before the loop, load the second offset argument, use len8m in cmp, then move it back to a reg right after the loop
[04:52:26 CEST] <jamrial> no wait, those offsets are not arguments
[04:55:14 CEST] <jamrial> also, you can avoid having two cglobal lines by doing 5, 8, 8 + cpuflag(avx2) * 4
[04:58:41 CEST] <jamrial> atomnuker: maybe just push out/exp/len8 before LUT_LOAD_4D, reuse the reg instead of r7q, then pop it back
[04:59:15 CEST] <jamrial> you probably need to use PUSH and POP (x86inc magic)
[05:01:28 CEST] <atomnuker> huh, that was easy
[05:09:34 CEST] <atomnuker> it does slow things down by around 500 decicycles though
[05:15:44 CEST] <atomnuker> I found a better way, I can free up len8q's reg from the loop and push/pop it at the start and end
[05:24:41 CEST] <atomnuker> better yet I can omit len8 after init
[05:25:10 CEST] <atomnuker> turns out that I don't overwrite the end at all since its mod 8
[05:25:51 CEST] <atomnuker> but because I offset the negative offset by 4 at the start I underwrite the output
[06:31:52 CEST] <cone-610> ffmpeg 03Matt Oliver 07master:a3833bee9482: win32_dlfcn: Support WinRT/UWP.
[06:31:52 CEST] <cone-610> ffmpeg 03Matt Oliver 07master:6cc677c0e828: lavf/os_support: Use existing WinRT config value.
[06:31:52 CEST] <cone-610> ffmpeg 03Matt Oliver 07master:b0c61209cd30: lavc/makefile: Add missing file dependencies.
[08:39:38 CEST] <cone-610> ffmpeg 03Rostislav Pehlivanov 07master:70eb77b34e9f: mdct15: add inverse transform postrotation SIMD
[08:40:27 CEST] <atomnuker> much cleaner, no crazy alignment requirements, no overreading/underreading, 32 bit sse3 version, 9%!!
[08:42:31 CEST] <atomnuker> crazy mdcts: 0, me: 2
[08:43:45 CEST] <atomnuker> only thing not SIMDd is the prereindexing, and the gains would be much lower there and having a forward version would be impossible
[10:27:15 CEST] <ubitux> atomnuker: i think you need to declare temp earlier in OVERALLOC()
[10:31:48 CEST] <atomnuker> I didn't need to have any OVERALLOC at all
[12:37:22 CEST] <cone-563> ffmpeg 03Nicolas George 07master:1daacba91f7c: Revert "Revert "lavfi/buffersrc: push the frame deeper if requested.""
[12:37:22 CEST] <cone-563> ffmpeg 03Nicolas George 07master:cffea1b4837b: lavfi: copy framesync into framesync2.
[12:37:22 CEST] <cone-563> ffmpeg 03Nicolas George 07master:873306f265de: lavfi/framesync2: rename all conflicting symbols.
[12:37:22 CEST] <cone-563> ffmpeg 03Nicolas George 07master:b77f041dff5f: lavfi: make FFERROR_NOT_READY available to filters.
[12:37:22 CEST] <cone-563> ffmpeg 03Nicolas George 07master:ed1c884b9e0d: lavfi: add outlink helper functions.
[12:37:22 CEST] <cone-563> ffmpeg 03Nicolas George 07master:4e0e9ce2dc67: lavfi/framesync2: implement "activate" design.
[12:37:22 CEST] <cone-563> ffmpeg 03Nicolas George 07master:0dd8320e16bc: lavfi/vf_stack: move to "activate" design.
[12:37:23 CEST] <cone-563> ffmpeg 03Nicolas George 07master:d07e25de763e: lavfi/vf_threshold: move to "activate" design.
[12:37:24 CEST] <cone-563> ffmpeg 03Nicolas George 07master:dbf7a670942d: lavfi/vf_remap: move to "activate" design.
[12:37:25 CEST] <cone-563> ffmpeg 03Nicolas George 07master:b894415a703e: lavfi/vf_premultiply: move to "activate" design.
[12:37:26 CEST] <cone-563> ffmpeg 03Nicolas George 07master:620608467f26: lavfi/vf_midequalizer: move to "activate" design.
[12:37:27 CEST] <cone-563> ffmpeg 03Nicolas George 07master:a5e3b0c1934f: lavfi/vf_maskedmerge: move to "activate" design.
[12:37:28 CEST] <cone-563> ffmpeg 03Nicolas George 07master:0bc331bd57dc: lavfi/vf_mergeplanes: move to "activate" design.
[12:37:29 CEST] <cone-563> ffmpeg 03Nicolas George 07master:27d8af03ae0d: lavfi/vf_maskedclamp: move to "activate" design.
[12:37:30 CEST] <cone-563> ffmpeg 03Nicolas George 07master:dbc4af862e74: lavfi/vf_lut2: move to "activate" design.
[12:37:31 CEST] <cone-563> ffmpeg 03Nicolas George 07master:5dbb111900b6: lavfi/vf_hysteresis: move to "activate" design.
[12:37:32 CEST] <cone-563> ffmpeg 03Nicolas George 07master:8b2cd8e0e412: lavfi/vf_displace: move to "activate" design.
[14:28:33 CEST] <cone-563> ffmpeg 03Marton Balint 07master:e433497160bd: avdevice/decklink_dec: set field order via codecpar
[16:09:06 CEST] <cone-563> ffmpeg 03Clément BSsch 07master:ca23d3491d4c: sws/tests/pixdesc_query: save every pix fmts in a list
[16:09:07 CEST] <cone-563> ffmpeg 03Clément BSsch 07master:d2c70fc88790: sws/tests/pixdesc_query: sort pixel formats
[16:09:08 CEST] <cone-563> ffmpeg 03Clément BSsch 07master:4158fba3cdb7: sws/tests/pixdesc_query: replace rgb based pix fmts with endianess agnostic names
[16:10:26 CEST] <cone-563> ffmpeg 03Michael Niedermayer 07release/3.2:66395ac32bfb: Update for 3.2.7
[17:08:40 CEST] <cone-563> ffmpeg 03Clément BSsch 07n3.2.7:HEAD: sws/tests/pixdesc_query: replace rgb based pix fmts with endianess agnostic names
[19:32:12 CEST] <rindolf> Hi all! In http://ffmpeg.org/download.html , it reads: « 3.2.7 was released on 2017-07-30. It is the latest stable FFmpeg release from the 3.2.7 release branch, which was cut from master on 2016-10-26. » the second "3.2.7" should be "3.2"
[19:33:33 CEST] <rindolf> also, the /topic reads "3.3.2" but there is already "3.3.3"
[19:47:39 CEST] <ubitux> durandal_1707: do you have a use case of the filter?
[19:48:24 CEST] <durandal_1707> its gimp replacement
[19:49:05 CEST] <ubitux> rindolf: topic fixed, web may need a patch
[19:49:41 CEST] <rindolf> ubitux: thanks
[19:50:05 CEST] <rindolf> ubitux: can i send a pull-request?
[19:50:14 CEST] <ubitux> better send a patch to the ml
[19:50:23 CEST] <rindolf> ubitux: ok
[19:50:36 CEST] <rindolf> ubitux: where are the sources?
[19:51:18 CEST] <ubitux> https://git.ffmpeg.org/ffmpeg-web
[19:52:11 CEST] <rindolf> ubitux: thanks
[19:52:34 CEST] <ubitux> durandal_1707: oh, it's the bucket thing?
[19:55:27 CEST] <durandal_1707> ubitux: yeah
[19:57:25 CEST] <ubitux> you need a threshold, a (optionally negative) border overlap and alpha fading in the overlap ;)
[20:06:40 CEST] <rindolf> ubitux: sent, thanks
[20:11:26 CEST] <durandal_170> ubitux: what?
[20:46:21 CEST] <ubitux> jamrial: you sent that mail in private
[20:46:29 CEST] <ubitux> durandal_170: i'm just enumerating features
[20:46:40 CEST] <ubitux> jamrial: that mail htmlsub
[20:46:42 CEST] <jamrial> ubitux: ugh, sorry
[20:46:49 CEST] <ubitux> no worry :)
[20:47:00 CEST] <jamrial> i thought i clicked reply list. stupid thunderbird
[20:50:00 CEST] <cone-563> ffmpeg 03Clément BSsch 07master:797c232ef84d: sws/tests/pixdesc_query: fix use of free() instead of av_free()
[00:00:00 CEST] --- Mon Jul 31 2017
More information about the Ffmpeg-devel-irc
mailing list