[Ffmpeg-devel-irc] ffmpeg-devel.log.20190304

Tue Mar 5 03:05:05 EET 2019

[13:04:50 CET] <cone-697> ffmpeg 03Martin Vignali 07master:9cb576fc1e3d: fate/qtrle : change 32b test to output bgra instead of rgb24
[13:04:51 CET] <cone-697> ffmpeg 03Martin Vignali 07master:5496a734882c: avcodec/qtrle : avoid swap in 32bpp decoding on little endian
[13:04:52 CET] <cone-697> ffmpeg 03Martin Vignali 07master:3278ea67c8f2: avcodec/qtrle : 32bpp dec copy two raw argb value at the same time
[13:04:53 CET] <cone-697> ffmpeg 03Martin Vignali 07master:88d0be1c0eea: avcodec/qtrle : improve 24bbp decoding speed
[15:29:53 CET] <durandal_1707> the QMF thing makes ac4 really slow, from 200x to 50x for stereo
[15:31:28 CET] <durandal_1707> there is no AVX for scalarproduct_float ..
[15:33:59 CET] <j-b> QMF?
[15:37:52 CET] <durandal_1707> j-b: Quadrature Mirror Filter, ac-4 version of aac's SBR
[15:39:00 CET] <JEEB> who here updates the site?
[15:39:01 CET] <JEEB> <@DEATH> https://ffmpeg.org/download.html says 4.1.1, why am I getting a 4.1 tarball?
[15:39:08 CET] <atomnuker> moral: think before you put easy to implement in hardware unsimdable filters in your codecs
[15:39:50 CET] <atomnuker> and really they should have done sbr in the frequency domain before invtx
[15:41:41 CET] <durandal_1707> JEEB: if one click on more releases, there is 4.1.1
[15:42:13 CET] <cone-697> ffmpeg 03Guo, Yejun 07master:402bf262375d: configure: add missing pthreads extralibs dependency for libvpx-vp9
[15:42:14 CET] <cone-697> ffmpeg 03Guo, Yejun 07master:d9b2668766e3: configure: use vpx_codec_vp8_dx/cx for libvpx-vp8 checking
[15:42:28 CET] <durandal_1707> atomnuker: this is just vanilla qmf analysis + qmf sythesis (actual decoding may be even slower)
[15:43:05 CET] <JEEB> durandal_1707: yea but the big download butan seems to lead to 4.1 still
[15:43:11 CET] <JEEB> which might not be what we want
[15:44:40 CET] <atomnuker> you can't even do the subsample-for-analysis trick opus does to save time during decoding
[15:44:47 CET] <j-b> durandal_1707: ok
[16:18:54 CET] <cone-697> ffmpeg 03James Almer 07master:db332832a17c: configure: allow enabling libvpx vp9 modules when vp8 is disabled
[16:47:42 CET] <kierank> durandal_1707: should be trivial to write avx, no?
[16:56:31 CET] <durandal_1707> kierank: dunno
[17:07:14 CET] <nevcairiel> should be relatively easy, just need to take care not to overread since it only guarantees 16-byte data
[17:19:44 CET] <durandal_1707> nevcairiel: that is not an issue, problem is to sum higher values in registers
[17:20:26 CET] <nevcairiel> that should be no different to sse, no?
[17:23:37 CET] <durandal_1707> i get different result, so no
[17:28:12 CET] <jamrial> scalarproduct_float needs 16 byte aligned buffers, and length a multiple of 4, so avx is not possible
[17:28:47 CET] <nevcairiel> anything is possible, just some care being taken
[17:29:20 CET] <durandal_1707> that is just limitation of current function as in lavu, which is very stupid, expecially aligned buffer to 16
[17:30:14 CET] <atomnuker> yeah, I remember those alignment limitations were annoying for opus
[17:30:43 CET] <atomnuker> somehow I managed to use them even though nothing looked aligned at first
[17:58:47 CET] <jamrial> most of these were written years ago before avx was a thing
[17:59:26 CET] <jamrial> changing the alignment requirement isn't a problem in most codecs, but the lax elem number constrains probably is
[18:03:34 CET] <durandal_1707> jamrial: length is 64 so it is multiple of 4 last time i checked
[18:04:20 CET] <jamrial> in your ac4 code, yes. but that's not the only decoder using this
[18:11:12 CET] <durandal_1707> i will use my own scalar product code
[18:12:19 CET] <j-b> durandal_1707: I would say focus on correctness before speed.
[18:13:09 CET] <kierank> J_Darnley: can you review the v210 patch someone wrote
[18:13:27 CET] <J_Darnley> oh probably
[18:13:33 CET] <J_Darnley> I must have missed it
[18:14:54 CET] <J_Darnley> I found it
[18:27:17 CET] <J_Darnley> Wow.  Is this doing something other than planar output?
[18:27:52 CET] <J_Darnley> Whys it doing so much stuff?
[18:30:35 CET] <J_Darnley> no, it is planar
[18:40:37 CET] <lrusak_> jkqxz: can you help clarify some things for me regarding AVDRMFrameDescriptor? Is it possible to allocate buffers (dumb/gbm) that we can have ffmpeg decode a frame into? I've been looking at hwcontext_drm and am not really sure if this is achievable. Any direction would be great. 
[20:04:32 CET] <durandal_1707> here is my SIMD code for scalarproduct_float: https://pastebin.com/wJjRQSTQ
[20:29:29 CET] <jamrial> that doesn't work with 4 element buffers, or 12 element buffers, or 24, etc
[20:29:42 CET] <jamrial> also, the horizontal add at the end is unnecessarely complex
[20:30:37 CET] <durandal_1707> jamrial: i posted that code to troll you into writting proper version
[21:48:12 CET] <durandal_1707> jamrial: what would you use for hadd?
[21:51:58 CET] <jamrial> vextractf128 + addps, then do the horizontal add as it's done for sse
[00:00:00 CET] --- Tue Mar  5 2019