[Ffmpeg-devel-irc] ffmpeg-devel.log.20151004
burek
burek021 at gmail.com
Mon Oct 5 02:05:03 CEST 2015
[00:17:21 CEST] <cone-286> ffmpeg 03DHE 07master:76e3f8242d60: libavformat/hlsenc: Use of uninitialized memory unlinking old files
[02:57:13 CEST] <cone-286> ffmpeg 03Rodger Combs 07master:a2b8b163004e: lavf: add chromaprint muxer
[03:16:22 CEST] <jamrial> did videolan.org ssl certificate just expire?
[03:17:40 CEST] <atomnuker> >Expires on 04/10/16
[03:17:43 CEST] <atomnuker> doesn't seem so
[03:18:20 CEST] <jamrial> i just got a warning, and the cert says expires on 10/3/2015
[03:27:11 CEST] <DHE> it was issued just over 30 days ago... surprisingly clean number...
[03:34:40 CEST] <jamrial> atomnuker: sorry, i was talking about git.videolan.org
[06:52:40 CEST] <rcombs> jamrial: yup, looks expired to me as well
[06:53:10 CEST] <rcombs> looks like they've replaced the cert but haven't installed it on the git server yet
[06:53:24 CEST] <jamrial> good to know it's not just me, then
[06:54:31 CEST] <jamrial> j-b: ^
[07:00:44 CEST] <rcombs> pointed out in #videolan
[07:00:51 CEST] <rcombs> a look is apparently being taken
[07:27:14 CEST] <rcombs> fixed
[07:33:59 CEST] <jamrial> yep
[10:30:24 CEST] <nevcairiel> mmm my schannel tls module can create a connection and verify the certificate if requested .. reading and writing actual data to/from the socket are overrated, right
[10:57:53 CEST] <rcombs> heh
[10:57:59 CEST] <rcombs> nice :D
[11:09:08 CEST] <nevcairiel> why MS never made a slightly higher level API, i will probably never know
[11:09:26 CEST] <nevcairiel> the worst part is that client and server modes need to use quite different code paths
[11:09:31 CEST] <nevcairiel> but i dont need server mode
[11:09:35 CEST] Action: nevcairiel skips
[11:10:30 CEST] <rcombs> huh, funky given that all the other libs make it pretty similar
[11:10:51 CEST] <nevcairiel> this API is 100 times more low level than any of the others
[11:11:08 CEST] <nevcairiel> here http://pastebin.com/4hJBPirX
[11:11:17 CEST] <nevcairiel> look how much shit establishing a connection needs
[11:11:25 CEST] <rcombs> if you get that working we should add autodetection in configure for a cert file on linux, and turn on verification by default
[11:12:36 CEST] <rcombs> &ew
[11:17:11 CEST] <nevcairiel> I'm also simply ignoring all the ca file options for now
[11:17:15 CEST] <nevcairiel> as well as the listen mode
[11:17:23 CEST] <nevcairiel> finish the one mode I would use myself first =p
[11:18:27 CEST] <rcombs> good plan
[11:19:43 CEST] <nevcairiel> personally i dont care for cert validation, i dont have a way to give feedback to the user, so i would probably always turn it off again
[11:20:09 CEST] <nevcairiel> i should test what MSs URL reader does with an invalid cert
[12:39:16 CEST] <ubitux> rcombs: chromaprint is distributed with a .c
[12:39:19 CEST] <ubitux> .pc*
[12:39:26 CEST] <ubitux> you could have use require_pkg_config
[12:39:36 CEST] <wm4> but cehoyos
[12:39:37 CEST] <ubitux> now it's forever require :(
[12:39:53 CEST] <ubitux> well, now it will be considered a regression :)
[12:40:13 CEST] <wm4> no, it's been in the tree only for a day
[12:52:00 CEST] <nevcairiel> wee it works
[12:52:26 CEST] <nevcairiel> http://pastebin.com/w43hxTaH not long at all, noo
[12:56:25 CEST] <wm4> wonderful
[13:18:03 CEST] <cone-754> ffmpeg 03Kyle Swanson 07master:4f721bfd4612: avfilter/ebur128: add dualmono measurement option
[13:18:03 CEST] <cone-754> ffmpeg 03Clément BSsch 07master:513fcd4167d4: avfilter/ebur128: use AV_OPT_TYPE_BOOL for video option
[13:20:47 CEST] <durandal_1707> BBB: I have issues with anaglyph function in stereo3d, can this be done at all?
[13:21:01 CEST] <BBB> let me see
[13:24:32 CEST] <BBB> durandal_1707: yeah thats not so hard, whats the issue with it?
[13:25:06 CEST] <BBB> durandal_1707: I guess horizontal stuff isnt very cool, but its not super-hard, you basically have to convert the buffers from 8 pixels to 8 rs, 8 gs and 8 bs, and then do stuff with that
[13:34:32 CEST] <cone-754> ffmpeg 03Ching Yi, Chan 07master:5d926d547388: avformat/flacdec: support fast-seek
[13:38:58 CEST] <BBB> durandal_1707: the magic instruction (in ssse3) is pshufb, Id load 3 registers (data, data+8 and data+16) which is 8 pixels of data (or maybe half of that, since youre currently working in dwords), and then pshufb them into orientation to have 4/8 rs, 4/8 gs and 4/8 bs, multiply then with appropriate coefficients, and then pshufb them back to rgb order
[14:00:04 CEST] <durandal_1707> BBB: what to use to get dwords back to bytes?
[14:02:26 CEST] <kierank> packusdw, packuswb or sometime shuffle iirc
[14:03:06 CEST] <J_Darnley> packusdw is sse4
[14:03:36 CEST] <J_Darnley> check the 16-bit yadif code I wrote if you want to emulate it.
[14:04:24 CEST] <J_Darnley> (assuming your values are in 0-INT_MAX range
[14:04:37 CEST] <kierank> iirc there is a macro
[14:14:20 CEST] <BBB> packssdw works
[14:14:45 CEST] <BBB> packssdw+packuswb
[14:14:48 CEST] <BBB> packssdw is sse2
[14:16:02 CEST] <J_Darnley> As I understand it, it should work if you don't have a value over 0x7fff
[14:20:56 CEST] <BBB> 0x7fffffff
[14:21:39 CEST] <BBB> (INT_MAX)
[14:22:19 CEST] <J_Darnley> but it saturates to the signed word range... oh wait, this is going to be packed again to bytes so who cares
[14:22:52 CEST] <BBB> right, it doesnt matter in this case
[14:23:32 CEST] <BBB> I also still wonder if 32bit intermediates is overkill
[14:23:37 CEST] <BBB> but well leave that for some other time :)
[14:50:50 CEST] <nevcairiel> re: schannel, if someone is curious about a full patch for testing or something https://github.com/Nevcairiel/FFmpeg/commit/33271e6db717a3bf80e524da7d1c82b23174f416
[14:51:02 CEST] <nevcairiel> i briefly tried to implement listen mode, but it didnt work, so i gave up for now
[14:54:22 CEST] <wm4> what is listen mode?
[14:54:46 CEST] <nevcairiel> you can open a listening socket
[14:54:50 CEST] <nevcairiel> and connect from another machine
[14:55:23 CEST] <nevcairiel> ie. this works ffmpeg -i file -f mpegts -listen 1 tcp://<your ip>:1234/ .. and on second machine do ffmpeg -i tcp://<ip>:1234/
[14:55:31 CEST] <nevcairiel> just with tls encryption
[14:55:43 CEST] <wm4> sounds very useful (not)
[14:55:59 CEST] <nevcairiel> the basis is probably not too bad, its a cheap way to stream live media over the network
[14:56:06 CEST] <nevcairiel> if you need tls for that is another question
[14:57:15 CEST] <nevcairiel> you can use it the other way around as well, open a listneing socket on some remote machine, and push content there, possibly for further processing, and once its done it could push it back
[14:57:24 CEST] <nevcairiel> in theory you could do some fun things with little effort
[14:57:27 CEST] <nevcairiel> but again, meh tls
[15:06:55 CEST] <nevcairiel> Anyway my main goal was to enable reading of https urls, which i solved
[15:07:06 CEST] <nevcairiel> some users keep complaining, and i cba to bundle gnutls or openssl
[15:07:51 CEST] <wm4> heh
[15:08:02 CEST] <wm4> yeah, that seems like a good thing
[15:13:55 CEST] <BBB> wm4: Id like the #if HAVE_THREADS patch in, we have specific hooks in each AVCodec to only initialize the threading bits if threading is enabled, so its clearly considered a feature important enough to obfuscare each AVCodec instance in our codebase
[15:13:59 CEST] <BBB> wm4: is that ok?
[15:14:39 CEST] <wm4> which patch?
[15:14:59 CEST] <wm4> the one about compiler warnings?
[15:15:02 CEST] <BBB> yes
[15:15:47 CEST] <BBB> (you objected to the concept of fixing that specific warning)
[15:15:52 CEST] <wm4> I still don't get the point (at all), but I won't block it
[15:16:23 CEST] <BBB> ok, ty
[15:25:57 CEST] Action: Compn quotes developer policy and runs
[15:26:04 CEST] <Compn> run for the hills!
[15:30:59 CEST] <BBB> hm, theres no hevc maintainer in MAINTAINERS
[15:31:08 CEST] <BBB> I guess nobody wants it
[15:42:41 CEST] <cone-754> ffmpeg 03Henrik Gramner 07master:ec85153f252c: checkasm: Fix compilation with --disable-avcodec
[15:58:24 CEST] <durandal_1707> so I need to do rgbrgbrgbrgb -> rrrrggggbbbb
[16:00:08 CEST] <durandal_1707> BBB ^
[16:04:07 CEST] <Gramner> pshufb
[16:05:58 CEST] <nevcairiel> pshufb is the only good way to do 3 component shuffling, f'ing odd counts
[16:06:29 CEST] <durandal_1707> yes but what arg to use?
[16:07:08 CEST] <kierank> you can read the doc and work it out
[16:07:38 CEST] <RiCON> nit: ffmpeg -protocols with different sort using librtmp http://j.fsbn.eu/PXde.png (left with ffrtmp, right with librtmp)
[16:09:15 CEST] <wm4> I would have expected left to be librtmp?
[16:09:25 CEST] <RiCON> oh, external libraries always get sorted after
[16:09:30 CEST] <wm4> yeah
[16:09:32 CEST] <RiCON> so it's probably intended
[16:09:54 CEST] <RiCON> and yeah, left is librtmp, typo
[16:09:57 CEST] <wm4> yeah, I think it's registration order
[16:15:31 CEST] <durandal_1707> ok I think I got order
[16:17:02 CEST] <RiCON> nevcairiel: trying schannel, seems to compile and work fine with mpv
[16:18:42 CEST] <JEEB> oh, native tls?
[16:19:38 CEST] <wm4> fun, lots of native APIs lately (libass is fontconfig free, now ffmpeg will be openssl free)
[16:21:13 CEST] <durandal_1707> if I have rrrrggggbbbb how to extract gggg?
[16:21:32 CEST] <durandal_1707> or bbbb?
[16:21:34 CEST] <nevcairiel> could shift it to the front and read 32-bit or something
[16:21:44 CEST] <nevcairiel> no idea if there is a smarter way
[16:22:02 CEST] <J_Darnley> A better question: where do you want to extract it to?
[16:22:42 CEST] <durandal_1707> to another register so I can add and multiply
[16:22:55 CEST] <J_Darnley> Is that a SIMD register then?
[16:23:02 CEST] <durandal_1707> Yes
[16:23:40 CEST] <J_Darnley> Perhaps a shift to low and unpack, or pshufd and unpack
[16:24:09 CEST] <J_Darnley> I think AVX has some more interesting blending instructions
[16:27:09 CEST] <J_Darnley> I'm starting to wonder whether I should gather various separate instruction references from across the web into one place.
[16:27:49 CEST] <J_Darnley> I, for one, can't remember many of the things added after SSE2
[16:28:26 CEST] <JEEB> I think they have just made the reference manual longer and longer
[16:28:43 CEST] <JEEB> and then there's some kind of interactive reference thing which might have been java
[16:28:53 CEST] <J_Darnley> They cerainly have, and its split A-M and N-Z
[16:29:21 CEST] <J_Darnley> I saw an online JS-based intrinsics reference
[16:29:26 CEST] <JEEB> :o
[16:29:32 CEST] <JEEB> that makes more sense than the Java applet
[16:29:40 CEST] <J_Darnley> https://software.intel.com/sites/landingpage/IntrinsicsGuide/
[16:29:48 CEST] <JEEB> wow
[16:29:50 CEST] <nevcairiel> does it use asm.js? :P
[16:29:50 CEST] <JEEB> finally
[16:29:54 CEST] <J_Darnley> It showed my just how inane it was
[16:30:06 CEST] <J_Darnley> So many ways to load a dword
[16:30:36 CEST] <J_Darnley> Fortunately you can search for an exact instruction name.
[16:30:40 CEST] <nevcairiel> keep note that its intrinsics, so a bunch map to the same instruction
[16:31:04 CEST] <JEEB> yeah
[16:35:27 CEST] <Gramner> https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf
[16:35:28 CEST] <J_Darnley> If I knew any javascript I would probably steal that page and use it for actual assembly instructions.
[16:37:03 CEST] <Gramner> also https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf for future stuff
[16:37:16 CEST] <J_Darnley> OMG! There's a third volume!
[16:38:04 CEST] <Gramner> third volume is for os design iirc
[16:38:10 CEST] <J_Darnley> oh
[16:38:20 CEST] <J_Darnley> that's a little better
[16:38:25 CEST] <Gramner> and virtual machine things
[16:43:29 CEST] <RiCON> ffrtmpcrypt can only use openssl?
[16:44:38 CEST] <nevcairiel> apparently
[16:44:40 CEST] <nevcairiel> dont ask me why
[16:44:59 CEST] <J_Darnley> Because nobody wrote anything else?
[16:45:12 CEST] <J_Darnley> All you streaming nuts were satisfied?
[16:45:18 CEST] <nevcairiel> ah it can also use gmp
[16:45:31 CEST] <nevcairiel> but yeah it needs some magic that does dh exchanges
[16:45:36 CEST] <nevcairiel> that uses openssl or gmp
[16:45:48 CEST] <nevcairiel> or gmp/gcrypt, whatever
[16:51:06 CEST] <RiCON> only "mainstream" use of rtmpe i can remember is crunchyroll
[16:52:22 CEST] <JEEB> a lot of places use it since it's the least irritating thing that comes down as some kind of "DRM"
[17:11:38 CEST] <nevcairiel> i dunno, isnt it just weak encryption with a key exchange built into the protocol? you dont even need a crypt key or something to decrypt it, you just ask the server .. might as well use rtmps
[17:17:58 CEST] <nevcairiel> (in other news, schannel even works on xp)
[17:22:56 CEST] <durandal_1707> I get segv when trying to multiply with coefficients
[17:36:36 CEST] <J_Darnley> durandal_1707: post the code
[17:36:54 CEST] <J_Darnley> in the mean time I will guess: something isn't aligned
[17:38:39 CEST] <durandal_1707> I'm supposed to movd instead to pmulld directly
[17:39:40 CEST] <J_Darnley> huh?
[17:42:07 CEST] <durandal_1707> movd coeffs to mX
[17:43:18 CEST] <durandal_1707> how to concat rrrr +gggg+bbbb?
[17:43:43 CEST] <J_Darnley> shift + or
[17:54:17 CEST] <cone-754> ffmpeg 03Ganesh Ajjanagadde 07master:59594b11f962: Changelog: add note on ffplay dynamic volume control
[17:54:18 CEST] <cone-754> ffmpeg 03Ganesh Ajjanagadde 07master:c3e8de1c248f: ffplay: more robust thread creation
[18:14:16 CEST] <BBB> durandal_1707: segfault is probably because arguments are not aligned
[18:14:23 CEST] <BBB> durandal_1707: if theyre 16byte aligned, you can multiply directly
[18:15:24 CEST] <durandal_1707> BBB: I can't get red plane to work correctly
[18:15:47 CEST] <durandal_1707> just doing red plane for now
[18:18:19 CEST] <durandal_1707> can [x][] be accessed from asm?
[18:29:10 CEST] <durandal_170> BBB: https://github.com/richardpl/FFmpeg/blob/stereo3d/libavfilter/x86/vf_stereo3d.asm
[18:31:44 CEST] <BBB> what is [x][]?
[18:32:36 CEST] <durandal_1707> matrix
[18:32:50 CEST] <durandal_1707> NxM size
[18:33:31 CEST] <BBB> what code re you thinking of specifically?
[18:33:54 CEST] <BBB> if its a matrix of int16_t bla[20][3], yeah, sure, you can open that in assembly
[18:34:18 CEST] <BBB> if its int16_t *bla[3]; each filled with int16_t x[something], its a little different, but still doable
[18:34:40 CEST] <BBB> the first is a simple dereference, the second needs two levels of dereferences
[18:34:50 CEST] <Gramner> something something AoS vs SoA
[18:35:28 CEST] <durandal_1707> just asking, unrelated to stereo3d
[18:35:38 CEST] <BBB> why do you shuf followed by unpack/zerp-extend?
[18:35:42 CEST] <BBB> pshufb can zero-extend for you
[18:36:04 CEST] <BBB> pshufb m0, bytearray[0,-1,-1,-1,3,-1,-1,-1,6,-1,-1,-1,9,-1,-1,-1]
[18:36:11 CEST] <BBB> will create dword reds zero-extended
[18:36:26 CEST] <BBB> and then 0369->147A or 258B to get G/B
[18:36:41 CEST] <BBB> -1 means zero it instead of take byte[idx]"
[18:37:02 CEST] <BBB> how large are the coefficients?
[18:37:13 CEST] <BBB> if theyre 16bit, you should use pmaddwd, not pmulld
[18:38:14 CEST] <durandal_1707> they are bigger than that sometime
[18:39:01 CEST] <BBB> oh, unfortunate, ok
[18:45:29 CEST] <jamrial> durandal_1707: you wrote 20 for xmm regs by mistake in that file above
[18:46:12 CEST] <cone-754> ffmpeg 03Ganesh Ajjanagadde 07master:26e8895b7395: all: add _DEFAULT_SOURCE locally wherever needed
[18:47:02 CEST] <durandal_1707> jamrial: iknow, but thats not problem
[18:48:48 CEST] <cone-754> ffmpeg 03Ganesh Ajjanagadde 07master:2cbaa078d18a: avcodec: use HAVE_THREADS header guards to silence -Wunused-function
[19:01:26 CEST] <durandal_1707> BBB: so nothing obvious why I do not get red plane?
[19:01:47 CEST] <BBB> oh havent checked that yet :)
[19:01:50 CEST] <BBB> will do in a bit
[19:10:28 CEST] <durandal_1707> I now get red something but it looks like mults and adds messed up something
[19:12:00 CEST] <cone-754> ffmpeg 03James Almer 07master:b70566d6ca7b: avcodec/alacdec: split off decorrelate_stereo and append_extra_bits as alacdsp
[19:21:33 CEST] <BBB> durandal_1707: your shufs are weird
[19:22:04 CEST] <BBB> durandal_1707: I think you want 3 shufs for rgbrgbrgbrgb -> r000r000r000r000 and g000g000g000g000 and b000b000b000b000
[19:22:26 CEST] <BBB> durandal_1707: and then after packuswd/wb that back to rrrrggggbbbbxxxx, you need another shuf to get back to rgbrgbrgbrgb(xxxx)
[19:22:33 CEST] <BBB> and then you can write out 4 pixels/12 bytes
[19:22:49 CEST] <BBB> durandal_1707: and once it works for 4 pixels, you can adjust it to pair in twos and do 8 pixels at a time
[19:35:20 CEST] <durandal_170> BBB: i did some changes, use same link, i hope its more obvious now, bug is still there
[19:36:57 CEST] <BBB> why is m8/m9/m10 all initialized from ana_matrix_rd?
[19:37:16 CEST] <BBB> and why movd?
[19:38:13 CEST] <BBB> so, I think your ana_matrix conversion in 8/9/10 is all wrong
[19:38:29 CEST] <BBB> so lets say that you load data in m0, right?
[19:38:44 CEST] <durandal_1707> yes
[19:38:55 CEST] <BBB> so 1/2/3 is now r000r000, g000g000 and b000b000
[19:39:07 CEST] <BBB> or, well, r[dword], g[dword] and b[dword]
[19:39:41 CEST] <BBB> so dont you want to multiply r by an _array_ of ana_r[0], g by an _array_ of ana_r[1] and b by an _array_ of ana_r[2]?
[19:40:16 CEST] <BBB> so, offsets are in byte, so do movd m8, [ana_matrix_rq+0], movd m9, [ana_matrix_rq+4] and movd m10, [ana_matrix_rq+8]
[19:40:31 CEST] <BBB> and then splat them in the register so they cover the whole register size, not just the first bit
[19:40:35 CEST] <BBB> pshufd m8, m8, q0000
[19:40:39 CEST] <BBB> pshufd m9, m9, q0000
[19:40:44 CEST] <BBB> and pshufd m10, m10, q0000
[19:41:08 CEST] <BBB> that changes m8=r[dword] into m8=r[dword],r[dword],r[dword],r[dword]
[19:41:11 CEST] <BBB> in all 16 bytes of the register
[19:41:15 CEST] <BBB> instead of just the first4 bytes
[19:42:01 CEST] <rcombs> ubitux: brew's package doesn't install the .pc for whatever reason :/
[19:42:16 CEST] <ubitux> ah, that's unfortunate then
[19:46:12 CEST] <durandal_1707> BBB: works!
[19:46:35 CEST] <BBB> \o/
[19:46:40 CEST] <BBB> now try the same for g/b
[20:02:40 CEST] <durandal_1707> if I want to shift to combine RGB what instruction to use?
[20:20:32 CEST] <BBB> pshufb again
[20:20:43 CEST] <BBB> you packuswb to get rrrr, gggg and bbbb
[20:21:09 CEST] <BBB> you can combine them together by using packusdw red, blue and packusdw green, green
[20:21:21 CEST] <BBB> and then packuswb red/blue, green
[20:21:38 CEST] <BBB> and then pshufb to orient correctly from rrrrggggbbbb to rgbrgbrgb
[20:29:59 CEST] <durandal_1707> BBB: I did it but its not bitexact, numbers are of by one or two
[20:37:46 CEST] <durandal_1707> I think I need clamping and not saturation
[20:42:17 CEST] <J_Darnley> what's the difference?
[20:42:40 CEST] <J_Darnley> anyway see the CLIP* macros
[20:43:35 CEST] <J_Darnley> durandal_1707 ^
[20:44:58 CEST] <BBB> durandal_1707: how are they different?
[20:45:35 CEST] <BBB> durandal_1707: now comes the fun part - debugging assembly
[20:45:37 CEST] <BBB> its quite fun
[20:45:47 CEST] <BBB> durandal_1707: gdb, stepi, disass, breakpoint, etc.
[20:45:49 CEST] <BBB> all your friends
[20:45:59 CEST] <BBB> and info all-registers or print $xmm0
[20:46:00 CEST] <BBB> etc.
[20:50:06 CEST] <durandal_1707> also when using threads md5 changes
[20:50:07 CEST] <BBB> gtg, later
[20:50:25 CEST] <BBB> you may have an overwrite because you write 16 bytes
[20:50:28 CEST] <BBB> but only 12 bytes is useful
[20:50:37 CEST] <BBB> maybe try movq+movd instead of movu, and a shift in between
[20:50:41 CEST] <BBB> psrldq
[20:50:44 CEST] <BBB> bye
[20:57:22 CEST] <durandal_1707> that did it
[20:58:26 CEST] <durandal_1707> Should I use stack instead of reading coeff over and over again?
[21:03:49 CEST] <jamrial> you have 16 xmm regs available (assuming this is x86_64). don't you have any of them free to store the coeffs?
[21:08:51 CEST] <durandal_1707> there are 18 coeffs
[21:17:46 CEST] <durandal_1707> jamrial: perhaps one or two
[21:18:19 CEST] <durandal_1707> but what to use for stack?
[21:18:24 CEST] <kurosu> some insns accept memory operands (ie an address instead of a reg)
[21:18:59 CEST] <kurosu> but this situation is unusual, so it might be related to how your algorithm lays out data
[21:19:29 CEST] <kurosu> but then, I haven't seen it, and it's maybe completely normal
[21:22:59 CEST] <kurosu> ok, ana_convert => normal
[21:24:08 CEST] <kurosu> so, yes, if they are dynamic, maybe have an aligned buffer part of the dsp struct or something, where the dsp is free to put data, eg by calling a function specific to setting the coeffs according to the dsp expectations
[21:25:09 CEST] <kurosu> ie pass ana_matrix_r,g,b to that func, it fills said buffer in the dsp, and call that dsp with the coeff buffer of the dsp struct instead of directlt ana_matrix_r/g/b
[21:25:45 CEST] <RiCON> nevcairiel: Re: rtmpe: this works (?) -> http://sprunge.us/TYAK?diff
[21:26:36 CEST] <nevcairiel> i guess
[21:26:48 CEST] <nevcairiel> the check never accounted for having something else than openssl and gnutls
[21:27:29 CEST] <RiCON> compiled and worked with at least one crunchy video
[21:33:58 CEST] <nevcairiel> i cant test, i dont have gmp/gcrapt
[21:34:02 CEST] <nevcairiel> -a+y
[21:45:42 CEST] <cone-754> ffmpeg 03Paul B Mahol 07master:0c2b37fed43e: avfilter: add displace video filter
[22:04:30 CEST] <J_Darnley> durandal_1707: I'm reading through the stereo filter patch you sent
[22:28:36 CEST] <kurosu> durandal_1707, do you absolutely need int and 16 bits of precision in your coeffs? because you could get down to 15, and be able to use pmaddwd (16bx16b + 16bx16b -> 32b)
[22:29:30 CEST] <kurosu> that'd make the asm ok on sse2, and allow twice more computations per insn
[22:30:16 CEST] <kurosu> then again, the movd+pshud seems like a pity, it would be so much better if the tables were unrolled
[22:32:11 CEST] <kurosu> it really seems like you should pass the output format (ANAGLYPH_*) as a parameter
[22:32:47 CEST] <kurosu> then have the asm sets the proper address in its own arrays from it
[22:33:29 CEST] <kurosu> like table + (array size for a mode) * mode
[22:36:56 CEST] <durandal_1707> kurosu: own arrays?
[22:51:30 CEST] <J_Darnley> Is there a spec or reference with decimal values for the coefficients?
[23:03:43 CEST] <durandal_1707> J_Darnley: just divide by 65536
[23:04:58 CEST] <durandal_1707> what gain would be storing shuffed coeffs on stack?
[23:10:09 CEST] <cone-754> ffmpeg 03Michael Niedermayer 07master:46f3015f3525: avcodec/tta: Un-break build without threads
[23:25:42 CEST] <cone-754> ffmpeg 03Rodger Combs 07master:854972b53dc7: libavformat/tls_securetransport: fix argument evalulation order UB
[23:31:39 CEST] <durandal_1707> yasm supports >16 registers?
[23:39:27 CEST] <durandal_1707> BBB: can't I use avx and its 32 registers?
[23:39:39 CEST] <BBB> I believe avx512 has 32 regs
[23:39:48 CEST] <BBB> avx only has 16 32byte registers (on 64bit)
[23:45:59 CEST] <durandal_1707> I see no support for such in ffmpeg yet
[23:47:26 CEST] <nevcairiel> no avx512 support yet, no
[23:49:44 CEST] <BBB> your hardware doesnt support it anyway
[23:49:46 CEST] <BBB> so its ok
[23:50:35 CEST] <BBB> durandal_1707: and did some initial review of your patch - not bad, but lots can be improved, good job on a first try, thats a pretty tough function
[23:50:46 CEST] <BBB> durandal_1707: btw do you like this type of work?
[23:53:34 CEST] <durandal_1707> its nice to speed up code
[23:57:08 CEST] <durandal_1707> it would make more sense to write gbrp simd in swscale
[23:58:01 CEST] <BtbN> I wonder if OpenCL would make sense for some stuff in swscale
[23:58:11 CEST] <nevcairiel> go away =P
[23:58:37 CEST] <nevcairiel> swscale would need a cleaner architecture so you could plug scalers in more easily
[23:58:55 CEST] <BtbN> easy, just put every conversion into lavf!
[00:00:00 CEST] --- Mon Oct 5 2015
More information about the Ffmpeg-devel-irc
mailing list