[Ffmpeg-devel-irc] ffmpeg-devel.log.20170901
burek
burek021 at gmail.com
Sat Sep 2 03:05:04 EEST 2017
[00:08:18 CEST] <michaelni> durandal_170, in git master the multiplication looks like just real valued not complex so while it scales its not compatible with convolution
[00:09:58 CEST] <durandal_170> michaelni: yes, i'm reusing rdft_horizontal and rdft_vertical and doing complex multiplication
[00:10:21 CEST] <michaelni> iam not sure if not one of these need to be changed to C2C
[00:11:33 CEST] <durandal_170> wouldn't same be needed if one does just inverse of itself?
[00:16:20 CEST] <atomnuker> BtbN: so CUdeviceptr are pointers but you can't access them directly?
[00:17:02 CEST] <atomnuker> what are AVFrames supposed to contain in the data pointers if they're on the hardware?
[00:17:20 CEST] <atomnuker> pointers to contexts so the hwcontext functions can access them?
[00:18:39 CEST] <nevcairiel> every hardware pixfmt defines what the data pointers contain
[00:18:55 CEST] <nevcairiel> so just opaque things you need to know how to access
[00:20:02 CEST] <rcombs> it can be an actual pointer, or an integer surface ID, or whatever else
[00:20:42 CEST] <atomnuker> and nothing else but hwcontext functions will access it, right?
[00:21:08 CEST] <nevcairiel> not necessarily
[00:21:18 CEST] <nevcairiel> its documented what it contains, so the user could access it
[00:21:35 CEST] <nevcairiel> and of course encoders/decoders/filters access it
[00:21:35 CEST] <wm4> hm I think opencl is one example where it's more complex?
[00:22:00 CEST] <wm4> most hw pixfmts have just a single pointer which is an API object in the underyling hw API
[00:22:18 CEST] <atomnuker> opencl isn't a hwaccel and its rather shoddily implemented
[00:22:51 CEST] <wm4> I thought you're doing this for a filter
[00:22:51 CEST] <atomnuker> I don't like it and I think we need to get rid of it, vulkan does everything it does better
[00:23:03 CEST] <jkqxz> There is an opencl hwcontext, but it's stalled for lack of any proper use.
[00:23:05 CEST] <wm4> unless you're talking about the old opencl filters
[00:23:12 CEST] <jkqxz> The opencl filters in ffmpeg are just awful.
[00:23:23 CEST] <atomnuker> wm4: not necessarily, there's jpeg decoding shaders which I'd like to implement
[00:23:28 CEST] <atomnuker> so its a hwaccel
[00:23:30 CEST] <wm4> yeah I recall jkqxz wanted to add some real opencl filters
[00:23:38 CEST] <wm4> atomnuker: oh
[00:24:04 CEST] <nevcairiel> that sounds stupid
[00:24:48 CEST] <jkqxz> What do you want JPEG decoding for? All the decode APIs implement it, but noone has bothered to write a hwaccel. That rather suggests that noone cares.
[00:25:39 CEST] <atomnuker> because they're already written and seem to perform better and to demonstrate it might be usable elsewhere to decode
[00:26:15 CEST] <nevcairiel> "because it exists" shouldnt be a reason to bloat up ffmpeg, you're one of those always eager to delete stuff thats not used by anything =p
[00:26:36 CEST] <atomnuker> they're faster then :)
[00:26:43 CEST] <jkqxz> Perform better than what? Does that include all the time messing around with upload/download?
[00:27:18 CEST] <atomnuker> yep
[00:27:25 CEST] <atomnuker> jpegturbo IIRC
[00:27:47 CEST] <jkqxz> (The whole point of the opencl stuff was that upload/download was terrible and you should do stuff only on the GPU, so it interoperates with DXVA2, D3D11 and VAAPI to do that. Upload/download is possible but to be avoided.)
[00:28:26 CEST] <wm4> atomnuker: is this a generic compute shader?
[00:28:34 CEST] <atomnuker> no, glsl
[00:28:35 CEST] <atomnuker> https://archive.org/details/lca2017-GPU_Accelerated_JPEG_Rendering
[00:28:46 CEST] <wm4> well compute shaders can be written in glsl
[00:29:24 CEST] <nevcairiel> meh its a partial decoder
[00:29:37 CEST] <wm4> aren't normal hwaccels too?
[00:29:38 CEST] <nevcairiel> you need to do bitstream shit on the cpu, which leads to super ugly code IMHO
[00:29:48 CEST] <nevcairiel> well no, hwaccels get the full slices
[00:29:53 CEST] <atomnuker> https://people.xiph.org/~negge/LCA2017.pdf
[00:29:56 CEST] <nevcairiel> this needs entropy decoding on the cpu
[00:30:04 CEST] <wm4> yeah, but everything else is done on the CPU
[00:30:18 CEST] <nevcairiel> so basically its at a state where video hwaccels were 10 years or so ago
[00:30:20 CEST] <wm4> hm makes sense
[00:30:23 CEST] <nevcairiel> and people moved on to things that dont suck
[00:30:32 CEST] <wm4> or which suck more
[00:30:39 CEST] Action: wm4 stares at apple and linux/arm things
[00:30:55 CEST] <nevcairiel> slice decoders are definitely better then MC/IDCT decoders from long ago =p
[00:30:58 CEST] <jkqxz> Hybrid stuff does have value for new codecs. JPEG isn't what I'd pick there though...
[00:32:48 CEST] <atomnuker> well its to proove a concept, don't expect me to write loopfilter shaders for vp9 just yet
[00:33:41 CEST] <jkqxz> VP9 will be in ~all new hardware pretty soon. Better start with AV1!
[00:33:56 CEST] <nevcairiel> the performance numbers in that paper include things like yuv -> rgb conversion, no wonder it turns up faster in the end
[00:34:38 CEST] <jkqxz> I bet they don't do correct colourspace conversion.
[00:35:12 CEST] <wm4> nevcairiel: they provide separate numbers for that
[00:35:52 CEST] <atomnuker> jkqxz: and look what good hardware decoding does when it fails to decode 90% of the videos I'd like to watch
[00:36:02 CEST] <atomnuker> hevc 10bits isn't even supported on skylake
[00:36:08 CEST] <atomnuker> its all because of profiles and levels
[00:36:19 CEST] <jkqxz> Are you one of those amine people who wants 10-bit H.264?
[00:36:31 CEST] <atomnuker> no, HDR 4K 10bit videos
[00:36:40 CEST] <wm4> I think people have tried to do h264 in shaders before
[00:36:52 CEST] <wm4> and the performance wasn't necessarily convincing, I think?
[00:36:57 CEST] <nevcairiel> not at all
[00:37:01 CEST] <jkqxz> So Kaby Lake, right. The profiles are a pain, yeah, but new ones will just be added as they become useful.
[00:37:09 CEST] <nevcairiel> nothing beats full hardware decoders anyway
[00:37:20 CEST] <nevcairiel> they got so fast in recent years
[00:37:26 CEST] <atomnuker> no, skylake, I have a skylake from the end of 2015, and yet it doesn't do 10bit
[00:37:27 CEST] <wm4> "hybrid" decoding done by intel drivers probably also uses programmable GPU parts
[00:37:30 CEST] <atomnuker> what use is it then?
[00:38:13 CEST] <nevcairiel> there is literally no hevc content commercially available though outside of walled-garden ecosystems we don't get to play in anyway =p
[00:39:15 CEST] <atomnuker> nevcairiel: they go fast? decoding low res h264 is faster in software
[00:39:27 CEST] <jkqxz> No, I mean it works on Kaby Lake. Anything Intel newer than what you have will do it.
[00:39:27 CEST] <nevcairiel> define fast
[00:39:28 CEST] <atomnuker> by 5 to 10 times too
[00:39:33 CEST] <atomnuker> 5 to 10 times
[00:39:41 CEST] <nevcairiel> i get like a thousand fps on 1080p
[00:39:52 CEST] <nevcairiel> no clue how well it scales down on lowres
[00:40:09 CEST] <atomnuker> jkqxz: considering how it has a different socket you need to fork out enough cash for a new machine
[00:40:13 CEST] <nevcairiel> but i don't particularly care if it goes lightning fast or ultra lightning fast
[00:40:37 CEST] <nevcairiel> also skylake has hybrid hevc10 decoding afaik, its not all that brilliant but it works
[00:40:39 CEST] <jkqxz> The constant overhead gets annoying. At low resolutions it's only worth using hardware decode if you need the output in GPU surfaces.
[00:40:53 CEST] <rcombs> nevcairiel: isn't that only useful with MFX
[00:41:00 CEST] <nevcairiel> works through dxva as well
[00:41:05 CEST] <rcombs> hmmm
[00:41:11 CEST] <rcombs> but not VAAPI, presumably
[00:41:14 CEST] <nevcairiel> not sure about platforms that dont have a central decoding api :p
[00:41:37 CEST] <wm4> I thought there are some "special" closed vaapi drivers that do hybrid
[00:41:48 CEST] <wm4> maybe they were bundled with MFX or some shit
[00:41:54 CEST] <atomnuker> I'm sure I'm also not the only one who'd like a fully multiplatform hwaccel, even if its hybrid
[00:41:57 CEST] <wm4> quicksync sdk something something
[00:42:03 CEST] <rcombs> wm4: yeah that's MFX
[00:42:04 CEST] <nevcairiel> hybrid generally sucks
[00:42:12 CEST] <atomnuker> nevcairiel: APIs suck more
[00:42:27 CEST] <nevcairiel> I can implement APIs, I can't fix inherent suckyness of a hybrid approach
[00:42:36 CEST] <wm4> khronos should make a hwaccel API... but it'd suck or nobody would use it or both
[00:42:57 CEST] <jkqxz> They do. It's called OpenMAX.
[00:42:59 CEST] <rcombs> ^
[00:42:59 CEST] <atomnuker> nevcairiel: well you could if it was open... which with vulkan it would be
[00:43:12 CEST] <nevcairiel> besides on a codec like hevc entropy decoding quickly becomes a major bottleneck
[00:43:13 CEST] <rcombs> iirc AMD uses it, don't they
[00:43:13 CEST] <jkqxz> It sucks and some people use it anyway.
[00:43:16 CEST] <nevcairiel> so good luck getting that any faster
[00:43:35 CEST] <rcombs> and also Realtek on e.g. RPi, and a fair bit of Android stuff
[00:43:54 CEST] <nevcairiel> atomnuker: well i can't, because its just how hybrid works, its sucky by design
[00:44:12 CEST] <atomnuker> nevcairiel: so you're fully complacent with not touching things anymore if they run faster than realtime on your system?
[00:44:16 CEST] <rcombs> meanwhile, I recently learned that Android MediaCodec is capable of doing zero-copy buffer-passing& and it's built on top of EGL
[00:44:39 CEST] <rcombs> they added an EGL function for "set the PTS of the next frame I send"
[00:44:42 CEST] <wm4> jkqxz: oh, that's sad
[00:44:46 CEST] <wm4> rcombs: oh, that's sad, too
[00:44:47 CEST] <rcombs> and then you send a frame by doing swapBuffers()
[00:44:53 CEST] <wm4> ..................
[00:45:01 CEST] <wm4> just...
[00:45:20 CEST] <wm4> does it refer to using a GL interop API and rendering to an offscreen surface?
[00:45:33 CEST] <wm4> you had to do this with vaapi/x11, and it sucked to hell and back
[00:45:33 CEST] <rcombs> I think that's the idea?
[00:45:36 CEST] <nevcairiel> atomnuker: what does that have to do with anything? I have full decoding capability through GPUs of all 3 major vendors, I just don't need subpar hybrid =p
[00:45:58 CEST] <wm4> (vaapi had only a X11 rendering API... so you rendered to a X pixmap and mapped that as GL texture)
[00:46:10 CEST] <rcombs> so I'm still not entirely sure if this is better or worse than getting the vendors to give us access to the devfs nodes and using OMX directly
[00:46:11 CEST] <nevcairiel> you're never going to get hybrid faster then those dedicated decode ASICs
[00:46:23 CEST] <rcombs> (all this GL nonsense is implemented on top of OMX)
[00:46:28 CEST] <wm4> rcombs: I think it's both
[00:46:41 CEST] <rcombs> well, net-better or net-worse
[00:47:00 CEST] <atomnuker> nevcairiel: and when a new codec comes out or some other one's unsupported profile becomes a large enough meme, then what?
[00:47:12 CEST] <rcombs> the primary advantages of using the GL stuff is that it's theoretically portable (within android anyway), and it doesn't involve directly fucking with OMX
[00:47:22 CEST] <nevcairiel> then I rather decode on software then some crap hybrid :)
[00:47:36 CEST] <atomnuker> nevcairiel: but that crap hybrid might be 2x faster
[00:47:46 CEST] <wm4> rcombs: I'd rather continue to insist that google take a look at existing APIs instead of continuing to fuck up
[00:47:57 CEST] <wm4> d3d11va and vaapi are good examples these days
[00:48:25 CEST] <rcombs> so, is it sufficiently worse than OMX (in terms of both API [and having to deal with GL] and performance) that using OMX is preferable?
[00:48:33 CEST] <rcombs> I dunno the answer
[00:48:51 CEST] <rcombs> wm4: do you have thoughts on the new v4l2 codec interface
[00:49:01 CEST] <jkqxz> I think d3d11va and vaapi are considered too hard, because they give too much control to the user.
[00:49:10 CEST] <nevcairiel> I like control :(
[00:49:12 CEST] <jkqxz> V4L2 is the logical conclusion of taking all control away from the user and hiding everything in kernel blobs.
[00:49:38 CEST] <rcombs> meanwhile VideoToolbox is halfway in between
[00:49:48 CEST] <nevcairiel> the linux people really like their monolitic-everything-kernel, dont they
[00:49:54 CEST] <rcombs> everything is hidden, except for frame reordering, which you've got to do yourself
[00:50:04 CEST] <rcombs> and also it doesn't tell you how
[00:50:08 CEST] <nevcairiel> VT is just the most terrible of them all
[00:50:26 CEST] <rcombs> nevcairiel: IMO the API would be just fine if it weren't for how it fucks up on reordering hard
[00:50:28 CEST] <wm4> rcombs: well I guess it's better than X vendor APIs that work the same anyway
[00:50:33 CEST] <nevcairiel> it doesn't give you any control but doesn't even do the things it ought to be doing
[00:50:53 CEST] <rcombs> (for decode, I mean)
[00:51:06 CEST] <wm4> rcombs: and other than being full-stream and being full of awkward kernel dev fuckups, it's reasonable I guess
[00:51:21 CEST] <nevcairiel> fullstream is the future!
[00:51:34 CEST] <nevcairiel> and we parse the btistream anyway because they dont export half the metadata
[00:51:40 CEST] <rcombs> I'm fine with fullstream in concept, as long as it actually does all the stuff
[00:51:40 CEST] <wm4> (awkward kernel dev fuckups means kernel devs don't know how to create APIs, and force it into existing, inadequate POSIX or related APIs)
[00:51:43 CEST] <wm4> like ioctl hell
[00:51:46 CEST] <rcombs> just, in practice nothing does all the stuff
[00:52:17 CEST] <rcombs> wm4: did you see the time when the VideoToolbox kernel people forgot to zero buffers before they gave them to the decoder
[00:52:19 CEST] <wm4> nevcairiel: yeat, VT is terrible
[00:52:22 CEST] <wm4> *yeah
[00:52:28 CEST] <wm4> nevcairiel: both API and implementation
[00:52:43 CEST] <wm4> rcombs: yep
[00:52:45 CEST] <RiCON> wm4: speaking of VT, what happened to your april 1st MF patch
[00:53:14 CEST] <rcombs> I've got nothing against the VT API apart from the stuff that's missing
[00:53:17 CEST] <wm4> RiCON: yeah, I should fix that I guess
[00:53:30 CEST] <rcombs> if it had the missing functionality, but otherwise worked the same way it does now, it'd be fine
[00:53:42 CEST] <wm4> rcombs: you mean reordering?
[00:53:50 CEST] <rcombs> reordering, and other metadata stuff
[00:54:02 CEST] <nevcairiel> i have a thick skin and can ignore "bad" APIs mostly, as long as I can actually reach my goal eventually
[00:54:16 CEST] <nevcairiel> but VT doesn't seem to make that possible
[00:54:42 CEST] <rcombs> it's like they got halfway through designing something decent, and then stopped
[00:54:57 CEST] <rcombs> exactly at the point where it did enough for their own applications
[00:54:59 CEST] <nevcairiel> probably designed the other half in the OSX video player app
[00:55:05 CEST] <nevcairiel> and forgot that it didnt belong there
[00:55:09 CEST] <wm4> also, now is the time that users are crying for hevc support on OSX
[00:55:16 CEST] <nevcairiel> didnt they add that
[00:55:25 CEST] <wm4> yes, it seems so
[00:55:29 CEST] <wm4> with no docs
[00:55:35 CEST] <nevcairiel> even has an encoder i heard
[00:56:06 CEST] <wm4> I also need to fix that ffmpeg's VT crashes if multithreading is enabled
[00:56:32 CEST] <rcombs> lol
[00:56:35 CEST] <nevcairiel> people have been bugging me about hwdecoding for linux, luckily mac didnt come up yet
[00:56:56 CEST] <nevcairiel> speaking about that, does avcodec have a built-in way to auto-select vaapi or vdpau on availability yet?
[00:56:59 CEST] <wm4> I think VT mostly works now... I only remember a patch by tmm1 which fixed some sps midstream change thing
[00:57:18 CEST] <wm4> nevcairiel: no
[00:57:54 CEST] <cone-182> ffmpeg 03Dale Curtis 07master:f1e47f87131d: avformat/mov: Bail when invalid sample data is present.
[00:57:54 CEST] <cone-182> ffmpeg 03Daniel Glöckner 07master:feb1dbc7bd4c: avformat/mov: prevent duplication of first fragment's ctts_data
[00:57:56 CEST] <nevcairiel> what would the best way be to "probe"? try to create a device context?
[00:58:19 CEST] <wm4> that's what I do, plus checking for awful emulation layers
[00:58:41 CEST] <jkqxz> Probe is nasty because it's hard to know whether a given stream will actually be decodable on it.
[00:59:06 CEST] <nevcairiel> i guess, but i'm not worried about artificial cases like multiple GPUs yet
[00:59:17 CEST] <wm4> usually API probing means either it probably works with most streams, or it's horseshit
[00:59:30 CEST] <nevcairiel> so if I can detect if the API is actually available (and perhaps rule out emulation things), that might be enough
[01:01:32 CEST] <jkqxz> Most other stuff is just about there. The last generic hwaccel patch (not yet complete) in the other tine finishes off knowing which devices are usable and matching them to formats.
[01:01:46 CEST] <jkqxz> Given that you just look at the possibly-usable devices and try to make them.
[01:03:35 CEST] <wm4> jkqxz: oh right, getting in this hwframes adjustment API we've talked about would be nice because apparently a new ffmpeg release is planned soon
[01:05:01 CEST] <durandal_170> michaelni: well, i cant get identity impulse to be all 1 in re and all 0 in im so C2C doesnt hold for that one
[01:15:16 CEST] <jkqxz> wm4: Yeah, I should get back to that one.
[01:15:51 CEST] <jkqxz> I've been distracted by finishing off CBS and playing with some silly stuff recently. (Anyone want no-CPU screen grab for Linux? Works with or without X! Also needs root, boo :(.)
[01:24:47 CEST] <wm4> no-CPU? wut
[01:29:49 CEST] <jkqxz> KMS lets you find the buffers being used for scanout as DRM objects which you can give to VAAPI.
[01:32:48 CEST] <kiroma_> Is there a filter that can blend frames together to lower framerate of video?
[01:33:19 CEST] <kiroma_> for example blend 10 frames together in a 600 fps input video to produce a smooth 60 fps one?
[01:33:21 CEST] <durandal_1707> michaelni: it have something to do with copy_rev()
[01:33:47 CEST] <durandal_1707> kiroma_: define blend
[01:34:54 CEST] <kiroma_> Uh
[01:35:52 CEST] <kiroma_> Overlay one image on top of another with 50% opacity?
[01:36:01 CEST] <durandal_1707> iirc for simple blend there is framerate filter
[01:36:48 CEST] <kiroma_> Oh
[01:36:58 CEST] <kiroma_> Okay thanks
[01:37:17 CEST] <kiroma_> (how did I not notice that?)
[02:47:57 CEST] <cone-182> ffmpeg 03pkviet 07master:73bed07373f2: avocdec/libopus: fix typo
[02:47:58 CEST] <cone-182> ffmpeg 03Yi(SÑ) 07master:c24bcb553650: avformat/nsvdec: Fix DoS due to lack of eof check in nsvs_file_offset loop.
[02:47:59 CEST] <cone-182> ffmpeg 03Yi(SÑ) 07master:900f39692ca0: avformat/mxfdec: Fix DoS issues in mxf_read_index_entry_array()
[02:48:00 CEST] <cone-182> ffmpeg 03Yi(SÑ) 07master:9d00fb9d70ee: avformat/mxfdec: Fix Sign error in mxf_read_primer_pack()
[02:48:13 CEST] <rpw> doublya
[02:53:30 CEST] <Compn> irc client actually displays those chinese characters
[02:53:34 CEST] <Compn> impressive
[03:05:14 CEST] <rpw> I'm calling scale_slice in vf_scale.c from multiple pthreads. I'm getting a segmentation fault.
[03:13:00 CEST] <rpw> Perhaps I need a unique struct SwsContext per thread. Can I just do a memcpy(new_sws, sws, sizeof(struct SwsContext))?
[04:21:56 CEST] <cone-182> ffmpeg 03Steven Liu 07master:837580f458f2: avformat/dash: move reused API to common file and header file
[06:52:48 CEST] <cone-182> ffmpeg 03Anton Khirnov 07master:b12e4d3bb8df: avio: add a destructor for AVIOContext
[07:22:47 CEST] <cone-182> ffmpeg 03Anton Khirnov 07master:78a7af823b7c: Use the new AVIOContext destructor.
[10:10:34 CEST] <wm4> stevenliu: you could post the dash patch also as series of incremental patches, if that makes it easier
[10:24:41 CEST] <stevenliu> wm4: you mean: make avformat/dash and avformat/dashdec to [PATCH 1/2] [PATCH 1/2] ? do i get the mean?
[10:26:03 CEST] <wm4> stevenliu: yeah
[10:26:19 CEST] <stevenliu> Ok, Thanks wm4 :D
[10:27:10 CEST] <wm4> and I get that these patch iterations are painful... but I also think it's required to fix the remaining things
[10:31:24 CEST] <stevenliu> yes, only a little modify in a big patch, review and check is a hard work, just me know which part is modified, reviewers to review it is not easy :(
[10:41:10 CEST] <durandal_170> michaelni: how is multiplication done in 2d fft?
[10:44:45 CEST] <BtbN> atomnuker, you can, but with CUDA functions.
[10:46:11 CEST] <wm4> <rpw> I'm calling scale_slice in vf_scale.c from multiple pthreads. I'm getting a segmentation fault.
[10:46:12 CEST] <wm4> <rpw> Perhaps I need a unique struct SwsContext per thread. Can I just do a memcpy(new_sws, sws, sizeof(struct SwsContext))?
[10:46:13 CEST] <wm4> no
[10:46:21 CEST] <wm4> even the slice calls are stateful
[10:46:39 CEST] <wm4> they probably store at least a line of the previous call in memory for interpolation or so
[10:46:50 CEST] <nevcairiel> i learned that the hard way, sws isnt exactly thread friendly
[10:47:23 CEST] <wm4> doing just conversion without scaling would be pretty trivially threadable
[10:47:36 CEST] <wm4> except when it converts to 420p or some shit as intermediate step
[10:47:40 CEST] <wm4> (does it still do this?)
[10:48:05 CEST] <nevcairiel> most of the conversions i cared to possibly make faster involved scaling, like 420 to rgb
[10:48:48 CEST] <wm4> I was thinking format conversion without size (or subsampling) change
[10:48:51 CEST] <nevcairiel> ultimately I NIH'ed that with simple bilinear chroma scaling because factor 2 is so easy
[10:49:11 CEST] <wm4> I just use GPU shaders duh
[10:49:26 CEST] <nevcairiel> i tell people to do that, but some want software conversion
[10:49:57 CEST] <wm4> stupid people or dshow whatever restrictions?
[10:50:08 CEST] <nevcairiel> mostly stupid people i guess
[10:50:20 CEST] <nevcairiel> or people who want dumb post-processing things that d ont like yuv so they need rgb first
[10:50:57 CEST] <wm4> we have user shaders for that kind of stuff (and vapoursynth things for the more hopeless nerds)
[10:51:16 CEST] <nevcairiel> noone ever bothered to make a vapoursynth directshow filter
[10:51:38 CEST] <wm4> doom9 folks are still tripping on avisynth
[10:51:53 CEST] <nevcairiel> but user shaders are possible in many dshow players, but apparently hlsl is hard
[10:51:59 CEST] <JEEB> yup
[10:52:09 CEST] <JEEB> heck, MPC-HC even had an included shader editor
[10:52:12 CEST] <JEEB> which I went WTF at
[10:52:20 CEST] <nevcairiel> i never even opened that i dont think
[10:52:41 CEST] <JEEB> I once noticed it in the menus and double-blinked
[10:55:01 CEST] <BtbN> Does patchwork not pick up E-Mails with patches attached anymore?
[11:01:42 CEST] <cone-136> ffmpeg 03Timo Rothenpieler 07master:a0b69e2b0a7b: avcodec/nvenc: add support for specifying entropy coding mode
[11:01:42 CEST] <cone-136> ffmpeg 03Timo Rothenpieler 07master:0e995eac2035: avcodec/nvenc: only push cuda context on encoder close if encoder exists
[11:15:13 CEST] <cone-136> ffmpeg 03Timo Rothenpieler 07release/3.3:bab4cb3fb55e: avcodec/nvenc: only push cuda context on encoder close if encoder exists
[12:54:04 CEST] <cone-136> ffmpeg 03Steven Liu 07master:adeb41afb80f: avformat/dash:add copyright to dash.c
[14:10:06 CEST] <mobi> hi
[14:10:17 CEST] <mobi> Nobody used youtube live channels?
[14:10:18 CEST] <mobi> It is broken after this commit
[14:10:39 CEST] <mobi> http://trac.ffmpeg.org/ticket/6490
[14:17:06 CEST] <durandal_170> stevenliu: ^
[14:18:09 CEST] <stevenliu> let me see
[15:03:42 CEST] <stevenliu> mobi: here?
[15:03:52 CEST] <stevenliu> Wnat your email address?
[15:05:44 CEST] <stevenliu> be used to add author
[15:06:23 CEST] <mobi> Yes , why do you need my email ?
[15:06:32 CEST] <stevenliu> Yes
[15:06:51 CEST] <stevenliu> Submit and commit modify need your author info
[15:07:16 CEST] <stevenliu> for example me: Steven Liu <lq at chinaffmpeg.org>
[15:07:55 CEST] <mobi> this workround is not good. That makes live play only a little less broken
[15:12:30 CEST] <stevenliu> let me think about it, don't worry
[15:13:03 CEST] <stevenliu> will fix it sunday or monday
[15:16:29 CEST] <mobi> ok, thx
[15:20:37 CEST] <mobi> stevenliu: you dash patch need this : +OBJS-$(CONFIG_DASH_DEMUXER) += dash.o dashdec.o for build without mux
[17:28:14 CEST] <cone-638> ffmpeg 03James Almer 07master:877076ffa17b: avformat/avio: update avio_alloc_context() doxy
[17:44:03 CEST] <ubitux> for the ppl on ARM/ARM64 who want to cycle count stuff but don't want to mess with kernel modules
[17:44:08 CEST] <ubitux> i did https://github.com/ubitux/FFmpeg/compare/perf
[17:44:35 CEST] <ubitux> i'll probably make a build switch and submit
[17:44:38 CEST] <ubitux> but it's already usable
[17:45:48 CEST] <atomnuker> wouldn't it be more useful being in lavu/timer.h rather than checkasm only?
[17:46:30 CEST] <ubitux> atomnuker: that's what i initially did, but it's way too much pain
[17:46:51 CEST] <ubitux> not sure that's even possible without a lot of problems
[17:47:27 CEST] <ubitux> first because i need a "global" fd
[17:47:39 CEST] <ubitux> (and i don't want to if () in the read time)
[17:48:13 CEST] <ubitux> the other problem is syscall (which requires _GNU_SOURCE)
[17:48:44 CEST] <ubitux> the nop time and other heuristics are pretty different in timer.h too
[17:48:52 CEST] <ubitux> so having a common api is very tricky
[17:49:05 CEST] <ubitux> i couldn't have a common api within libavutil without some avpriv shit
[17:49:19 CEST] <ubitux> so in the end i put that stuff in checkasm where it matters the most
[17:49:33 CEST] <ubitux> you're not supposed to use START/STOP_TIMER anymore
[17:49:47 CEST] <ubitux> just write your stuff in checkasm so it has coverage and benchmark infrastructure
[17:52:16 CEST] <atomnuker> too complicated and doesn't offer non-synthetic input
[17:57:19 CEST] <cone-638> ffmpeg 03wm4 07master:5d7667475680: lavf: make avio_read_partial() public
[18:03:21 CEST] <ubitux> atomnuker: that's all i can propose, sorry
[18:09:55 CEST] <iive> ubitux: please, checkasm is not reliable benchmark, it is in essense a syntetic test and does not cover real world usage
[18:10:29 CEST] <ubitux> yeah well, you can ask me all you want, i don't have any other solution
[18:10:39 CEST] <ubitux> and checkasm is covering my use case
[18:11:35 CEST] <ubitux> any solution i could come up with as part of the lavu/timer.h thing had too much outstanding issues that couldn't be solved
[18:14:38 CEST] <jamrial> iive: to check if an asm function is faster or not than the c version it's more than enough
[18:18:26 CEST] <iive> once again, this is syntheric test.
[18:19:25 CEST] <ubitux> which shouldn't mean much given that you rarely have any condition
[18:22:06 CEST] <iive> ?
[18:22:46 CEST] <ubitux> whatever the input you perform most of the time the same instructions in the simd code
[18:23:00 CEST] <ubitux> i know there are a few exceptions
[18:23:27 CEST] <iive> well, it wasn't for the last SIMD i wrote.
[18:23:59 CEST] <iive> and i've had cases where some change is slightly better in the synthetic tests, but doesn't bode so well in ffmpeg.
[18:24:29 CEST] <BBB> why not use START/STOP_TIMER?
[18:24:36 CEST] <BBB> ubitux: ^
[18:24:48 CEST] <ubitux> i replied earlier
[18:25:06 CEST] <iive> that's where it started. should we have start/stop for arm that is implemented ugly
[18:25:19 CEST] <BBB> ubitux: oh is that arm only? so on x86 its still ok?
[18:25:22 CEST] <iive> or we should rely on checkasm benchmark
[18:25:23 CEST] <atomnuker> you usually only have one start/stop timer in the entire code, so having global context isn't that bad
[18:25:44 CEST] <ubitux> BBB: it's not arm only, it's linux only, but it's mostly useful for arm
[18:25:50 CEST] <ubitux> (and yes, i'll make a build switch)
[18:26:06 CEST] <BBB> on x86 I can still use START/STOP_TIMER, right?
[18:26:15 CEST] <ubitux> i'm not touching start/stop timer
[18:26:32 CEST] <iive> we are talking about extending it.
[18:26:32 CEST] <BBB> ok so I can still use it then, tnx
[18:26:47 CEST] <ubitux> i'm just making possible to run checkasm on arm/arm64
[18:27:10 CEST] <ubitux> without compiling and running an unreliable random kernel module found on github
[18:27:38 CEST] <iive> ubitux: it would be useful to have start/stop that could work with arm
[18:27:52 CEST] <ubitux> you can't
[18:28:14 CEST] <ubitux> unless you use kernel API like i did, which requires various context stuff
[18:28:30 CEST] <BBB> ok, cool
[18:28:41 CEST] <ubitux> anyway, gtg
[18:28:42 CEST] <BBB> I probably misunderstood the earlier comment, sorry about that
[18:28:52 CEST] <BBB> cya
[18:40:46 CEST] <TimothyGu> FYI https://patchwork.ffmpeg.org/ still uses the old StartSSL certs that's blocked by Chrome etc.
[18:41:13 CEST] <BtbN> works for me
[18:41:13 CEST] <TimothyGu> michaelni: ^^
[18:41:41 CEST] <TimothyGu> I'm using Chrome beta if that matters
[18:42:11 CEST] <TimothyGu> but either way we should switch to let's encrypt, which ffmpeg.org already uses
[18:43:05 CEST] <michaelni> TimothyGu, talk with reimar, he did setup the lets encrypt stuff
[18:44:30 CEST] <TimothyGu> michaelni: is he in control of the patchwork server?
[18:45:50 CEST] <TimothyGu> BtbN: curl https://patchwork.ffmpeg.org/ also fails on debian stable
[18:48:01 CEST] <michaelni> TimothyGu, everyone with root access to the main server should have root access to patchwork
[19:07:57 CEST] <cone-638> ffmpeg 03James Almer 07master:3ec6d9c6b297: avfilter: remove duplicate and disabled trace log function
[19:38:03 CEST] <durandal_170> michaelni: i dont get simple horizontal only fft to match reference
[19:38:17 CEST] <durandal_170> michaelni: do I need to padd with zeroes or what?
[19:50:54 CEST] <cone-638> ffmpeg 03James Almer 07master:9aa24699302c: avcodec/internal: move FF_QSCALE_TYPE defines from avcodec.h
[19:57:32 CEST] <BtbN> There is a bug somewhere in ffmpeg. When transcoding interlaced, from mkv to mkv. Input: Stream #0:0[0x100]: Video: h264 (Main) ([27][0][0][0] / 0x001B), yuv420p(top first) Output: Stream #0:0(eng): Video: h264 (Main), yuv420p(top coded first (swapped))
[19:57:41 CEST] <michaelni> if the code doesnt pad with zeros then it likely needs to
[19:57:56 CEST] <BtbN> If I transcode to .ts instead, it says (top first) on the output as well
[19:58:05 CEST] <BtbN> So I guess something with the mkv muxer messes this up, somehow
[19:59:19 CEST] <BtbN> It can't be the codec. It happens with both libx264 and nvenc.
[19:59:27 CEST] <jamrial> BtbN: the field_order stuff is wrong atm, library wide
[19:59:31 CEST] <jamrial> there's a patch in the ml
[20:00:14 CEST] <BtbN> It works fine if I mux to ts
[20:00:19 CEST] <BtbN> And is broken if I mux to mkv
[20:04:22 CEST] <nevcairiel> thats because ts doesnt store any metadata whatsoever
[20:06:25 CEST] <durandal_170> michaelni: see http://www.ft.unicamp.br/docentes/magic/khoros/html-dip/c5/s2/dft-w.gif of this I get only right half
[20:06:34 CEST] <BtbN> The patch on the ML would not fix mkvenc though. It only changes some labels
[20:16:01 CEST] <jamrial> BtbN: probably https://git.videolan.org/?p=ffmpeg.git;a=blob;f=ffmpeg.c;h=ccb6638e0a4140d1349565691dacb4ba10268799;hb=HEAD#l1228 then
[20:16:42 CEST] <BtbN> Yes. Now that just looks plain wrong
[20:17:00 CEST] <jamrial> source mkv is TT, ffmpeg.c changes it to TB
[20:20:11 CEST] <durandal_170> atomnuker: simple horizontal fft should be symetrical, but it is not for me for some reason
[20:24:43 CEST] <BtbN> jamrial, I wonder if that is just plain wrong then. And should be dropped
[20:25:42 CEST] <BtbN> That AV_FIELD_TB/BT mode seems highly uncommon to me.
[20:26:55 CEST] <jamrial> BtbN: i don't know, to be honest. commit dcbf72836c9 added it but without any explanation
[20:28:11 CEST] <jamrial> send a patch to remove it, arguing it looks like weird heuristics
[20:55:31 CEST] <BtbN> jamrial, or maybe the heuristic is reversed?
[20:56:18 CEST] <jamrial> BtbN: i send a patch to reverse it. it wasn't popular
[20:56:42 CEST] <jamrial> s/send/sent
[20:57:22 CEST] <BtbN> The current version is clearly broken
[20:59:54 CEST] <doublya> I need some help with multithreading scale_slice in vf_scale.c. I'm testing two with slices on two threads. I'm getting a seg fault from desc[i].process, I'm looking through the possible function that coule be assigned to function pointer process.
[21:01:28 CEST] <wm4> wasn't this answered a few days ago? the answer is it can't and won't work
[21:02:57 CEST] <doublya> are you talking to me?
[21:04:32 CEST] <mobi> why come the fix in git? https://patchwork.ffmpeg.org/patch/4386/
[21:04:47 CEST] <mobi> why come the fix not in git? https://patchwork.ffmpeg.org/patch/4386/
[21:06:30 CEST] <wm4> doublya: yes
[21:25:39 CEST] <wm4> if you want that, a reasonable way would be adding threading directly to libswscale
[21:27:45 CEST] <durandal_170> wm4: slice threading is already in swscale
[21:28:54 CEST] <wm4> well that makes it easier
[21:29:31 CEST] <durandal_170> there was even gsoc project for it
[21:44:05 CEST] <wm4> I guess vf_scale could do frame threading easily... is there libavfilter frame threading support?
[21:45:13 CEST] <durandal_170> not yet
[21:51:45 CEST] <doublya> wm4: I need slice threading per frame. It seems as vf_scale.c may have had though of line by line threading via scale_slice function. If you're suggesting threading libswscale, it seems to me that the for loop on line 370 of swscale.c would be a good place.
[21:52:55 CEST] <doublya> I was thinking the SwsContext would be the only needed per thread when calling scale_slice (vf_scale.c)
[21:54:10 CEST] <wm4> as durandal_170 is saying this already exists
[21:54:23 CEST] <durandal_170> doublya: have you looked at sws_scale() in header?
[21:56:15 CEST] <durandal_170> doublya: yes, you can parallelize with execute only if nb_slices is set
[21:56:51 CEST] <durandal_170> note that it is marked currently for debug purposes only
[21:57:13 CEST] <durandal_170> ask michaelni why, but you can test slice scaling already
[22:01:45 CEST] <doublya> I'm just bypassing the if(scale->interlaced>0... block in vf_scale.c I'm threading with pthreads. Calling scale_slice from worker function on each p thread. testing with two threads. Seg fault happens in one of the functions in vscale.h .. the call to desc[i].process near the end of swscale function
[22:05:11 CEST] <durandal_170> doublya: why? there is internal threading api. use that
[22:06:55 CEST] <wm4> doublya: for the thousands time, it won't work and can't wolk
[22:06:57 CEST] <wm4> work, even
[22:08:10 CEST] <doublya> durandal_170: What are you suggesting exactly?
[22:08:24 CEST] <doublya> wm4: Well I need it. So maybe I need to write my own filter.
[22:11:01 CEST] <durandal_170> doublya: look at other filters, that call execute, they have own slice threading flag
[22:11:46 CEST] <durandal_170> libavfilter/vf_yadif.c: ctx->internal->execute(ctx, filter_slice, &td, NULL, FFMIN(h, ff_filter_get_nb_threads(ctx)));
[22:14:44 CEST] <kiroma> Does the `framerate` filter interpolate all frames from the input, or only two closest to interp_start/interp_end values?
[22:17:41 CEST] <durandal_170> kiroma: iirc those parameters have nothing to do with how many frames will be used from input
[22:20:32 CEST] <doublya> wm4: I looked at vf_yadif.c per durandal_170. Similar to what I'm doing already no?
[22:23:28 CEST] <doublya> I'll test it out
[23:49:12 CEST] <durandal_170> michaelni, atomnuker: i resolved bunch of issues switching to av_fft, now I keep getting output in only 2 quadrants, topleft and bottomright, 2 missing quadrants are overlayed over existing ones
[23:56:26 CEST] <atomnuker> so 2 quadrants are correct and 2 look like what?
[23:57:18 CEST] <durandal_170> atomnuker: i can get all quadrants, they are just overlaid onto each other
[23:58:48 CEST] <atomnuker> overlaid?
[00:00:00 CEST] --- Sat Sep 2 2017
More information about the Ffmpeg-devel-irc
mailing list