[Ffmpeg-devel-irc] ffmpeg-devel.log.20190827
burek
burek at teamnet.rs
Wed Aug 28 03:05:07 EEST 2019
[00:00:52 CEST] <lotharkript> are you saying that UB and timeout should not be fixed? or just timeout?
[00:00:55 CEST] <nevcairiel> if you feel like you need to count, you are missing the point anyway
[00:01:38 CEST] <baptiste> I think you are being stubborn as well
[00:02:12 CEST] <baptiste> maybe you should go and fix some fuzzer detected issues
[00:02:42 CEST] <lotharkript> For timeout, i agreed that the fuzzer does not make the difference between an infinite loop (DoS) and long time to decode.. So what do you propose?
[00:02:57 CEST] <lotharkript> Should we ignore the long time to decode and fix only infinite loop?
[00:03:31 CEST] <BtbN> Someone should look at the issue and judge them.
[00:03:41 CEST] <BtbN> And not blindly fix everything the fuzzer spits out
[00:04:19 CEST] <lotharkript> ok.. So, if the bug is about take too long to decode, should we leave it as Working as Intended?
[00:04:47 CEST] <lotharkript> leave it open?
[00:04:48 CEST] <nevcairiel> It entirely depends, is the point.
[00:05:12 CEST] <BtbN> Obviously a way to make a decoder jump into an infinite loop needs fixing.
[00:05:27 CEST] <BtbN> Something that makes it take 2 or 10 seconds longer... very much depends
[00:05:35 CEST] <lotharkript> ok.. I agree.. Then when the patch is summited for review, should we talk about the "Take too long to decode" and come up with some solution?
[00:05:56 CEST] <lotharkript> or should we ignore a 10s time out for decoding a video frame?
[00:06:01 CEST] <nevcairiel> The problem is that "security" is used as an argument to white-wash any patch
[00:06:08 CEST] <nevcairiel> because who dare argue against security fixes
[00:06:59 CEST] <lotharkript> i understand that.. Right now, I'm trying to find a way about the time out.
[00:07:05 CEST] <BtbN> First step would be to move fuzzing off of the very secluded security ml. To a dedicated fuzzing ml, and give every known dev/maint access.
[00:07:12 CEST] <lotharkript> What is the best way to deal with timeout found by fuzzer?
[00:07:38 CEST] <BtbN> There are a whole bunch of timeouts you can probably just safely ignore
[00:08:58 CEST] <nicolas17> BtbN: would it be possible to make fuzz-found timeouts public while fuzz-found crashes stay private?
[00:09:09 CEST] <BtbN> Don't ask me
[00:09:18 CEST] <BtbN> imo it should just all stay private, but not THAT private as it is right now
[00:09:25 CEST] <BtbN> Just treat it the same like the coverity results
[00:09:59 CEST] <lotharkript> nicolas17: yes it is possible.. Bu t i was asked to wait before doing the pull request
[00:20:38 CEST] <jamrial> the cfr -> vfr qtrle change was indeed to "fix" a fuzzer reported timeout
[00:20:59 CEST] <jamrial> something that should no longer be reproducible ever since i made the fuzzer use ref counted buffers
[00:21:08 CEST] <jamrial> and even more so is "avcodec/qtrle: call ff_reget_buffer() only when the picture data is going to change" is applied as well
[00:21:21 CEST] <BtbN> a fuzzer patch that changes behaviour, even arguable off-spec, is a no-go
[00:21:36 CEST] <nevcairiel> and now its being argued for as a generic performance improvement, just to justify it
[00:21:38 CEST] <jamrial> so what people complain about is that the output of a decoder was changed to shut up a timeout report
[00:22:03 CEST] <jamrial> BtbN: exactly
[00:22:06 CEST] <BtbN> This feels the wrong way around to me. There should be arguments about including the patch. Not about reverting it again.
[00:22:16 CEST] <BtbN> The revert is obvious, and then you can argue.
[00:22:25 CEST] <BtbN> About the if and how to do it properly.
[00:22:40 CEST] <nevcairiel> I have been arguing against the CFR changes for months, but in the beginning noone else cared, so it was me against "security" so it went in anyway
[00:22:46 CEST] <nevcairiel> I forgot what codec that was even on
[00:22:53 CEST] <nevcairiel> At least n ow more people care
[00:28:00 CEST] <jamrial> i'm not happy that i had to go and take direct action (aka, write a patch to make the fuzzer use the proper and recommended api myself) to effectively call the wave of silly 2 second timeout reports into question
[00:28:23 CEST] <jamrial> it wasn't me who realized the fuzzer wasn't using ref counted buffers. it wasn't me who complained about it for months
[00:28:43 CEST] <BtbN> I didn't even realize the code that did that was part of ffmpeg itself
[00:28:43 CEST] <jamrial> so why was i the one that wrote a trivial patch to deal with it instead of those who raised the issue?
[00:54:21 CEST] <lotharkript> what if we can assign the timeout bug to the owner of the file? And let him decide what to do?
[00:58:36 CEST] <lotharkript> It seems I can assign the bug to people. Will it be ok if i CC the maintener of the codec for timeout? at this point, the dev should have access to the reproduce data
[00:59:55 CEST] <lotharkript> any one interested to see if this will work?
[01:00:02 CEST] <jamrial> the maintainers list is not exactly up to date, or accurate anymore
[01:00:23 CEST] <jamrial> you'll probably end up ccing someone that hasn't looked at ffmpeg codebase in years
[01:00:31 CEST] <lotharkript> can we then update the list?
[01:00:53 CEST] <lotharkript> for example, who is VC1? or MotionPixels?
[01:00:59 CEST] <lotharkript> or TAK?
[01:01:11 CEST] <lotharkript> pr ARBC?
[01:01:19 CEST] <jamrial> tak is durandal_1707
[01:01:22 CEST] <nevcairiel> the majority of those probably don't have an active maintainer
[01:01:58 CEST] <nevcairiel> codecs that barely need changes over years don't get adopted when their original authors leave
[01:02:10 CEST] <lotharkript> Durandal_1707: do you want me to CC you on one of those time bug? So we can see if you can have access to it?
[01:02:19 CEST] <lotharkript> should we then no fuzz them?
[01:02:28 CEST] <lotharkript> and maybe fuzz only the one with maintener?
[01:04:34 CEST] <nevcairiel> eh security issues can of course still happen in all parts of the code
[01:04:34 CEST] <durandal_1707> for tak or arbc timeouts cc me
[01:05:01 CEST] <lotharkript> ok.. Let;s try one..
[01:05:12 CEST] <jamrial> i think the above about ignoring timeouts below a given threshold sounds better than skipping stuff from files without maintainer
[01:05:35 CEST] <jamrial> but no idea if that should be automated, or someone deciding if it's not worth looking at
[01:05:51 CEST] <jamrial> two second timeouts sound silly either way
[01:07:13 CEST] <lotharkript> The overall timeout for fuzzer is 25s..
[01:10:02 CEST] <cone-077> ffmpeg 03Aman Gupta 07master:0821bc4eee25: avcodec/vaapi_encode: respect -force_key_frames setting
[09:28:26 CEST] <thardin> that doesn't seem like the proper reason to close a ticket
[09:31:22 CEST] <Compnn> baptiste makes a rare appearance
[09:32:53 CEST] <thardin> so I saw
[09:33:15 CEST] <thardin> I had wanted to run that mxf d-10 thing past him
[09:33:34 CEST] <thardin> hard to do while asleep
[09:34:23 CEST] <durandal_1707> all tickets without samples should be closed asap
[09:44:32 CEST] <thardin> why not just assign it to the reporter?
[09:47:13 CEST] <thardin> it's marked as an important regression, seems strange to close it
[11:02:22 CEST] <rcombs> anyone seen "error registering an input resource: unimplemented (22)" from nvenc? I'm getting it only when pairing nvenc with nvdec, with hwaccel_output_format=cuda
[11:04:03 CEST] <rcombs> seems to happen regardless of any filters
[11:04:29 CEST] <rcombs> oddly, it seems to happen on the _second_ frame, despite the params being basically identical
[11:10:00 CEST] <BtbN> that sounds odd
[11:10:05 CEST] <BtbN> never encountered that
[11:10:10 CEST] <BtbN> What GPU and Driver?
[11:11:31 CEST] <rcombs> Quadro P2000
[11:11:36 CEST] <rcombs> driver, uh
[11:11:56 CEST] <rcombs> idk but the date is from july
[11:12:02 CEST] <rcombs> this is on windows
[11:12:49 CEST] <rcombs> I've also had some very bizarre issues when trying to run this in a 32-bit windows process (segfaults in nvenc; can't tell where because gdb doesn't get any useful function bounds from DLLs)
[11:13:28 CEST] <rcombs> (though the segfaults only happen when I have a scale_cuda filter)
[11:13:53 CEST] <rcombs> this is in 64-bit, which doesn't crash, but gives this error
[11:14:30 CEST] <rcombs> hard to tell if it's an ffmpeg bug or an nvenc driver bug; not sure if I should poke nvidia or not
[11:15:51 CEST] <rcombs> making me miss VAAPI and open-source drivers
[11:15:59 CEST] <JEEB> yea :/
[11:16:02 CEST] <BtbN> I'd try and get the latest driver first
[11:16:04 CEST] <JEEB> you do have the closed blob at the end often
[11:16:10 CEST] <JEEB> but still, you have a lot of stuff semi-visible
[11:16:34 CEST] <rcombs> the installed driver is newer than the one in the current CUDA installer, at least
[11:16:56 CEST] <BtbN> Those are usually super outdated
[11:17:02 CEST] <rcombs> also wait the date is actually may
[11:17:13 CEST] <BtbN> Current driver for it is 436.02 from a week ago
[11:17:17 CEST] <rcombs> windows says "the best drivers for your device are already installed"
[11:17:21 CEST] <rcombs> is it lying
[11:17:36 CEST] <rcombs> and how do I check the driver version
[11:17:38 CEST] <BtbN> Why would Windows be able to tell what the latest nvidia driver is?
[11:17:46 CEST] <BtbN> You put the GPU in here: https://www.nvidia.de/drivers/beta
[11:18:04 CEST] <rcombs> idk it claims to know
[11:18:13 CEST] <BtbN> You must be new to Windows :P
[11:18:18 CEST] <rcombs> yes
[11:18:21 CEST] <rcombs> please let me stay that way
[11:18:22 CEST] <JEEB> that's what windows update has basically
[11:18:30 CEST] <BtbN> Not even what Windows Update has
[11:18:30 CEST] <JEEB> whatever nvidia pushed to MS at some point in history
[11:18:42 CEST] <BtbN> it's what driver it has locally for that device
[11:19:17 CEST] <rcombs> this is, uh, 430.86, I guess
[11:19:30 CEST] <BtbN> Yeah, you want to update that
[11:19:42 CEST] <BtbN> P2000 is a Notebook GPU, right?
[11:19:45 CEST] <rcombs> I'll try the newer version but forgive me for not being especially optimistic about this happening to have been fixed in the past few months
[11:19:46 CEST] <BtbN> Is it paired with an Intel one?
[11:20:15 CEST] <rcombs> not notebook, but there's also an intel GPU in this machine yes
[11:20:41 CEST] <BtbN> Cause when I put P2000 into the NVidia Page, it sends me to the mobile driver
[11:21:05 CEST] <BtbN> And Laptop-Optimus-Setups often have the video engine in the nvidia GPU partially or entirely disabled
[11:21:16 CEST] <rcombs> idk maybe it's a mobile part but it's on a card in this case
[11:21:25 CEST] <rcombs> I'm RDP'd into a coworker's machine poking at this
[11:21:40 CEST] <BtbN> Yeah, I'd update the driver and reboot
[11:21:50 CEST] <BtbN> Nvidia won't accept any bugreports otherwise anyway
[11:21:54 CEST] <nevcairiel> quadro P2000 exists both as mobile and desktop part
[11:22:36 CEST] <rcombs> the "recommended" one is 431.70, from late july
[11:22:45 CEST] <nevcairiel> its (surpsiginly) a pascal part, similar to a 1060
[11:22:48 CEST] <rcombs> I take it I want the beta anyway?
[11:22:55 CEST] <BtbN> what do you mean, recommended?
[11:23:09 CEST] <rcombs> I mean there's a little "RECOMMENDED" label next to it on nvidia's search page
[11:23:39 CEST] <BtbN> 436.02 isn't Beta, it's just quite new
[11:23:50 CEST] <BtbN> I'm using it at home and it runs fine
[11:24:04 CEST] <nevcairiel> quadro uses different drivers, they dont have 436 yet
[11:24:12 CEST] <rcombs> well it's offering me it
[11:24:14 CEST] <BtbN> They do, at least on their download page
[11:24:23 CEST] <BtbN> https://www.nvidia.de/drivers/results/150305
[11:24:32 CEST] <rcombs> just "recommending" 431
[11:24:44 CEST] <nevcairiel> oh i selected the stable driver instead of the "new features" driver
[11:24:56 CEST] <rcombs> it's not listed as "WHQL", whatever that means
[11:25:07 CEST] <BtbN> Certified by Microsoft
[11:25:13 CEST] <rcombs> how do people actually use this platform to, like, do things
[11:25:24 CEST] <nevcairiel> it like, just works for us :D
[11:25:41 CEST] <rcombs> [insert joke about shipping your machine]
[11:25:45 CEST] <nevcairiel> its always the people that dont use it that somehow attract weird issues
[11:25:50 CEST] <BtbN> It gets specially fun when you realize windows 10 has two different GPU driver models
[11:26:02 CEST] <BtbN> and depending on which one your driver was initially installed, you need to download different ones
[11:26:25 CEST] <BtbN> Classic vs. DCH
[11:26:44 CEST] <rcombs> this is not what I would describe as an excellent sales pitch
[11:26:54 CEST] <rcombs> installing driver now
[11:26:54 CEST] <BtbN> from what I gather, you want to stick with Classic
[11:27:17 CEST] <nevcairiel> I dont think it ultimately makes much of a difference
[11:27:25 CEST] <BtbN> DCH performs notably worse
[11:27:27 CEST] <rcombs> I didn't get an option along those lines
[11:27:36 CEST] <BtbN> I don't think Quadro has DCH drivers yet
[11:27:46 CEST] <rcombs> but perf doesn't really matter here, I'm just trying to get a setup working that can test this code
[11:27:52 CEST] <nevcairiel> from what I can tell the only difference is that you get the control panel from the windows store in DCH mode
[11:28:13 CEST] <nevcairiel> the actual driver itself is even t he same
[11:28:18 CEST] <nevcairiel> its just packaging that varies
[11:28:24 CEST] <rcombs> oh it actually updated the driver without needing a reboot
[11:28:38 CEST] <BtbN> you should still reboot
[11:28:44 CEST] <BtbN> nvenc is iffy sometimes when you don't
[11:29:23 CEST] <rcombs> same error as before, so sure I'll try a reboot
[11:31:10 CEST] <BtbN> The driver returning 22 from nvEncRegisterResource is also very likely not an ffmpeg issue
[11:31:46 CEST] <rcombs> the weird thing is that it's so specific (only happens if pipelining from nvdec)
[11:31:59 CEST] <BtbN> Well, in no other case does nvenc get cuda frames as input
[11:32:07 CEST] <BtbN> And that's the codepath where it happens
[11:32:18 CEST] <BtbN> You could alternatively try to decode using d3d11va, and pass in d3d frames
[11:32:39 CEST] <rcombs> it doesn't happen if I explicitly use a hwupload filter
[11:32:50 CEST] <BtbN> Oh
[11:32:51 CEST] <rcombs> I checked, it's taking the CUDA-input code path in that case
[11:33:21 CEST] <nevcairiel> nvdec and nvenc sharing surfaces has been a bit wonky in the past, iirc
[11:33:39 CEST] <rcombs> lemme see what happens if I hwdownload->hwupload, just for shits n' gigs
[11:34:18 CEST] <BtbN> Yeah, nvdec frames are a bit special
[11:34:28 CEST] <BtbN> but they _did_ work when I made them that way
[11:34:29 CEST] <rcombs> oh also, in mintty is there a reasonable way to move the cursor faster
[11:34:45 CEST] <JEEB> rcombs: btw I have my first version of TTML-in-MP4 that is not a complete monstrocity boiling up (other than the spec being spörs splräsh wtf)
[11:34:47 CEST] <rcombs> (or, I assume this is mintty)
[11:34:49 CEST] <BtbN> No idea, I don't use mintty anymore
[11:35:22 CEST] <rcombs> JEEB: like, the _code_ isn't a monstrosity? 'cause I'm pretty sure there's gonna be some monstrous shit involved no matter what there
[11:35:35 CEST] <JEEB> yes
[11:35:40 CEST] <BtbN> I'm at work right now, without a nvidia GPU close
[11:35:42 CEST] <JEEB> the code isn't completely evil incarnate
[11:35:48 CEST] <BtbN> so I can't test, but will do once I get home
[11:35:51 CEST] <JEEB> it's just stupid because you need to squash packets into one
[11:36:07 CEST] <BtbN> rcombs, try decoding with the old cuvid decoders
[11:36:09 CEST] <rcombs> okay yeah if I hwdownload->hwupload it's fine
[11:36:52 CEST] <nevcairiel> if you're on windows anyway and don't care about 12-bit or 444 support, just use d3d11va
[11:37:15 CEST] <BtbN> I'm a bit worried nvidia broke the way we operate nvdec in some update :/
[11:37:42 CEST] <nevcairiel> ultimately all you do is copy nvdec output onto a generic cuda array, no? its not directly tied to nvdec anymore after that?
[11:37:49 CEST] <BtbN> It is
[11:38:03 CEST] <BtbN> There is a special buffer context, that drags the nvdec context along, to avoid a copy
[11:38:13 CEST] <BtbN> so the frames coming out of nvdec are the mapped nvdec frame
[11:38:53 CEST] <BtbN> Without doing that, nvdec reaches only ~20% of the performce the cuvid decoder has
[11:39:19 CEST] <rcombs> nevcairiel: need scale+deint
[11:39:35 CEST] <BtbN> rcombs, cuvid might be just what you want then
[11:39:37 CEST] <nevcairiel> too bad I was too lazy to finish d3d11va vpp filter
[11:39:41 CEST] <JEEB> rcombs: it is just dumb but lol http://up-cat.net/p/6bf94b85
[11:39:44 CEST] <BtbN> the cuvid decoders have scale + deint built in
[11:39:59 CEST] <rcombs> fucking
[11:40:00 CEST] <nevcairiel> its somewhere on my list to actually write that thing
[11:40:04 CEST] <rcombs> you're giving me PTSD flashbacks
[11:40:15 CEST] <rcombs> to android shit
[11:40:23 CEST] Action: rcombs deep breaths
[11:40:33 CEST] <rcombs> at least it's not nvidia's android thing where they put the scaler IN THE ENCODER
[11:40:33 CEST] <JEEB> (I love it how track_id 3 gets printed as zero since there is no packets muxed yet so it hasn't decided yet?)
[11:40:37 CEST] <BtbN> though deint in the decoder is a bit troublesome, due to a decoder not being intended to double the framerate
[11:40:38 CEST] <JEEB> :D
[11:40:48 CEST] <BtbN> oh, nvenc absolutely does have a scaler
[11:40:53 CEST] <BtbN> ffmpeg just does not utilize it for anything
[11:40:58 CEST] <rcombs> good
[11:41:15 CEST] <rcombs> believe you me, scalers in lavc encoders were not meant to be
[11:41:22 CEST] <rcombs> I have some very evil hacks to support this shit
[11:43:02 CEST] <rcombs> but anyway, long-term I'm gonna need more filtering stuff as well (overlay, tonemap, etc), so I don't particularly want to rely on cramming everything in the decoder
[11:43:17 CEST] <BtbN> nevcairiel, http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavcodec/nvdec.c;h=b60da24301dffbf2849a885f2dc01a713c4a0c7c;hb=HEAD#l404 does most of that
[11:43:17 CEST] <rcombs> and I've been having decent success with nvdec+cuda filters+nvenc on linux
[11:43:49 CEST] <BtbN> I kinda wish there was a way to make OpenCL operate on CUDA frames without a roundtrip copy via system RAM
[11:44:37 CEST] <rcombs> same, but I'm considering putting together a sort of compatibility layer that lets you write kernels that work on both
[11:44:48 CEST] <rcombs> just a .h with a bunch of #defines for each
[11:45:05 CEST] <rcombs> BtbN: same error with the cuvid decoder
[11:45:34 CEST] <rcombs> [[aforementioned worries intensify]]
[11:45:35 CEST] <BtbN> ok, it's not the weirdly mapped frame then, cause cuvid does not do that.
[11:46:04 CEST] <rcombs> lemme try with the download+upload
[11:46:28 CEST] <rcombs> yeah it's fine if I do that
[11:46:33 CEST] <BtbN> Did something break in ffmpeg that makes nvenc use the wrong CUDA Context?
[11:46:48 CEST] <BtbN> I'll have to investigate this later
[11:47:06 CEST] <BtbN> Cause yeah: http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavcodec/cuviddec.c;h=acee78cf2cb74b8c93e01ea01c1936dc8b045eff;hb=HEAD#l527
[11:47:31 CEST] <BtbN> cuviddec literally makes a new frame and memcopies the data over. Equivalent to hwdownload+upload without the RAM roundtrip
[11:48:44 CEST] <rcombs> I've also got a scaler in the filter chain here, with src=dst res, which is effectively a GPU-side memcpy
[11:48:48 CEST] <rcombs> (same error with or without it)
[11:48:57 CEST] <rcombs> but any mapping oddities shouldn't survive that, right?
[11:49:03 CEST] <BtbN> scale_cuda should detect that case and pass through the frames untouched
[11:49:09 CEST] <rcombs> should; afaik it doesn't
[11:49:23 CEST] <nevcairiel> oh so cuviddec did the thing i was thinking of
[11:49:53 CEST] <BtbN> nvdec mapps the frame and then constructs a frame that unmaps it on free
[11:50:04 CEST] <BtbN> cuvid just copies it
[11:50:22 CEST] <BtbN> cuvid can do that because it's a full fledged decoder, and thus can delay the copy until the GPU is ready
[11:50:39 CEST] <BtbN> nvdec can't, and doing the copy as late as possible is still to early and forces the GPU to sync and stall
[11:50:43 CEST] <rcombs> I'm a little behind master, but there doesn't seem to be anything relevant in between
[11:50:53 CEST] <BtbN> nah, there hasn't been any changes there lately
[11:51:08 CEST] <rcombs> ¯\_(Ä)_/¯ that kind of mapping/free thing is pretty common in these
[11:51:19 CEST] <rcombs> it's perfectly reasonable
[11:51:22 CEST] <nevcairiel> I think we could devise a way to have delay in the generic code of avcodec before the postprocess callback is called
[11:51:34 CEST] <BtbN> Hacking it into a AVFrame is not common though
[11:51:34 CEST] <nevcairiel> if it would be useful
[11:51:42 CEST] <rcombs> isn't it?
[11:51:43 CEST] <nevcairiel> but i suppose this method works
[11:52:14 CEST] <BtbN> To keep the frame mapped, the cuvid and cuda context need to stay alive as well
[11:52:14 CEST] <rcombs> I've done it on android
[11:52:14 CEST] <rcombs> yeah, I've done that
[11:52:14 CEST] <BtbN> so the frames also carry a buffer ref to those, so they don't get freed to early
[11:52:22 CEST] <rcombs> yup, same with my android thing
[11:52:32 CEST] <BtbN> I've never seen anything else do that
[11:52:49 CEST] <BtbN> But since it also happens with cuviddec, which does not do any such thing to begin with, that can't be the issue
[11:52:51 CEST] <rcombs> though in that case I'm just passing around mapped pointers, not GPU surfaces or anything
[11:53:26 CEST] <rcombs> (I'm told there's _some_ way to get surfaces out of mediacodec decoders, and to pass them into mediacodec encoders, but fuck if I can find any decent docs on how)
[11:53:46 CEST] <rcombs> (and some of that stuff's only available from java because fuck)
[11:54:38 CEST] <rcombs> but yeah, what you said
[11:55:14 CEST] <BtbN> Does it also happen if you do just cuviddec -> nvenc without anything in between?
[11:55:46 CEST] <rcombs> lemme see
[11:56:02 CEST] <BtbN> Cause if not, it has to be something the scale filter does
[11:57:17 CEST] <rcombs> yup
[11:57:26 CEST] <BtbN> I have an NVidia GPU in my desktop again since yesterday, so I'll be able to test once I get home
[11:57:33 CEST] <BtbN> My AMD adventures where short and painful.
[11:57:52 CEST] Action: rcombs sighs
[11:57:58 CEST] <rcombs> I'll probably have to do some AMF stuff soon
[11:58:05 CEST] <BtbN> Do you need a GPU?
[11:58:09 CEST] <BtbN> I happen to have two
[11:58:31 CEST] <rcombs> possibly, but I can probably expense one if it comes up
[11:58:41 CEST] <BtbN> AMD sent me two dev GPUs for free
[11:58:49 CEST] <BtbN> so, I can literally just send you one
[11:59:01 CEST] <BtbN> One RX5700 and one RX590
[11:59:17 CEST] <rcombs> now I kinda want your dev contacts more than I want your GPUs
[11:59:48 CEST] <BtbN> They asked here in this channel a while ago, looking for developers interested in doing AMF stuff, and sent everyone a round of GPUs
[12:00:04 CEST] <rcombs> oh, totally missed that
[12:00:14 CEST] <rcombs> but yeah I might write an AMF decoder to get around needing to deal with mesa
[12:00:26 CEST] <nevcairiel> me too, i bought a 560 earlier this year for $work =p
[12:00:27 CEST] <rcombs> (have you tried to build mesa)
[12:00:35 CEST] <BtbN> The RX5700 is a hot mess on Windows
[12:00:43 CEST] <rcombs> (it DEPENDS ON LLVM)
[12:00:44 CEST] <nevcairiel> its an AMD gpu, its expected
[12:00:46 CEST] <BtbN> so if you want a RX5700 (non-XT) for free, send me your address.
[12:00:58 CEST] <rcombs> (can you imagine shipping something with a runtime dependency on LLVM)
[12:01:17 CEST] <BtbN> OpenCL also has that, as you can and do compile C code at runtime?
[12:01:19 CEST] <nevcairiel> dont most systems just come with mesa anyway
[12:01:32 CEST] <nevcairiel> i mean, the typical graphics stack always has it
[12:02:01 CEST] <rcombs> yeah, but I ship libva, and VAAPI breaks the lib<->driver ABI on every release
[12:02:10 CEST] <nevcairiel> fun.
[12:02:25 CEST] <nevcairiel> which is too bad, since i've been thinking about shipping libva2
[12:02:27 CEST] <BtbN> nevcairiel, it's so bad on Windows that playing a YouTube video on Firefox with hwaccel enabled has a more than 50% chance to send the system into a Bluescreen
[12:02:44 CEST] <rcombs> with the intel driver it's nbd, it's an easy build and costs like 5MB
[12:03:03 CEST] <rcombs> but the AMD driver is 25MB and the mesa build is awful
[12:03:55 CEST] <nevcairiel> I had two options, either change ffmpeg so libva is not linked-in, or ship the linked-in version of libva, so that we can support hwaccel on intel on reasonable modern systems, but still run in software mode on older ones, i figured shipping libva2 would be easier :D
[12:05:17 CEST] <rcombs> it is, if you only need intel and not AMD
[12:05:42 CEST] <BtbN> libva/vaapi is basically an Intel GPUs interna made into an API
[12:05:43 CEST] <nevcairiel> intel is the primary goal, since we care about NUCs a bit
[12:05:57 CEST] <BtbN> So for AMD to support libva, they kinda have to emulate an Intel GPU
[12:06:17 CEST] <rcombs> though you have to either set an env var to specify the driver path, or use a patch I wrote that's been in PR since forever that adds an RPATH-$ORIGIN-style feature
[12:06:39 CEST] <rcombs> and makes libva search for drivers in [its own library path]/dri instead of a hardcoded absolute path
[12:06:47 CEST] <nevcairiel> i sorta hope that those vulkan video decoding extensions actually work and give us something for all vendors on linux .... eventually
[12:07:06 CEST] <BtbN> Yeah, Vulkan Video Decodce becoming the d3d11va of Linux would be godly
[12:07:23 CEST] <rcombs> still needs encode though
[12:07:29 CEST] <BtbN> Browsers would probably finally support hwaccel
[12:07:35 CEST] <nevcairiel> true, but decode would be a very good start
[12:07:40 CEST] <rcombs> fair enough
[12:08:00 CEST] <rcombs> and apparently someone in here is working on vulkan->AMF interop on the encode side
[12:08:07 CEST] <rcombs> it's on the ML iirc
[12:08:11 CEST] <nevcairiel> unfortunately a vendor-agnostic encode API seems not to be s omething anyone is interested in, since the hardware exposes way too different features
[12:08:27 CEST] <BtbN> Well, technically Linux has a vendor agnostic encode API
[12:08:29 CEST] <rcombs> there's the v4l2 thing!
[12:08:30 CEST] <BtbN> but nobody uses it
[12:08:38 CEST] <nevcairiel> see not being interested in it =p
[12:08:51 CEST] <rcombs> though also, doesn't AMF have OpenCL interop
[12:09:04 CEST] <nevcairiel> but regardless encoding is usually easier to get going with vendor-specific APIs
[12:09:11 CEST] <nevcairiel> well except intel on windows, qsv is so terrible
[12:09:32 CEST] <BtbN> I wonder, with Nvidia UVM... Could you just cast a CUdevptr to a OpenCL one, and it just works?
[12:09:48 CEST] <nevcairiel> (not that vaapi encoding is better, its so extremely low level .. but at least it being low level lets you fix what they wont)
[12:10:00 CEST] <rcombs> allegedly no, but in practice idk
[12:10:17 CEST] <BtbN> You can cast a CUdevptr to a char* and it just works
[12:10:23 CEST] <BtbN> even if it's in GPU memory
[12:10:43 CEST] <rcombs> do you get ridiculous performance characteristics
[12:10:55 CEST] <nevcairiel> isnt there a real interop?
[12:11:17 CEST] <nevcairiel> i suppose maybe not, nvidia doesnt like OCL
[12:11:28 CEST] <rcombs> if there is, their forums don't know about it
[12:11:33 CEST] <BtbN> There is absolutely no OpenCL <-> CUDA interop
[12:11:47 CEST] <nevcairiel> at least they are warmer towards vulkan
[12:12:03 CEST] <nevcairiel> has interop and everything
[12:12:14 CEST] <rcombs> &is there vulkan<->OpenCL interop
[12:12:58 CEST] <rcombs> (are you thinking the extremely dumb thing I'm thinking)
[12:13:15 CEST] <nevcairiel> obviously, but i reckon it might nto work without copies
[12:13:48 CEST] <BtbN> Even CUDA<->OpenGL Interop is so weirdly designed you need to memcpy
[12:15:28 CEST] <rcombs> anyway
[12:16:19 CEST] <rcombs> lmk if you repro that nvidia issue
[12:16:51 CEST] <rcombs> in the interim, I'll poke nvidia about it
[12:37:46 CEST] <JEEB> rcombs: and with fragments it gets even more "fun" http://up-cat.net/p/d8fc4f9e
[12:37:50 CEST] <JEEB> xD
[12:38:27 CEST] <JEEB> yes, adding empty documents to fragments to keep the time line going
[12:38:30 CEST] <JEEB> ý(´ü@)Î
[12:39:57 CEST] <rcombs> JEEB: heh, I have a segment.c patch somewhere that lets it emit empty segments when necessary (by looping over end/start until it hits the target timestamp)
[12:41:38 CEST] <JEEB> and if you have only the subtitle track, this will 100% fail with "fragment at time X" because the packets only get actually added to index at fragmentation or footer writing :P
[12:41:53 CEST] <JEEB> unless you add some extra logic to check the queue
[15:59:15 CEST] <cone-129> ffmpeg 03James Almer 07master:af70bfbeadc0: avcodec/h2645_parse: zero initialize the rbsp buffer
[16:28:38 CEST] <tmm1> is there a good doc on how format negotiation works with hwcontexts
[16:58:40 CEST] <jamrial> tmm1: jkqxz may know, but i guess the doxy is expected to be enough?
[16:59:33 CEST] <cone-129> ffmpeg 03James Almer 07master:33a53722dc5e: avcodec/qtrle: add a flush() callback
[17:14:47 CEST] <tmm1> the filter design document is really good, something like that for hwcontext design would be useful
[17:33:36 CEST] <BtbN> I wonder if that package is something I can ask for a refund for. Cause the cheap 16¬ option is kinda shit.
[17:33:47 CEST] <BtbN> No insurance and tracking at all.
[17:59:14 CEST] <cone-129> ffmpeg 03Michael Niedermayer 07master:dead949a1fbf: avcodec/atrac9dec: Check block_align
[18:04:16 CEST] <durandal_1707> jamrial: have checked that after seeking qtrle its not black?
[18:15:50 CEST] <jamrial> durandal_1707: no
[18:16:34 CEST] <jamrial> durandal_1707: if you want i can remove the memset for the palette, if that's what you're concerned about
[18:17:35 CEST] <durandal_1707> hmm, try with ffplay?
[18:17:44 CEST] <nevcairiel> the palette should probably stay
[18:19:01 CEST] <nevcairiel> it might be constant throughout the entire file and only delivered in the beginning
[18:20:15 CEST] <BtbN> michaelni_, I'm not sure who usually handles that, but would shipping costs for a parcel to the US (AMD GPU for AMD devel stuff for rcombs) be eligible for a refund from the ffmpeg foundation? Probably 30~50¬.
[18:20:48 CEST] <BtbN> rcombs, "./ffmpeg.exe -hwaccel cuda -hwaccel_output_format cuda -i D:/Cache/enctest.mkv -c:a copy -sn -c:v h264_nvenc -preset slow -rc vbr_hq -b:v 25M -cq 22 -y D:/Cache/test_out.mkv" works great for me on current master
[18:21:22 CEST] <BtbN> Same if I add in "-c:v h264_cuvid"
[18:23:17 CEST] <BtbN> Also works if I throw in "-vf scale_cuda=800:-2"
[18:25:12 CEST] <BtbN> That's on an RTX2070
[18:32:50 CEST] <BtbN> I must say, I'm impressed with turing nvenc
[18:34:01 CEST] <nevcairiel> the quality is really a large step up
[18:34:13 CEST] <nevcairiel> happy with the 1660 for my streaming PC
[18:35:03 CEST] <jamrial> durandal_1707: do you have a qtrle mov sample with more than one sync point and a palette in the bitstream?
[18:35:52 CEST] <BtbN> nevcairiel, it's also really good at holding CBR
[18:36:00 CEST] <durandal_1707> jamrial: there is ffmpeg encoder
[18:36:05 CEST] <BtbN> set it to 6 Mbit CBR, and sure enough, does not devicate one bit from it
[18:37:23 CEST] <nevcairiel> did older ones not manage that? it seems like strict CBR was something they always could do
[18:38:00 CEST] <BtbN> Not THAT good
[18:38:07 CEST] <BtbN> they padded a lot
[18:38:15 CEST] <BtbN> turing uses virtually no padding
[18:38:30 CEST] <BtbN> Unless nothing is happening, of course
[18:38:57 CEST] <nevcairiel> i see, you actually looked into the padding used :D
[18:39:10 CEST] <BtbN> Well, I just didn't enable it, and looked at the output :D
[18:40:14 CEST] <BtbN> I'm confused why things explode on that P2000 though, and work perfectly here
[18:41:02 CEST] <michaelni_> BtbN, i would approve it. But as we IIRC did not had such case before there may be other unforseen obstacles also other developers could in theory objectbut I do not know if there are other obstacles, for example s international shipping will probably be duty/tax
[18:41:38 CEST] <BtbN> Yeah, but those won't be on me, but on the receiving end.
[18:41:57 CEST] <BtbN> Not sure how that works, I intend to declare it as a gift, which it technically is.
[18:42:37 CEST] <michaelni_> here declaring as a gift wont help, i still have to pay full on gifts from china
[18:43:21 CEST] <nevcairiel> they also have to believe you :p
[18:44:29 CEST] <BradleyS> may i ask which gpu and to what country? i have a spare 570 here i was going to sell but could donate. i'm in the us
[18:44:39 CEST] <BtbN> RX5700 from DE to US
[18:44:46 CEST] <BradleyS> might be a 580, i have to look
[18:44:47 CEST] <durandal_1707> no gifts from me or to me :)
[18:45:09 CEST] <BtbN> I got it from AMD for devel work, but it's unlikely I will ever put it to that use.
[18:45:15 CEST] <BradleyS> ah, navi
[18:45:29 CEST] <BradleyS> definitely newer than what i have, but if he wants mine i can send it too
[18:45:44 CEST] <BtbN> They also sent me an RX590 along with it. Which I have no clue what to do with.
[18:46:11 CEST] <BradleyS> only bought it for benchmarking hardware encoding for handbrake, the results and comparisons are an article in our documentation
[18:46:13 CEST] <BradleyS> not sure what to do with it now
[18:46:26 CEST] <durandal_1707> BtbN: why you have so much unused stuff?
[18:46:29 CEST] <jamrial> play a game?
[18:46:43 CEST] <BradleyS> 1080 ti sc
[18:46:49 CEST] <BtbN> durandal_1707, because AMD asked me if I wanted free stuff. And I didn't say no.
[18:47:44 CEST] <BradleyS> i would have liked to test a newer gpu's asic but i was assured it wouldn't be that much difference and i didn't want to pay top dollar, got this one on ebay of all places
[18:48:16 CEST] <BtbN> RX5700 isn't even that expensive. It's like 300¬ new
[18:48:34 CEST] <BradleyS> not awful but i think i paid ~100 USD for the 570 used
[18:48:41 CEST] <BtbN> Only porlbme, and reason why I'm not using it: The Windows drivers are an absolute catastrophy.
[18:55:40 CEST] <philipl> The new prime offload support for nvidia is pretty snazy. I can do nvdec and vaapi work in the same desktop session.
[18:56:09 CEST] <BtbN> I mean, you could always do that, since nvdec does not depend on the desktop session at all.
[18:56:09 CEST] <philipl> vdpau shits the bed; that was amusing.
[18:56:20 CEST] <philipl> BtbN: but with working opengl/vulkan interop.
[18:57:14 CEST] <BradleyS> speaking of, if anyone could provide further comment/review on this, it would be helpful https://patchwork.ffmpeg.org/patch/14319/
[18:57:33 CEST] <BradleyS> we'd like to include it in handbrake but of course, wish to know whether it will be accepted into ffmpeg proper
[18:59:18 CEST] <philipl> It's self contained enough that I don't know why anyone would object, but it obviously would not work well if we ever get a vulkan hwcontext in.
[19:02:22 CEST] <BradleyS> my understanding is this is directly from amd, so i'm sure that information will be valuable if you would kindly reply
[19:03:05 CEST] <BradleyS> fyi this is our pull request from june https://github.com/HandBrake/HandBrake/pull/2151
[19:05:13 CEST] <philipl> BtbN: you have any objection to pushing that change? You did comment previously
[19:06:39 CEST] <BtbN> It looks fine to me
[19:28:59 CEST] <tmm1> looks pretty reasonable to me too
[19:29:49 CEST] <tmm1> i didn't realize amf worked on linux
[20:16:49 CEST] <durandal_1707> jamrial: so, tried ffmpeg qtrle encoder and playing output with ffplay?
[20:17:23 CEST] <jamrial> durandal_1707: no, was busy doing other stuff
[20:17:41 CEST] <jamrial> does it even code in a palette? it's not looking at frame side data, or even frame->data[1]
[20:18:26 CEST] <durandal_1707> agh, you mean it does not support pal8?
[20:18:34 CEST] <durandal_1707> that sucks
[20:19:00 CEST] <jamrial> ah, good point, didn't bother looking at the list of supported pix_fmts
[20:19:02 CEST] <jamrial> no, no pal8
[20:20:56 CEST] <jamrial> i'll remove the memset in any case. the mov demuxer seems to read the palette from the header (stsd atom), and then attach it into a single packet
[20:31:24 CEST] <cone-129> ffmpeg 03James Almer 07master:8b71cc3363b5: Revert "avcodec/qtrle: Do not output duplicated frames on insufficient input"
[20:31:25 CEST] <cone-129> ffmpeg 03James Almer 07master:d70bbdc5fa05: avcodec/qtrle: call ff_reget_buffer() only when the picture data is going to change
[20:31:26 CEST] <cone-129> ffmpeg 03James Almer 07master:b319feb05f40: avcodec/qtrle: don't clear the palette when flushing
[21:11:43 CEST] <tmm1> decoder codecs list hardware formats first in pix_fmt, then call ff_get_format to negotiate the format
[21:12:07 CEST] <tmm1> what about encoders.. looks like they list the hardware formats last? do they have to call something else to negotiate or does the filtergraph do it automatically
[22:00:55 CEST] <rcombs> BtbN: huh, well I'll try on master for good measure, and if that doesn't work I suppose I'll blame the drivers on this particular card
[22:01:01 CEST] <rcombs> thanks for checking
[22:03:11 CEST] <BtbN> definitely worth a message to nvidia
[22:03:31 CEST] <BtbN> My video was a 1080p yuv420p video
[22:03:42 CEST] <BtbN> Yours wasn't something uncommon by any chance, was it?
[22:10:20 CEST] <rcombs> nope, 720p yuv420p
[22:11:51 CEST] <rcombs> (just a random file)
[00:00:00 CEST] --- Wed Aug 28 2019
More information about the Ffmpeg-devel-irc
mailing list