[Ffmpeg-devel-irc] ffmpeg-devel.log.20180511

burek burek021 at gmail.com
Sat May 12 03:05:04 EEST 2018


[00:06:44 CEST] <durandal_1707> wm4: look at ffplay it seeks within y4m just fine
[00:08:51 CEST] <wm4> durandal_1707: doesn't here
[00:10:20 CEST] <JEEB> durandal_1707: explain the issue step by step on either the user or devel channel and let's see if we can get your issue sorted
[00:10:43 CEST] <JEEB> we are not espers and you didn't exactly share too much about your issue
[00:13:21 CEST] <durandal_1707> JEEB: how when we contradicts each other from start
[00:18:07 CEST] <JEEB> I'd say it's closer to "one does not understand the other"
[00:18:40 CEST] <jkqxz> jamrial:  Can I infer from your comments on 5 and 6 of the H.264 SEI set that you reviewed all of it?
[00:18:47 CEST] <jkqxz> (Thank you for uploading the test file, btw.)
[00:19:34 CEST] <rcombs> so apparently hwcontext_dxva2 leaks threads
[00:21:39 CEST] <jamrial> jkqxz: no, but i did test them and as far as fate and a bunch of samples i tried it seemed ok
[00:21:45 CEST] <jamrial> none had pan-scan rectangle sei, though
[00:22:49 CEST] <jkqxz> I don't have any samples with it either, but the reference decoder does parse them and agreed with the ones I created.
[00:39:41 CEST] <jamrial> sounds good then :p
[00:40:53 CEST] <nevcairiel> rcombs: it doesnt open any, so how would that work? :)
[00:41:07 CEST] <rcombs> it doesn't itself, but d3d9.dll does
[00:44:50 CEST] <jkqxz> jamrial:  Ok; thank you!
[00:49:50 CEST] <rcombs> looks like the threads are created in IDirect3D9Ex_CreateDeviceEx
[00:55:49 CEST] <cone-593> ffmpeg 03Mark Thompson 07master:4c9741a1dddf: cbs_h264: Fix handling of unknown SEI
[00:55:50 CEST] <cone-593> ffmpeg 03Mark Thompson 07master:2b4121350092: h264_metadata: Remove redundant setting of SEI payload size
[00:55:51 CEST] <cone-593> ffmpeg 03Mark Thompson 07master:9d375e114ac2: h264_metadata: Fix AUD writing
[00:55:52 CEST] <cone-593> ffmpeg 03Mark Thompson 07master:d94dda742c8e: cbs_h264: Add support for pan-scan rectangle SEI messages
[00:55:53 CEST] <cone-593> ffmpeg 03Mark Thompson 07master:ac687add84a1: cbs_h264: Add support for mastering display SEI messages
[00:55:54 CEST] <cone-593> ffmpeg 03Mark Thompson 07master:f995aa82d858: fate/cbs: Add an SEI test
[01:04:04 CEST] <rcombs> I dunno if anybody feels like debugging d3d9 code
[01:04:12 CEST] <rcombs> I'm definitely out of my depth there
[01:08:09 CEST] <nevcairiel> i've never noticed any excess threads
[01:08:17 CEST] <nevcairiel> are you sure you dont leak the hwcontext somehow
[01:17:27 CEST] <rcombs> I see one place where that could happen in an OOM case, but nowhere else
[01:17:31 CEST] <rcombs> (fixed that though)
[01:17:44 CEST] <rcombs> user's logs don't indicate that path was taken, either
[01:17:56 CEST] <rcombs> could be something driver-specific
[01:18:06 CEST] <nevcairiel> if its only one user, might be a driver bug indeed
[01:18:26 CEST] <rcombs> i686 binary running on 64-bit windows in VMWare
[01:18:45 CEST] <rcombs> no idea why anyone would ever want to run this app in a windows VM
[01:18:50 CEST] <nevcairiel> hwaccel in a VM, he deserves all the pain he gets
[01:19:11 CEST] <nevcairiel> those VM hwaccel virtualization drivers are super crappy
[01:19:34 CEST] <rcombs> threads apparently created like this https://gist.github.com/ae5aec2badd1eb154dbaf7c3ad9589f7
[01:19:54 CEST] <rcombs> had to stare at disas of lavu to figure out what function it was
[01:22:55 CEST] <rcombs> ¯\_(Ä)_/¯ I'll tell the user to poke the vmware people and in the interim that his use-case is insane
[01:36:31 CEST] <philipl> rcombs: the vmware video driver doesn't implement hwaccel does it??
[01:36:49 CEST] <rcombs> dunno, and also dunno if it matters
[01:37:04 CEST] <rcombs> I open a hwaccel to test whether or not it's available
[01:37:38 CEST] <nevcairiel> virtualbox at least implements virtualized dxva, i imagine vmware might as well,  but i havent used it in ages
[01:38:17 CEST] <nevcairiel> but these modes are generally quite unusable regardless
[01:38:49 CEST] <philipl> Having worked there - I can't imagine how they'd justify the engineering investment. Their paying customers certainly don't care about video playback.
[01:41:02 CEST] <rcombs> meanwhile in extremely nice: https://gist.github.com/e58670a07b529578b6a1db3186a2514d
[01:41:05 CEST] <nevcairiel> a quick google suggests that it does work
[01:41:16 CEST] <nevcairiel> or well, is supposed to
[01:41:46 CEST] <nevcairiel> well, at least with ESXi
[01:43:26 CEST] <philipl> nevcairiel: with pass-through GPUs...
[01:43:45 CEST] <nevcairiel> apparently also sharing hardware between multiple VMs
[01:44:14 CEST] <nevcairiel> at least D3D9, which DXVA2 is based on
[05:16:06 CEST] <Compn> i cant remember the cineform hd codec in mplayer
[05:16:14 CEST] <Compn> there were two versions of it
[05:16:22 CEST] <Compn> one worked and the other we all tried to forget about iirc
[05:16:40 CEST] <Compn> which looks about the same in ffmpeg with that sample
[05:18:02 CEST] <Compn> http://cineform.com/gopro-cineform-decoder
[05:18:26 CEST] <Compn> oh i might be thinking of canopus
[06:09:16 CEST] <jamrial_> jkqxz: re ticket 7200 above, https://pastebin.com/szcJw45K fixes it for me
[06:09:25 CEST] <jamrial_> the file has a lot of NALUs type 0 (invalid, afaik) of size 1, with that byte being zero
[09:17:31 CEST] <cone-752> ffmpeg 03Tobias Rapp 07master:66ba303c5369: fate: add more tests for hue video filter
[10:44:29 CEST] <JEEB> tmm1: ahahahaha. I wonder if your PMT fixing bug actually just fixed my sample :D
[10:44:54 CEST] <JEEB> I seem to get PCR quicker now
[10:47:39 CEST] <JEEB> although I'll have to test with my usual way first to make sure...
[10:49:32 CEST] <JEEB> ah, no :) shows me how I shouldn't be lazy and test things without making sure the way of doing it is the same
[11:05:31 CEST] <JEEB> tmm1: so I guess for PID switches if we would want to support stuff like switching codecs we'd just need a flag "programs X,Y,Z got updated" with a list of pointers to matching AVPrograms?
[11:05:44 CEST] <JEEB> because that way you don't need to iterate over the thing yourself
[11:06:24 CEST] <JEEB> ffmpeg.c would need rework of course, haven't looked at how much
[11:10:22 CEST] <JEEB> not sure where such side data would get attached though
[11:10:26 CEST] <JEEB> stream? which stream?
[11:10:30 CEST] <JEEB> packet? which packet?
[11:11:13 CEST] <JEEB> if you could attach them to the avformat context then that would make it not matter which stream's packet is getting fed to you next, I guess?
[11:11:25 CEST] Action: JEEB scratches his head
[11:12:03 CEST] <JEEB> the stream merging thing will work as long as you are a) OK with losing the original PIDs b) having no support for codec switches
[11:51:39 CEST] <jkqxz> Trac is down?  I get 503.
[11:53:22 CEST] <durandal_1707> not for me
[11:56:26 CEST] <nevcairiel> seems to have come back just now
[11:56:29 CEST] <jkqxz> Right, working again.  Odd.
[12:09:14 CEST] <JEEB> trac is a bit of weird bit of software
[12:09:28 CEST] <JEEB> if I recall correctly on some implementation details
[16:44:26 CEST] <philipl> BtbN: after your changes, GPU memory usage in mpv for a 4k/10bit video is reduced by 25% (comparing nvdec to cuvid)
[16:44:37 CEST] <philipl> 1.2GB->0.9GB
[16:45:21 CEST] <BtbN> didn't wm4 say mpv uses its own hw_frames_ctx, so the memory saving effect woudln't work?
[17:11:19 CEST] <philipl> Not for nvdec. It just uses a device_ctx
[17:26:05 CEST] <wm4> philipl: that sounds wrong
[17:31:11 CEST] <wm4> generally it does what ffmpeg tells it though (but not sure which method is preferred)
[17:34:40 CEST] <BtbN> for nvdec/cuvid, giving it just a hw_device_ctx is prefred I'd say
[17:34:44 CEST] <BtbN> specially now
[17:47:55 CEST] <durandal_1707> ubitux: patch_diff_sq in nlmeans can be negative
[17:50:07 CEST] <ubitux> durandal_1707: huh, really?
[17:50:13 CEST] <ubitux> did i introduce a regression?
[17:53:12 CEST] <ubitux> are you sure it's not because it's >(1<<31)?
[17:54:10 CEST] <wm4> BtbN: I think it actually prefers frames_ctx over device_ctx because it can cache the memory allocations
[17:54:56 CEST] <BtbN> cache the memory allocations?
[17:55:01 CEST] <wm4> since mpv keeps some frames in the video pipeline during seeks, not caching the frame pool could momentarily double GPU memory use
[17:55:45 CEST] <BtbN> but the internal pool already caches the allocations
[17:56:06 CEST] <BtbN> It allocates 14 or so "frames" at the beginning, and then never again under normal operation
[18:01:53 CEST] <durandal_1707> ubitux: you are correct, still searching why results are soo blurry
[18:05:51 CEST] <ubitux> try to switch to u64
[18:05:55 CEST] <ubitux> just in case it's due to an overflow
[18:19:39 CEST] <durandal_1707> ubitux: right, i compiled with integer sanitizer: libavfilter/vf_nlmeans.c:401:50: runtime error: unsigned integer overflow: 2693 - 3291 cannot be represented in type 'unsigned int'
[18:19:51 CEST] <durandal_1707> libavfilter/vf_nlmeans.c:401:54: runtime error: unsigned integer overflow: 4294782596 + 186942 cannot be represented in type 'unsigned int'
[18:20:28 CEST] <durandal_1707> so it intermediate needs to be long
[18:22:28 CEST] <ubitux> what resolution?
[18:22:37 CEST] <durandal_1707> actually something goes into negative
[18:23:12 CEST] <durandal_1707> ubitux: same gray sample with fishes i had uploaded
[18:35:03 CEST] <philipl> wm4: For cuda it definitely just provides a device_ctx, and that's the right thing to be doing.
[18:37:44 CEST] <durandal_1707> ubitux: it appears this one does not affect output, at least not for that sample
[18:40:27 CEST] <wm4> philipl: and nvdec?
[18:42:23 CEST] <philipl> I'm still looking at the code. :-)
[18:42:30 CEST] <philipl> Can't argue about the lower memory footprint though.
[18:42:51 CEST] <ubitux> durandal_1707: how can i reproduce the glitch? (what exact command)
[18:43:58 CEST] <philipl> cuviddec only declares device_ctx support, so that's clear enough. nvdec declares both and I do see that mpv wants to use frames_ctx if it's possible.
[18:45:22 CEST] <durandal_1707> ubitux: you need to compile with clang 6.0
[18:46:03 CEST] <ubitux> filter output is different between compilers?
[18:46:35 CEST] <JEEB> he might be talking about the sanitizer?
[18:46:55 CEST] <durandal_1707> ubitux: compile with -fsanitize=integer
[18:47:01 CEST] <ubitux> i'm interested in the border glitch, not the sanitizer warning
[18:47:02 CEST] <JEEB> yes, so that was it :)
[18:47:42 CEST] <durandal_1707> ubitux: output appears to be same, with overflow or without
[18:53:27 CEST] <philipl> wm4: OK. How does this sound for a story.
[18:55:32 CEST] <philipl> mpv prefers frames_ctx, and it uses avcodec_get_hw_frames_parameters to set it up. That in turn uses the frame_params function provided by the hwaccel, which is specific to the decoder. The nvdec frame_params specify the dummy pool that BtbN added.
[18:55:42 CEST] <philipl> So yes, it's using a pool, but it's using the pool that nvdec wants it to use.
[18:55:49 CEST] <philipl> So you still get memory savings.
[18:55:56 CEST] <durandal_1707> ubitux: are you ok with parameter that controls amount of denoising?, the 1.f weight of centered pixel
[18:57:57 CEST] <ubitux> durandal_1707: yeah sure, just don't allow zero
[18:58:50 CEST] <wm4> philipl: so what happens on a seek?
[18:59:08 CEST] <wm4> normally the hw decoder will be recreated, but mpv makes sure it reuses the pool (in general for hwaccels)
[18:59:30 CEST] <BtbN> I don't see why that wouldn't still work the exact same
[19:00:06 CEST] <philipl> If you have a referenced AVFrame, it would continue to work because the AVFrame keeps the decoder alive.
[19:00:17 CEST] <BtbN> that doesn't matter
[19:00:31 CEST] <philipl> I didn't look in detail at seeks but I know seeks work, as I tried seeking.
[19:01:55 CEST] <durandal_1707> ubitux: minimal would be 1.f
[19:02:07 CEST] <ubitux> okay
[19:02:20 CEST] <ubitux> is there an actual need for adjusting this?
[19:07:21 CEST] <durandal_1707> for finer control, sigma is very crude currently
[19:12:47 CEST] <wm4> philipl: so at some point during seek, there will be 2 decoders allocated, right
[19:14:48 CEST] <philipl> That sounds likely.
[19:15:14 CEST] <philipl> I can look this evening and see what the exact lifetimes are.
[19:19:20 CEST] <wm4> assuming the decoders keep its internal frame pools alive that wouldn't be good
[19:22:29 CEST] <philipl> wm4: presumably the frames keeping the decoder alive will be discarded soon after. even with interpolation they don't stick around that long. Or are you worried about a GPU memory allocation spike?
[19:28:19 CEST] <wm4> yes, exactly
[19:29:52 CEST] <philipl> I will investigate later.
[19:31:00 CEST] <tmm1> JEEB: https://en.wikipedia.org/wiki/Program-specific_information documents the behavior i saw, "repeated until end of TS packet"
[19:31:30 CEST] <tmm1> also says the 0xc0 table i'm seeing is a SCTE specific "Programme Information Message"
[19:32:44 CEST] <JEEB> have you got a copy of the MPEG-TS spec btw?
[19:32:54 CEST] <JEEB> just out of interest :)
[19:33:46 CEST] <JEEB> but yes, the spec's part 2.4.4.8 shows how PMT contains arrays of things
[19:34:00 CEST] <JEEB> I just never really got to trying to match that against your code because asdf other things
[19:37:47 CEST] <tmm1> 2.4.4.8 only defines the structure of tid=0x2
[19:43:45 CEST] <tmm1> i'm sure it says somewhere in here that one ts packet can contain multiple tables but i'm not sure where
[19:44:56 CEST] <JEEB> 2.4.4.8 clearly says that PMT can have multiple descriptors etc :)
[19:45:03 CEST] <JEEB> or is it not about PMTs?
[19:45:33 CEST] <JEEB> yea I think that's just a single entry with multiple PIDs linked to it
[19:47:04 CEST] <tmm1> its confusing because PMT refers to the pid with the tables, and in there there's a "Program Map" table with table_id=2
[19:47:26 CEST] <tmm1> but the PMT pid can also contain other tables in the packets
[20:05:13 CEST] <JEEB> tmm1: I guess it's an actual program association section? 2.4.4.3. since that defines 0x02 as program_map_section?
[20:06:39 CEST] <JEEB> so it's actually a multi-part PAT
[20:06:56 CEST] <JEEB> (the one that has program_map_section as 0x02
[20:11:11 CEST] <JEEB> (or they just decided to list all the alternatives for the tables under PA section)
[20:12:55 CEST] <tmm1> well its not on the PAT pid
[20:12:57 CEST] <tmm1> pretty confusing
[20:13:33 CEST] <tmm1> 2.4.4.2 kind of alludes to it, "first section" and "at least one section"
[20:14:08 CEST] <tmm1> its more clearly outlined in the dvb spec the wikipedia page links to
[20:14:41 CEST] <JEEB> the sections are talking about PAT's things most likely, the 2-31 table_id assignment values table just somehow ends up being after the first table
[20:15:10 CEST] <JEEB> or maybe not, I'm still trying to grasp what is a section and what is a table
[20:15:13 CEST] <JEEB> (lol)
[20:15:17 CEST] <tmm1> its not clear at all
[20:15:23 CEST] <tmm1> they seem to use them interchangably
[20:16:08 CEST] <JEEB> the only clear thing is that you're suppsoed to have a table_id (finally understood what tid was!)
[20:16:22 CEST] <JEEB> and that we have various data under various table_ids
[20:16:29 CEST] <JEEB> of which 0x00 is PAT
[20:16:39 CEST] <JEEB> (and that's where the T in the PAT comes from I guess)
[20:17:00 CEST] <JEEB> but then again, the name of the data structure is program_association_section
[20:23:38 CEST] <JEEB> ok, 2.4.3.3 explains it more with the payload_unit_start_indicator thing. via context you know if a packet is going to be a PES packet or a section. and you can have multiple sections in some assembled data
[20:26:16 CEST] <JEEB> and it's not surprising that the other data comes in a !PAT PID
[20:26:53 CEST] <JEEB> "Only sections with this value of table_id (0x00 - PAT) are permitted within transport stream packets with PID value of 0x0000"
[20:33:11 CEST] <JEEB> ok, so with non-PES PID packets you synchronize on payload_unit_start_indicator (2.4.3.2), with its value being 1 and then the pointer_field tells the offset from the start of the data in a given packet
[20:36:23 CEST] <JEEB> trying to find a definite proof but as we see it's probably rather clear that if you can fit multiple sections in a single packet (or between multiple), the next one's table_id comes right after the previous ended?
[20:36:48 CEST] <JEEB> and that's how non-PES PID packets work
[21:10:21 CEST] <BtbN> philipl, wm4, if there wasn't a spike before, there can't be one now.
[21:10:30 CEST] <BtbN> It's only allocating less memory, not more
[21:14:22 CEST] <akravchenko188> jkqxz: I have sent couple of comments/questions. thanks
[21:37:03 CEST] <tmm1> JEEB: yea that other doc explicitly states that
[21:37:45 CEST] <tmm1> >There is never more than one pointer_field in a TS packet, as the start of any other section can be
[21:37:48 CEST] <tmm1> identified by counting the length of the first and any subsequent sections, since no gaps between sections within a TS
[21:38:39 CEST] <JEEB> yup
[21:49:53 CEST] <philipl> BtbN: The question was how much memory is sunk into the decoder itself and the internal decoder surface pool.
[21:50:06 CEST] <philipl> You'll have two of those around for a short period after a seek
[21:50:10 CEST] <philipl> But I'll find out.
[21:50:35 CEST] <BtbN> if you use the same hw_frames_ctx, no, you won't, ever
[21:53:15 CEST] <philipl> BtbN: I mean the actual CUvideodecoder and its internal resource allocations
[21:53:40 CEST] <BtbN> but those didn't change from before
[21:54:11 CEST] <BtbN> or do you mean because now maybe two decoders might exist in parallel for a few frames?
[21:54:40 CEST] <BtbN> that shouldn't really matter, as you'll have flushed it right before anyway
[21:56:14 CEST] <philipl> I mean two might exist in parallel.
[21:56:31 CEST] <philipl> If the memory overhead of that is minimal than it's not a problem, but that was what worried wm4
[22:13:34 CEST] <BtbN> well, it will be less of an overhead than the 14 extra frames before
[22:13:55 CEST] <philipl> I would expect so
[22:19:13 CEST] <wm4> why can't the decoder just be reused
[22:20:27 CEST] <BtbN> seeking is not exactly intended by nvidia iirc
[22:20:29 CEST] <philipl> It was because we couldn't find a way to actually flush it.
[22:20:38 CEST] <philipl> yeah
[22:20:45 CEST] <BtbN> if you flush it, it's just done for, not reuseable
[22:21:43 CEST] <BtbN> how is that even done with nvdec hwaccel?
[22:21:49 CEST] <BtbN> in cuvid it was an explicit implementation
[22:22:30 CEST] <philipl> That's true actually. I'm not sure how it works out in nvdec :-)
[22:22:55 CEST] <BtbN> actually, it's the nvidia parser that can't handle it
[22:23:03 CEST] <BtbN> I don't see how the pure decoder would care
[22:23:08 CEST] <philipl> ah. good point.
[22:23:18 CEST] <philipl> That means we're not destroying the decoder on seek with nvdec.
[22:23:26 CEST] <philipl> So there's no problem, right?
[22:23:53 CEST] <BtbN> the nvdec decoder, without parser, is extremely simplistic
[22:23:56 CEST] <BtbN> it consits of just 3 function calls
[22:24:07 CEST] <BtbN> decode, map, unmap. If you ignore the create/destory calls
[23:32:40 CEST] <durandal_1707> nobody commented on bm3d? I get noticeable better results than nlmeans
[23:33:29 CEST] <JEEB> haven't needed denoising, unfortunately
[23:44:07 CEST] <durandal_1707> nice thing with bm3d is that you can use other denoisers as first step instead of bm3d itself
[23:54:41 CEST] <durandal_1707> JEEB: when will you switch to RustAV ?
[23:57:03 CEST] <JEEB> haven't seen anything too interesting from there
[00:00:00 CEST] --- Sat May 12 2018


More information about the Ffmpeg-devel-irc mailing list