[Ffmpeg-devel-irc] ffmpeg-devel.log.20161122

Wed Nov 23 03:05:03 EET 2016

[00:11:51 CET] <cone-568> ffmpeg 03Jun Zhao 07master:e72662e131e5: lavc/vaapi_encode_h264: fix poc incorrect issue after meeting idr frame.
[00:11:52 CET] <cone-568> ffmpeg 03Mark Thompson 07master:f242e0a0ff0d: vaapi_encode: Fix format specifier for bitrate logging
[03:05:12 CET] <cone-568> ffmpeg 03James Almer 07master:0b8df0ce48e6: avformat/utils: add missing brackets around arguments in av_realloc() call
[03:19:19 CET] <cone-568> ffmpeg 03Steven Liu 07master:d316b21dba22: avformat/flvenc: add no_metadata to flvflags
[07:23:01 CET] <jya> BBBB: the 65% improvement of avx2 IDCT optimisation, how much does that translate in actual decoding speed?
[07:29:24 CET] <atomnuker> jya: as in, the latest patch of all of the previous ones as well?
[07:30:40 CET] <jya> atomnuker: in reference to commit f0a2b6249bb2426befa4c03247361268e45b13af. BBB had mentioned that libvpx having those avx2 optimisations, is what closed the gap speed-wise between libvpx and ffvp9
[07:31:13 CET] <jya> but i'm wondering what in real-word use it translates to, 
[07:35:29 CET] <atomnuker> well, ffvpx9 was already faster, but the gap probably increased
[07:35:36 CET] <atomnuker> he isn't online atm though
[09:51:19 CET] <[-T-]> hi all
[09:51:41 CET] <[-T-]> I'm trying VAAPI encoding on the latest master code, on a Skylake, and I get: [AVHWDeviceContext @ 0x1b43500] Failed to query surface attributes: 20 (the requested function is not implemented).
[09:51:55 CET] <[-T-]> is Skylake supported ?
[09:57:40 CET] <jkqxz> You are trying to use the proprietary iHD driver from the media SDK?  Don't do that, use the normal driver.
[09:59:48 CET] <[-T-]> nope
[09:59:49 CET] <jkqxz> (You can kindof hack it into working, but it has weird problems and is unlikely to be worth it.  It's easiest to just consider it a backend to libmfx.)
[09:59:54 CET] <[-T-]> the normal one from i965-intel
[10:00:01 CET] <BtbN> What Kernel are you on?
[10:00:17 CET] <[-T-]> i tried both 4.4 and 4.4 patched with MSDK
[10:00:26 CET] <[-T-]> what kernel is recommended ?
[10:00:29 CET] <BtbN> those are both too old for Skylake iirc
[10:00:44 CET] <[-T-]> well the 4.4 patched works with skylake and MediaSDK 2017
[10:00:52 CET] <[-T-]> but i guess it's not relevant ?
[10:01:08 CET] <[-T-]> i can easily try a newer one
[10:01:11 CET] <BtbN> You should forget that abomination even exists, makes your life a lot easier
[10:01:15 CET] <[-T-]> lol
[10:01:17 CET] <[-T-]> well well
[10:01:37 CET] <[-T-]> i also so that the h264_qsv imp changed completly
[10:01:44 CET] <[-T-]> between 3.2 and master
[10:01:56 CET] <[-T-]> so you recommend to use the vaapi one ?
[10:02:11 CET] <BtbN> With vaapi working fine, there is no reason to deal with the horrible patch-everything-fest QSV on linux requires.
[10:02:36 CET] <[-T-]> yes, but does it support VPP scaling for instance ?
[10:02:41 CET] <[-T-]> and deint etc
[10:02:52 CET] <BtbN> vaapi does, yes
[10:02:57 CET] <BtbN> no idea if it's implemented in ffmpeg though
[10:03:18 CET] <[-T-]> ok
[10:03:27 CET] <[-T-]> I have one extra question for you please
[10:03:47 CET] <[-T-]> could you confirm how layers ares actually ordered
[10:04:02 CET] <[-T-]> is it LIBMFX->VAAPI->HW
[10:04:08 CET] <[-T-]> for the MediaSDK imp
[10:04:15 CET] <[-T-]> and VAAPI->hw for the vaapi one ?
[10:04:34 CET] <BtbN> as the media-sdk one is closed source in most parts, onlx Intel knows
[10:05:32 CET] <[-T-]> ok
[10:05:46 CET] <[-T-]> i will try this VAAPI imp again
[11:44:25 CET] <BtbN> is stdcall == cdecl on x64?
[11:45:23 CET] <BtbN> Cause I'm getting (unrelated) warnings about "NVENCSTATUS (__cdecl *)(NV_ENCODE_API_FUNCTION_LIST *)", and that function should be stdcall
[11:50:55 CET] <nevcairiel> x64 only has one  calling convention
[11:51:03 CET] <nevcairiel> stdcall/cdecl only applies to x86
[11:52:27 CET] <BtbN> interesting that it explicitly marks it as cdecl then
[12:10:40 CET] <cone-500> ffmpeg 03Timo Rothenpieler 07master:5c02d2827bef: compat/cuda: add dynamic loader
[12:10:41 CET] <cone-500> ffmpeg 03Timo Rothenpieler 07master:e6464a44eda9: avutil/hwcontext_cuda: use dynamically loaded CUDA
[12:10:42 CET] <cone-500> ffmpeg 03Timo Rothenpieler 07master:d9ad18f3b4db: avcodec/cuvid: use dynamically loaded CUDA/CUVID
[12:10:43 CET] <cone-500> ffmpeg 03Timo Rothenpieler 07master:a0c9e76942ed: avfilter/vf_scale_npp: use dynamically loaded CUDA
[12:10:44 CET] <cone-500> ffmpeg 03Timo Rothenpieler 07master:a66835bcb16c: avcodec/nvenc: use dynamically loaded CUDA
[12:10:45 CET] <cone-500> ffmpeg 03Timo Rothenpieler 07master:b0ca90d7cbcc: avfilter/vf_hwupload_cuda: use new hwdevice allocation API
[12:10:46 CET] <cone-500> ffmpeg 03Timo Rothenpieler 07master:0faf3c3a25ed: avfilter/vf_hwupload_cuda: check ff_formats_ref for errors
[12:10:47 CET] <cone-500> ffmpeg 03Timo Rothenpieler 07master:8228b714be25: configure: cuda is no longer nonfree, enable and autodetect by default
[12:10:48 CET] <cone-500> ffmpeg 03Miroslav SlugeH 07master:c4aca65a42eb: avcodec/nvenc: maximum usable surfaces are limited to maximum registered frames
[12:10:49 CET] <cone-500> ffmpeg 03Miroslav SlugeH 07master:de2faec2faf3: avcodec/nvenc: better surface allocation alghoritm, fix rc_lookahead
[12:10:50 CET] <cone-500> ffmpeg 03Miroslav Slugen 07master:10db40f37430: avcodec/cuvid: allow setting number of used surfaces
[13:02:47 CET] <wm4> jkqxz: how do you gpu memcpy on CPUs that don't support SSE4?
[13:04:03 CET] <BtbN> slowly
[13:05:06 CET] <wm4> eh
[13:08:54 CET] <BtbN> wm4, http://git.videolan.org/?p=vlc.git;a=blob;f=modules/video_chroma/copy.c;h=444d47a7a75e56a2531132f918a61f4ee2d55652;hb=HEAD#l118
[13:11:34 CET] <nevcairiel> basically you just use generic memcpy
[13:11:53 CET] <nevcairiel> (any modern memcpy shouild use vector instructions)
[13:12:18 CET] <nevcairiel> just not the special ones for gpu surfaces, but if you dont have those..
[13:12:50 CET] <jkqxz> Or get the GPU to do it for you.  It can see some of CPU memory.
[13:15:29 CET] <wm4> nevcairiel: does "modern" mean "not mingw's"?
[13:15:40 CET] <wm4> since mingw-w64 might still use msvcrt's
[13:15:45 CET] <wm4> (the ancient one)
[13:16:07 CET] <nevcairiel> msvcrt isnt ancient as such, it gets updated
[13:16:27 CET] <nevcairiel> its the baseline the OS was build against
[13:32:07 CET] <wm4> anyway, I'm getting reports of slowness when not using the sse copy, so that's still somehow a problem
[13:33:29 CET] <nevcairiel> well of course its a problem, there is a reason these special instructions are used, no optimized generic memcpy cant fix that without them
[13:33:44 CET] <nevcairiel> the only other option is what jkqxz said, get the GPU to copy it for you if the API supports it
[13:36:27 CET] <wm4> would that imply use of staging textures?
[13:36:55 CET] <nevcairiel> Depending on the api, perhaps
[13:39:23 CET] <jkqxz> The GPU can only see some of CPU memory, so you need to tell it at allocation time that you want that.  (On Linux libdrm allocates such things.)
[13:40:03 CET] <wm4> nevcairiel: well, dxva2
[13:40:08 CET] <jkqxz> If you need to be able to copy to arbitrary places then you need to copy twice, once with the GPU and once with normal memcpy.  This is still 9001 times faster than CPU access to the GPU memory without magic memcpy.
[13:40:14 CET] <wm4> d3d11va always forces staging textures, which also makes it slower than dxva2
[13:40:19 CET] <wm4> (for copy-back)
[13:41:10 CET] <wm4> unfortunately I can't even test this stuff, because on my laptop with intel gpu a normal memcpy is just as fast
[13:43:38 CET] <jkqxz> Doesn't that mean it's cheating and giving you a copy of the surface in CPU memory?
[13:44:08 CET] <wm4> I don't know, I'm just locking the texture
[13:56:31 CET] <nevcairiel> Some drivers do that, yes.
[13:59:14 CET] <jkqxz> That's actually rather nice because it (presumably?) gives you the right two-step copy with the first one being implicit.
[13:59:23 CET] <jkqxz> It does rather mess with testing, though.
[14:04:28 CET] <wm4> wouldn't you just end up copying 3 times?
[14:12:27 CET] <BtbN> But with an Intel GPU, it doesn't even have real vram, it's all just system ram
[14:26:09 CET] <wm4> doesn't it depend how the cache is configured? but dunno
[14:27:42 CET] <BtbN> I'm quite sure an Intel Linux dev told me that the split you configure in the BIOS is useless and you can set it to minimum without worrying, as it uses the system ram anyway
[14:28:02 CET] <BtbN> No idea how the situation is on Windows
[14:37:34 CET] <BtbN> philipl, dynload set is merged now, in case you missed it.
[14:43:24 CET] <DHE> So everyone talks about how ffmpeg (the CLI tool) is single-threaded. Do you think there's a benefit to adding threaded processing to it?
[14:45:20 CET] <DHE> (Yes, I'm offering to [try] doing the work)
[14:54:26 CET] <BtbN> It would definitely be beneficial if it would run all the things in parallel
[14:54:35 CET] <BtbN> it would be one hell of a synchronization nightmare though
[14:56:19 CET] <DHE> what i had in mind was that the avcodec_{en,de}code_{video,audio} calls would be wrapped in threads and have a small buffer limit (maybe 2 or 3 frames). the main thread would just focus on moving the data around, PTS work, etc while the threads deals with the en/decoders
[14:56:39 CET] <DHE> so the codecs wouldn't bottleneck the whole process
[14:57:10 CET] <DHE> my main motivation is that I do have multiple outputs in most of my jobs so it seems that I might benefit from something like this
[14:57:19 CET] <BtbN> I highly suspect that it's not that easy, and there's a lot more stuff to considder
[14:57:50 CET] <BtbN> It might even be easier to implement some kind of auto-fork and pipe logic
[14:58:14 CET] <BtbN> Where it spawns multiple ffmpeg processes, with one of them handling the decoding and common filtering, and one per-output process
[14:58:50 CET] <Compn> so ffmpeg, if using multiple outputs, just uses one thread? hmm
[14:58:57 CET] <DHE> that could be a cross-platform nightmare
[14:59:00 CET] <Compn> that would be nice to multithread
[14:59:05 CET] <Compn> win32 uses pthreads
[14:59:20 CET] <Compn> or can use pthreads... lavc threading works on windows
[14:59:25 CET] <DHE> Compn: in the main ffmpeg cli, yes. the codecs may be multi-threaded wihch improves things, but I'm guessing not much.
[14:59:25 CET] <Compn> so i dont think much problem there
[14:59:37 CET] <DHE> I meant cross-platform nightmare for "forking" which isn't really a windows thing.
[14:59:40 CET] <Compn> ah
[14:59:49 CET] <Compn> so i'd say copy how lavc does it :)
[14:59:57 CET] <Compn> to avoid cross-plat nightmares
[15:00:37 CET] <DHE> oh libav already has threading utility functions...
[15:02:32 CET] <wm4> DHE: actually ffmpeg.c already uses threads
[15:02:37 CET] <wm4> for input (for whatever reason)
[15:02:50 CET] <BtbN> because it can block and stall the whole thing probably
[15:03:02 CET] <wm4> also, decoding is already multithreaded, so adding a threaded buffer as you suggested will in most cases help little
[15:03:13 CET] <DHE> wm4: that is true, yes
[15:03:21 CET] <BtbN> it still runs the decoders sequentialy though
[15:03:28 CET] <BtbN> it's only multi threaded in the decoder itself
[15:03:46 CET] <wm4> on the other hand, this kind of buffer could just be added to libavcodec itself (for things which aren't threaded, like some dumb codecs)
[15:04:10 CET] <BtbN> Running filters and encoder in parallel would be much more interesting
[15:04:19 CET] <wm4> well the fact that decoders are threaded means the decode call returns almost immediately
[15:06:03 CET] <wm4> on the other hand, some cases might benefit from this
[15:06:27 CET] <wm4> e.g. hevc decoding is going to be very slow, while not saturation all used threads
[15:06:45 CET] <DHE> there's actually a lot of places I've wanted to put multi-threading that don't exist yet. filtergraphs would be one possibility. a long filtergraph could probably run each individual filter in a thread. complex graphs with [a]split filters could handle multiple paths simultaneously for example...
[15:07:02 CET] <DHE> but making a simple work queue on the decoder/encoder in the ffmpeg cli is the lower hanging fruit
[15:07:21 CET] <wm4> that's a whole different issue... though I bet running every single filter in its own thread would make things much slower
[15:10:34 CET] <Compn> so we're going to have cpu with 128 cores in the future? :\
[15:10:45 CET] <Compn> hehe
[15:11:17 CET] <microchip_> Compn: there are already CPUs with at least that many threads overall
[15:11:41 CET] <DHE> well, I have a machine with 20 cores + hyperthreading (per CPU) and I have an itch to put it to good use. multiple ffmpeg instances only scales so far
[15:12:14 CET] <microchip_> Xeon?
[15:12:17 CET] <DHE> (bottleneck is actually IO)
[15:12:22 CET] <DHE> yes, xeon e5-2698 v4
[15:12:27 CET] <microchip_> nice!
[15:15:48 CET] <jkqxz> DHE:  What is your use-case where multiple ffmpeg instances isn't the answer for maximising throughput?
[15:16:26 CET] <jkqxz> To me it seems like a rare case that you would want to get as much parallelism as possible with only one instance (the parts where it matters are generally already covered by the individually threaded components).
[15:17:02 CET] <DHE> jkqxz: the disks are spindles and there's a lot of seeking going on to keep the ffmpeg processes fed alongside writing out the (multiple) outputs
[15:17:20 CET] <DHE> I suspect that's a big part of why it's not scaling as well as I would hope...
[15:18:36 CET] <DHE> arguably that's my fault for not specing the machine out properly
[15:59:25 CET] <philipl> BtbN: yay. can you review the p016 changes now? :-)
[16:12:57 CET] <BBB> 20 cores wtf
[16:12:59 CET] <BBB> I want that also
[16:21:06 CET] <BtbN> philipl, They kind of depend on the P016 pixel format to exist first
[16:22:23 CET] <philipl> BtbN: I sent that review too
[16:23:32 CET] <BtbN> One thing I'm not sure about is weather calling ff_get_format outside of the init function is even supported
[16:25:07 CET] <wm4> it is
[16:25:15 CET] <wm4> the native decoders do it on stream changes
[16:25:17 CET] <wm4> and even on seeks
[16:26:15 CET] <BtbN> Ok, so that should be fine then.
[16:26:28 CET] <BtbN> Only thing is that the hwupload_cuda changes should be in their own commit
[16:26:39 CET] <philipl> I'll split that out
[16:27:14 CET] <BtbN> I'm also still not sure how good of an idea it is to implement this without any kind of official statement or documentation from nvidia about it even existing.
[16:27:31 CET] <philipl> A question: Should I mark decoded 10bit content as P010? It might make interoperability easier.
[16:27:55 CET] <BtbN> Does cuvid even give that information?
[16:28:22 CET] <BtbN> ah, yeah. It has bit_depth_luma_minus8
[16:28:27 CET] <philipl> It doesn't. The only output format they added is P016 but you know that decoded 10bit content in P016 is indistinguishable from P010
[16:28:45 CET] <wm4> did you check bitexactness?
[16:29:01 CET] <philipl> No, actually, I can do that.
[16:35:17 CET] <DHE> BBB: not when you see the price tag you won't
[16:36:05 CET] <philipl> wm4: So, 10bit ends up bitexact in framemd5 if I report it as P010. if I report it as P016, it does not. That might be a sign my swscale code is bad :-P
[16:36:08 CET] <philipl> Need to check 12bit
[16:36:24 CET] <wm4> neat
[16:36:34 CET] <wm4> libswscale might just randomly mangle the data, that's normal
[16:37:50 CET] <philipl> How do I control the output format for framemd5? -pix_fmt?
[16:40:15 CET] <wm4> no idea... probably
[16:42:17 CET] <philipl> So, right name framemd5 really wants the data to be in the same format as the initial input format (yuv420p12le in the 12bit case) so it does a conversion.
[16:42:24 CET] <philipl> That should be lossless, in theory
[16:42:36 CET] <philipl> but it's b0rked. I expect it's my sws code.
[16:44:35 CET] <BtbN> I wouldn't be surprised if nvidia actually documents the new pix format as P01x or something like that
[16:44:54 CET] <BtbN> So using the correct P010 format depending on the bit depth seems reasonable to me
[16:45:25 CET] <philipl> Yeah.
[16:46:56 CET] <philipl> Ehh. So it's auto-inserting a scaler and then deciding it can't convert to p016 up front, rather than waiting until it learns the decoder is outputting p016
[16:47:48 CET] <JEEB> that's always fun
[16:47:50 CET] <BtbN> The path of least resistance for that is probably adding output support for p016
[16:47:58 CET] <philipl> Yes.
[16:48:17 CET] <philipl> Funnily, if I declare output support, it will get past that point then see it doesn't need to and then it works.
[16:48:21 CET] <philipl> that's with no actual conversion code.
[16:48:38 CET] <philipl> Of course, if I want the software decoder to do framemd5 in p016, I need to write output conversion code.
[16:48:39 CET] <nevcairiel> it shouldnt need to convert to p016, only from =p
[16:49:01 CET] <philipl> If I want framemd5 values in p016 to compare, I do :-)
[16:49:03 CET] <BtbN> well, unless at some point nvenc can encode that
[16:49:11 CET] <philipl> yeah.
[16:49:28 CET] <nevcairiel> why not just convert the p016 down
[16:49:35 CET] <nevcairiel> instead of the software decoder up
[16:49:38 CET] <philipl> Ok, so P010 was confirmed bitexact with a conversion back to yuv420p10le.
[16:50:18 CET] <philipl> When I converted the same content in P016 down to yuv420p10le it was not bitexact.
[16:50:26 CET] <BtbN> is the nvenc patch even needed if cuvid outputs actual P010?
[16:50:27 CET] <philipl> So I think it's safe to say my swscale code is wrong.
[16:50:39 CET] <philipl> BtbN: It's needed for decoded 12bit
[16:50:48 CET] <nevcairiel> not necessarily, sws may try to do some aggressive dithering when you reduce bitdepth
[16:50:51 CET] <philipl> You don't want to force a software 'conversion'
[16:51:07 CET] <philipl> Hmm. Fair.
[16:51:20 CET] <philipl> Let me try to force yuv420p16
[16:51:54 CET] <philipl> Nope.
[16:52:22 CET] <philipl> argh. I was using the 10bit sample again for the software framemd5 :-(
[16:53:18 CET] <philipl> but nope.
[16:53:20 CET] <philipl> Still not the same
[16:54:47 CET] <philipl> but again, hard to prove it didn't try and do something smart as the swscale conversion is not the same in each case.
[16:59:30 CET] <philipl> BtbN: hrm. you explicitly needed to add unscaled conversion from 8bit to p010. So I probably am getting screwed by scaling here, even going from 12 -> 16bit?
[17:00:17 CET] <BtbN> I added them because the automatic one was horribly slow
[17:01:21 CET] <philipl> but were they bitexact before?
[17:01:50 CET] <philipl> you had to change test hashes
[17:02:32 CET] <BtbN> The results do differ
[17:02:39 CET] <BtbN> but I have no idea which of them is bitexact
[17:03:15 CET] <philipl> heh.
[17:03:38 CET] <philipl> Ok. so we've gone around a couple of circles here.
[17:03:55 CET] <philipl> 1) 10bit content, declared as P010 is bitexact in framemd5.
[17:04:27 CET] <philipl> 2) 10bit content declared as P016 and 12bit content (obviously P016) are not bitexact in framemd5 but I strongly suspect that's swscale making a mess of things.
[17:05:18 CET] <BBB> your code isnt necessarily wrong
[17:05:27 CET] <BBB> swscale may have a fast p010-to-yuv420p10 codepath
[17:05:31 CET] <BBB> that doesnt handle p016
[17:05:33 CET] <philipl> It does
[17:05:39 CET] <philipl> btbn added it
[17:05:43 CET] <BBB> oh
[17:05:46 CET] <BBB> <- stupid
[17:05:57 CET] <philipl> actually, it's the other way around.
[17:06:06 CET] <BtbN> there is fast yuv420p to p010
[17:06:17 CET] <nevcairiel> yuv420pbXX is sws internal format
[17:06:40 CET] <philipl> yes, I see it.
[17:06:40 CET] <BtbN> or was it nv12?
[17:06:43 CET] <philipl> both
[17:07:19 CET] <philipl> anyway.
[17:07:23 CET] <philipl> So, question:
[17:07:36 CET] <philipl> I'll update the diff to report 10bit as P010.
[17:07:46 CET] <philipl> Are we ok with the unclear nature of the 12bit stuff?
[17:08:00 CET] <cone-763> ffmpeg 03Timo Rothenpieler 07master:5ea8f7062300: avcodec/libx264: fix forced_idr logic
[17:28:30 CET] <philipl> BtbN: I've dropped the nvenc patch now that I'm calling 10bit content P010.
[17:53:09 CET] <wm4> philipl: what happens on older drivers?
[17:58:19 CET] <philipl> wm4: You'll get an error initializing the decoder with the unknown output format.
[17:59:34 CET] <wm4> sounds good
[17:59:58 CET] <wm4> what happens the other way around if the input video's profile is not supported by the decoder?
[18:00:33 CET] <philipl> You get another form of initialization error.
[18:01:03 CET] <philipl> 12bit is new in 375 as well, so you'd get that error on an older driver.
[18:03:07 CET] <wm4> or older hw I suppose
[18:03:54 CET] <philipl> Yes.
[18:06:09 CET] <wm4> that sounds pretty good
[18:17:28 CET] <gabrieliv> warning: avcodec_decode_video2 is deprecated, what should I use instead?
[18:21:50 CET] <jamrial> gabrieliv: avcodec_send_packet() and avcodec_receive_frame(). read the doxy in avcodec.h
[18:24:50 CET] <BtbN> philipl, where does the avctx->sw_pix_fmt come from? It's used to initialize the hwframes ctx, but it's never set.
[18:25:29 CET] <philipl> Heh. It's set by ff_get_format. Which is insane, but there you go.
[18:33:04 CET] <BtbN> what? So it has logic to also negotiate a software format, in case the pixel format is a hw one?
[18:33:39 CET] <BtbN> Also, wouldn't it be easier and shorter to only have one pix_fmt enum array, and set the second element, instead of 3 full arrays?
[18:34:11 CET] <BtbN> Would also be more future proof in case another format gets added
[18:35:44 CET] <philipl> ff_get_format doesn't do the right thing if there are multiple softwware formats
[18:35:49 CET] <philipl> It *always* chooses the last one
[18:36:07 CET] <philipl> So it can choose between hw and sw correctly but does nothing smart about the sw value
[18:36:39 CET] <philipl> Oh, I see. Change the second value.
[18:36:41 CET] <philipl> Yeah, I can do that.
[18:41:17 CET] <philipl> BtbN: https://github.com/philipl/ffmpeg/tree/cuvid
[18:41:23 CET] <philipl> I'll just keep updating that until you're happy with it
[18:46:34 CET] <BtbN> have you tried enum values higher than 1, btw.?
[18:46:39 CET] <BtbN> For the cuvid pixel format, that is
[18:47:45 CET] <BtbN> But what I meant is: https://github.com/philipl/FFmpeg/commit/d6a40c8f5ec5b849821bfc56dee5e005ddddbe6a#diff-b103d53200c9b354278248e68430a7d0R260
[18:48:01 CET] <BtbN> where is that sw_pix_fmt set? Does ff_get_format really do that, if the primary format is a hw one?
[18:48:34 CET] <philipl> BtbN: I have tried 2, and 2 doesn't work. I didn't try higher this time, although I did in the past.
[18:48:47 CET] <philipl> Yes, ff_get_format really does it.
[18:49:27 CET] <philipl> https://github.com/philipl/FFmpeg/blob/master/libavcodec/utils.c#L1122
[18:49:33 CET] <philipl> It's insane, but there it is.
[18:50:03 CET] <BtbN> allways hardcoded to the last, wow
[18:50:55 CET] <BtbN> but yeah, otherwise this looks fine to me now.
[18:51:17 CET] <wm4> this code is basically designed for hwaccels
[18:51:23 CET] <wm4> doesn't make too much sense for hw decoders
[18:52:18 CET] <wm4> uh it even sets sw_pix_fmt for non-hwaccels
[18:52:26 CET] <wm4> so correction: this code doesn't make too much sense
[18:52:43 CET] <philipl> yep
[18:52:57 CET] <philipl> It's... highly predictable...
[18:53:34 CET] <BtbN> it would seem like a good idea to add a check for the selected format being a hwaccel one
[18:54:41 CET] <philipl> BtbN: So, if you're using the software read-back path, it won't be an hwaccel one.
[18:54:59 CET] <BtbN> hm?
[18:55:01 CET] <philipl> That's the value here, it's able to choose AV_PIX_FMT_CUDA vs AV_PIX_FMT_NV12 (or whatever)
[18:55:20 CET] <philipl> depending on how its being used. yeah?
[18:55:27 CET] <philipl> or are you asking a different question?
[18:55:56 CET] <BtbN> I mean just moving the setting of sw_pix_fmt to the end of the function, surrounded by an if(is_hwaccel_pix_fmt(ret))
[18:57:12 CET] <philipl> yeah.
[18:57:47 CET] <philipl> Ok, so I'll push the p016 pix fmt (but not the swscale input part) and then the cuvid changes.
[18:57:54 CET] <philipl> No one's reviewed the swscale part yet.
[19:08:02 CET] <philipl> BtbN: is this a micro version bump for avcodec?
[19:08:22 CET] <BtbN> yes
[19:08:26 CET] <philipl> and avutil for the hwcontext_cuda?
[19:08:45 CET] <BtbN> hm, if in doubt, it's allways a micro bump.
[19:08:55 CET] <philipl> heh
[19:09:05 CET] <BtbN> I wonder if it'd make sense to also add support to hwupload_cuda
[19:10:30 CET] <philipl> presumably - but for nvenc more than anything else.
[19:11:06 CET] <philipl> one line change?
[19:11:43 CET] <BtbN> haven't looked at it, but as it mostly just forwards to the hwcontext functions, it can't be much more
[19:13:24 CET] <cone-763> ffmpeg 03Philip Langdale 07master:237421f14973: avutil: add P016 pixel format
[19:13:25 CET] <cone-763> ffmpeg 03Philip Langdale 07master:8d6c358ea8ec: libavutil/hwcontext_cuda: Support P010 and P016 formats
[19:13:26 CET] <cone-763> ffmpeg 03Philip Langdale 07master:81147b5596ea: avcodec/cuvid: Add support for P010/P016 as an output surface format
[19:20:39 CET] <gabrieliv> I need to compile and run some ffmpeg code on a remote machine where I don't have root access to install ffmpeg-devel package. I've copied the ffmpeg source code snapshot to the remote machine and tried to compile my code with gcc -I/path/to/snapshot but, however, this fails due to the missing .so libraries (e.g. libavformat.so, libavcodec.so etc) that need to linked together with the object files into my executable. Is there any way I could ge
[19:20:39 CET] <gabrieliv> t the ffmpeg .so files in order to successfully build my code?
[19:21:49 CET] <llogan> wrong channel. see #ffmpeg
[19:24:12 CET] <gabrieliv> OK
[20:33:04 CET] <cone-763> ffmpeg 03Alex Converse 07master:3ee59939a1c1: libvpxenc: Support targeting a VP9 level
[21:11:20 CET] <microchip_> anyone working on AC-4 support? Is that even doable atm?
[21:19:44 CET] <JEEB> the "spec" is around
[21:19:51 CET] <JEEB> but I don't think anyone has worked on it
[21:19:58 CET] <JEEB> there are some DASH samples with it I hear
[21:20:14 CET] <microchip_> ic
[21:21:13 CET] <JEEB> http://testassets.dashif.org/#feature/details/57cd83dfb626efae4d44d458
[21:21:29 CET] <JEEB> the AC-4 test vectors at dashif
[21:26:48 CET] <microchip_> apparently 192 kbps is enough for 5.1 AC-4.... O.O
[21:27:02 CET] <microchip_> how the hell did they pull that off?
[21:27:59 CET] <llogan> cp opus AC-4
[21:28:09 CET] <microchip_> lol
[21:42:31 CET] <wm4> I surely hope AC-4 has so little success that it'll be the biggest embarrassment to its creators ever
[21:44:34 CET] <nevcairiel> is that the object based format
[21:44:44 CET] <wm4> oh some broadcasters already picked it up... looks like mankind is the embarrassment
[21:44:53 CET] <j-b> You mean ATSC picked it up?
[21:46:04 CET] <wm4> and ETSI according to wikipedia
[21:46:32 CET] <nevcairiel> so basically all the broadcasters? :p
[21:46:40 CET] <wm4> I'm not sure what kind of BSE these people have
[21:46:46 CET] <j-b> No, ETSI does not pick anything
[21:46:57 CET] <wm4> fuck them anyway
[21:46:57 CET] <j-b> They standardize what they are given
[21:47:09 CET] <j-b> They standardized a lot of things that never went in prod
[21:47:13 CET] <j-b> like DVB-CSA2
[21:47:14 CET] <nevcairiel> whats so bad about ac-4 anyway
[21:47:47 CET] <TD-Linux> other than it probably costing $$$$?
[21:47:59 CET] <nevcairiel> well so do practically all other formats they use
[21:48:01 CET] <TD-Linux> one good thing: it can't be as bad as ac3
[21:48:07 CET] <wm4> maybe because its only merit is that it's patented
[21:48:11 CET] <TD-Linux> though ac3 is expiring / has expired
[21:48:18 CET] <TD-Linux> actually ^ is probably its main merit
[21:50:36 CET] <microchip_> i use currently e-ac-3 for my movies
[21:50:39 CET] Action: microchip_ runs
[21:51:35 CET] <wm4> microchip_: I'm so sorry
[21:51:39 CET] <microchip_> :D
[21:51:49 CET] <nevcairiel> i just leave the movies in whatever format they come in
[21:54:59 CET] <microchip_> nevcairiel: i would too, but a) i don't have high-end gear to hear the diff, so I thought since i can't tell the diff between DTS/TrueHD lossless and 768 kbps e-ac-3, I'd go for the latter to save space and b) DTS core needs too high a bitrate to sound like e-ac-3
[21:58:09 CET] <TD-Linux> looking at the spec it's not an opus copy, but not very novel either
[21:58:24 CET] <TD-Linux> atomnuker will enjoy the 960 and 1920 sample dcts
[21:58:35 CET] <microchip_> :D
[21:59:39 CET] <atomnuker> not baffled at all, ac-3 used messed-up 512 windows and even more messed up 256 windows
[22:00:42 CET] <atomnuker> they're getting at fucking up the most fundamental parts
[22:01:21 CET] <TD-Linux> (or maybe he won't because no one cares about ac-4)
[22:01:39 CET] <nevcairiel> once it starts showing in actual user-facing broadcasts, people might care
[22:01:59 CET] <atomnuker> like, I dunno, 2 organs responsible for hearing require more than channels
[22:02:02 CET] <microchip_> wait till yamaha/onkyo/denon adapt it
[22:03:13 CET] <llogan> Don't forget Coby
[22:03:34 CET] <microchip_> ;)
[23:40:56 CET] <jkqxz> BtbN:  Do you want to make any comment on the VAAPI H.265 scaling list patch?  (You wrote that file.)
[23:47:06 CET] <jkqxz> BtbN:  I would intend to just apply it as-is - it looks right and works on Skylake for me.  (And I doubt anyone else will be interested in commenting.)
[23:49:40 CET] <rcombs> why would broadcasters want to use AC-4 instead of Opus
[23:49:43 CET] <rcombs> idgi
[23:50:09 CET] <rcombs> I get why the fuckers at Dolby would want to come up with a new profit center, but I don't get why anyone would bite
[23:50:38 CET] <kierank> rcombs: object based audio
[23:50:54 CET] <rcombs> broadcasters care about that shit?
[23:51:02 CET] <kierank> yes
[23:51:10 CET] <rcombs> also doesn't Google have a free spec for that now
[23:51:15 CET] <kierank> they've been told people will spend money on tv for that
[23:51:22 CET] <kierank> no
[23:51:44 CET] <j-b> Opus can do Object-Based Audio.
[23:51:48 CET] <rcombs> I think they've heard wrong
[23:52:16 CET] <rcombs> (I'm doubtful many people will spend money on TV for the sake of object audio)
[23:52:25 CET] <rcombs> j-b: well there you go
[23:52:27 CET] <j-b> They will buy new TVs for 3D /s
[23:52:28 CET] <kierank> j-b: afaik object based audio != ambisonics
[23:52:36 CET] <j-b> kierank: correct.
[23:52:38 CET] <wm4> yeah, got to invent some new consumer gimmick
[23:52:42 CET] <kierank> j-b: so what does opus support
[23:52:56 CET] <kierank> http://www.pocket-lint.com/news/139452-bt-launches-dolby-atmos-sound-with-4k-tv-packages-get-the-stadium-experience-at-home
[23:52:57 CET] <wm4> just to create a reason to buy new expensive hw
[23:52:59 CET] <j-b> kierank: various "layouts"
[23:53:09 CET] <kierank> j-b: fixed layouts, no
[23:53:13 CET] <j-b> kierank: you have a stereo layout, a multichannel layout, an ambisonic layout
[23:53:24 CET] <j-b> kierank: but opus allows any layout, up to 255 channels
[23:53:34 CET] <kierank> afaik dolby atmos is dynamic layout or something
[23:53:36 CET] <kierank> based on the room
[23:53:42 CET] <j-b> so it's trivial to put 255 objects in Opus
[23:53:52 CET] <kierank> but the mixing data isn't there
[23:53:52 CET] <j-b> the dynamic can be done in the metadata
[23:53:55 CET] <rcombs> so you'd transmit location information in sideband data
[23:54:00 CET] <j-b> rcombs: yep
[23:54:01 CET] <kierank> j-b: where, opus doesn't have metadata?
[23:54:06 CET] <rcombs> and do mixing at a higher level?
[23:54:09 CET] <rcombs> that actually sounds sane
[23:54:21 CET] <wm4> +in
[23:54:24 CET] <kierank> you've already lost with sideband data in broadcast
[23:54:35 CET] <rcombs> wm4: well, sane compared to atmos
[23:54:39 CET] <j-b> like HDR10 vs HLG
[23:54:44 CET] <j-b> or Dolby Vision
[23:54:53 CET] <j-b> Broadcast people are so sloooow to react
[23:54:56 CET] <rcombs> Dolby Vision, bahahahaha
[23:55:12 CET] <rcombs> and isn't HDR10 just a fancy name for HEVC Main 10 + BT2020
[23:55:18 CET] <microchip_> why all the Dolby hate? :P
[23:55:18 CET] <j-b> they want HLG when all TVs have Dolby-Vision, and HDMI and Windows/Android is HDR10
[23:55:30 CET] <j-b> rcombs: + metadata
[23:55:35 CET] <kierank> in my live transmission, how do I transport this metadata
[23:55:55 CET] <j-b> where you want :)
[23:56:07 CET] <j-b> you have the same issue for HDR
[23:56:16 CET] <kierank> hlg is backwards compatible
[23:56:53 CET] <j-b> too late... I'm afraid.
[23:56:58 CET] <wm4> microchip_: because they're shitty assholes with bad tech?
[23:57:14 CET] <microchip_> wm4: better than DTS imho :p
[23:57:15 CET] <j-b> But no, the broadcaster will use AC-4 and get us 20 years of Dolby Patents
[23:57:24 CET] <kierank> only the french
[23:57:30 CET] <j-b> lol
[23:57:31 CET] <j-b> touché
[23:57:44 CET] <wm4> the french?
[23:57:46 CET] <j-b> but also the american, who are going with AC4 and MMT
[23:58:03 CET] <kierank> well atsc3 won't go anywhere
[23:58:04 CET] <kierank> so that's fine
[23:58:59 CET] <j-b> kierank: may $deity make you correct!
[23:59:52 CET] <wm4> apropos awesome tech... is there really no good way to determine the duration of gifs?
[00:00:00 CET] --- Wed Nov 23 2016