[Ffmpeg-devel-irc] ffmpeg-devel.log.20190323
burek
burek021 at gmail.com
Sun Mar 24 03:05:04 EET 2019
[00:00:00 CET] --- Sat Mar 23 2019
[00:45:13 CET] <Matti-> i found a amazing program that scans mp3 files and tell which mp3 encoder was used. if this could happen with mp3, is it possible to have a program that does for AAC, and tells which AAC encoder used
[00:45:29 CET] <Matti-> oops wrong channel
[01:00:08 CET] <cone-219> ffmpeg 03James Almer 07n4.1.2:HEAD: Merge commit '0676de935b1e81bc5b5698fef3e7d48ff2ea77ff'
[05:56:11 CET] <cone-682> ffmpeg 03Jun Zhao 07master:305025c8aede: lavfi/sidedata: add missed frame side data type
[05:56:11 CET] <cone-682> ffmpeg 03Jun Zhao 07master:fba42b33b7f2: lavf/flvdec: fix typo in log message
[08:04:26 CET] <cone-682> ffmpeg 03Steven Liu 07master:2cb29a5d8de0: avformat/avformat.h: update the comment from deprecated to new API
[08:04:27 CET] <cone-682> ffmpeg 03hwrenx 07master:bf05f621d583: lavc/libdavs2: add davs2_flush
[08:04:28 CET] <cone-682> ffmpeg 03hwrenx 07master:5252d594a155: lavc/libdavs2: fix frame dumping error description
[12:55:24 CET] <BtbN> I feel like I have forgotten one or two nvidia related patches, but I can't find them on the ML anymore.
[13:04:27 CET] <JEEB> was npp or cuda the thing we were trying to get rid of?
[13:05:53 CET] <JEEB> since I see a cuda crop filter being posted on the ML
[13:07:33 CET] <jkqxz> CUDA crop? Isn't that redundant, since you already have the cropping information on hardware frames?
[13:08:07 CET] <jkqxz> Hmm. Apparently I never merged the generic crop filter.
[13:15:34 CET] <jkqxz> That CUDA version seems wildly complicated. Why does it even copy? Aren't the CUDA data references just pointers anyway?
[13:16:03 CET] <jkqxz> (Or meant to look like them, at least.)
[13:16:19 CET] <nevcairiel> they are, and it really confused me when i saw it the first time, when I thought "why is someone doing pointer arithmetic on these opaque hardware pointers"
[13:16:55 CET] <BtbN> Those opaque hardware pointers can be casted to actual pointers and they just work most of the time as well.
[13:17:57 CET] <BtbN> Can you even do full crop, like top and left side, with just pointer arith?
[13:17:58 CET] <nevcairiel> that then leaves the question, why bother with cropping if you can just increment pointers
[13:18:08 CET] <nevcairiel> sure
[13:18:08 CET] <BtbN> well, top obviously, but left side
[13:18:24 CET] <nevcairiel> well in software you can
[13:18:45 CET] <BtbN> I'm pretty sure that would mess up the strict alignment requirements CUDA has on the pointers.
[13:18:51 CET] <nevcairiel> in software the o nly problem that comes up is that the reslulting pointers are no longer aligned
[13:18:54 CET] <BtbN> So you could only crop in multiple of 256 or even 512
[13:19:09 CET] <nevcairiel> if hardware has similar annoyances, then you cant
[13:19:22 CET] <BtbN> Alignment requirements on CUDA frames are super strict
[13:19:48 CET] <BtbN> 256 byte aligned minimum, depending on GPU gen
[13:19:56 CET] <jkqxz> Do NVENC or any other things support the AVFrame cropping fields?
[13:20:27 CET] <BtbN> no
[13:20:31 CET] <jkqxz> Let me just send the generic one again. It's from a year ago, but sitll works nicely for VAAPI at least.
[13:20:49 CET] <jkqxz> Could scale_cuda support it easily?
[13:20:50 CET] <BtbN> I don't remember nvenc having cropping parameters either. nvdec has
[13:21:23 CET] <BtbN> Adding support for it to scale_cuda would require pretty much the same kind of code that filter adds.
[13:21:27 CET] <nevcairiel> in a perfect world you would want to get rid of cropping parameters w hen transcoding
[13:21:41 CET] <nevcairiel> because many players dont support top/left either
[13:22:02 CET] <BtbN> I'm not aware of nvenc supporting cropping parameters either, as in, speciying an input rect
[13:22:49 CET] <BtbN> It defines a NVENC_RECT type, but then proceeds to never use it
[13:28:22 CET] <nevcairiel> in related topics, do they 16-series cards have the full turing video engine? I would assume so, right?
[13:28:47 CET] <BtbN> I'd assume the same, but in lack of any Turing card, can't test it.
[13:29:55 CET] <nevcairiel> well i have a 20 series, but that doesnt really allow me to extrapolate to the 16 series that I might want to plug into a media system
[13:31:19 CET] <BtbN> If they were to release a 16 series card with 8GB RAM like a 1670, I'd probably get it.
[13:33:17 CET] <nevcairiel> i dont think the 16-series is going to go upwards in performance, because it would seriously infringe on the 2060 then
[13:46:35 CET] <BtbN> Will need to get a 2070 or 2080 then at some point
[13:46:44 CET] <BtbN> My 1060 is already limiting me in some games
[18:52:11 CET] <durandal_1707> BBB: now what about qpel? lets use (A+B+1)>>1 for 1/2pel and (6A+2B+1)>>3 for 1/4pel and (2A+6B+4)>>3 for 3/4pel, how to make 16 combinations from those?
[18:52:55 CET] <durandal_1707> 0,0 is copy
[19:02:49 CET] <kierank> See how h264 does it
[19:08:16 CET] <durandal_1707> h264 is mess
[19:28:16 CET] <iive> i thought that all MC already do that. I even remember that RV40 turned out to use the wrong interpolation function for one of the cases e.g. [3,3]/4
[19:33:03 CET] <BBB> wait, wait
[19:33:15 CET] <BBB> the reason that h264 is the way it is, is because halfpel is special-cased
[19:33:23 CET] <BBB> the idea in h264 is that qpel positions are derived from hpel
[19:33:26 CET] <BBB> this is hell for decoders
[19:33:47 CET] <BBB> but for encoders, it is supposedly better
[19:34:02 CET] <BBB> that's why h264 is the way it is (in terms of dsp function pointer table design)
[19:34:09 CET] <BBB> for a more generic system, you don't need that
[19:34:24 CET] <BBB> just use a subpel position function argument x/y
[19:34:31 CET] <BBB> and use that as multiplier for A
[19:34:57 CET] <BBB> if mx is in range of [0..3], then the B coefficient is simply mx
[19:35:39 CET] <BBB> so out=A+(((B-A)*mx+4)>>3)
[19:35:40 CET] <BBB> IIRC
[19:35:48 CET] <BBB> it's just standard bilinear
[19:35:59 CET] <BBB> and then store that in temp, do the same for my vertical
[19:36:00 CET] <BBB> etc.
[19:36:05 CET] <BBB> durandal_1707: ^^
[19:55:47 CET] <durandal_1707> BBB: what about my? i have 16 combinations, 4Xx4Y
[19:57:25 CET] <BBB> but you don't need 16 functions
[19:57:30 CET] <BBB> just pass mx/my as function arguments
[19:57:36 CET] <BBB> and do the function I just gave you per direction
[19:57:39 CET] <BBB> it's simple bilinear
[19:57:43 CET] <BBB> there's nothing complex about it
[19:58:26 CET] <BBB> durandal_1707: if I'm not making sense, let me know, I can explain in some more detail if that helps
[19:59:29 CET] <iive> some of the 16 combinations are repeats.
[19:59:32 CET] <durandal_1707> so first 2 bits are x part of coded motion vector and rest is 2 bits for y part, that gives 0-15 values
[20:00:25 CET] <durandal_1707> first 4 cases are pure horizontal?
[20:04:15 CET] <durandal_1707> BBB: i do not understand how to do vertical - for which cases it is done?
[20:05:10 CET] <BBB> so you have a motion vector right?
[20:05:15 CET] <BBB> and the lowest 2 bits are fractional
[20:05:22 CET] <BBB> that is, they are the subpixel position
[20:05:29 CET] <BBB> of each (x, y) component
[20:05:31 CET] <BBB> yes?
[20:05:35 CET] <BBB> does that make sense so far?
[20:06:06 CET] <BBB> let's call this set of 2x 2 bits "mx" and "my"
[20:06:17 CET] <BBB> mx is the x mv subpixel component
[20:06:24 CET] <BBB> my is the y mv subpixel component
[20:06:26 CET] <BBB> ok?
[20:06:31 CET] <durandal_1707> ok
[20:06:46 CET] <BBB> ok, let's start with horizontal
[20:06:52 CET] <BBB> you said, for mx=0, the function is A
[20:07:02 CET] <BBB> for mx=1, it's (A*6+B*2+4)>>3
[20:07:10 CET] <BBB> for mx=2, it's (A+B+1)>>1
[20:07:27 CET] <BBB> for mx=3, it's (A*2+B*6+4)>>3
[20:07:28 CET] <BBB> ok?
[20:07:34 CET] <durandal_1707> yes
[20:08:02 CET] <BBB> I'm gonna rewrite that as (mx*B+(4-mx)*a+2)>>2
[20:08:18 CET] <BBB> which can also be written as A+(((B-A)*mx+2)>>2)
[20:08:28 CET] <BBB> this is the standard bilinear equation
[20:08:41 CET] <BBB> where A is src[x], and B is src[x+1]
[20:08:44 CET] <BBB> that's the horizontal part
[20:08:48 CET] <BBB> store it, do the same vertically
[20:08:49 CET] <BBB> and done
[20:09:26 CET] <BBB> it's just that vertically, you do A=src[x] and B=src[x+stride]
[20:09:41 CET] <BBB> so do the hor loop wx(h+1)
[20:09:45 CET] <BBB> and the verloop wxh times
[20:11:11 CET] <durandal_1707> other cases from 8 to 15 are just repeat of those?
[20:11:26 CET] <durandal_1707> 4-7 are vertical?
[20:30:21 CET] <BBB> whatdo you mean 8 to 15
[20:30:26 CET] <BBB> you have4 mx positions and 4 my
[20:30:34 CET] <BBB> so in h264 these are unrolles to unique functions
[20:30:39 CET] <BBB> but for bilin you don't need that
[20:30:48 CET] <BBB> so it's not 4x4 functions; it's just 1 function handling 2x4 mx/my pairs
[20:34:19 CET] <durandal_1707> i need to map somehow those bits to something, so which of them are which ....
[20:36:07 CET] <BBB> I can see that we have some sort of a mismatch of information here
[20:36:29 CET] <BBB> I don't see what you don't understand and you can't see why I can't understand you :-p
[20:36:35 CET] <BBB> but I don't know what you mean, sorry
[20:36:48 CET] <BBB> can you show some code and tell me what is unclear?
[20:42:18 CET] <durandal_1707> BBB: https://pastebin.com/Axn1TKLz
[20:42:49 CET] <BBB> hehe
[20:42:51 CET] <BBB> mx=mode&3
[20:42:54 CET] <BBB> my=mode>>2
[20:42:57 CET] <BBB> and then do what I did above
[20:43:05 CET] <BBB> don't write it out using switches, that's terrible :)
[20:43:08 CET] <BBB> it'll be way too long
[20:49:17 CET] <durandal_1707> for which mx/my i pick vertical and v+h?
[20:55:56 CET] <BBB> if mx != 0, there is horizontal subsampling
[20:56:04 CET] <BBB> if my != 0, there is vertical subpixeling
[20:56:07 CET] <BBB> not subsampling, sorry
[23:42:43 CET] <cone-134> ffmpeg 03Carl Eugen Hoyos 07master:5fceac1cdb82: lavd/v4l2: Fix the type of the probe function.
[00:00:00 CET] --- Sun Mar 24 2019
More information about the Ffmpeg-devel-irc
mailing list