burek021 at gmail.com
Sun Dec 15 02:05:02 CET 2013
[00:58] <BBB> ubitux: did you look at simd? is it ok for you or have big comments?
[00:59] <ubitux> i didn't yet, i'll have a look tomorrow, but doubt i'll have a lot of comments though
[01:56] <cone-382> ffmpeg.git 03Michael Niedermayer 07master:f5cf0ea93a55: avformat/asf: clear uninitialized areas of packets before returning them
[01:56] <cone-382> ffmpeg.git 03Michael Niedermayer 07master:cf95dee3de59: avcodec/vc1dec: dont calculate unused values from uninitialized sprites
[01:56] <cone-382> ffmpeg.git 03Michael Niedermayer 07master:5f00b333a4c3: avcodec/vc1dec: zero SpriteData struct
[01:57] <cone-382> ffmpeg.git 03Michael Niedermayer 07master:48016f8febe6: avcodec/vc1dec: propagate errors from vc1_parse_sprites()
[02:14] <wm4> should the hypothetical high level API be powerful enough to handle crap like dvdnav?
[02:29] <cone-382> ffmpeg.git 03Michael Niedermayer 07master:3f4290a2060b: swscale/x86/rgb2rgb_template: try to fix build without AVX
[03:01] <cone-382> ffmpeg.git 03Michael Niedermayer 07master:445c58a8c6be: swscale/x86/rgb2rgb: Make sure COMPILE_TEMPLATE_AVX is defined
[03:37] <cone-382> ffmpeg.git 03Guillaume Martres 07master:c6afd0aacc31: hevc: fix PTL parsing
[03:37] <cone-382> ffmpeg.git 03Guillaume Martres 07master:dddc9b7a8ec3: hevc: don't check for errors in PTL code
[03:37] <cone-382> ffmpeg.git 03Guillaume Martres 07master:8e72e19f6468: hevc: remove unused PTL flags
[03:37] <cone-382> ffmpeg.git 03Guillaume Martres 07master:c90cdf4b6444: hevc: pack PTL representation using uint8_t
[03:37] <cone-382> ffmpeg.git 03Guillaume Martres 07master:ecb21d24373c: hevc: rename ptl structs and variables
[11:13] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:2cfccd8060b9: avcodec/vc1: Factorize imode enum out / remove duplication
[11:23] <cone-938> ffmpeg.git 03Diego Biurrun 07master:70a7b24d56a8: avutil: Add deprecation ifdefs around obsolete intfloat code
[11:23] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:acda7c8e20dc: Merge commit '70a7b24d56a823894440a372c46e89e212b89c35'
[11:33] <cone-938> ffmpeg.git 03Luca Barbato 07master:7cbe1ea9df83: configure: Move the bz2 and zlib checks below phtreads
[11:33] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:7431923da8b7: Merge commit '7cbe1ea9df83ec66403fbf6400353bcb2242bf06'
[11:54] <ubitux> BBB: +%if ARCH_X86_64 ; TODO: 32-bit? (32-bit limited to 8 xmm reg, we use 13 here)
[11:54] <ubitux> 16 for you afaict
[11:56] <ubitux> where are the benchs? :p
[12:45] <cone-938> ffmpeg.git 03Luca Barbato 07master:a5a3b398fd9d: configure: Reorder pthreads checks
[12:45] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:f5013913da36: Merge commit 'a5a3b398fd9dce38ca50b20f182b17a256d209f2'
[13:19] <cone-938> ffmpeg.git 03Luca Barbato 07master:c85aad9cb2af: doxy: Define a group for libswscale documentation
[13:19] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:688c3d944de3: Merge remote-tracking branch 'qatar/master'
[13:21] <saste> michaelni, ping on "ffserver: add stream Metadata option"
[13:40] <BBB> ubitux: 16x16 = 16, 8x8 = 13, yes
[13:41] <BBB> ubitux: will update comment
[13:41] <BBB> ubitux: bench coming up, sec
[14:02] <cone-938> ffmpeg.git 03Stefano Sabatini 07master:54c596fe7a5f: lavf/movenc: improve feedback in case of unsupported codec
[14:02] <cone-938> ffmpeg.git 03Stefano Sabatini 07master:2cfe70ff3eb9: lavf/movenc: return meaningful error code from mov_write_header()
[14:02] <cone-938> ffmpeg.git 03Stefano Sabatini 07master:66a703ea0168: ffprobe: only show tags when explicitly requested
[14:03] <BBB> 9.7 -> 9.3 sec total decoding time for ped1080p.webm, and cycle timings from ~4100 -> ~750 for non-dc-only and ~950 -> ~130 for dc-only
[14:03] <BBB> not bad, not bad
[14:04] <cone-938> ffmpeg.git 03Timothy Gu 07master:b242c156e5b9: examples/resample_audio: check av_samples_get_buffer_size() return code
[14:04] <cone-938> ffmpeg.git 03Timothy Gu 07master:c65fe9e9822c: examples/decoding_encoding: check av_samples_get_buffer_size() return code
[14:09] <BBB> ubitux: patch updated
[14:10] <BBB> give me a few minutes to test before I send it out
[14:15] <BBB> ubitux: patch sent
[14:16] <ubitux> BBB: did you upload ped1080p.webm somewhere?
[14:34] <BBB> no, I could
[14:34] <BBB> I probably should
[14:34] <BBB> I'll try to find some space somewhere
[14:35] <ubitux> mega.co.nz?
[14:36] <BBB> nah something that sounds legit
[14:36] <BBB> at 2 threads we beat libvpx already
[14:36] <BBB> not bad
[14:37] <BBB> and we're still massively c
[14:37] <BBB> (libvpx goes from 8.3 -> 6.9, we go from 9.3 -> 5.9)
[14:38] <ubitux> great
[14:39] <ubitux> i didn't know libvpx was threaded
[14:39] <BBB> threaded loopfilter and threaded tiles, afaik
[14:40] <BBB> so this one uses threaded loopfilter
[14:40] <ubitux> ok :p
[14:41] <BBB> anyway, I'll do other 16x16 subforms now, and then 32x32, and in between try to fix that parallelmode bug we both saw
[14:41] <BBB> and maybe that crasher you reported
[14:41] <ubitux> :)
[15:20] <BBB> I see basically all fixable runtime sitting in various loopfilter functions and 32x32 idct
[15:20] <BBB> so the todo list is quite easy
[16:57] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:017e234c204f: avcodec/vc1: fix mb_height for field pictures
[17:42] <BBB> ubitux: also slightly updated patch on my github without the FIXME SWAPs
[17:42] <BBB> I'll work on subidcts now, I don't feel like redoing the register ordering, it has no benefit
[17:42] <ubitux> :)
[17:44] <BBB> the subidcts have a real speed gain, in particular b/c in almost all cases the primary things that disappear are the slow VP9_UNPACK_MULSUB_2W_4X things
[17:45] <BBB> (it's not like they can be sped up, they just do very difficult stuff)
[17:45] <BBB> so if you can get rid of them in subidcts, it has massive benefit for the whole function
[17:45] <ubitux> is there a benefit of having the 2 passes in VP9_IDCT16_1D?
[17:45] <ubitux> they're at the end of the macro
[17:46] <ubitux> (btw, weird align in the comments of the pmulhrsw calls)
[17:47] <ubitux> (in "from 3 stages back" sry)
[17:51] <BBB> alignment fixed
[17:52] <BBB> the 2-pass is basically because they very specifically depend on integrating with the final piece of the idct, that is, they are very state-dependent
[17:52] <BBB> this piece for example:
[17:52] <BBB> SUMSUB_BA w, 6, 9, 15 ; t6, t9
[17:52] <BBB> SUMSUB_BA w, 7, 8, 15 ; t7, t8
[17:53] <BBB> that's actually the end of the 1d idct, interleaved with the writing out (pass 2) or transpose (pass 1) to make the whole thing more optimal
[17:53] <BBB> so I think here it helps; for the 8x8/4x4, what you did was better
[17:54] <BBB> new patch with alignment-fix on github btw
[17:56] <ubitux> ok
[17:56] <ubitux> +%macro VP9_STORE_2XFULL 6; dc, tmp1, tmp2. tmp3, tmp4, zero
[17:56] <ubitux> '.' instead of ','
[17:58] <ubitux> in that macro, i'm not sure, but would it be somehow possible to do the add post packuswb?
[17:59] <ubitux> so you would only do 2 add instead of 4
[17:59] <ubitux> might not be possible at all, just wondering
[18:01] <ubitux> BBB: btw, no way to put [pw_512] into a reg in the second pass?
[18:03] <ubitux> actually you might be a short in reg anyway :p
[18:04] <BBB> right, each reg means two movas to store/load it
[18:04] <BBB> dit fixed
[18:04] <BBB> dot*
[18:05] <BBB> add before instead of after, where? in VP9_STORE_2XFULL?
[18:05] <ubitux> yes
[18:05] <ubitux> like doing the add byte oriented or sth
[18:05] <ubitux> probably not a good idea :p
[18:05] <BBB> I think it can be in the [-255,255] range, so there's no way to be 8bit
[18:08] <ubitux> no other comment from me :p
[18:08] <ubitux> looks nice
[18:11] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:c9f72e4b81ae: avcodec/vc1dec: fix mby_start for interlaced content
[18:11] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:2224159c787e: avcodec/vc1: fix DIFF2/NORM2 with width<=16
[18:12] <BBB> michaelni: merge request github/rbultje/ffmpeg/vp9-simd?
[18:12] <BBB> michaelni: slightly updated version of the patch on the ml
[18:12] <ubitux> you still haven't added yourself to the copyright btw
[18:12] <BBB> oops
[18:13] <BBB> fixed
[18:13] <BBB> let's hope michaelni isn't mega-fast
[18:36] <cone-938> ffmpeg.git 03Ronald S. Bultje 07master:8d4c616fc05f: vp9/x86: idct_add_16x16_ssse3.
[18:36] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:2d50ebc20b61: Merge remote-tracking branch 'rbultje/vp9-simd'
[19:25] <cone-938> ffmpeg.git 03Yu Xiaolei 07master:20bc574b862b: build fix: apetag.c depends on img2.c
[19:38] Action: Daemon404 wonders how much faster ffmpeg is than libvpx now
[19:40] <BBB> Daemon404: 1 thread libvpx 8.3 sec we 9.3 sec, 2 threads libvpx 6.9 we 5.9 (IIRC)
[19:40] <BBB> Daemon404: libvpx is mostly simd'ed, we are not yet, so we're not yet faste
[19:40] <BBB> Daemon404: probably after simd is done, we'll be faster
[19:41] <Daemon404> ah
[19:41] <Daemon404> i seem to remember ffvp8 being much faster
[19:52] <smarter> Daemon404: BBB made libvpx fast ;)
[19:52] <BBB> ffvp8 was much faster
[19:53] <BBB> _after_ the simd was done
[19:53] <BBB> ffvp9's simd is not done, again
[19:53] <BBB> so it's not yet faster
[19:53] <Daemon404> yes
[19:53] <BBB> or you mean ffvp8 vs libvpx when you asked ffmpeg vs libvpx?
[19:59] <BBB> Daemon404: you have to realize that if for sth. like 16x16 idct, I can make the function 5x faster, then simd is going to have a massively disproportionate effect on total speed
[19:59] <BBB> so as long as they have disproportionally more simd than us, they will be faster
[20:00] <Daemon404> [18:53] <@BBB> or you mean ffvp8 vs libvpx when you asked ffmpeg vs libvpx? <-- just a quip
[20:00] <BBB> oh ic
[20:00] <Daemon404> its not a matter of if
[20:00] <Daemon404> but when
[20:00] <Daemon404> is what i meant
[20:00] <BBB> yeah
[20:00] <BBB> help write simd
[20:37] <kierank> smarter: do you know what they are doing about chroma siting for hevc interlaced?
[20:41] <BBB> ubitux: if you want more review fun - https://github.com/rbultje/ffmpeg/commits/vp9-simd
[21:01] <kierank> can someone pm me the ffmpeg incoming read username/password
[21:01] <kierank> i remember the username but not the host and username
[21:02] <kierank> i remember the password i mean
[21:02] <Daemon404> whats the poitn of having a non-anonymous place for sampels to be uploaded
[21:02] <Daemon404> thats the opposite of how everyone does it
[21:05] <nevcairiel> i always wanted to write a small upload webapp like the one vlc has, with some fancy logic inside to help organize them, since people always pick the worst one-click hosters
[21:05] <smarter> kierank: I don't know anything about interlaced stuff
[21:07] <JEEB> kierank, I thought it was shared with mplayer?
[21:07] <JEEB> and was anonymous
[21:07] <JEEB> ftp://upload.mplayerhq.hu/incoming/
[21:08] <JEEB> at least I remember the ffmpeg trac at one point pointing you towards mplayer for the bigger samples
[21:48] <rcombs> https://trac.ffmpeg.org/ticket/1582#comment:11 <-- anyone know what the details of full_chroma_int in swscale are, and whether or not that can be made the default behavior?
[21:49] <cone-938> ffmpeg.git 03Timothy Gu 07master:96093fe18024: Changelog: correct typo
[21:49] <cone-938> ffmpeg.git 03Stefano Sabatini 07master:de9ea40a40d8: doc/bitstream_filters: remove mp3_header_decompress filter
[21:50] <ubitux> wut?
[21:50] <ubitux> saste: no?
[21:50] <ubitux> BBB: will look :)
[21:51] <saste> ubitux, the commit was wrong
[21:51] <ubitux> saste: 75ec40b083ff40655a81c709ba5c9d867b2ed8a4
[21:51] <ubitux> The decompress filter is left in place for interoperability and support of
[21:51] <ubitux> files that used the compress filter.
[21:52] <saste> ubitux, s/decompress/compress
[21:52] <saste> compress is gone
[21:52] <ubitux> ok
[21:52] <ubitux> commit message was wrong, ok
[21:52] <saste> I mean the commit message was wrong
[21:57] <wm4> rcombs: usually the trade off is speed vs. quality
[21:57] <wm4> but I think nobody really knows what swscale does and when it can subtly fail at something
[21:58] <wm4> so bugs are possible too
[21:58] <wm4> when using "obscure" flags
[21:59] <rcombs> the behavior without it when doing 10bit->8bit is quite bad
[22:00] <nevcairiel> and the speed with it is quite low :D
[22:01] <rcombs> ouch
[22:01] <nevcairiel> well test for yourself, it usually works fine, but has quite the performance hit
[22:01] <wm4> the main use for libswscale was traditionally as conversion filter for playback in mplayer AFAIK
[22:01] <wm4> so the defaults are tuned for speed
[22:01] <rcombs> I'd only tried it for single screenshots
[22:02] <nevcairiel> there are reasons why i wrote my own conversions for most common cases ;)
[22:03] <rcombs> so, could improve either the +full_chroma_int option's speed, or the default's quality
[22:04] <rcombs> I'm not sure which would be easier, but defaulting to discoloring the output seems like a Bad Thing"
[22:07] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:892562e9218b: avformat/ipmovie: Check OPCODE_CREATE_TIMER size
[22:12] <wm4> rcombs: well, every application is supposed to supply its own flags
[22:12] <wm4> until recently, it even had it explicitly select a scaler algorithm
[22:15] <cone-938> ffmpeg.git 03Diego Biurrun 07master:5db4e88ecd32: configure: Detect Solaris libc in an OpenIndiana/illumos compatible way
[22:15] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:f357ef2e0ce1: Merge commit '5db4e88ecd32485341f6150c00f5ee5bfa74f62d'
[22:15] <rcombs> well, is anyone against using full_chroma_int in ffmpeg by default?
[22:17] <kierank> what does that thing even do
[22:17] Action: kierank has never known
[22:17] <nevcairiel> swscale is so brilliant that it converts everything to yuv420, even if it just has to convert from yuv444 10-bit to yuv444 8-bit, it can go through 420 if that flag is not set :p
[22:17] <rcombs> kierank: I'm not sure of the specifics of it, but I know it changes the behavior of dithering so there's significantly less banding and fixes discoloration of the output
[22:18] <rcombs> not sure how, and now I know it has negative performance implications
[22:18] <nevcairiel> ie without the flag, it can drop lots of chroma information
[22:18] <rcombs> nevcairiel: that's incredibly shitty
[22:18] <kierank> I've always cargo culted and set the flag
[22:18] <rcombs> in my test, the input was yuv420p10LE
[22:18] <nevcairiel> not sure what happens in that case
[22:19] <nevcairiel> whats the output?
[22:19] <wm4> rcombs: welcome to swscale
[22:19] <wm4> everyone seems to hate it
[22:19] <rcombs> rgb24
[22:19] <nevcairiel> ah ok, if you enable the flag, it disables the cheap-but-fast "optimized" yuv->rgb converter
[22:19] <wm4> but it does all-to-all conversions from >100 pixel formats, so it's kind of... needed
[22:19] <nevcairiel> it feels like it does point scaling of chroma
[22:19] <wm4> *for
[22:20] <nevcairiel> with the flag it does proper upscaling of the chroma
[22:20] <nevcairiel> which is SIMD optimized, but still slower
[22:25] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:417927af3c99: hdsenc: Avoid integer overflow
[22:25] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:b8ed15d6378f: hdsenc: Fix an off by one error in an array size check
[22:25] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:797f2a791397: hdsenc: Check the init_file() return code
[22:25] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:6659364d3a47: Merge commit '797f2a791397210ec1b591b326658805c5dbf104'
[22:37] <BBB> nevcairiel is right, by default swscale converts everything to 420 _when scaling_
[22:38] <BBB> "scaling" is a very obscure term here and doesn't mean what you think it means
[22:38] <BBB> it basically means "there is no direct conversion function OR the input/output sizes are not identical"
[22:39] <BBB> and yes this means that when you go from 320x240 444 to 640x480 420, it will actually downscale the chroma to 160x120 420 before upscaling it back up to 320x240 for the 640x480:420
[22:39] <wm4> wow
[22:39] <BBB> unless you set full_chroma_int
[22:39] <BBB> or actually this is full_chroma_inp
[22:39] <BBB> int=interpolation, inp=input
[22:40] <cone-938> ffmpeg.git 03Martin Storsjö 07master:6451c8853a07: sdp: Check theora colorspace before producing the configuration string
[22:40] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:09c13ff7bdbc: Merge commit '6451c8853a07ff2e28bda950fb5e83fcf88c5cf4'
[22:40] <BBB> lots of fun
[22:40] <JEEB> the lovely swscale
[22:40] <BBB> \o/
[22:41] <BBB> swscale is a great conversion lib for any-to-any... it has some good conversions for old colorspace formats, like rgb<>yuv
[22:41] <BBB> but it hasn't kept up with more revent developments like 444 or >8bpp
[22:41] <BBB> so when you use these kind of things, it basically kinda sucks
[22:41] <JEEB> I wonder how swscale compares to avery lee's similar thing
[22:41] <JEEB> which also tries to support most paths without having too many specific paths
[22:42] <wm4> what happened to the thing used with vapoursynth?
[22:42] <JEEB> still around
[22:43] <wm4> does it suck or rock?
[22:44] <JEEB> relatively OK, still doesn't have too many generic paths IIRC
[22:45] <Daemon404> the development methodology is shit
[22:45] <Daemon404> "post zips on doom9"
[22:45] <JEEB> yes
[22:45] <wm4> lol
[22:45] <wm4> I wonder why people are doing this
[22:45] <wm4> git is so much more convenient
[22:45] <Daemon404> because it is doom9
[22:45] <Daemon404> and theve been doing it that way for decade+ and the hate change
[22:45] <Daemon404> why do you think people are sitll forking avs (see: avs+)
[22:46] <wm4> meh
[22:46] <Daemon404> a lot of the vapoursynth hate is "but it's not avisynth"
[22:47] <Timothy_Gu> TBH *synth is a lot more powerful than lavfi
[22:48] <Timothy_Gu> And easier to use.
[22:48] <wm4> heh
[22:48] <Timothy_Gu> I still can't manage deinterlacing my old DVD using FFmpeg yadif... (yeah i know it should go into #ffmpeg)
[22:49] <wm4> hm, yadif usage should be pretty simple, though
[22:49] <nevcairiel> some old dvds are especially evil though
[22:51] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:9aba0a6f7b3d: rtpdec_h264: Check the return value of functions doing allocations
[22:51] <cone-938> ffmpeg.git 03Michael Niedermayer 07master:12e81041200d: Merge remote-tracking branch 'qatar/master'
[23:05] <Daemon404> Timothy_Gu, people use yadif because it can do realtime
[23:05] <Daemon404> not because it is good
[23:15] <Timothy_Gu> saste: why do you do http://git.videolan.org/?p=ffmpeg.git;a=commit;h=de9ea40a40d8324c44ba9c20275a788954f701d4 ?
[23:16] <Timothy_Gu> saste: see http://git.videolan.org/?p=ffmpeg.git;a=commit;h=75ec40b083ff40655a81c709ba5c9d867b2ed8a4
[23:16] <Timothy_Gu> "The decompress filter is left in place for interoperability and support of files that used the compress filter."
[00:00] --- Sun Dec 15 2013
More information about the Ffmpeg-devel-irc