[Ffmpeg-devel-irc] ffmpeg-devel.log.20131026

burek burek021 at gmail.com
Sun Oct 27 02:05:02 CEST 2013


[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/0.10:558c1f35fa09: avcodec/h264: reduce noisiness of "mmco: unref short failure"
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/0.11:d49761b39643: avcodec/h264: reduce noisiness of "mmco: unref short failure"
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/0.9:ff3e385d849e: avcodec/h264: reduce noisiness of "mmco: unref short failure"
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/1.0:946815aa0974: avcodec/h264: reduce noisiness of "mmco: unref short failure"
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/1.1:a4b705b4cbb5: avcodec/h264: do not trust last_pic_droppable when marking pictures as done
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/1.1:8e72a8d1c278: avformat/mp3dec: perform seek resync in the correct direction
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/1.1:5bce35d9581c: avcodec/h264: reduce noisiness of "mmco: unref short failure"
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/1.2:a8b6721bedce: avcodec/h264: do not trust last_pic_droppable when marking pictures as done
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/1.2:833dce3818e3: avformat/mp3dec: perform seek resync in the correct direction
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/1.2:9195ef6f65cb: avcodec/h264: reduce noisiness of "mmco: unref short failure"
[01:30] <cone-822> ffmpeg.git 03Stefano Sabatini 07release/2.0:17d169ce0f07: doc/Makefile: fix man pages uninstall path
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/2.0:cd7d575e90ce: avcodec/h264: do not trust last_pic_droppable when marking pictures as done
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/2.0:b7154758de3f: avformat/wavdec: Fix smv packet interleaving
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/2.0:94c7ee4d9e03: avformat/mp3dec: perform seek resync in the correct direction
[01:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/2.0:782331be1eb2: avcodec/h264: reduce noisiness of "mmco: unref short failure"
[01:58] <cone-822> ffmpeg.git 03Michael Niedermayer 07master:3c9dd93faa9f: h264: make flush_change() set mmco_reset
[02:30] <cone-822> ffmpeg.git 03Michael Niedermayer 07master:780669ef7c23: avcodec/jpeg2000dec: non zero image offsets are not supported
[02:46] <cone-822> ffmpeg.git 03Kieran Kunhya 07master:4d6ee0725553: libavutil: x86: Add AVX2 capable CPU detection.
[02:46] <cone-822> ffmpeg.git 03Kieran Kunhya 07master:865b70bc5d1c: Add AVX2 capable CPU detection. Patch based on x264's AVX2 detection
[02:46] <cone-822> ffmpeg.git 03Michael Niedermayer 07master:a66570440215: Merge commit '4d6ee0725553a43ba88d6f8327ebcf8f1c5ae8d4'
[02:46] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/1.1:69603724750b: h264: make flush_change() set mmco_reset
[02:46] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/1.2:720e2d4143b5: h264: make flush_change() set mmco_reset
[02:47] <cone-822> ffmpeg.git 03Michael Niedermayer 07release/2.0:1d0e58372885: h264: make flush_change() set mmco_reset
[03:01] <BBB> ubitux: DEFINE_ARGS is just to define the GPRs (i.e. things like dst, stride)
[03:01] <BBB> ubitux: r0, r1 etc.is the "common" names therefore, and we can alternatively name them using DEFINE_ARGS first, second
[03:01] <BBB> first will refere to r0, second to r1, etc
[03:01] <BBB> e.g. DEFINE_ARGS dst, stride makes dst r0, stride r1, etc.
[03:02] <BBB> ubitux: so you can refer to them by name instead of register, similar to how you do it in C
[03:02] <BBB> ubitux: STORE_DIFFx2 is to unpack a series of bytes, unpack to words, add the diff, pack, store
[03:03] <BBB> ubitux: VP9_MULTIPLY_SUMSUB looks good, I have a slightly alternative version that does approximately the same thing, I can share that later (it's a litle more interleaved, but essentially identical in function)
[03:04] <cone-822> ffmpeg.git 03Alex Converse 07master:adea4512c608: aacdec: Fix calls to avpriv_report_missing_feature().
[03:04] <cone-822> ffmpeg.git 03Alex Converse 07master:4e326ec76991: fate: aac: Add test for AAC-ELD
[03:04] <cone-822> ffmpeg.git 03Michael Niedermayer 07master:8c508a035428: Merge commit 'adea4512c6087280702e2423de55cea050e20a98'
[03:04] <cone-822> ffmpeg.git 03Michael Niedermayer 07master:7e19c549bafa: Merge remote-tracking branch 'qatar/master'
[03:10] <BBB> ubitux: looks mostly good, but you don't need the second transpose (that's a fix from vp8 to vp9; see rants from jason about that in vp8)
[03:11] <BBB> ubitux: also store_diff, you should be able to use m0, m1, m2 and m3 instead of your current more random ordering (tranpose does the swap for you)
[09:37] <ubitux> BBB: oh ok, i was doing the second transpose because i wasn't understanding the way the store was working
[09:37] <ubitux> and yeah store diff args are broken for now, it was purely random
[09:38] <ubitux> BBB: i understood DEFINE_ARGS as a "redefine function's argument names"
[09:39] <ubitux> about interleaving the content of the MULTIPLY_SUMSUB, i assume it's because it's faster?
[09:39] <ubitux> well, i should be done by today hopefully
[10:39] <cone-25> ffmpeg.git 03Lukasz Marek 07master:c6c70c2bf76c: lavd/pulse_audio_enc: add another default to stream name
[10:39] <cone-25> ffmpeg.git 03Lukasz Marek 07master:c4281705492a: lavd/pulse_audio_enc: avoid vars in for()
[10:39] <cone-25> ffmpeg.git 03Michael Niedermayer 07master:0feecb62abab: Merge remote-tracking branch 'lukaszmluki/master'
[12:27] <cone-25> ffmpeg.git 03Michael Niedermayer 07master:6889b78fe038: doc/issue_tracker: add 2 missing issue types
[12:27] <cone-25> ffmpeg.git 03Michael Niedermayer 07master:fcd08b777059: avformat/md5enc: add format, version and column headers
[12:32] <BBB> ubitux: well vp8 does the second transpose, but vp9 doesn't need it (it inverts the order of the 1d idcts, so it's transpose;1d;transpose;1d, like what h264 and any other video codec in the world does, instead of vp8's 1d; transpose; 1d; transpose) - but the transpose before the first 1d is integrated with the coeff reader so simd doesn't have to do it (i.e. it's free)
[14:02] <cone-25> ffmpeg.git 03Marton Balint 07master:060c42bc3dff: ffplay: update and extend documentation for channel and stream switching
[14:02] <cone-25> ffmpeg.git 03Marton Balint 07master:2d059d8de1b4: ffplay: factor out picture freeing code
[14:02] <cone-25> ffmpeg.git 03Marton Balint 07master:04de0e04c5d0: ffplay: use av_frame_get_pkt_pos instead directly accessing pkt pos
[14:02] <cone-25> ffmpeg.git 03Marton Balint 07master:44758b4d17f6: ffplay: add support for libswresample options
[14:02] <cone-25> ffmpeg.git 03Michael Niedermayer 07master:444ce03f0f15: Merge remote-tracking branch 'cus/stable'
[14:17] <ubitux> BBB: ok
[14:17] <ubitux> BBB: infinitely faster like SWAP then @_@
[14:20] <BBB> indeed ;)
[14:24] <BBB> I'm really wasting too much time on irrelevant stuff sometimes...
[14:24] <BBB> anyway
[14:24] <BBB> (see email to list)
[14:25] <ubitux> :)
[14:26] <cone-25> ffmpeg.git 03Lukasz Marek 07master:99a4c86a32dc: configure: link with built libs when pc-uninstalled is used
[15:16] <cone-25> ffmpeg.git 03Michael Niedermayer 07master:41efb8d9a75f: avcodec/x86/cabac: include get_cabac_bypass_sign_x86() under #if !BROKEN_COMPILER
[15:16] <ubitux> ah shit i've constructed them in reverse..
[15:55] <ubitux> BBB: it seems i need to reverse all the word post idct before the add
[15:55] <ubitux> any idea how i could avoid this?
[15:55] <BBB> ?
[15:55] <BBB> no you don't
[15:55] <BBB> you mean you're looking at them in a debugger and they seem inversed?
[15:55] <ubitux> yes
[15:55] <BBB> that's just your debugger
[15:56] <BBB> :-p
[15:56] <ubitux> well, they're inverted in comparison to the content i'm adding to
[15:56] <ubitux> reg + reg
[15:56] <BBB> show me the code?
[15:56] <ubitux> one is in the wrong order
[15:56] <BBB> and the gdb output
[15:57] <cone-25> ffmpeg.git 03Michael Niedermayer 07master:e0b2bdd37a01: avcodec/h264_parser: heuristically detect non marked keyframes
[15:57] <ubitux> code is mostly unchanged from yesterday;
[15:57] <ubitux> then in the store it reaches:
[15:57] <ubitux> => 0xb03d91 <ff_gdb_break+21>:	paddw  mm6,mm3
[15:57] <ubitux> 69: $st3 = -nan(0xfffeffff00000000)
[15:57] <ubitux> 66: $st6 = -nan(0x7e007e007f0080)
[15:57] <ubitux> and it should be the other way around for one of the 2 reg
[15:58] <ubitux> st6 is the current destination, 16bit padded with 0
[15:58] <ubitux> st3 is the result of the second idct
[15:59] <ubitux> just updated my branch
[15:59] <BBB> you didn't remove the second transpose
[16:00] <BBB> ah now it's gone
[16:00] <ubitux> BBB: i did, just updated :p
[16:01] <BBB> so what is st3 and st6?
[16:01] <BBB> and why are the coeffs inverted between the 2d idcts?
[16:02] <BBB> that doesn't seem right
[16:02] <ubitux> so st3 is the line 0 out of the 2nd idct
[16:02] <ubitux> and st6 is the first line of the dest buffer, 16b padded
[16:02] <BBB> you haven't loaded the dst buffer yet
[16:02] <BBB> that happens in STORE_DIFFx2
[16:03] <ubitux> yes i'm talking about the exec in STORE_DIFFx2
[16:03] <BBB> ah ok
[16:03] <ubitux> which i'm tracing
[16:03] <BBB> can you show the disassembly for the whole function?
[16:03] <ubitux> sure
[16:04] <BBB> I still think you inverted the coeffs between the 2 1d idcts
[16:04] <BBB> t2/t3
[16:04] <BBB> that might well explain this, since they are merely different in sign
[16:05] <ubitux> BBB: http://pastie.org/8432471
[16:05] <ubitux> BBB: i needed to do that to get correct result, will recheck
[16:08] <ubitux> yeah, it doesn't do the correct mult if i don't toggle them
[16:09] <BBB> how do you know it's incorrect?
[16:09] <BBB> look, I know for sure that the order should be identical between the two 1ds :)
[16:09] <BBB> I don't know which one is wrong, but I'll check ;)
[16:10] <ubitux> i'm comparing the outputs after each idct
[16:10] <ubitux> i have a C trace in parallel
[16:10] <BBB> ok... I'll check also
[16:10] <BBB> not sure
[16:10] <BBB> weird
[16:12] <ubitux> i checked the resulting registers after the 2 idct, they're correct, just reverted
[19:11] <ubitux> BBB: mmh i think i didn't notice it because my test case had 2 lines full of zero
[19:11] <BBB> ?
[19:11] <BBB> notice what?
[19:11] <ubitux> the bug i have
[19:12] <BBB> oh so it's fixed?
[19:12] <ubitux> didn't notice earlier in the writing of the function
[19:12] <ubitux> not yet but i'm tracing it
[19:12] <ubitux> the input block of my test case has 2 lines of zero and the code worked with that input
[19:12] <ubitux> but trying with another input, i get differences
[19:12] <BBB> ok
[19:12] <ubitux> i think i know what i messed up
[19:13] <BBB> :)
[19:19] <kurosu_> some times, it's nice to extract the idct code into a stand-alone, and feed it purely vertical coeffs, then purely vertical ones.
[19:20] <ubitux> last i did such thing, i messed up my prng
[19:20] <ubitux> i had some non visible cycles 
[19:20] <ubitux> and all kind of weird regular zeros in the output
[19:20] <ubitux> i thought it was buggy, while it was in fact perfectly correct :))
[19:28] <cone-25> ffmpeg.git 03Michael Niedermayer 07master:d2db1bb7dea1: avformat/http: dont fail with unknown Content-Encodings
[19:43] <BBB> there's this code in x86util.asm which I don't understand
[19:43] <BBB> %elif cpuflag(3dnow)
[19:43] <BBB>     movq      %1, %2
[19:43] <BBB>     psrlq     %1, 32
[19:43] <BBB>     punpckldq %1, %2
[19:43] <BBB> %endif
[19:44] <BBB> what is 3dnow about that, and why is there no else clause?
[19:45] <BBB> and there's all these macros that are only used in 1 place
[19:45] <BBB> this is so sadly unmaintained... :/
[19:59] <ubitux> default output for display $st[0-9] in gdb sucks so much it's insane
[20:00] <ubitux> why the default are so much broken in that debugger...
[20:00] <ubitux> no wonder after years of coding i still prefer printf over gdb for any C debugging
[20:14] <mathstuf_> ubitux: lldb is supposed to be better
[20:35] <ubitux> i'll consider then
[20:35] <ubitux> anyway, found my bug
[21:16] <durandal_1707> nice, more breakings from dark side
[21:40] <ubitux> yay it works
[21:41] <BBB> ubitux: cool
[21:41] <BBB> ubitux: don't forget to write a dc-only (and possibly a 2x2) version
[21:41] <BBB> these are simpler but will provide further speedups
[21:42] <BBB> i.e. eob == 1 and eob <= 3
[21:42] <ubitux> let me first clap my hands in joy
[21:42] <ubitux> BBB: can you detail those cases?
[21:43] <BBB> if eob == 1, only block[0] is filled in
[21:44] <durandal11707> new release, when?
[21:44] <BBB> this is a special case, because all residual values will have the same value
[21:44] <BBB> so you only need 2 multiplications
[21:44] <BBB> basically out[n] (for any n) = pmulhrsw(pmulhrsw(block[0], 11585x2), 11585x2)
[21:46] <BBB> ubitux: and once you know out[n], just pshufw that over the register, and add it to each dst[0-3]+stride[0-3]
[21:46] <BBB> ubitux: that's all
[21:46] <BBB> ubitux: (i.e. much less instructions, and thus faster)
[21:47] <BBB> ubitux: the 2x2 is eob <= 3 and is halfway in between, block[0, 1 and 4] are != 0, so you do a 1d idct only for the top half of the block
[21:48] <BBB> ubitux: n. of instructions is halfway between dc-only and the full (which you just wrote)
[21:49] <ubitux> ok, i'll consider this
[21:50] <ubitux> BBB: do you have a trick for the final rounding?
[21:50] <michaelni> durandal11707, soon
[21:50] <ubitux> BBB: since i'm adding 8 to the 4 reg
[21:50] <ubitux> i'm guessing this could be put somewhere else transparently
[21:53] <BBB> ubitux: pmulhrsw :)
[21:53] <BBB> ubitux: and just don't use STORE_DIFFx2, just write your own, that's fine
[21:54] <ubitux> why i shouldn't use it?
[21:55] <BBB> ubitux: pmulhrsw(x, 2048) is the same as +8>>4
[21:55] <BBB> so just use that
[21:55] <ubitux> ah
[21:55] <ubitux> mmh
[21:55] <BBB> so it's one instruction instead of two
[21:55] <BBB> and you're already ssse3 so that's not an issue
[21:55] <ubitux> ok
[21:56] <ubitux> BBB: while i'm at it, any idea why there was a DEFINE_ARGS and lea instead of just calling STORE_DIFFx2, lea, STORE_DIFFx2?
[21:56] <ubitux> (in vp8, i copied that part)
[21:57] <BBB> just so the reg has a name
[21:57] <BBB> but really there's no reason
[21:57] <BBB> so don't feel obliged to use it
[21:57] <ubitux> ah, okay
[21:57] <BBB> vp8 isn't perfect
[21:57] <BBB> it was pretty good, but it was also the first asm I ever wrote
[21:57] <BBB> so it's certainly got some beginner's issues
[21:58] <ubitux> ok :)
[21:59] <cone-25> ffmpeg.git 03Vittorio Giovara 07master:b284e1ffe343: mem: do not check for negative size
[21:59] <cone-25> ffmpeg.git 03Michael Niedermayer 07master:3fcc2684e49b: Merge commit 'b284e1ffe343d6697fb950d1ee517bafda8a9844'
[22:04] <cone-25> ffmpeg.git 03Anton Khirnov 07master:834259528b6c: fft-test: add a missing #include
[22:04] <cone-25> ffmpeg.git 03Michael Niedermayer 07master:c78a41698558: Merge remote-tracking branch 'qatar/master'
[22:08] <ubitux> BBB: is the eob relevant only for 4x4?
[22:11] <iive> eob as end of block, indicating the last non-zero coefficient?
[22:18] <llogan> burek: gusari seems to be having issues
[22:19] <llogan> BBB: did you post anything about katmai? i haven't seen it bu tit might be laziness on my part
[22:19] <llogan> *but it
[22:19] <llogan> see
[22:22] <llogan> burek: nevermind. now it's back. sorry for the noise.
[22:23] <burek> llogan, thanks
[22:23] <burek> i just investigated and killed a spammer :/
[22:23] <burek> they never sleep :S
[22:24] <llogan> i probably have 5 spammer pelts from the forum
[22:24] <llogan> but nobody buys these
[22:24] <burek> thats a hosting machines with a lot of websites on it
[22:25] <burek> and a lot of newbee webadmins..
[22:25] <burek> you just reported that at the perfect time
[22:25] <burek> for me to catch the spammer in the action :)
[22:25] <burek> thanks again :beer: :)
[22:25] <llogan> beer... i had too much last night and it's early here...
[22:27] <burek> :)
[22:31] <michaelni> ubitux, did you had any reason why you didnt apply "Clément BSsch   (4.9K) [FFmpeg-devel] [PATCH] build: make sure probed tools don't wait for stdin input." ?
[22:32] <ubitux> i remember there was a comment, and so i didn't bother look again at the issue
[22:32] <ubitux> it looked relevant at first, but i didn't allocated time for it so far, sorry
[22:33] <ubitux> feel free to push if you think it's ok
[22:34] <michaelni> i thought but it doesnt work anymore :(
[22:35] <ubitux> ah? well i guess i need to look again
[22:36] <ubitux> but not now, as you can see i'm having a lot of fun with asm recently :]
[22:37] <michaelni> sure, no hurry
[22:43] <BBB> ubitux: no, all of them
[22:43] <BBB> ubitux: also, eob > 0 (by definition)
[22:43] <BBB> ubitux: so the !!eob is sort of silly (it's always 1)
[22:44] <ubitux> you were talking about eob < 3
[22:44] <BBB> ubitux: <=
[22:45] <ubitux> and so !!eob was required 
[22:45] <BBB> ?
[22:45] <BBB> why don't you assert(!!eob == 1) in there
[22:46] <BBB> or assert(!!eob);
[22:46] <BBB> (just for fun)
[22:46] <ubitux> im not following you, we might not be talking about the same thing
[22:46] <BBB> I guess... what did you mean?
[22:46] <ubitux> i was refering to the memset(block, 0, size * (!!eob)) i did in another commit
[22:47] <BBB> right, that's not necessary
[22:47] <BBB> !!eob is always 1
[22:47] <ubitux> !!eob, but not eob
[22:47] <BBB> right
[22:47] <ubitux> so you can't size * eob
[22:47] <BBB> right
[22:47] <ubitux> but maybe that's not what you suggest?
[22:47] <ubitux> :D
[22:48] <BBB> no not really
[22:48] <BBB> just replace that with the original code
[22:48] <BBB> what I suggest is this:
[22:48] <BBB> if (eob == 1) { special code } else if (eob <= 3) { special code } else { your current assembly code }
[22:49] <BBB> this if can be done in the c code in x86/vp9dsp_init.c
[22:49] <BBB> special code for eob == 1 does this: mm0 = pshufw(pmulhrsw(pmulhrsw(block[0], 11585x2), 11585x2), q0000)
[22:49] <ubitux> ah, sure
[22:49] <BBB> then do STORE_DIFFx2 with that mm0
[22:49] <ubitux> okay :)
[22:49] <BBB> for eob <= 3, it's a little harder
[22:50] <BBB> also, you only need 1 or 2 movd [mem], zeroreg instead of 4
[22:50] <ubitux> i'm rewriting the store thing currently, trying to put the pmulhrsw trick in
[22:50] <BBB> cool
[22:50] <ubitux> i'll look at the eob thing after
[22:50] <BBB> ok cool
[22:51] <BBB> it's not urgent, it's not a bug
[22:51] <BBB> it's just another speedup possibility
[22:54] <BBB> are your coefficients now correct?
[22:54] <BBB> (i.e. same for both 1d dimensions)
[22:56] <ubitux> BBB: yes everything is correct
[22:56] <ubitux> it passes fate etc
[22:56] <ubitux> and yes i could put them in the correct order now ;)
[22:56] <ubitux> my branch is up-to-date if you want to hf with it :p
[23:00] <BBB> so ... if the arguments to a macro are always the same
[23:00] <BBB> feel free to remove the arguments
[23:01] <BBB> (e.g. in my 32x32, I just have a macro called IDCT32_1D)
[23:01] <BBB> (with no arguments)
[23:01] <BBB> oh actually, that's a lie, I have one argument, but you don't need that
[23:01] <ubitux> i thought we could do some factoring :(
[23:01] <BBB> up to you
[23:02] <BBB> if you think it can be improved, then that's totally fine
[23:02] <BBB> if you don't thik this will change, it may be cleaner to remove
[23:02] <BBB> up to you
[23:02] <ubitux> i need to know how the other function will look like
[23:03] <BBB> which one?
[23:05] <ubitux> any other simd idct 
[23:05] <BBB> well you asked me to not put my code up so I didn't ;)
[23:06] <ubitux> haha yeah
[23:06] <ubitux> let me get done with that first one first :)
[23:06] <BBB> right
[23:27] <ubitux> BBB: did you finish it btw?
[23:30] <BBB> no, I rarely have free time in an environment that allows concentration for this kind of stuff
[23:30] <BBB> I continuously have screaming kids around me
[23:30] <BBB> it's virtually impossible to get any coding done :/
[23:31] <ubitux> teach them simd
[23:31] <ubitux> if i could somehow learn it, they certainly can as well
[23:32] <BBB> they're kind of young maybe
[23:32] <BBB> the youngest can't walk yet
[23:32] <BBB> I don't think he's quite ready for simd
[23:33] <ubitux> don't underestimate them
[23:34] <ubitux> anyway, i like this a lot, but it's as time consuming as reversing code
[23:34] <ubitux> i hope to get faster somehow :p
[23:35] <BBB> you'll get faster
[23:35] <ubitux> assuming i can hardly get slower, sure
[23:35] <BBB> it's a skill, you're still learning :)
[23:35] <BBB> don't feel bad
[23:35] <BBB> you're doing fine
[23:36] <BBB> try to work on avx-readiness
[23:37] <BBB> that's one thing that's fun for future
[23:37] <BBB> basically whenever you do mova mX, mY; think avx
[23:37] <BBB> avx allows 3-register instructions
[23:37] <BBB> like punpcklbw m0, m1, m2
[23:37] <BBB> think of that as mova m0, m1; punpcklbw m0, m2
[23:37] <BBB> so m0 is a dest only, and m1 and m2 are both unmodified sources
[23:37] <BBB> in avx, that's a single instruction
[23:37] <kierank> mova mX, [foo] as well
[23:38] <BBB> right
[23:39] <ubitux> ow
[23:39] <BBB> examples: line 101, 108, 121
[23:39] <ubitux> yeah i don't like those
[23:39] <BBB> lol :)
[23:39] <BBB> anyway
[23:39] <ubitux> especially L101
[23:39] <BBB> right
[23:39] <BBB> so avx doesn't work with mmx, so it won't affect your code
[23:40] <BBB> but it does work with xmm, so if you want to do the 8x8 after this, it's very relevant
[23:40] <ubitux> unfortunately, i might have one of the rare i7 without avx
[23:40] <ubitux> :D
[23:40] <BBB> so do I
[23:40] <BBB> kierank has a test machine
[23:40] <BBB> obe2 or so
[23:40] <ubitux> i have a remote one
[23:40] <kierank> i need to get one with avx2
[23:40] <ubitux> i think it has avx
[23:40] <BBB> ah cool
[23:40] <BBB> the new macbook pro has avx2/haswell
[23:41] <BBB> so I'm thinking of getting that one
[23:41] <kierank> I have a machine but nowhere to colo it :(
[23:41] <ubitux> yeah my remote one has avx
[23:41] <ubitux> not avx2 though
[23:45] <ubitux> BBB: btw, i'm really happy i could use SWAP :
[23:51] <BBB> thank pengvado, I think he designed all the original stuff in x86inc.asm
[00:00] --- Sun Oct 27 2013


More information about the Ffmpeg-devel-irc mailing list