[Ffmpeg-devel-irc] ffmpeg-devel.log.20190530

Fri May 31 03:05:04 EEST 2019

[00:33:53 CEST] <mkver> Does anybody know a system where an attempted read via AV_RNxxA (xx = 32 or 64) at a nonaligned position actually crashes?
[00:35:47 CEST] <mkver> I'm asking because I improved (so I think) the search for 0x00 0x00 0x01 startcodes.
[00:36:24 CEST] <mkver> In particular, I added an implementation that does not require HAVE_FAST_UNALIGNED.
[00:37:02 CEST] <mkver> It uses aligned reads and I want to make sure that the way I establish alignment works fine.
[00:37:19 CEST] <mkver> Because I have no system to test this on.
[00:37:33 CEST] <nevcairiel> ARM has issues with  that
[00:37:44 CEST] <mkver> If anybody is interested: You can find the commits here: https://github.com/mkver/FFmpeg/commits/start_3
[00:38:07 CEST] <mkver> I unfortunately have only an X64 to test this on. Could you test this?
[00:38:35 CEST] <mkver> Notice that ARM has a different implementation of the startcode-search function alltogether.
[00:39:33 CEST] <Lynne> I think the issue with alignment on arm was that it was just slow, and that was just on older processors
[00:39:55 CEST] <nevcairiel> There is definitely systems were using RNxxA on unaligned data fails
[00:40:03 CEST] <nevcairiel> we had some FATE failures not too long  ago due to that
[00:40:14 CEST] <mkver> The ARM-specific function would need to be disabled (in libavcodec/arm/h264dsp_init_arm.c and libavcodec/arm/vc1dsp_init_arm.c).
[00:40:43 CEST] <Lynne> was it the compiler's fault? I think on such systems gcc handled paging wrt reads and writes
[00:42:16 CEST] <mkver> Good to hear of these failures. Could someone tests these patches on ARM (or other hardware with slow (or none) unaligned access)?
[00:42:38 CEST] <nevcairiel> I don't think the compiler can know if a pointer is necessarily going to be misaligned if you just cast a byte pointer to 32-bit or 64-bit types
[00:43:16 CEST] <nevcairiel> which is what the RN32A macro does, for example
[00:44:07 CEST] <mkver> I don't just cast a byte pointer to 32 or 64 bit types.
[00:44:23 CEST] <nevcairiel> well if you use RNxxA, it'll do that for you :D
[00:44:44 CEST] <mkver> My way of aligning a pointer is ((const uint8_t *)((uintptr_t)(buf + (mod) - 1) / (mod) * (mod))), where mod = 4 or 8.
[00:45:08 CEST] <mkver> The resulting pointer is then used in RNxxA
[00:45:59 CEST] <nevcairiel> fast_unaligned is really enabled on all systems that really matter, anyhow
[01:07:09 CEST] <jkqxz> 32-bit ARM can be configured to SIGBUS on unaligned access.
[01:07:53 CEST] <jkqxz> By default on Linux it traps to the kernel which fixes up the result (which is painfully slow if you do it much), but there is a setting to change that.
[01:08:08 CEST] <jkqxz> What test would you want to run?
[01:28:04 CEST] <mkver> Apply the patches in the linked branch and simply run ffmpeg -i <some H.264 elementary stream> -c copy -f null - and check whether you get the SIGBUS error.
[01:28:35 CEST] <mkver> And maybe change -f null - with a framehash muxer and compare the result to the current git head.
[01:29:56 CEST] <mkver> (And remember to deactivate the ARM-specific startcode function in libavcodec/arm/h264dsp_init_arm.c.)
[01:33:09 CEST] <nevcairiel> so if arm32 has an ASM variant, and arm64 and x86 have fast_unaligned, what systems is that for? :D
[01:34:05 CEST] <mkver> I don't know. Maybe mips?
[01:34:23 CEST] <nevcairiel> I suppose so, but seems like a lot of work without a known target :)
[01:34:29 CEST] <mkver> I thought that I could optimize it, so I did.
[01:36:06 CEST] <jkqxz> Can you give me a clean single patch?
[01:41:26 CEST] <mkver> Sent you an email.
[01:43:03 CEST] <jamrial> mkver: you may instead want to look at the startcode stuff in ff_h2645_extract_rbsp(), h2645_parse.c
[01:43:05 CEST] <jkqxz> Got it, thank you!  That's much easier than fighting with github.
[01:43:12 CEST] <jamrial> that one is run on every target
[01:43:51 CEST] <mkver> I have already looked at the startcode stuff in h2645_parse.c
[01:44:33 CEST] <mkver> Improving it (and cbs_h2645_assemble_fragment) is on my to-do list.
[01:44:38 CEST] <jamrial> jkqxz: in case you didn't know, when looking at a github commit, if you add .patch at the end of the url it will give you a git format-patch style patch
[01:45:58 CEST] <mkver> h2645_parse.c uses a suboptimal pattern. It could filter out more false positives without any extra cost.
[01:48:09 CEST] <mkver> And of course, h2645_parse.c could do better if unaligned is slow :)
[02:13:14 CEST] <mkver> Good night!
[02:13:36 CEST] <jkqxz> Hmm.  This CPU seems to be magically fixing up 4-byte reads/writes.
[02:13:48 CEST] <jkqxz> Oh, too late.
[02:14:07 CEST] <jkqxz> But I can get it to fail with 8-byte ones.
[10:53:42 CEST] <cone-233> ffmpeg 03Linjie Fu 07master:6895b350c31d: lavf/qsvvpp: avoid the double-free when working in sys memory mode
[11:40:24 CEST] <jkqxz> mkver:  Turns out that CPU fixes up most 4-byte unaligned accesses automatically.  (8-byte unaligned accesses do trap as expected.)
[11:40:38 CEST] <jkqxz> So while there were no problems, I don't think it tested what you wanted it to test.
[11:49:04 CEST] <mkver> Ok.
[11:49:23 CEST] <mkver> And the framehash output showed no problems?
[12:49:42 CEST] <vel0city> how do you guys get the Message-Id of an email you want to reply to in Gmail?
[12:50:20 CEST] <vel0city> I'm using an extension but it seems to not be working well
[12:51:34 CEST] <vel0city> I tried to send a reply via --in-reply-to and it went to a completely different one, for a different patchset
[12:52:52 CEST] <vel0city> I'm using this extension: https://chrome.google.com/webstore/detail/gmail-message-id-finder/comcoiiifldaaejpbgbincdkoohihbae
[12:53:35 CEST] <JEEB> vel0city: when you open an e-mail there's the three dots on teh right at the top
[12:53:44 CEST] <JEEB> "show original" is one of the alternatives
[12:54:01 CEST] <JEEB> and the first value that view shows is teh Message ID
[12:55:09 CEST] <vel0city> ah right, in that case, it's the same that the extension gave me
[12:55:18 CEST] <vel0city> something else must be wrong
[12:55:55 CEST] <vel0city> (more likely, _I'm_ doing something wrong)
[12:56:20 CEST] <vel0city> though I remember in-reply-to worked fine a few months ago
[19:38:54 CEST] <cone-100> ffmpeg 03Nick Renieris 07master:a7e018b05e0e: avcodec/tiff: Option to decode embedded thumbnail
[19:38:54 CEST] <cone-100> ffmpeg 03Nick Renieris 07master:661facb8a89a: libavcodec/tiff: Process SubIFDs tag with multiple entries
[19:38:54 CEST] <cone-100> ffmpeg 03Nick Renieris 07master:9c35285aea84: avcodec/tiff: Recognize DNG/CinemaDNG images
[20:31:39 CEST] <mjbshaw> I'm looking into fixing https://trac.ffmpeg.org/ticket/6841 but there are a few ways to go about it...
[20:32:10 CEST] <mjbshaw> I've got a patch that modifies the mkv muxer and makes it lie about the output stream's time_base (so it uses the audio's sample_rate instead of the "real" time base).
[20:32:37 CEST] <mjbshaw> And then in mkv_write_flush_packet it rescale's the packet's timing info to be in the "real" time base.
[20:32:51 CEST] <mjbshaw> It works, but like I said, it's lying.
[20:33:48 CEST] <mjbshaw> Alternatively, the issue could be fixed by changing ffmpeg.c so it rescale's the packet's timing info *after* applying the avoid_negative_ts hack rather than before it.
[20:34:45 CEST] <mjbshaw> While that seems more correct, it requires mucking with some of the bowels of the processing graph, and I'm not sure I have the time to do that.
[20:35:37 CEST] <mjbshaw> Does anyone have any brighter ideas?
[20:46:34 CEST] <mkver> I have been thinking about this ticket and ticket 7182 myself.
[20:49:27 CEST] <mkver> First, what does this have to do with ffmpeg.c? The avoid_negative_ts option is not specific to ffmpeg.c, but to libavformat in general.
[20:49:38 CEST] <mkver> A fix should be in the libraries.
[20:53:40 CEST] <mjbshaw> ffmpeg.c line 952 rescale's the packet's ts from 1/48000 to 1/1000.
[20:53:50 CEST] <mjbshaw> Which is where the rounding error is introduced.
[20:54:04 CEST] <mkver> Second, for opus in Matroska there avoid_negative_ts should actually not come into play at all.
[20:54:17 CEST] <mkver> the, not there
[20:56:11 CEST] <mkver> After all, the Matroska file format has the ability to signal encoder delay at the container level without offsetting the actual timestamps.
[20:58:18 CEST] <mjbshaw> Yeah, it's not clear to my why the mkv muxer defaults to avoid_negative_ts=1.
[20:58:29 CEST] <mkver> So the generic code (in mux.c) should be changed so that it doesn't offset all the tracks just because an opus track has an initial timestamp of -6.5ms.
[20:59:16 CEST] <mkver> It defaults to avoid_negative_ts because the Matroska file format can't adequatly represent negative timestamps.
[20:59:47 CEST] <mkver> The CodecDelay field is only there for encoder delay, not for general timestamp shifting.
[21:00:06 CEST] <mjbshaw> Oh yeah, I forgot Cluster.Timestamp is unsigned, not signed.
[21:00:23 CEST] <Lynne> is the problem rounding to ms precision? matroska does allow you to change the timebase, but the muxer would probably have to be rewritten to support that
[21:00:40 CEST] <mjbshaw> I've got a patch to support changing the time base of the mkv muxer.
[21:00:59 CEST] <mjbshaw> https://patchwork.ffmpeg.org/patch/1959/
[21:02:02 CEST] <mjbshaw> But yeah, the problem is caused by a rounding error. mkv defaults to a timebase of 1/1000 (and is required by webm). I don't think using a more precise time base is sufficient.
[21:02:53 CEST] <durandal_1707> why?
[21:02:56 CEST] <mkver> Supporting more timebases in the Matroska muxer is fine, but the timestamps that should actually be written for opus can actually be represented in the default ms timebase.
[21:03:44 CEST] <mkver> A more precise timebase would be sufficient (for Matroska, not for WebM, for which it is impossible), but it is not necessary.
[21:03:46 CEST] <mjbshaw> durandal_1707: because (1) it doesn't work for WebM and (2) it only pushes the rounding error down to a lower-order bit; it doesn't actually fix the rounding.
[21:04:30 CEST] <mkver> After all, the timestamps 0ms, 20ms, 40ms (which are the timestamps the blocks should have (before taking CodecDelay into account)) can be represented in the ms timebase.
[21:05:08 CEST] <mjbshaw> Assuming you're using 20ms Opus frames, yes. But Opus frames can be 2.5ms.
[21:06:26 CEST] <mkver> Yeah. For such streams, you really should use a finer timebase (if you don't want the jitter).
[21:06:38 CEST] <mjbshaw> But you can't for WebM.
[21:07:06 CEST] <mjbshaw> It won't be perfect, granted, but we could at least be a *little* more accurate.
[21:07:51 CEST] <mjbshaw> Anyway, I agree with your earlier point that we shouldn't be shifting the time stamps of all the streams just because of Opus's delay (which is already signaled in the container).
[21:08:54 CEST] <mkver> Yeah.
[21:09:33 CEST] <mkver> One solution would be to use different PTS and DTS to account for encoder delay for audio.
[21:10:01 CEST] <mkver> But the assumption that pts=dts for audio is so deeply ingrained into FFmpeg that this is probably a pipe dream.
[21:11:00 CEST] <nevcairiel> its also not accurate, since that delayed audio isn't just offset in time, but will actually disappear
[21:12:13 CEST] <mkver> Thinking about this a second time, it wouldn't work.
[21:15:24 CEST] <mkver> The problem of offsetting all tracks just because one seems to have a negative initial timestamp could be solved by a new flag that signals whether a muxer has the capability to signal an encoder delay by itself.
[21:16:13 CEST] <nevcairiel> You would have to verify that the signaled delay also matches the negative timestamp, of course, but that sounds sensible otherwise
[21:16:51 CEST] <mjbshaw> That sounds good to me.
[21:17:47 CEST] <nevcairiel> dont we already have a variety of modes for avoid_negative_ts? maybe that can just fit in there
[21:17:49 CEST] <mkver> Or more precisely: In mux.c, for a muxer for which said flag is set, the timestamp to look at and check for whether it is >= 0 is not the pts, but rather pts + time of the encoder delay (given by initial_padding).
[21:18:11 CEST] <mkver> There is already a avoid_negative_ts_use_pts which is only used for Matroska.
[21:18:19 CEST] <mjbshaw> I was just going to mention avoid_negative_ts_use_pts.
[21:18:24 CEST] <mkver> So it could be reused for this.
[21:18:55 CEST] <mkver> This would solve one of the four problems currently existing with CodecDelay.
[21:19:08 CEST] <mjbshaw> What are the other three problems?
[21:20:03 CEST] <mkver> 1. #7182: The Matroska muxer writes the CodecDelay flag, but doesn't actually modify the timestamps.
[21:21:08 CEST] <mkver> This means that a track for which the CodecDelay field is nonzero gets offset compared to tracks for which it is not (or for which it has a different value).
[21:21:27 CEST] <mkver> 2. The rounding issue in the Matroska muxer.
[21:22:52 CEST] <mkver> 3. The rounding issue in the Matroska demuxer: The demuxer will correctly (sample-accurately, not ns-accurately) export the CodecDelay as initial_padding. And it will offset the timestamps, but it rounds the timestamps to the timebase of the Matroska file (usually 1ms).
[21:23:37 CEST] <mkver> So an opus block with timestamp 0 (before taking CodecDelay into account) and CodecDelay will leave the demuxer with a timestamp of -7ms.
[21:24:58 CEST] <mkver> Possible solution: Use a finer timebase for a track with CodecDelay.
[21:25:14 CEST] <mkver> But this has side-effects by claiming to be more precise than it is.
[21:26:36 CEST] <Lynne> its better than having everything muxed be downgraded to 1/1000 timebase
[21:26:56 CEST] <mkver> Think about it this way: You have an audio track with a default duration of 21 1/3ms (AAC) and an encoder delay of 21 1/3.
[21:27:24 CEST] <mkver> The possible solution that I am currently talking about is for problem 3 (for the demuxer).
[21:27:51 CEST] <mkver> Btw: Michael, did you receive my mail that I just sent?
[21:28:04 CEST] <Lynne> though I did hear that matroska files with non-standard (!= 1/1000) timebases break a lot of demuxers
[21:28:48 CEST] <mkver> These demuxers would then be broken, obviously.
[21:29:08 CEST] <mkver> (Except if there are for WebM only, as WebM requires a 1ms timebase.)
[21:29:25 CEST] <mjbshaw> mkver: I haven't received anything.
[21:29:34 CEST] <mkver> I'll resend it.
[21:33:42 CEST] <mjbshaw> For #3, the way I've handled it (outside of FFmpeg) is to claim all audio streams in MKV have a time_base of 1/sample_rate (and only non-audio streams use the container's 1/1000 (or whatever) timebase).
[21:34:28 CEST] <mkver> Back to the demuxer problem: The actual timestamps are -21 1/3ms, 0ms, 21 1/3ms, 42 2/3ms, ..., but the timestamps of the blocks are 0, 21ms, 43ms,... (i.e. from these timestamps alone it appears as if the second packet had a longer duration than the first).
[21:36:04 CEST] <mkver> If one now uses a finer timebase to export this stream (one fine enough to signal the CodecDelay) and simply offsets all timestamps by the CodecDelay, then the fact that some packets have a duration of 21ms and some of 22ms will remain.
[21:37:18 CEST] <mkver> But with the 1ms timebase this could be put down to being a side-effect of the timebase (so it did not really signal that there were gaps).
[21:38:15 CEST] <mkver> This changes when the demuxer exports this with a finer timebase. Now the track really looks as if it had some gaps and some parts where the audio from two packets should be played simultaneously.
[21:39:06 CEST] <mkver> To solve this one would have to interpolate the timestamps (via their duration).
[21:39:50 CEST] <mjbshaw> This example is audio, right? Are the packet times/durations not perfectly representable using a time base of 1/sample_rate? If not, wtf kind of audio codec is emitting fractional audio frames?
[21:40:49 CEST] <mkver> It is an audio example (based upon AAC).
[21:41:17 CEST] <mkver> Lots of audio packets duration can't be perfectly represented in Matroska.
[21:41:50 CEST] <mkver> That's because most common sample rates have a 3 in their factorization.
[21:42:21 CEST] <mkver> 1/sample rate is not a multiple of ns then and so is not perfectly representable in Matroska.
[21:43:23 CEST] <mkver> And the number of samples in an audio packet doesn't have a three in its factorization, then the duration of said packet won't be perfectly representable either.
[21:44:33 CEST] <mkver> This applies to most codecs: DTS (512 samples (10 2/3 ms)), AAC (1024 samples), certain modes of flac...
[21:45:12 CEST] <mjbshaw> I need to look at matroskadec.c, but in my own demuxer+player (which handles WebM) I "lie" about the time base of streams contained in MKV/WebM.
[21:45:23 CEST] <mjbshaw> Audio streams all get reported as having a time base of 1/sample_rate.
[21:45:32 CEST] <mjbshaw> Video streams get a time base of whatever the container says.
[21:45:49 CEST] <mjbshaw> Doing this in ffmpeg would fix the rounding issue in the demuxer.
[21:45:50 CEST] <mkver> AC3 is fine though (at 48kHz): 1536 = 3 * 512 has a duration of 32ms. 
[21:46:19 CEST] <mkver> And how do you fix up the issue I talked about earlier?
[21:46:46 CEST] <mjbshaw> Which one :)
[21:47:36 CEST] <mkver> You have real timestamps of 0ms, 21 1/3ms, 42 2/3ms..., but the container only contains the timestamps of 0, 21ms, 43ms (timebase 1ms).
[21:48:36 CEST] <mkver> If you now simply convert this to (say) 48kHz, then do you make it by multiplying 21*48 and 43*48 or do you actually interpolate the real timestamps?
[21:49:40 CEST] <mkver> (I.e. do you take into account that 21ms and 43ms are not the real timestamps, but just the approximations in the 1ms timebase and fix them up to 21 1/3ms and 42 2/3ms before converting them to the 1/sample rate timebase?)
[21:49:44 CEST] <nevcairiel> slightly jittery timestamps for audio is something everyone should be handling already anyway, its also not so bad since after decoding you can just make them sample accurate and just check it doesnt drift apart
[21:50:12 CEST] <mjbshaw> The player ignores the PTS of audio packets except for when seeking.
[21:50:41 CEST] <mkver> So it just plays them as it finds them and doesn't care.
[21:50:47 CEST] <mjbshaw> As nevcairiel said, audio timestamps suck anyway and so the only way to really handle this in a player is to ignore the timestamps when playing contiguously.
[21:51:25 CEST] <mjbshaw> So the first packet gets its timestamp from the container (lossily converted to a timebase of 1/sample_rate).
[21:51:29 CEST] <nevcairiel> Thats what I do too. Primarily just keep a "counter" based on decoded samples and check that it doesn't drift apart from the timestamps, in which case something would be wrong
[21:51:49 CEST] <mjbshaw> Subsequent packets get their timestamp by doing previous_packet_pts + previous_back_duration and ignoring the container's PTS.
[21:52:20 CEST] <mjbshaw> where previous_back_duration is a sample-accurate duration of the packet.
[21:53:19 CEST] <nevcairiel> for the record, while such logic is fine for a decoder or player to perform, a demuxer should just spit out the container timestamps, and let the other layers then ignore them (or use them :P)
[21:55:00 CEST] <mkver> mjbshaw: I hope you are at least doing dome sanity checks and don't completely ignore container pts.
[21:55:43 CEST] <mkver> nev, what do you think of the strategy I outlined at the end of this post: https://forum.doom9.org/showthread.php?p=1848928#post1848928 ?
[21:56:42 CEST] <mjbshaw> mkver: Eh, there's only one of two options if the container PTS drifts horribly away from the sample-accurate timing based on actual contents. At that point the file is broken, and you can either continue playing it (what I do) or not (e.g., seek, warn the user, etc.).
[21:57:26 CEST] <mjbshaw> Anyway, the advantage of reporting the audio stream's timestamps using a timebase of 1/sample_rate is that you can then accurately represent stuff like the codec delay.
[21:57:33 CEST] <mkver> Ok, makes sense.
[22:00:27 CEST] <nevcairiel> mkver: that sounds way too complicated just to avoid a tiny bit of timestamp jitter which is inherent to many container formats
[22:01:29 CEST] <mkver> I somehow expected this from you.
[22:02:15 CEST] <nevcairiel> its really nto a problem that needs solving :)
[22:02:34 CEST] <nevcairiel> there is so many problems that do, rather invest time in those, personally
[22:03:48 CEST] <mkver> Back to the rounding issues. Maybe we are thinking about this to complicated. Maybe we should simply use a different rounding method. One that does not round down -6.5ms and round up 13.5ms.
[22:04:37 CEST] <mkver> If both are rounded the same, the 1ms gap is gone.
[22:07:12 CEST] <mjbshaw> I'd be fine with that. Not quite as complete of a solution, but it's suuuper simple.
[22:07:49 CEST] <mkver> Yeah, simplicity rules.
[00:00:00 CEST] --- Fri May 31 2019