[FFmpeg-devel] [PATCH] Add NVENC encoder
Philip Langdale
philipl at overt.org
Sat Nov 29 16:26:07 CET 2014
On 29 Nov 2014 03:45, Nicolas George <george at nsup.org> wrote:
>
> Le septidi 7 frimaire, an CCXXIII, Timo Rothenpieler a écrit :
> > Done that primarily to keep things cleaned up and easier to read.
> > Can as well put it all in one huge file.
>
> IMHO, your choice in the end.
>
> > Propably, will split that out when i get to it.
>
> Thanks.
>
> > Most of this code is ported from a C++ Project where everything had
> > to be camel case, so those and the C++ malloc casts are some of the
> > leftovers.
>
> Idem, IMHO your choice.
>
> > The maximum is the number of adjacent B Frames plus one. So 3 in the
> > case of NVENC, unless they change the supported gop patterns.
>
> In that case, I suspect to take some margin and use a hardcoded size for the
> queue: with 8 or 16, there is plenty of time to update the size if nvidia
> releases new GOP patterns.
>
> > I basically need a fifo like structure, where i can queue output
> > surfaces until NVENC is done filling them.
> > An array doesn't realy reflect that usage, as new elements are added
> > to the front, and taken from the back.
>
> An array used as a circular buffer is perfect for that.
For what it's worth, the implementation I did against the
nvidia patch was a circular buffer and it works fine. The
docs I read say 4 + num b frames and max b frames can
go up to 16 - not 2.
> > Is a simple assert on the return value enough? Can't continue in a
> > sane way anyway if it ever fails.
>
> No: an assert can only be used when the error is not possible except for a
> bug in the code. A malloc failure is always possible, so a proper error
> handling, eventually returning ENOMEM, is necessary. That is annoying but
> mostly unavoidable.
>
> > I renamed it to enqueue/dequeue.
>
> Thanks.
>
> > This definitely can't be replaced, its purpose isn't just a plain
> > list, but sorting of the input timestamps, so the dts is still
> > monotonic after re-ordering for B frames.
>
> As far as I can see, you do not need the sorting random access to the sorted
> timestamps but only to be able to extract the lowest one. This is precisely
> what a heap is very good at. I assure you, in this case a heap will be
> almost as simple to implement (certainly simpler if you make it fixed-size)
> and way more efficient.
Yeah, you don't need to sort. Circular fifo works fine.
> > Only adding the CUDA error code, which then has to be looked up manualy.
>
> Too bad ;(
>
> > What do you mean by that? Printing which presets are available in
> > the error message?
>
> Yes. Something like that is always nice:
>
> # x264 [error]: invalid preset 'list'
> # [libx264 @ 0x172a560] Error setting preset/tune list/(null).
> # [libx264 @ 0x172a560] Possible presets: ultrafast superfast veryfast faster
> # fast medium slow slower veryslow placebo
> # [libx264 @ 0x172a560] Possible tunes: film animation grain stillimage psnr
> # ssim fastdecode zerolatency
>
> > Not entirely sure why i did that this way. Copied it straight from
> > the libx264 encoder, without thinking too much about it. I can just
> > set the profileGUID straight from that switch and can remove the
> > second profile variable(which the libx264 encoder has in exactly the
> > same conflicting way) entirely.
>
> It is entirely possible that libx264 has reasons that I do not know to do
> things that way. It is also possible that the options are there just for
> historic reasons.
>
> > >>+ default:
> > >>+ avctx->coded_frame->pict_type = AV_PICTURE_TYPE_NONE;
> > Not that I'm aware of. But i don't know what else to assume in this case.
>
> Then I believe some kind of feedback would be needed. If it is really never
> supposed to happen, an assert is acceptable; otherwise, an error requesting
> a sample may be better.
>
> > The maximum supported number of surfaces is allocated, if it'd ever
> > run out, there'd be a bug in the code managing the surfaces.
>
> Then an assert is exactly what is needed.
>
> > ff_cuCtxCreate is a library function loaded from the CUDA dll/so.
> > Same for all the other ff_cu* functions, there is no way to change
> > what it returns, as it's not my function.
>
> I missed that, sorry.
>
> > >>+#define ifav_log(...) if (avctx) { av_log(__VA_ARGS__); }
> > >Looks strange: why no error message when there is no context?
> > Is it possible to call av_log without a context?
>
> Yes, it just prints the message without the [libfoo @ 0x42] prefix.
>
> > No, does it need to be? Can multiple threads create the coded at the
> > same time?
>
> It looks like it is called from encode_init, which can definitely be called
> by multiple threads.
>
> Thanks for your efforts. Hope this helps.
>
> Regards,
>
> --
> Nicolas George
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
More information about the ffmpeg-devel
mailing list