[FFmpeg-devel] [PATCH] Add NVENC encoder

Sat Nov 29 12:45:05 CET 2014

Le septidi 7 frimaire, an CCXXIII, Timo Rothenpieler a écrit :
> Done that primarily to keep things cleaned up and easier to read.
> Can as well put it all in one huge file.

IMHO, your choice in the end.

> Propably, will split that out when i get to it.

Thanks.

> Most of this code is ported from a C++ Project where everything had
> to be camel case, so those and the C++ malloc casts are some of the
> leftovers.

Idem, IMHO your choice.

> The maximum is the number of adjacent B Frames plus one. So 3 in the
> case of NVENC, unless they change the supported gop patterns.

In that case, I suspect to take some margin and use a hardcoded size for the
queue: with 8 or 16, there is plenty of time to update the size if nvidia
releases new GOP patterns.

> I basically need a fifo like structure, where i can queue output
> surfaces until NVENC is done filling them.
> An array doesn't realy reflect that usage, as new elements are added
> to the front, and taken from the back.

An array used as a circular buffer is perfect for that.

> Is a simple assert on the return value enough? Can't continue in a
> sane way anyway if it ever fails.

No: an assert can only be used when the error is not possible except for a
bug in the code. A malloc failure is always possible, so a proper error
handling, eventually returning ENOMEM, is necessary. That is annoying but
mostly unavoidable.

> I renamed it to enqueue/dequeue.

Thanks.

> This definitely can't be replaced, its purpose isn't just a plain
> list, but sorting of the input timestamps, so the dts is still
> monotonic after re-ordering for B frames.

As far as I can see, you do not need the sorting random access to the sorted
timestamps but only to be able to extract the lowest one. This is precisely
what a heap is very good at. I assure you, in this case a heap will be
almost as simple to implement (certainly simpler if you make it fixed-size)
and way more efficient.

> Only adding the CUDA error code, which then has to be looked up manualy.

Too bad ;(

> What do you mean by that? Printing which presets are available in
> the error message?

Yes. Something like that is always nice:

# x264 [error]: invalid preset 'list'
# [libx264 @ 0x172a560] Error setting preset/tune list/(null).
# [libx264 @ 0x172a560] Possible presets: ultrafast superfast veryfast faster
# fast medium slow slower veryslow placebo
# [libx264 @ 0x172a560] Possible tunes: film animation grain stillimage psnr
# ssim fastdecode zerolatency

> Not entirely sure why i did that this way. Copied it straight from
> the libx264 encoder, without thinking too much about it. I can just
> set the profileGUID straight from that switch and can remove the
> second profile variable(which the libx264 encoder has in exactly the
> same conflicting way) entirely.

It is entirely possible that libx264 has reasons that I do not know to do
things that way. It is also possible that the options are there just for
historic reasons.

> >>+        default:
> >>+        avctx->coded_frame->pict_type = AV_PICTURE_TYPE_NONE;
> Not that I'm aware of. But i don't know what else to assume in this case.

Then I believe some kind of feedback would be needed. If it is really never
supposed to happen, an assert is acceptable; otherwise, an error requesting
a sample may be better.

> The maximum supported number of surfaces is allocated, if it'd ever
> run out, there'd be a bug in the code managing the surfaces.

Then an assert is exactly what is needed.

> ff_cuCtxCreate is a library function loaded from the CUDA dll/so.
> Same for all the other ff_cu* functions, there is no way to change
> what it returns, as it's not my function.

I missed that, sorry.

> >>+#define ifav_log(...) if (avctx) { av_log(__VA_ARGS__); }
> >Looks strange: why no error message when there is no context?
> Is it possible to call av_log without a context?

Yes, it just prints the message without the [libfoo @ 0x42] prefix.

> No, does it need to be? Can multiple threads create the coded at the
> same time?

It looks like it is called from encode_init, which can definitely be called
by multiple threads.

Thanks for your efforts. Hope this helps.

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20141129/fa6f2530/attachment.asc>