[FFmpeg-devel] [PATCH] libavcodec: add bit-rate support to RoQ video encoder

Victor Luchitz vluchits at gmail.com
Thu Jan 25 00:09:59 EET 2024


In our case, the machine we're targeting (the Sega 32X) has only 256KB
of RAM. Even more modern consoles such as XBOX or even PS3 didn't
have enough RAM to hold an entire CD-ROM..

We also have to be concerned about how fast we can move data to the
main CPU. Say you make the first frame BIG to make it look its best, then
compensate with much smaller delta frames... is the first frame too much
data to move to the CPU quick enough? Well, maybe not an issue for the
first frame, but definitely an issue for keyframes in the middle of the
stream.



On Thu, Jan 25, 2024 at 12:30 AM Tomas Härdin <git at haerdin.se> wrote:

> ons 2024-01-24 klockan 11:50 +0300 skrev Victor Luchitz:
> > On Tue, Jan 23, 2024 at 8:44 PM Tomas Härdin <git at haerdin.se> wrote:
> > >
> > > Anyway, using -b:v 100k causes the encoder to effectively become
> > > stuck
> > > on the first frame, being unable to go below 621 kbps and
> > > increasing
> > > qscale very slowly. But you mentioned this already of course.
> > > Perhaps
> > > there should be a faster "startup" phase? Subsequent frames being
> > > P-
> > > frames may lead to the average hitting the target bitrate.
> > >
> >
> > I'm not sure we'd go as far as special-casing the encoder for
> > extremely low
> > bit-rates at high image resolutions. I mean, if you want low picture
> > quality,
> > the proper way for RoQ would be to scale down your video to a lower
> > resolution
> > instead of turning the video into a huge ugly mess of macroblocks.
>
> I'm not really suggesting such low bitrates is a good idea, but they
> shouldn't cause the encoder to choke. What happens if there's suddenly
> a frame that's very complex, say white noise?
>
> > The main issues with allowing a larger keyframe and then trying to
> > make it
> > up on subsequent frames are: doable for FIRST frame, may cause buffer
> > underrun on later keyframes; we are targeting CDROM media with this
> > patch,
> > be it 1X, 2X, whatever. We cannot have too much variance in the frame
> > sizes
> > (bigger). Smaller is allowable, but not bigger.
>
> How bursty is CD-ROM anyway? I suspect something similar to VBV would
> help here. Surely there are buffers involved?
>
> > It's OK if encoding takes some time :)
>
> Yeah but it shouldn't hang
>
> >
> >
> > >
> > > I'm also curious about this hunk:
> > >
> > > > -        if (enc->lambda > 100000) {
> > > > +        if (enc->lambda > 100000000) {
> > > >              av_log(roq->logctx, AV_LOG_ERROR, "Cannot encode
> > > > video in
> > > Quake compatible form\n");
> > > >              return AVERROR(EINVAL);
> > > >          }
> > >
> > > Where in the Quake (3?) source code is this limitation? Seems
> > > rather
> > > that there should be a retry limit. There's probably no harm in
> > > keeping
> > > going so long as the packet doesn't end up above the limit.
> > >
> >
> > In our tests we found that lambda can go way higher than the original
> > 100000. Stopping at 100000 is artificially restricting yourself when
> > you
> > can go much further. 100000000 is high enough to allow 1X  bitrates
> > even on extreme frames.
>
> I figured as much. But this could also be arrived at with some formula
> I suspect. A higher limit isn't wrong per se.
>
> > > A quick logarithmic regression on bitrate vs qscale suggests it
> > > scales
> > > with somewhere between qscale^-0.2833 and qscale^-0.2842.
> > > Conversely,
> > > to hit a specific bitrate, try scaling qscale with the bitrate
> > > ratio
> > > raised to around 3.5. A more conservative exponent like 2 is
> > > probably
> > > also fine. See patch attached.
> > >
> >
> > Thanks so much for the patch! I've tested it on a pretty long hires
> > video at
> > 835kbps and everything seems to work fine. The runtime was 2m39s for
> > both: the original and your version.
>
> Yep. But, the fact that I could get it to hang at certain points
> doesn't fully instill confidence.
>
> A better approach might be to buffer an entire GOP, then find a lambda
> that brings the entire GOP within the desired bitrate. If the encoder
> happens to insert an extra I-frame, emit the I and P frames up to that
> point and buffer up enough frames to make another GOP. Say with a GOP
> size of 6, it would normally look like:
>
>   [IPPPPP][IPPPPP]...
>
> But sometimes like this, if nothing special is done:
>
>   [IPPPIP][IPPPPP]...
>
> This is probably wasteful however, so better to do
>
>   [IPPP][IPPPPP][IP...
>
> That second I-frame would probably have been quite cheap to fit within
> the allotted bits. It might for example be a cut to white and fade in.
>
> If I understand this patch correctly, it tries to make every packet
> roughly the same size, which seems excessive. Does the machine you're
> targetting not have enough RAM to buffer an entire CD-ROM? Or even a
> couple of seconds?
>
> /Tomas
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>


-- 
Best regards,
 Victor Luchitz


More information about the ffmpeg-devel mailing list