[FFmpeg-devel] [PATCH] Support > 8 bit input in yuv2rgb.
Reimar Döffinger
Reimar.Doeffinger at gmx.de
Sat Nov 9 13:05:53 CET 2013
On 09.11.2013, at 12:37, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Fri, Nov 08, 2013 at 10:47:20PM +0100, Reimar Döffinger wrote:
>> Significantly faster than the default path (which defaults to
>> bicubic scaling even if no real scaling happens), though
>> the templating is kind of ugly and increases code size a bit.
>>
>> Signed-off-by: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
>> ---
>> libswscale/swscale_unscaled.c | 3 +
>> libswscale/yuv2rgb.c | 550 ++++++------------------------------------
>> libswscale/yuv2rgb_template.c | 458 +++++++++++++++++++++++++++++++++++
>> 3 files changed, 537 insertions(+), 474 deletions(-)
>> create mode 100644 libswscale/yuv2rgb_template.c
>>
>> diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
>> index 83086f7..8842f35 100644
>> --- a/libswscale/swscale_unscaled.c
>> +++ b/libswscale/swscale_unscaled.c
>> @@ -1217,6 +1217,9 @@ void ff_get_unscaled_swscale(SwsContext *c)
>> }
>> /* yuv2bgr */
>> if ((srcFormat == AV_PIX_FMT_YUV420P || srcFormat == AV_PIX_FMT_YUV422P ||
>> + srcFormat == AV_PIX_FMT_YUV420P9 || srcFormat == AV_PIX_FMT_YUV422P9 ||
>> + srcFormat == AV_PIX_FMT_YUV420P10 || srcFormat == AV_PIX_FMT_YUV422P10 ||
>> + srcFormat == AV_PIX_FMT_YUV420P16 || srcFormat == AV_PIX_FMT_YUV422P16 ||
>> srcFormat == AV_PIX_FMT_YUVA420P) && isAnyRGB(dstFormat) &&
>> !(flags & SWS_ACCURATE_RND) && (c->dither == SWS_DITHER_BAYER || c->dither == SWS_DITHER_AUTO) && !(dstH & 1)) {
>> c->swscale = ff_yuv2rgb_get_func_ptr(c);
>> diff --git a/libswscale/yuv2rgb.c b/libswscale/yuv2rgb.c
>> index 77c56a9..28de37e 100644
>> --- a/libswscale/yuv2rgb.c
>> +++ b/libswscale/yuv2rgb.c
>> @@ -54,72 +54,72 @@ const int *sws_getCoefficients(int colorspace)
>> }
>>
>> #define LOADCHROMA(i) \
>> - U = pu[i]; \
>> - V = pv[i]; \
>> + U = pu[i] >> shift; \
>> + V = pv[i] >> shift; \
>> r = (void *)c->table_rV[V+YUVRGB_TABLE_HEADROOM]; \
>> g = (void *)(c->table_gU[U+YUVRGB_TABLE_HEADROOM] + c->table_gV[V+YUVRGB_TABLE_HEADROOM]); \
>> b = (void *)c->table_bU[U+YUVRGB_TABLE_HEADROOM];
>
> are the shifts faster than bigger tables ?
> (it would be slightly more accurate with bigger tables)
I haven't tested. But note that I also added 16-bit support, we are talking about 256 times larger table.
If keeping the shift for Y that would still be around 48 MB if I calculated right?
There could be a "compromise" by making the tables for y, u and v 9 bit and only shifting for > 9 bit, to get better precision. That would only increase their size 4x I believe.
I guess even making the tables 10 bit might still be reasonable...
However in both cases I think that will mean further changes since it would also need to increase the HEADROOM stuff.
More information about the ffmpeg-devel
mailing list