[FFmpeg-devel] [PATCH v4 4/8] swscale/range_convert: fix mpeg ranges in yuv range conversion for non-8-bit pixel formats

Ramiro Polla ramiro.polla at gmail.com
Tue Dec 3 12:27:23 EET 2024


On Tue, Dec 3, 2024 at 3:35 AM Michael Niedermayer
<michael at niedermayer.cc> wrote:
> On Sun, Dec 01, 2024 at 07:20:06PM +0100, Ramiro Polla wrote:
> > There is an issue with the constants used in YUV to YUV range conversion,
> > where the upper bound is not respected when converting to mpeg range.
> >
> > With this commit, the constants are calculated at runtime, depending on
> > the bit depth. This approach also allows us to more easily understand how
> > the constants are derived.
> >
> > For bit depths <= 14, the number of fixed point bits has been set to 14
> > for all conversions, to simplify the code.
> > For bit depths > 14, the number of fixed points bits has been raised and
> > set to 18, to allow for the conversion to be accurate enough for the mpeg
> > range to be respected.
> >
> > The convert functions now take the conversion constants (coeff and offset)
> > as function arguments.
> > For bit depths <= 14, coeff is unsigned 16-bit and offset is 32-bit.
> > For bit depths > 14, coeff is unsigned 32-bit and offset is 64-bit.
> >
> > x86_64:
> > chrRangeFromJpeg8_1920_c:    2127.4   2125.0  (1.00x)
> > chrRangeFromJpeg16_1920_c:   2325.2   2127.2  (1.09x)
> > chrRangeToJpeg8_1920_c:      3166.9   3168.7  (1.00x)
> > chrRangeToJpeg16_1920_c:     2152.4   3164.8  (0.68x)
> > lumRangeFromJpeg8_1920_c:    1263.0   1302.5  (0.97x)
> > lumRangeFromJpeg16_1920_c:   1080.5   1299.2  (0.83x)
> > lumRangeToJpeg8_1920_c:      1886.8   2112.2  (0.89x)
> > lumRangeToJpeg16_1920_c:     1077.0   1906.5  (0.56x)
> >
> > aarch64 A55:
> > chrRangeFromJpeg8_1920_c:   28835.2  28835.6  (1.00x)
> > chrRangeFromJpeg16_1920_c:  28839.8  32680.8  (0.88x)
> > chrRangeToJpeg8_1920_c:     23074.7  23075.4  (1.00x)
> > chrRangeToJpeg16_1920_c:    17318.9  24996.0  (0.69x)
> > lumRangeFromJpeg8_1920_c:   15389.7  15384.5  (1.00x)
> > lumRangeFromJpeg16_1920_c:  15388.2  17306.7  (0.89x)
> > lumRangeToJpeg8_1920_c:     19227.8  19226.6  (1.00x)
> > lumRangeToJpeg16_1920_c:    15387.0  21146.3  (0.73x)
> >
> > aarch64 A76:
> > chrRangeFromJpeg8_1920_c:    6324.4   6268.1  (1.01x)
> > chrRangeFromJpeg16_1920_c:   6339.9  11521.5  (0.55x)
> > chrRangeToJpeg8_1920_c:      9656.0   9612.8  (1.00x)
> > chrRangeToJpeg16_1920_c:     6340.4  11651.8  (0.54x)
> > lumRangeFromJpeg8_1920_c:    4422.0   4420.8  (1.00x)
> > lumRangeFromJpeg16_1920_c:   4420.9   5762.0  (0.77x)
> > lumRangeToJpeg8_1920_c:      5949.1   5977.5  (1.00x)
> > lumRangeToJpeg16_1920_c:     4446.8   5946.2  (0.75x)
> >
> > NOTE: all simd optimizations for range_convert have been disabled.
> >       they will be re-enabled when they are fixed for each architecture.
> >
> > NOTE2: the same issue still exists in rgb2yuv conversions, which is not
> >        addressed in this commit.
> > ---
> >  libswscale/aarch64/swscale.c                  |   5 +
> >  libswscale/hscale.c                           |   6 +-
> >  libswscale/swscale.c                          | 113 +++++++++--
> >  libswscale/swscale_internal.h                 |  26 ++-
> >  libswscale/x86/swscale.c                      |   5 +
> >  tests/checkasm/sw_range_convert.c             |  68 ++++++-
> >  .../fate/filter-alphaextract_alphamerge_rgb   | 100 +++++-----
> >  tests/ref/fate/filter-pixdesc-gray10be        |   2 +-
> >  tests/ref/fate/filter-pixdesc-gray10le        |   2 +-
> >  tests/ref/fate/filter-pixdesc-gray12be        |   2 +-
> >  tests/ref/fate/filter-pixdesc-gray12le        |   2 +-
> >  tests/ref/fate/filter-pixdesc-gray14be        |   2 +-
> >  tests/ref/fate/filter-pixdesc-gray14le        |   2 +-
> >  tests/ref/fate/filter-pixdesc-gray16be        |   2 +-
> >  tests/ref/fate/filter-pixdesc-gray16le        |   2 +-
> >  tests/ref/fate/filter-pixdesc-gray9be         |   2 +-
> >  tests/ref/fate/filter-pixdesc-gray9le         |   2 +-
> >  tests/ref/fate/filter-pixdesc-ya16be          |   2 +-
> >  tests/ref/fate/filter-pixdesc-ya16le          |   2 +-
> >  tests/ref/fate/filter-pixdesc-yuvj411p        |   2 +-
> >  tests/ref/fate/filter-pixdesc-yuvj420p        |   2 +-
> >  tests/ref/fate/filter-pixdesc-yuvj422p        |   2 +-
> >  tests/ref/fate/filter-pixdesc-yuvj440p        |   2 +-
> >  tests/ref/fate/filter-pixdesc-yuvj444p        |   2 +-
> >  tests/ref/fate/filter-pixfmts-copy            |  34 ++--
> >  tests/ref/fate/filter-pixfmts-crop            |  34 ++--
> >  tests/ref/fate/filter-pixfmts-field           |  34 ++--
> >  tests/ref/fate/filter-pixfmts-fieldorder      |  30 +--
> >  tests/ref/fate/filter-pixfmts-hflip           |  34 ++--
> >  tests/ref/fate/filter-pixfmts-il              |  34 ++--
> >  tests/ref/fate/filter-pixfmts-lut             |  18 +-
> >  tests/ref/fate/filter-pixfmts-null            |  34 ++--
> >  tests/ref/fate/filter-pixfmts-pad             |  22 +--
> >  tests/ref/fate/filter-pixfmts-pullup          |  10 +-
> >  tests/ref/fate/filter-pixfmts-rotate          |   4 +-
> >  tests/ref/fate/filter-pixfmts-scale           |  34 ++--
> >  tests/ref/fate/filter-pixfmts-swapuv          |  10 +-
> >  .../ref/fate/filter-pixfmts-tinterlace_cvlpf  |   8 +-
> >  .../ref/fate/filter-pixfmts-tinterlace_merge  |   8 +-
> >  tests/ref/fate/filter-pixfmts-tinterlace_pad  |   8 +-
> >  tests/ref/fate/filter-pixfmts-tinterlace_vlpf |   8 +-
> >  tests/ref/fate/filter-pixfmts-transpose       |  28 +--
> >  tests/ref/fate/filter-pixfmts-vflip           |  34 ++--
> >  tests/ref/fate/fitsenc-gray                   |   2 +-
> >  tests/ref/fate/fitsenc-gray16be               |  10 +-
> >  tests/ref/fate/gifenc-gray                    | 186 +++++++++---------
> >  tests/ref/fate/idroq-video-encode             |   2 +-
> >  tests/ref/fate/jpg-icc                        |   8 +-
> >  tests/ref/fate/sws-yuv-colorspace             |   2 +-
> >  tests/ref/fate/sws-yuv-range                  |   2 +-
> >  tests/ref/fate/vvc-conformance-SCALING_A_1    | 128 ++++++------
> >  tests/ref/lavf/gray16be.fits                  |   4 +-
> >  tests/ref/lavf/gray16be.pam                   |   4 +-
> >  tests/ref/lavf/gray16be.png                   |   6 +-
> >  tests/ref/lavf/jpg                            |   6 +-
> >  tests/ref/lavf/smjpeg                         |   6 +-
> >  tests/ref/pixfmt/gbrp-gray                    |   2 +-
> >  tests/ref/pixfmt/gbrp-gray10be                |   2 +-
> >  tests/ref/pixfmt/gbrp-gray10le                |   2 +-
> >  tests/ref/pixfmt/gbrp-gray12be                |   2 +-
> >  tests/ref/pixfmt/gbrp-gray12le                |   2 +-
> >  tests/ref/pixfmt/gbrp-gray16be                |   2 +-
> >  tests/ref/pixfmt/gbrp-gray16le                |   2 +-
> >  tests/ref/pixfmt/gbrp-yuvj420p                |   2 +-
> >  tests/ref/pixfmt/gbrp-yuvj422p                |   2 +-
> >  tests/ref/pixfmt/gbrp-yuvj440p                |   2 +-
> >  tests/ref/pixfmt/gbrp-yuvj444p                |   2 +-
> >  tests/ref/pixfmt/gbrp10-gray                  |   2 +-
> >  tests/ref/pixfmt/gbrp10-gray10be              |   2 +-
> >  tests/ref/pixfmt/gbrp10-gray10le              |   2 +-
> >  tests/ref/pixfmt/gbrp10-gray12be              |   2 +-
> >  tests/ref/pixfmt/gbrp10-gray12le              |   2 +-
> >  tests/ref/pixfmt/gbrp10-gray16be              |   2 +-
> >  tests/ref/pixfmt/gbrp10-gray16le              |   2 +-
> >  tests/ref/pixfmt/gbrp10-yuvj420p              |   2 +-
> >  tests/ref/pixfmt/gbrp10-yuvj422p              |   2 +-
> >  tests/ref/pixfmt/gbrp10-yuvj440p              |   2 +-
> >  tests/ref/pixfmt/gbrp10-yuvj444p              |   2 +-
> >  tests/ref/pixfmt/gbrp12-gray                  |   2 +-
> >  tests/ref/pixfmt/gbrp12-gray10be              |   2 +-
> >  tests/ref/pixfmt/gbrp12-gray10le              |   2 +-
> >  tests/ref/pixfmt/gbrp12-gray12be              |   2 +-
> >  tests/ref/pixfmt/gbrp12-gray12le              |   2 +-
> >  tests/ref/pixfmt/gbrp12-gray16be              |   2 +-
> >  tests/ref/pixfmt/gbrp12-gray16le              |   2 +-
> >  tests/ref/pixfmt/gbrp12-yuvj420p              |   2 +-
> >  tests/ref/pixfmt/gbrp12-yuvj422p              |   2 +-
> >  tests/ref/pixfmt/gbrp12-yuvj440p              |   2 +-
> >  tests/ref/pixfmt/gbrp12-yuvj444p              |   2 +-
> >  tests/ref/pixfmt/gbrp16-gray16be              |   2 +-
> >  tests/ref/pixfmt/gbrp16-gray16le              |   2 +-
> >  tests/ref/pixfmt/rgb24-gray                   |   2 +-
> >  tests/ref/pixfmt/rgb24-gray10be               |   2 +-
> >  tests/ref/pixfmt/rgb24-gray10le               |   2 +-
> >  tests/ref/pixfmt/rgb24-gray12be               |   2 +-
> >  tests/ref/pixfmt/rgb24-gray12le               |   2 +-
> >  tests/ref/pixfmt/rgb24-gray16be               |   2 +-
> >  tests/ref/pixfmt/rgb24-gray16le               |   2 +-
> >  tests/ref/pixfmt/rgb24-yuvj420p               |   2 +-
> >  tests/ref/pixfmt/rgb24-yuvj422p               |   2 +-
> >  tests/ref/pixfmt/rgb24-yuvj440p               |   2 +-
> >  tests/ref/pixfmt/rgb24-yuvj444p               |   2 +-
> >  tests/ref/pixfmt/rgb48-gray                   |   2 +-
> >  tests/ref/pixfmt/rgb48-gray10be               |   2 +-
> >  tests/ref/pixfmt/rgb48-gray10le               |   2 +-
> >  tests/ref/pixfmt/rgb48-gray12be               |   2 +-
> >  tests/ref/pixfmt/rgb48-gray12le               |   2 +-
> >  tests/ref/pixfmt/rgb48-gray16be               |   2 +-
> >  tests/ref/pixfmt/rgb48-gray16le               |   2 +-
> >  tests/ref/pixfmt/rgb48-yuvj420p               |   2 +-
> >  tests/ref/pixfmt/rgb48-yuvj422p               |   2 +-
> >  tests/ref/pixfmt/rgb48-yuvj440p               |   2 +-
> >  tests/ref/pixfmt/rgb48-yuvj444p               |   2 +-
> >  tests/ref/pixfmt/yuv444p-gray10be             |   2 +-
> >  tests/ref/pixfmt/yuv444p-gray10le             |   2 +-
> >  tests/ref/pixfmt/yuv444p-gray12be             |   2 +-
> >  tests/ref/pixfmt/yuv444p-gray12le             |   2 +-
> >  tests/ref/pixfmt/yuv444p-gray16be             |   2 +-
> >  tests/ref/pixfmt/yuv444p-gray16le             |   2 +-
> >  tests/ref/pixfmt/yuv444p-yuvj420p             |   2 +-
> >  tests/ref/pixfmt/yuv444p-yuvj422p             |   2 +-
> >  tests/ref/pixfmt/yuv444p-yuvj440p             |   2 +-
> >  tests/ref/pixfmt/yuv444p10-gray               |   2 +-
> >  tests/ref/pixfmt/yuv444p10-gray10be           |   2 +-
> >  tests/ref/pixfmt/yuv444p10-gray10le           |   2 +-
> >  tests/ref/pixfmt/yuv444p10-gray12be           |   2 +-
> >  tests/ref/pixfmt/yuv444p10-gray12le           |   2 +-
> >  tests/ref/pixfmt/yuv444p10-gray16be           |   2 +-
> >  tests/ref/pixfmt/yuv444p10-gray16le           |   2 +-
> >  tests/ref/pixfmt/yuv444p10-yuvj420p           |   2 +-
> >  tests/ref/pixfmt/yuv444p10-yuvj422p           |   2 +-
> >  tests/ref/pixfmt/yuv444p10-yuvj440p           |   2 +-
> >  tests/ref/pixfmt/yuv444p10-yuvj444p           |   2 +-
> >  tests/ref/pixfmt/yuv444p12-gray               |   2 +-
> >  tests/ref/pixfmt/yuv444p12-gray10be           |   2 +-
> >  tests/ref/pixfmt/yuv444p12-gray10le           |   2 +-
> >  tests/ref/pixfmt/yuv444p12-gray12be           |   2 +-
> >  tests/ref/pixfmt/yuv444p12-gray12le           |   2 +-
> >  tests/ref/pixfmt/yuv444p12-gray16be           |   2 +-
> >  tests/ref/pixfmt/yuv444p12-gray16le           |   2 +-
> >  tests/ref/pixfmt/yuv444p12-yuvj420p           |   2 +-
> >  tests/ref/pixfmt/yuv444p12-yuvj422p           |   2 +-
> >  tests/ref/pixfmt/yuv444p12-yuvj440p           |   2 +-
> >  tests/ref/pixfmt/yuv444p12-yuvj444p           |   2 +-
> >  tests/ref/pixfmt/yuv444p16-gray16be           |   2 +-
> >  tests/ref/pixfmt/yuv444p16-gray16le           |   2 +-
> >  tests/ref/pixfmt/yuvj420p                     |   2 +-
> >  tests/ref/pixfmt/yuvj422p                     |   2 +-
> >  tests/ref/pixfmt/yuvj440p                     |   2 +-
> >  tests/ref/pixfmt/yuvj444p                     |   2 +-
> >  tests/ref/seek/lavf-jpg                       |   8 +-
> >  tests/ref/seek/vsynth_lena-mjpeg              |  40 ++--
> >  tests/ref/seek/vsynth_lena-roqvideo           |   2 +-
> >  tests/ref/vsynth/vsynth1-amv                  |   8 +-
> >  tests/ref/vsynth/vsynth1-mjpeg                |   6 +-
> >  tests/ref/vsynth/vsynth1-mjpeg-422            |   6 +-
> >  tests/ref/vsynth/vsynth1-mjpeg-444            |   6 +-
> >  tests/ref/vsynth/vsynth1-mjpeg-huffman        |   6 +-
> >  tests/ref/vsynth/vsynth1-mjpeg-trell          |   8 +-
> >  tests/ref/vsynth/vsynth1-mjpeg-trell-huffman  |   8 +-
> >  tests/ref/vsynth/vsynth1-roqvideo             |   8 +-
> >  tests/ref/vsynth/vsynth2-amv                  |   6 +-
> >  tests/ref/vsynth/vsynth2-mjpeg                |   6 +-
> >  tests/ref/vsynth/vsynth2-mjpeg-422            |   6 +-
> >  tests/ref/vsynth/vsynth2-mjpeg-444            |   6 +-
> >  tests/ref/vsynth/vsynth2-mjpeg-huffman        |   6 +-
> >  tests/ref/vsynth/vsynth2-mjpeg-trell          |   8 +-
> >  tests/ref/vsynth/vsynth2-mjpeg-trell-huffman  |   8 +-
> >  tests/ref/vsynth/vsynth2-roqvideo             |   8 +-
> >  tests/ref/vsynth/vsynth3-amv                  |   8 +-
> >  tests/ref/vsynth/vsynth3-mjpeg                |   8 +-
> >  tests/ref/vsynth/vsynth3-mjpeg-422            |   8 +-
> >  tests/ref/vsynth/vsynth3-mjpeg-444            |   6 +-
> >  tests/ref/vsynth/vsynth3-mjpeg-huffman        |   8 +-
> >  tests/ref/vsynth/vsynth3-mjpeg-trell          |   6 +-
> >  tests/ref/vsynth/vsynth3-mjpeg-trell-huffman  |   6 +-
> >  tests/ref/vsynth/vsynth_lena-amv              |   6 +-
> >  tests/ref/vsynth/vsynth_lena-mjpeg            |   8 +-
> >  tests/ref/vsynth/vsynth_lena-mjpeg-422        |   6 +-
> >  tests/ref/vsynth/vsynth_lena-mjpeg-444        |   6 +-
> >  tests/ref/vsynth/vsynth_lena-mjpeg-huffman    |   8 +-
> >  tests/ref/vsynth/vsynth_lena-mjpeg-trell      |   8 +-
> >  .../vsynth/vsynth_lena-mjpeg-trell-huffman    |   8 +-
> >  tests/ref/vsynth/vsynth_lena-roqvideo         |   8 +-
> >  184 files changed, 880 insertions(+), 725 deletions(-)
>
> should be ok if tested and output values are ok

Thanks. I'll commit the patchset in a couple of days if there are no
more comments. Martin had already ok'd the aarch64 patches. I'd
appreciate it if someone could have a look at the x86 patches.

Ramiro


More information about the ffmpeg-devel mailing list