[FFmpeg-devel] [PATCH] Unfinished GSoC qual task (16bit VQA Video Decoder)
Michael Niedermayer
michaelni
Sun Apr 26 02:38:55 CEST 2009
On Sun, Apr 26, 2009 at 01:03:23AM +0200, Adam Iglewski wrote:
> Hi,
>
> Thanks for reviews. Updated patches attached.
>
> Vitor Sessak pisze:
> > Duplicating 4XM code is not ok (you could archive the same result of
> > your patch just putting "case CODEC_ID_ADPCM_IMA_WS:" inside the 4XM
> > case block). Also, did you test your code with our samples archive? It
> > would surprise me that CODEC_ID_ADPCM_IMA_WS never worked for any file...
> >
>
> I did. Files with one channel plays fine but just try playing any sample
> from Tiberian Sun.
>
> http://samples.mplayerhq.hu/game-formats/vqa/Tiberian%20Sun%20VQAs/
>
> As far as I can tell they are the only ones with stereo sound. I'm
> attaching new patch as suggested.
>
> >> +static inline void vqa_copy_hc_block(unsigned short *pixels,int
> >> pixel_ptr,int stride,unsigned short *codebook,
> >> + int vector_index,int block_h)
> >
> > I prefer using uint16_t instead of "unsigned short", since what we
> > actually want is a 16-bit uint...
>
> Fixed.
>
> >> +{
> >> + int pixel_y;
> >> + for (pixel_y = 0; pixel_y < block_h; pixel_y++) {
> >> + pixels[pixel_ptr] = codebook[vector_index++];
> >> + pixels[pixel_ptr+1] = codebook[vector_index++];
> >
> > If you add one more whitespace it is more readable:
> >
> > pixels[pixel_ptr ] = codebook[vector_index++];
> > pixels[pixel_ptr+1] = codebook[vector_index++];
> >
> > but I think here one memcpy would be even better...
> >
>
> Fixed. Now using memcpy.
>
> >> +
> >> + pixel_ptr = ((block_y * s->vector_height) *
> >> stride)+block_x*block_inc;
> >> + vptr_action_code = AV_RL16(&src[src_ptr]);
> >> + src_ptr+=2;
> >
> > Instead of using src_ptr and constructs like src[src_ptr++], you could
> > just increment the src pointer. More precisely, instead of
> >
> > a= src[src_ptr++];
> >
> > just
> >
> > a = *src++;
> >
> > and instead of
> >
> > a= AV_RL16(&src[src_ptr]);
> > src_ptr+=2;
> >
> > just
> >
> > a = bytestream_get_le16(&src);
> >
>
> Fixed. But should I use bytestream_get_byte or *ptr++ is OK?
*ptr++ is fine
>
> > Also please read your code again to check if a malicious malformed VQA
> > file would not make your code write to unallocated memory. This would be
> > a serious security breach...
>
> I added macro that checks if number of blocks to write will exceed total
> number of blocks in frames.
>
> Michael Niedermayer pisze:
> [...]
>>> + if(!AV_RL16(&vqa_header[14]))
>>> + avctx->pix_fmt = PIX_FMT_RGB555;
>>> + else
>>> + avctx->pix_fmt = PIX_FMT_PAL8;
>>> +
>> inconsistent indention
>
> Fixed.
>
>>> /* the vector dimensions have to meet very stringent requirements */
>>> if ((s->vector_width != 4) ||
>>> ((s->vector_height != 2) && (s->vector_height != 4))) {
>>> @@ -205,11 +211,17 @@ static void decode_format80(const unsign
>>> int src_index = 0;
>>> int dest_index = 0;
>>> + int new_format = 0;
>>> int count;
>>> int src_pos;
>>> unsigned char color;
>>> int i;
>>> + if ((!src[src_index]) || (src_size > 0xffff)) {
>> superflous ()
>
> Fixed.
>
>>> + new_format = 1;
>>> + src_index++;
>>> + }
>>> +
>>> while (src_index < src_size) {
>>> vqa_debug(" opcode %02X: ", src[src_index]);
>>> @@ -230,6 +242,8 @@ static void decode_format80(const unsign
>>> count = AV_RL16(&src[src_index]);
>>> src_index += 2;
>>> src_pos = AV_RL16(&src[src_index]);
>>> + if(new_format)
>>> + src_pos = dest_index-src_pos;
>>> src_index += 2;
>>> vqa_debug("(1) copy %X bytes from absolute pos %X\n", count,
>>> src_pos);
>>> CHECK_COUNT();
>>> @@ -252,6 +266,8 @@ static void decode_format80(const unsign
>>> count = (src[src_index++] & 0x3F) + 3
>>> src_pos = AV_RL16(&src[src_index]);
>>> + if(new_format)
>>> + src_pos = dest_index-src_pos;
>>> src_index += 2;
>>> vqa_debug("(3) copy %X bytes from absolute pos %X\n", count,
>>> src_pos);
>>> CHECK_COUNT();
>> looks like code duplication (this should be fixed in a seperate patch)
>>
>
> Will do.
>
>>> @@ -291,6 +307,124 @@ static void decode_format80(const unsign
>>> dest_index, dest_size);
>>> }
>>> +static inline void vqa_copy_hc_block(unsigned short *pixels,int
>>> pixel_ptr,int stride,unsigned short *codebook,
>>> + int vector_index,int block_h)
>>> +{
>>> + int pixel_y;
>>> + for (pixel_y = 0; pixel_y < block_h; pixel_y++) {
>>> + pixels[pixel_ptr] = codebook[vector_index++];
>>> + pixels[pixel_ptr+1] = codebook[vector_index++];
>>> + pixels[pixel_ptr+2] = codebook[vector_index++];
>>> + pixels[pixel_ptr+3] = codebook[vector_index++];
>>> + pixel_ptr += stride;
>>> + }
>>> +}
>> vector_index is redundant, so is pixel_ptr
>
> Fixed. And as Vitor suggested now using memcpy.
memcpy might be a bad choice, did you benchmark this one?
personally id cast to uint64_t and do a single assignment ...
>
> [...]
>>> + block_x = block_y = 0;
>>> + while(total_blocks > 0) {
>>> +
>>> + pixel_ptr = ((block_y * s->vector_height) *
>>> stride)+block_x*block_inc;
>> superflous ()
>
> Fixed.
>
>>> + vptr_action_code = AV_RL16(&src[src_ptr]);
>>> + src_ptr+=2;
>> bytestream_get_le16()
>
> Fixed.
>
> [...]
>>> +
>>> + /* Write block number (Val & 0xff) and then write Count
>>> blocks
>>> + * getting their indexes by reading next Count bytes from
>>> + * the VPTR chunk. Count is (((Val/256) & 0x1f)+1)*2. */
>>> + case 0x4000:
>>> + vector_index = (vptr_action_code & 0xff) << index_shift;
>> please indent the comments differerntly so the code is easier to
>> see / is more seperated from the comments
>
> Fixed.
>
> [...]
>>> + case 0xa000:
>>> + vector_index = (vptr_action_code & 0x1fff) <<
>>> index_shift;
>>> + blocks_done = src[src_ptr++];
>>> +
>>> + for(i=0;i<blocks_done;i++) {
>>> + vqa_copy_hc_block(pixels,pixel_ptr,stride,codebook,
>>> + vector_index,s->vector_height);
>>> + pixel_ptr += block_inc;
>>> + }
>> looks exploitable
>
> As I wrote above. I added a macro to avoid writing outside frame
> memory.
>
> [..]
>>> @@ -353,6 +489,14 @@ static void vqa_decode_chunk(VqaContext
>>> case VPTZ_TAG:
>>> vptz_chunk = index;
>>> break;
>>> +
>> trailing whitespace
>
> Fixed.
>
>>> + case VPTR_TAG:
>>> + vptr_chunk = index;
>>> + break;
>>> +
>>> + case VPRZ_TAG:
>>> + vprz_chunk = index;
>>> + break;
>>>
>> why are the chunks not decoded immedeatly?
>
> This was the original code flow and I just added chunks for
> decoding 16 bit frames. But I believe that this is because chunk order
> in file does not guarantee correct order for decoding frame eg. VPTR chunk
> which contain indices to codebook can appear before CBFZ chunk which
> contain codebook. But I might be wrong. Maybe Mike Melanson
> will know better ?
mike?
anyway, if true its still possible to have a simple function to traverse the
chunks and search for a specific one, that then could just be called with the
types in the needed order, i suspect this might be slightly simpler
>
> And regarding comments about using bytestream_get_le16 instead of AV_RL16
> and incrementing ptr. Should I replace it in every place in the file or
> just in new code added by me?
a seperate patch simplifying all is welcome
>
> Again thank for the reviews.
>
> Adam
>
> --- ffmpeg/libavcodec/vqavideo.c 2009-04-20 20:37:46.000000000 +0200
> +++ ffmpeg_work/libavcodec/vqavideo.c 2009-04-26 00:01:43.000000000 +0200
> @@ -70,10 +70,19 @@
>
> #include "libavutil/intreadwrite.h"
> #include "avcodec.h"
> +#include "bytestream.h"
>
> #define PALETTE_COUNT 256
> #define VQA_HEADER_SIZE 0x2A
> #define CHUNK_PREAMBLE_SIZE 8
> +#define CHECK_BLOCKS_COUNT(n) \
> + if ((total_blocks - n) < 0) { \
> + av_log(s->avctx, AV_LOG_ERROR, " VQA video warning: writing %d blocks will exceed frame block count %d\n", \
> + n, total_blocks); \
> + av_log(s->avctx, AV_LOG_ERROR, " VQA video warning: action code %X, action count %d\n", \
> + vptr_action_code & 0xe000, n); \
> + return; \
> + }
macros are not a solution to code duplication, the code is still duplicated
in the binary
[...]
> @@ -230,6 +251,8 @@ static void decode_format80(const unsign
> count = AV_RL16(&src[src_index]);
> src_index += 2;
> src_pos = AV_RL16(&src[src_index]);
> + if(new_format)
> + src_pos = dest_index-src_pos;
indent ...
[...]
> + block_x = block_y = 0;
> + while(total_blocks > 0) {
> +
> + pixels = (uint16_t *)s->frame.data[0] + (block_y * s->vector_height * stride)+block_x*block_inc;
> + vptr_action_code = bytestream_get_le16(&src);
> +
> + switch(vptr_action_code & 0xe000) {
> +
> + /* Skip Count blocks. Count is (Val & 0x1fff). */
> + case 0x0000:
> + blocks_done = vptr_action_code & 0x1fff;
> + break;
> +
> + /* Write block number (Val & 0xff) Count times.
> + * Count is (((Val/256) & 0x1f)+1)*2
> + */
> + case 0x2000:
> + vector_index = (vptr_action_code & 0xff) << index_shift;
> + blocks_done = (((vptr_action_code>>8) & 0x1f)+1) << 1;
> + CHECK_BLOCKS_COUNT(blocks_done);
> +
> + for(i=0;i<blocks_done;i++) {
> + vqa_copy_hc_block(pixels,stride,&codebook[vector_index],s->vector_height);
> + pixels += block_inc;
> + }
> + break;
> +
> + /* Write block number (Val & 0xff) and then write Count blocks
> + * getting their indexes by reading next Count bytes from
> + * the VPTR chunk. Count is (((Val/256) & 0x1f)+1)*2
> + */
> + case 0x4000:
> + vector_index = (vptr_action_code & 0xff) << index_shift;
> + blocks_done = (((vptr_action_code>>8) & 0x1f)+1) << 1;
> + CHECK_BLOCKS_COUNT(blocks_done+1);
> +
> + vqa_copy_hc_block(pixels,stride,&codebook[vector_index],s->vector_height);
> + pixels += block_inc;
> +
> + for(i=0;i<blocks_done;i++) {
> + vector_index = *src++;
> + vector_index <<= index_shift;
> + vqa_copy_hc_block(pixels,stride,&codebook[vector_index],s->vector_height);
> + pixels += block_inc;
> + }
> + blocks_done++;
> + break;
> +
> + /* Write block (Val & 0x1fff).*/
> + case 0x6000:
> + vector_index = (vptr_action_code & 0x1fff) << index_shift;
> + CHECK_BLOCKS_COUNT(1);
> + vqa_copy_hc_block(pixels,stride,&codebook[vector_index],s->vector_height);
> + blocks_done=1;
> + break;
> +
> + /* Write block (Val & 0x1fff) Count times.
> + * Count is the next byte from the VPTR chunk
> + */
> + case 0xa000:
> + vector_index = (vptr_action_code & 0x1fff) << index_shift;
> + blocks_done = *src++;
> + CHECK_BLOCKS_COUNT(blocks_done);
> +
> + for(i=0;i<blocks_done;i++) {
> + vqa_copy_hc_block(pixels,stride,&codebook[vector_index],s->vector_height);
> + pixels += block_inc;
> + }
theres alot of code duplication in there, its basically the same thing
done over and over again
> + break;
> + }
> + block_y += (block_x + blocks_done) / blocks_wide;
> + block_x = (block_x + blocks_done) % blocks_wide;
> + total_blocks -= blocks_done;
can count cross the right border and continue to the next row?
if yes, your code is buggy
if no, it can be simplified into a for(all rows) for(all columns)
[...]
> @@ -581,12 +750,22 @@ static int vqa_decode_frame(AVCodecConte
> av_log(s->avctx, AV_LOG_ERROR, " VQA Video: get_buffer() failed\n");
> return -1;
> }
> + } else {
> + s->frame.reference = 1;
> + s->frame.buffer_hints = FF_BUFFER_HINTS_VALID | FF_BUFFER_HINTS_PRESERVE | FF_BUFFER_HINTS_REUSABLE;
> + if (avctx->reget_buffer(avctx, &s->frame)) {
> + av_log(s->avctx, AV_LOG_ERROR, "reget_buffer() failed\n");
> + return -1;
> + }
> + }
indent
[...]
> --- ffmpeg/libavcodec/adpcm.c 2009-04-20 18:29:22.000000000 +0200
> +++ ffmpeg_work/libavcodec/adpcm.c 2009-04-26 00:27:30.000000000 +0200
> @@ -1005,6 +1005,7 @@ static int adpcm_decode_frame(AVCodecCon
> if (cs->step_index < 0) cs->step_index = 0;
> if (cs->step_index > 88) cs->step_index = 88;
>
> + case CODEC_ID_ADPCM_IMA_WS:
> m= (buf_size - (src - buf))>>st;
> for(i=0; i<m; i++) {
> *samples++ = adpcm_ima_expand_nibble(&c->status[0], src[i] & 0x0F, 4);
> @@ -1156,25 +1157,6 @@ static int adpcm_decode_frame(AVCodecCon
> src++;
> }
> break;
> - case CODEC_ID_ADPCM_IMA_WS:
> - /* no per-block initialization; just start decoding the data */
> - while (src < buf + buf_size) {
> -
> - if (st) {
> - *samples++ = adpcm_ima_expand_nibble(&c->status[0],
> - src[0] >> 4 , 3);
> - *samples++ = adpcm_ima_expand_nibble(&c->status[1],
> - src[0] & 0x0F, 3);
> - } else {
> - *samples++ = adpcm_ima_expand_nibble(&c->status[0],
> - src[0] >> 4 , 3);
> - *samples++ = adpcm_ima_expand_nibble(&c->status[0],
> - src[0] & 0x0F, 3);
> - }
> -
> - src++;
> - }
> - break;
> case CODEC_ID_ADPCM_XA:
> while (buf_size >= 128) {
> xa_decode(samples, src, &c->status[0], &c->status[1],
That will break the existing cases that use CODEC_ID_ADPCM_IMA_WS i assume
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Awnsering whenever a program halts or runs forever is
On a turing machine, in general impossible (turings halting problem).
On any real computer, always possible as a real computer has a finite number
of states N, and will either halt in less than N cycles or never halt.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090426/f473fb91/attachment.pgp>
More information about the ffmpeg-devel
mailing list