[FFmpeg-devel] [RFC] Direct Stream Transfer (DST) decoder
Reimar Döffinger
Reimar.Doeffinger at gmx.de
Tue Sep 30 21:00:12 CEST 2014
On Tue, Sep 30, 2014 at 11:42:20PM +1000, Peter Ross wrote:
> +/**
> + * Calculate FS44 ratio
> + */
> +#define DSD_FS44(sample_rate) (sample_rate / 44100)
> +
> +/**
> + * Calculate DST frame size
> + * @return samples per frame (1-bit samples)
> + */
> +#define DST_SAMPLES_PER_FRAME(sample_rate) (588 * DSD_FS44(sample_rate))
A header for just these seems a bit overkill.
Also I'd prefer static inline functions over macros when there is no
reason to use macros.
> +static int read_map(GetBitContext *gb, Table *t, unsigned int map[DST_MAX_CHANNELS], int channels)
> +{
> + int ch;
> + t->elements = 1;
> + if (!get_bits1(gb)){
> + map[0] = 0;
> + for (ch = 1; ch < channels; ch++) {
> + int bits = av_log2(t->elements) + 1;
> + map[ch] = get_bits(gb, bits);
> + if (map[ch] == t->elements) {
> + t->elements++;
> + if (t->elements >= DST_MAX_ELEMENTS)
> + return AVERROR_INVALIDDATA;
> + } else if (map[ch] > t->elements) {
> + return AVERROR_INVALIDDATA;
> + }
> + }
> + } else {
> + memset(map, 0, sizeof(*map) * DST_MAX_CHANNELS);
> + }
Maybe I just don't understand how this should be used (in that case a
comment would be nice), but I find it suspicious that the if part
initializes channels elements while the else part does initialize
DST_MAX_CHANNELS elements.
Personally I also prefer to have one-liners always in the "if" branch.
> +static av_always_inline int get_sr_golomb_dst(GetBitContext *gb, unsigned int k)
> +{
> +#if 0
> + /* 'run_length' upper bound is not specified; we can never be sure it will fit into get_bits cache */
> + int v = get_ur_golomb(gb, k, INT_MAX, 0);
The code is too hard to read for the time of day (a comment on those
different golomb variants sure would help), but sure that
get_ur_golomb_shorten doesn't happen to do what you need?
> + if (x >= 0)
> + c -= (x + 4) / 8;
> + else
> + c += (-x + 3) / 8;
Uh, isn't this actually
c -= (x + 4) >> 3;
?
> +static uint8_t prob_dst_x_bit(int c)
> +{
> + return (ff_reverse[c & 127] >> 1) + 1;
> +}
Since this seems to be in the inner loop, making
your own table might be worth it...
> +static void build_filter(int16_t table[DST_MAX_ELEMENTS][16][256], const Table *fsets)
> +{
> + int i, j, k, l;
> + for (i = 0; i < fsets->elements; i++) {
> + int length = fsets->length[i];
> + for (j = 0; j < 16; j++) {
> + int total = FFMAX(0, FFMIN(length - j * 8, 8));
av_clip?
> + for (k = 0; k < 256; k++) {
> + int v = 0;
> + for (l = 0; l < total; l++)
> + v += (((k >> l) & 1) * 2 - 1) * fsets->coeff[i][j * 8 + l];
Is this faster in a relevant way than something more readable like
coeff = fsets->coeff[i][j * 8 + l];
v += k & (1 << l) ? coeff : -coeff;
?
> + unsigned int half_prob[DST_MAX_CHANNELS];
> + unsigned int map_ch_to_felem[DST_MAX_CHANNELS];
> + unsigned int map_ch_to_pelem[DST_MAX_CHANNELS];
> + DECLARE_ALIGNED(16, uint8_t, status)[DST_MAX_CHANNELS][16];
> + DECLARE_ALIGNED(16, int16_t, filter)[DST_MAX_ELEMENTS][16][256];
At least the last one is quite large I think.
Please consider putting it in the context or so instead.
That's also less problematic with alignment.
> + if (!(avpkt->data[0] & 1)) {
> + if (frame->nb_samples > avpkt->size - 1)
> + av_log(avctx, AV_LOG_WARNING, "short frame");
> + memcpy(frame->data[0], avpkt->data + 1, FFMIN(frame->nb_samples * avctx->channels, avpkt->size - 1));
Hm, shouldn't there be a check for avpkt->size > 0 somewhere first?
> + if ((ret = init_get_bits8(&gb, avpkt->data, avpkt->size)) < 0)
> + if ((ret = read_map(&gb, &fsets, map_ch_to_felem, avctx->channels)) < 0)
> + if ((ret = read_map(&gb, &probs, map_ch_to_pelem, avctx->channels)) < 0)
I still think putting assignments into an if is really bad idea all
around.
> + if (same) {
> + probs.elements = fsets.elements;
> + memcpy(map_ch_to_pelem, map_ch_to_felem, sizeof(map_ch_to_felem));
> + } else {
> + avpriv_request_sample(avctx, "Same_Mapping=0");
> + return ret;
> + }
Simpler as
> if (!same) {
> avpriv_request_sample(avctx, "Same_Mapping=0");
> return ret;
> }
and no else.
Also, paranoia: for memcpy preferably use sizeof(dst) over sizeof(src),
the effects are less bad if it goes wrong.
> + uint64_t * s = (uint64_t*)status[ch];
That looks like a strict aliasing violation.
> + v = ((predict >> 15) ^ residual) & 1;
> + frame->data[0][ (i >> 3) * avctx->channels + ch] |= v << (7 - (i & 0x7 ));
> +
> +#if HAVE_BIGENDIAN
> + /* FIXME: not tested */
> + s[0] = (s[0] << 1) | ((s[1] >> 63) & 1);
> + s[1] = (s[1] << 1) | v;
> +#else
> + s[1] = (s[1] << 1) | ((s[0] >> 63) & 1);
> + s[0] = (s[0] << 1) | v;
> +#endif
Haven't fully understood this, but wouldn't it be more efficient
especially on 32 bit systems to first collect e.g. 8 bits and only
then move them into s?
Especially if unrolling would be viable, in which case e.g.
7 - (i & 0x7 ) would become a constant as well.
More information about the ffmpeg-devel
mailing list