[FFmpeg-devel] [PATCH] Add support for Audible AA files
Nicolas George
george at nsup.org
Mon Jul 27 15:14:45 CEST 2015
Le nonidi 9 thermidor, an CCXXIII, Vesselin Bontchev a écrit :
> Hi!
>
> This patch adds support for Audible AA files.
>
> Audible samples can be obtained from,
>
> https://gitlab.com/vesselin.bontchev/audible-samples/tree/master
> https://samples.ffmpeg.org/audible/
>
> Currently, this code generates corrupt audio output (in some places, and deterministically).
Does this happen with the Go and Python implementations too?
>
> By posting this patch, I am hoping to get some early feedback, and help in making this work.
>
> Thanks,
> Vesselin
> From ba73543efc3fdc4b5c61e9cb56f998d748716e00 Mon Sep 17 00:00:00 2001
> From: Vesselin Bontchev <vesselin.bontchev at yandex.com>
> Date: Sun, 19 Jul 2015 23:16:36 +0200
> Subject: [PATCH] Add support for Audible AA files
>
> https://en.wikipedia.org/wiki/Audible.com#Quality
> ---
> doc/demuxers.texi | 10 ++
> doc/general.texi | 2 +
> libavformat/Makefile | 1 +
> libavformat/aadec.c | 381 +++++++++++++++++++++++++++++++++++++++++++++++
> libavformat/allformats.c | 1 +
> 5 files changed, 395 insertions(+)
> create mode 100644 libavformat/aadec.c
>
> diff --git a/doc/demuxers.texi b/doc/demuxers.texi
> index e45e1af..df95233 100644
> --- a/doc/demuxers.texi
> +++ b/doc/demuxers.texi
> @@ -18,6 +18,16 @@ enabled demuxers.
>
> The description of some of the currently available demuxers follows.
>
> + at section aa
> +
> +Audible Format 2, 3, and 4 demuxer.
> +
> +This demuxer is used to demux Audible Format 2, 3, and 4 (.aa) files.
> +
> + at example
> +ffmpeg -v debug -i input.aa -c:a copy output.wav
> + at end example
> +
> @section applehttp
>
> Apple HTTP Live Streaming demuxer.
> diff --git a/doc/general.texi b/doc/general.texi
> index a260e79..2b782e0 100644
> --- a/doc/general.texi
> +++ b/doc/general.texi
> @@ -228,6 +228,8 @@ library:
> @item 8088flex TMV @tab @tab X
> @item AAX @tab @tab X
> @tab Audible Enhanced Audio format, used in audiobooks.
> + at item AA @tab @tab X
> + @tab Audible Format 2, 3, and 4, used in audiobooks.
> @item ACT Voice @tab @tab X
> @tab contains G.729 audio
> @item Adobe Filmstrip @tab X @tab X
> diff --git a/libavformat/Makefile b/libavformat/Makefile
> index cc73fd8..8405f8d 100644
> --- a/libavformat/Makefile
> +++ b/libavformat/Makefile
> @@ -395,6 +395,7 @@ OBJS-$(CONFIG_RTSP_DEMUXER) += rtsp.o rtspdec.o httpauth.o \
> OBJS-$(CONFIG_RTSP_MUXER) += rtsp.o rtspenc.o httpauth.o \
> urldecode.o
> OBJS-$(CONFIG_SAMI_DEMUXER) += samidec.o subtitles.o
> +OBJS-$(CONFIG_AA_DEMUXER) += aadec.o
> OBJS-$(CONFIG_SAP_DEMUXER) += sapdec.o
> OBJS-$(CONFIG_SAP_MUXER) += sapenc.o
> OBJS-$(CONFIG_SBG_DEMUXER) += sbgdec.o
> diff --git a/libavformat/aadec.c b/libavformat/aadec.c
> new file mode 100644
> index 0000000..bd994e8
> --- /dev/null
> +++ b/libavformat/aadec.c
> @@ -0,0 +1,381 @@
> +/*
> + * Audible AA demuxer
> + * Copyright (c) 2015 Vesselin Bontchev
> + *
> + * Inspired by https://github.com/jteeuwen/audible, and
> + * https://code.google.com/p/pyaudibletags/source projects.
Inspired, or translated from Go to C? The readString() wrapper seems to
indicate a step-by-step translation, and that warrants copyright. The
licence is GPL-compatible, though, but attribution must be preserved.
> + *
> + * See https://en.wikipedia.org/wiki/Audible.com#Quality for
> + * format details.
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include "avformat.h"
> +#include "internal.h"
> +#include "libavutil/intreadwrite.h"
> +#include "libavutil/tea.h"
> +#include "libavutil/fifo.h"
> +#include "libavutil/opt.h"
> +
> +#define AA_MAGIC 1469084982 /* this identifies an audible .aa file */
> +#define MAX_CODEC_SECOND_SIZE 3982
> +#define MAX_TOC_ENTRIES 64
> +#define TEA_BLOCK_SIZE 8
> +
> +typedef struct AADemuxContext {
> + int64_t data_end;
Looks unused.
> + uint32_t HeaderSeed;
> + uint32_t Filesize;
> + uint32_t Magic;
> + uint32_t tocSize;
> + uint32_t npairs;
> + union {
> + unsigned char key[16];
> + uint32_t part[4];
> + } HeaderKey;
> + uint32_t TOC[MAX_TOC_ENTRIES][2];
These fields are used only in the read_header() function, they could
(should?) be local variables.
And some of them are stored but never accessed.
And TOC could be a structure.
> +
> + void *aa_fixed_key;
> + int aa_fixed_key_size;
> + uint32_t start;
> + uint32_t end;
Same as above: local.
> + int32_t codec_second_size;
> + char codec_name[64];
Local.
> + struct AVTEA *tea_ctx;
> + uint8_t final_key[16];
> + int64_t current_chapter_size;
> + int64_t current_codec_second_size;
> + uint32_t trailing_bytes;
Looks unused.
> + int64_t total_parsed;
Looks unused.
> + struct AVFifoBuffer *cbuf;
> + int chapter_idx;
> +} AADemuxContext;
> +
> +static int32_t GetSecondSizeByCodecID(char *codec_name)
Any reason to use UglyCamelCase for this?
> +{
> + int32_t result = 0;
> +
> + if (!strcmp(codec_name, "mp332")) {
> + // 0xC00D (marker in file)
I do not understand the comment.
> + result = 3982;
> + } else if (!strcmp(codec_name, "acelp85")) {
> + // 0xC00C
> + result = 2000;
> + } else if (!strcmp(codec_name, "acelp85")) {
> + // 0xC010
> + result = 1045;
> + }
> +
> + return result;
> +}
> +
> +static void readString(AVIOContext *pb, int size, char *output)
> +{
> + avio_read(pb, output, size);
> +}
Error code ignored. And the usefulness of this wrapper is hard to see.
> +
> +static int aa_read_header(AVFormatContext *s)
> +{
> + AVStream *st;
> + AADemuxContext *c;
> + AVIOContext *pb;
> +
> + int i, j, idx;
> + uint32_t nkey;
> + uint32_t nval;
> + char key[512] = {0};
> + char val[512] = {0};
> + unsigned char buffer[512] = {0};
> + uint32_t v0, v1;
> + unsigned char dst[8];
> + unsigned char src[8];
> + unsigned char *output = buffer;
Looks useless.
> + int largest_idx = -1;
> + int64_t largest_size = -1;
> + int64_t current_size = -1;
> +
> + c = s->priv_data;
> + pb = s->pb;
You could init that directly while declaring the variables, that is how it
is done.
> + c->tea_ctx = av_tea_alloc();
> + if (!c->tea_ctx)
> + return AVERROR(ENOMEM);
> + c->cbuf = av_fifo_alloc(MAX_CODEC_SECOND_SIZE * 2);
> + if (!c->cbuf)
> + return AVERROR(ENOMEM);
Leaks.
> +
> + c->Filesize = avio_rb32(pb); // file size
> + c->Magic = avio_rb32(pb); // magic string
> + c->tocSize = avio_rb32(pb); // TOC size
> + avio_rb32(pb); // unidentified integer
> + for (i = 0; i < c->tocSize; i++) { // read TOC
> + avio_rb32(pb); // TOC entry index
> + c->TOC[i][0] = avio_rb32(pb); // block Offset
> + c->TOC[i][1] = avio_rb32(pb); // block size
Unsafe, tocSize needs to be validated.
> + }
> + avio_read(pb, buffer, 24); // header termination block (ignored)
> + c->npairs = avio_rb32(pb); // read dictionary entries
> + for (i = 0; i < c->npairs; i++) {
> + memset(val, 0, sizeof(val));
> + memset(key, 0, sizeof(key));
> + avio_r8(pb); // unidentified byte
> + nkey = avio_rb32(pb); // key string length
> + nval = avio_rb32(pb); // value string length
> + if (nkey > 512) {
sizeof(key)
> + avio_seek(pb, nkey, SEEK_CUR);
avio_skip()
> + } else {
> + readString(pb, nkey, key); // key string
> + }
> + if (nval > 512) {
> + avio_seek(pb, nval, SEEK_CUR);
Ditto.
> + } else {
> + readString(pb, nval, val); // value string.
> + }
> + if (!strncmp(key, "codec", 5)) {
This will match "codec_params" too, you need to take the length into
account. I suggest to add a 0-terminator to key and use a simple strcmp().
> + strncpy(c->codec_name, val, sizeof(c->codec_name) - 1);
> + }
> + if (!strncmp(key, "HeaderSeed", 10)) {
> + c->HeaderSeed = atoi(val);
> + }
> + if (!strncmp(key, "HeaderKey", 9)) {
> + j = 0;
> + for (idx = 0; idx < 4; idx++) {
> + c->HeaderKey.part[idx] = (uint32_t)atoi(val + j);
> + c->HeaderKey.part[idx] = AV_RB32(c->HeaderKey.part + idx); // convert to BE!
Urgh!
uint32_t key_part = atoi(...);
AV_WB32(c->header_key + idx + 4, key_part);
> + if (idx == 3)
> + break;
> + while (val[j] != ' ') // find the next space
> + j++;
> + j = j + 1; // skip over the space
You could do much simpler using sscanf():
sscanf(val, "%d%d%d%d",
&key_part[0], &key_part[1], &key_part[2], &key_part[3]);
> + }
> + }
> + }
> +
> + /* decryption key derivation */
> + av_tea_init(c->tea_ctx, c->aa_fixed_key, 16);
> + c->codec_second_size = GetSecondSizeByCodecID(c->codec_name);
> + if (c->codec_second_size == -1) {
The function never returns -1.
> + return AVERROR_DECODER_NOT_FOUND;
> + }
> + memcpy(output + 0, "\x00\x00", 2); // purely for padding purposes
> + memcpy(output + 2, &c->HeaderKey, 16);
> + idx = 0;
> + for (i = 0; i < 3 * TEA_BLOCK_SIZE; i += TEA_BLOCK_SIZE) {
> + v0 = c->HeaderSeed;
> + v1 = c->HeaderSeed + 1;
> + AV_WB32(src, v0);
> + AV_WB32(src + 4, v1);
> + c->HeaderSeed = v1 + 1;
> + av_tea_crypt(c->tea_ctx, dst, src, TEA_BLOCK_SIZE, NULL, 0); // TEA ECB encrypt
> + for (j = 0; j < 8 && idx < 18; j+=1, idx+=1) {
> + output[idx] = output[idx] ^ dst[j];
> + }
Not entirely sure: is it not a re-implementation of CBC mode?
> + }
> + memcpy(c->final_key, output + 2, 16); // skip first 2 bytes of output
> +
> + /* decoder setup */
> + st = avformat_new_stream(s, NULL);
> + if (!st)
> + return AVERROR(ENOMEM);
> + st->codec->codec_type = AVMEDIA_TYPE_AUDIO;
> + if (!strcmp(c->codec_name, "mp332")) {
> + st->codec->codec_id = AV_CODEC_ID_MP3;
> + } else if (!strcmp(c->codec_name, "acelp85")) {
> + st->codec->codec_id = AV_CODEC_ID_SIPR;
> + st->codec->block_align = 19;
> + st->codec->channels = 1;
> + st->codec->sample_rate = 8500;
> + } else if (!strcmp(c->codec_name, "acelp16")) {
> + st->codec->codec_id = AV_CODEC_ID_SIPR;
> + st->codec->block_align = 20;
> + st->codec->channels = 1;
> + st->codec->sample_rate = 16000;
> + } else {
> + return AVERROR_DECODER_NOT_FOUND;
> + }
> +
> + /* determine, and jump to audio start offset */
> + for (i = 1; i < c->tocSize; i++) { // skip the first entry!
> + current_size = c->TOC[i][1];
> + if (current_size > largest_size) {
> + largest_idx = i;
> + largest_size = current_size;
> + }
> + }
> + c->start = c->TOC[largest_idx][0];
> + c->end = c->TOC[largest_idx][1];
> + avio_seek(pb, c->start, SEEK_SET);
> + c->current_chapter_size = 0;
> +
> + return 0;
> +}
> +
> +static int aa_read_packet(AVFormatContext *s, AVPacket *pkt)
> +{
> + AADemuxContext *c;
> + unsigned char dst[8];
> + unsigned char src[8];
> + int i;
> + int trailing_bytes;
> + int blocks;
> + uint8_t buf[MAX_CODEC_SECOND_SIZE * 2];
> + int written = 0;
> + int queued_size;
> + int ret = 0;
> + int shortfall;
> + int extra;
> +
> + c = s->priv_data;
> +
> + // do we already have some audio ouput from the last invocation?
> + queued_size = av_fifo_size(c->cbuf);
> + if (queued_size) {
> + // av_log(s, AV_LOG_DEBUG, "[aa] draining queued data (%d bytes)\n", queued_size);
> + av_fifo_generic_read(c->cbuf, buf, queued_size, NULL);
> + written = written + queued_size;
> + }
> +
> + // are we at the start of a chapter?
> + if (c->current_chapter_size == 0) {
> + c->current_chapter_size = avio_rb32(s->pb);
> + if (c->current_chapter_size == 0) {
> + ret = AVERROR_EOF;
> + goto end;
> + }
> + av_log(s, AV_LOG_DEBUG, "[aa] chapter %d (%ld bytes)\n", c->chapter_idx, c->current_chapter_size);
> + c->chapter_idx = c->chapter_idx + 1;
> + avio_rb32(s->pb); // data start offset
> + c-> current_codec_second_size = c->codec_second_size;
Nit: strange space after ->.
> + }
> +
> + // is this the last block in this chapter?
> + if (c->current_chapter_size / c->current_codec_second_size == 0) {
> + c->current_codec_second_size = c->current_chapter_size % c->current_codec_second_size;
> + }
> +
> + // decrypt c->current_codec_second_size bytes
> + blocks = c->current_codec_second_size / TEA_BLOCK_SIZE;
> + for (i = 0; i < blocks; i++) {
> + avio_read(s->pb, src, TEA_BLOCK_SIZE);
> + av_tea_init(c->tea_ctx, c->final_key, 16);
> + av_tea_crypt(c->tea_ctx, dst, src, 1, NULL, 1);
> + memcpy(buf + written, dst, TEA_BLOCK_SIZE);
> + written = written + TEA_BLOCK_SIZE;
> + }
> + trailing_bytes = c->current_codec_second_size % TEA_BLOCK_SIZE;
> + if (trailing_bytes != 0) { // trailing bytes are left unencrypted!
> + avio_read(s->pb, src, trailing_bytes);
> + memcpy(buf + written, dst, trailing_bytes);
Looks strange: you read to src but copy dst. May this be the cause of the
glitches?
> + written = written + trailing_bytes;
> + }
> +
> + // update state
> + c->current_chapter_size = c->current_chapter_size - c->current_codec_second_size;
> + if (c->current_chapter_size <= 0)
> + c->current_chapter_size = 0;
> +
> + if (written < c->codec_second_size) { // don't have enough output for the decoder, pull in data from the next chapter
> + shortfall = c->codec_second_size - written;
> + c->current_chapter_size = avio_rb32(s->pb);
> + if (c->current_chapter_size == 0) {
> + ret = AVERROR_EOF;
> + goto end;
> + }
> + av_log(s, AV_LOG_DEBUG, "[aa] chapter %d (%ld bytes)\n", c->chapter_idx, c->current_chapter_size);
> + c->chapter_idx = c->chapter_idx + 1;
> + avio_rb32(s->pb); // data start offset (ignored)
> + c->current_codec_second_size = c->codec_second_size;
> + blocks = c->current_codec_second_size / TEA_BLOCK_SIZE;
> + for (i = 0; i < blocks; i++) {
> + avio_read(s->pb, src, 8);
> + av_tea_init(c->tea_ctx, c->final_key, 16);
> + av_tea_crypt(c->tea_ctx, dst, src, 1, NULL, 1);
> + av_fifo_generic_write(c->cbuf, dst, TEA_BLOCK_SIZE, NULL);
> + }
> + trailing_bytes = c->current_codec_second_size % TEA_BLOCK_SIZE;
> + if (trailing_bytes != 0) {
> + avio_read(s->pb, src, trailing_bytes);
> + av_fifo_generic_write(c->cbuf, src, trailing_bytes, NULL);
> + }
This code looks duplicated.
> + c->current_chapter_size = c->current_chapter_size - c->current_codec_second_size;
> + if (c->current_chapter_size <= 0)
> + c->current_chapter_size = 0;
> +
> + // borrow "shortfall" bytes
> + av_fifo_generic_read(c->cbuf, buf + written, shortfall, NULL);
> + written = written + shortfall;
> + }
> +
> + // give only codec_second_size bytes to the decoder
> + if (written > c->codec_second_size) { // c->cbuf drain triggers this
> + extra = written - c->codec_second_size;
> + av_fifo_generic_write(c->cbuf, buf + c->codec_second_size, extra, NULL);
> + written = c->codec_second_size;
> + }
> +
> + av_init_packet(pkt);
> + pkt->data = NULL;
> + pkt->size = 0;
> + pkt->pos = avio_tell(s->pb);
> +
> + // av_log(s, AV_LOG_DEBUG, "[aa] giving %d bytes to decoder\n", written);
> + av_grow_packet(pkt, written);
IIRC, av_new_packet() can do that in a single step.
And do not forget to check memory allocations.
> + memcpy(pkt->data, buf, written);
> +
> +end:
> + return ret;
> +}
> +
> +static int aa_probe(AVProbeData *p)
> +{
> + unsigned char *buf = p->buf;
> +
> + // first 4 bytes are file size, next 4 bytes are the magic
> + if (AV_RB32(buf+4) != AA_MAGIC)
> + return AVPROBE_SCORE_EXTENSION;
> +
> + return AVPROBE_SCORE_MAX;
Looks very wrong: 32-bits magic number -> MAX, anything else -> EXTENSION.
Should be: 32->bits magic number -> MAX/4, anything -> 0.
> +}
> +
> +#define OFFSET(x) offsetof(AADemuxContext, x)
> +#define FLAGS AV_OPT_FLAG_DECODING_PARAM
> +static const AVOption aa_options[] = {
> + { "aa_fixed_key", // extracted from libAAX_SDK.so and AAXSDKWin.dll files!
> + "Fixed key used for handling Audible AA files", OFFSET(aa_fixed_key),
> + AV_OPT_TYPE_BINARY, {.str="77214d4b196a87cd520045fd2a51d673"},
> + .flags = AV_OPT_FLAG_DECODING_PARAM },
> + { NULL },
> +};
> +
> +static const AVClass aa_class = {
> + .class_name = "aa",
> + .item_name = av_default_item_name,
> + .option = aa_options,
> + .version = LIBAVUTIL_VERSION_INT,
> +};
> +
> +AVInputFormat ff_aa_demuxer = {
> + .name = "aa",
> + .long_name = NULL_IF_CONFIG_SMALL("Audible AA format files"),
> + .priv_class = &aa_class,
> + .priv_data_size = sizeof(AADemuxContext),
> + .extensions = "aa",
> + .read_probe = aa_probe,
> + .read_header = aa_read_header,
> + .read_packet = aa_read_packet,
> + .flags = AVFMT_GENERIC_INDEX,
> +};
> diff --git a/libavformat/allformats.c b/libavformat/allformats.c
> index 181cb9e..60ec0ca 100644
> --- a/libavformat/allformats.c
> +++ b/libavformat/allformats.c
> @@ -262,6 +262,7 @@ void av_register_all(void)
> REGISTER_MUXER (RTP_MPEGTS, rtp_mpegts);
> REGISTER_MUXDEMUX(RTSP, rtsp);
> REGISTER_DEMUXER (SAMI, sami);
> + REGISTER_DEMUXER (AA, aa);
Alphabetical order.
> REGISTER_MUXDEMUX(SAP, sap);
> REGISTER_DEMUXER (SBG, sbg);
> REGISTER_DEMUXER (SDP, sdp);
Regards,
--
Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20150727/4629c6e4/attachment.sig>
More information about the ffmpeg-devel
mailing list