[FFmpeg-user] Whisper in ffmpeg 8

Fri Aug 15 00:23:32 EEST 2025

On Thu, 14 Aug 2025 at 22:15, Bernhard Döbler <programmer at bardware.de> wrote:
>
> Hi,
>
>
> yesterday, news made the round, that ffmpeg 8 is going to be released,
> soon, and it will contain whisper, an AI software that can understand
> spoken text and create subtitles.
>
> Their github page https://github.com/ggml-org/whisper.cpp says they
> offer a handful of models.
>
> Model   Disk    Mem
> tiny    75 MiB  ~273 MB
> base    142 MiB         ~388 MB
> small   466 MiB         ~852 MB
> medium  1.5 GiB         ~2.1 GB
> large   2.9 GiB         ~3.9 GB
>

There is a commit [1] adding Whisper support [2]. As the docs note you
will need to provide a model.

> How does this work? Will all of this be compiled into the ffmpeg binary?

--enable-whisper config option is added (default: no) [3] so up to
whoever compiles your binary and you provide the model.

Cheers,
Rob

[1]: https://github.com/FFmpeg/FFmpeg/commit/13ce36fef98a3f4e6d8360c24d6b8434cbb8869b
[2]: https://ffmpeg.org/ffmpeg-filters.html#whisper-1
[3]: https://github.com/FFmpeg/FFmpeg/blob/47c6af7d299c96b2e65f5f10526e0f34e00b23c8/configure#L339