[FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions
Alexander Strasser
eclipse7 at gmx.net
Fri Jul 4 02:14:11 EEST 2025
On 2025-07-03 02:16 +0200, Gerion Entrup wrote:
> Am Dienstag, 1. Juli 2025, 12:58:23 Mitteleuropäische Sommerzeit schrieb Alexander Strasser via ffmpeg-devel:
[...]
> > Thus I want this thread to start a discussion, that eventually leads
> > to a policy about submitting and integrating "AI" generated content.
> >
> > Leaving all ethical issues aside for a moment I still see 2 very big
> > problems with AI generated code:
> >
> > * looks generally plausible but is often subtly wrong
> > * leading to more work, regressions and costs
> > * which often lands on a different group of people (other
> > projects, reviewers, bug finders, bug fixers, etc.)
> > * which are sometimes delayed for quite some time increasing
> > the costs of fixing them
> > * license/copyright violations
> > * this might be sometimes a non-issue with small changes
> > * but especially for complete components the risk seems high
> >
> > There is a lot more to the topic and I probably forgot to bring up
> > many more important aspects and details. Please feel free to bring
> > more things up in the discussion!
> >
> > There was a preparation in the musl project to put up a policy[2],
> > it has not yet been finalized and realized as far as I understand.
>
> Just to link it here. Remembers me on the Gentoo Linux discussion:
> https://archives.gentoo.org/gentoo-dev/9007c921a8a57655ecb2027eb4be4bff02673af4.camel@zougloub.eu/T/#t
> https://wiki.gentoo.org/wiki/Project:Council/AI_policy
Thanks for the links to the Gentoo discussion and policy!
IMHO the discussion and the resulting policy is interesting and maybe
something similar would be appropriate for FFmpeg.
I also became aware of LLVM policy:
https://llvm.org/docs/DeveloperPolicy.html#ai-generated-contributions
But I must say I do not like it as much. To cite the most critical part:
As such, the LLVM policy is that contributors are permitted to use
artificial intelligence tools to produce contributions, provided that
they have the right to license that code under the project license.
Contributions found to violate this policy will be removed just like
any other offending contribution.
For "AI" (in the LLM sense) I think it's usually not at all easy to
say if one has the right to license the code given it's trained on
a huge corpus of copyrighted and particularly licensed code.
Anyway they agree on license/copyright concern I raised. As does Gentoo.
And the LLVM policy also comes to a similar conclusions, as does Gentoo,
regarding waste of project resources:
We encourage contributors to review all generated code before sending
it for review to verify its correctness and to understand it so that
they can answer questions during code review. Reviewing and maintaining
generated code that the original contributor does not understand is not
a good use of limited project resources.
If anyone has more examples at hand, it would probably be interesting to
know and take a look.
Best regards,
Alexander
> > It also brings up the point, that it is not really related to
> > recent "AI" tech, but more to the origin of work and its handling.
> > Unfortunately "AI" made problems with this a lot more common.
> >
> >
> > Best regards,
> > Alexander
> >
> > 1. https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2025-April/342146.html
> > 2. https://www.openwall.com/lists/musl/2024/10/19/3
More information about the ffmpeg-devel
mailing list