[FFmpeg-devel] [RFC] Introducing policies regarding "AI" contributions

Mon Jul 7 01:18:46 EEST 2025

On 2025-07-05 14:20 +0300, Rémi Denis-Courmont wrote:
> Le tiistaina 1. heinäkuuta 2025, 13.58.23 Itä-Euroopan kesäaika Alexander 
> Strasser via ffmpeg-devel a écrit :
> > (...) I want this thread to start a discussion, that eventually leads
> > to a policy about submitting and integrating "AI" generated content.
> 
> Well, you can define a policy and/or make a public statement on FFmpeg.org, but 
> as others said, just like we can't prevent someone misattributing their 
> contributions and violating copyrights, we can't credibly prevent (mis)use of 
> LLMs to generate code.

Yes, that's what I had in mind.

At least it provides a place where we could point people to when we
detect such submissions.

Also I believe it will help on its own as far as people find it and
read it.

> There is also a problem of definition. While I don't personally use computer 
> assistance, I think it's fine to use language servers to automatically generate 
> or suggests boilerplate, possible contextual completions, etc. While this sort 
> of technology predates LLMs and is clearly distinct from it **at the moment**, 
> it's going to be hard to define "AI" and where to draw a line.

I don't think this has the same problems. My intention was to cover "AI"
in the current form where it means LLMs trained on a huge corpus of code
under many licenses and probably a lot without license at at all.

> Ultimately, I think you need to define the problem(s) as far as FFmpeg-devel is 
> concerned. Potential copyright violations are not new, and I think the current 
> policies and license terms are adequate, regardless of AI.
> 
> Low quality patches are also not really a new problem, and they can be 
> rejected with the current processes too.
> 
> *Maybe* LLM usage will (willingly or unwittingly) lead to a denial of service 
> attacks on the review capacity and motivation of the FFmpeg-devel, TC and GA 
> membership, but that remains highly speculative, and I think we don't need to 
> solve that what-if problem yet. And again, this attack does not necessarily 
> need an LLM to be carried.

I fear that could happen.

In general the usage of LLMs for codegen mostly shifts the line of work
in an unfavourable way :(

It also makes it possible with little effort to create a lot of code.
Code in itself is not really valuable though.
Having people that understand the code and maintain it are!

I agree that we might not need to react to that ahead of time, but in
general the review capacity in FFmpeg is too low since a long time.
So I guess it's better to watch closely.

Best regards,
  Alexander